Wednesday, August 26, 2015

Dataclysm: sometimes numbers can give us new insight in sociology

Back in 1984 an organization called the Maoist Internationalist Movement was born. What was unique about MIM to communism was that their theory put forward the idea that people are no longer divided simply by three classes; traditionally the proletarian (workers), petty-bourgeois (middle class) and the bourgeoisie (capitalist ruling class). Instead they laid out a new great divide: the 1st world and the 3rd world. Now instead of looking at a narrow scope of how exploitation works, many communists were forced to expand their view and open their minds to the possbility that things had changed a lot since Marx and Engels' time.

It later came out that one of the main contributors to MIM theory was a man named Henry Park. Park was a sociologist but he also worked in statistics. One of the ways that MIM was able to put forward their hypothesis so successfully and in ways that frustrated those that tried to refute it was by using numbers. Statistics told many Marxists who held onto old ideas undeniable truths through cold hard facts. MIM was also able to use numbers to refute anti-communists as well, such as showing that The Black Book of Communism contains significant mathematical errors.

While MIM is now gone others have picked up where they left off and refined the theory further. There's no doubt that what they left behind was something ground breaking and game changing that could alter the course of history in the communist movement.

Sometimes we need to look outside of the typical political science and economic techniques we use to analyze sociological questions. It's worth taking a fresh look through a new lens every so often to see what we might have missed.

Christian Rudder's 2014 book Dataclysm: Who we are (when we think no one is looking) is interesting because it takes a look at how such means of studying sociological data has now become a mainstream phenomenon. For those of you not familiar, Rudder is founder of the dating site OKCupid (OKC) which uses algorithms based on questions answered by users to match them up making itself different from other dating services.

And of course his field of education is mathematics. 

Rudder was able to mine a treasure trove of data from OKC to find out a lot of things about people and how we can predict a lot of things about who they are based on preferences. He was able to use some data from Twitter and Facebook as well.

Unfortunately for socialists and communist there is not anything really "groundbreaking" in the data that Rudder uncovers. Mostly it just reaffirms much of the things we already know (or at least assume to).

People are closet racists. The fact is that while people would answer otherwise in public the stats tell us that by and large white people don't want to date outside of their "race". People on OKC might answer otherwise, but the way they use the site tells a different story. Whiteness is a poison that infects People of Colour (PoC) as well. While almost everyone on the site shows a preference for their own ethnicity, white people come in as a second choice across the board for all others. Rudder's stats are, however, limited to whites, Latinas/Latinos, Asians and Blacks.

While it turns out that "blind" dates end up turning out good for women 75% and men 85% of the time, racism runs so deep in our society that it even goes beyond site. Rudder cites a study in which visually impaired people were found to end dating relationships once they found out they were seeing a PoC. Rudder also cites an example of how voters will tell pollsters they are voting for a Black candidate out of guilt but will then go to the ballots and elect a white one. (p.127)

Rudder also found that politics matter less in matching a relationship than we might think. Accordingly, the questions "Would you travel alone?" and "Do you like Scary Movies?" matter more to most Americans in the long term than whether or not you are a Democrat or Republican. This, in my view, could largely be tied to the overall privilege that we all enjoy in first world countries. At the end of day our lives are not interrupted by struggle here, at least not significantly, so it is much easier to look passed mainstream political differences than say someone in the third world who's matter of opinion might be life or death.

One's perceived attractiveness by others, as it turns out, is not a huge issue for men. In general "attractive" women tend to have 3 friends for every 2 that attractive men have on Facebook over those deemed less attractive. But the number is much more staggering when it comes to job interview requests. Attractive women are far more likely than their counterparts to receive an interview on shiftgig and this holds true whether the person that selected them was a man or woman, while on the other hand for men looks didn't matter at all. (p.119-120)

How we act online is often different from our behaviour IRL. Rudder gives examples of people flying off the handle at misinterpreted tweets, such as when a 17 year old woman joked that the world is 2,014 years old on new years eve last year. Personally it never ceases to amaze me how much misogyny, sexism, racism and hatred come out of people on the internet whenever these things happen. Other more serious examples of internet based mob justice include misidentification of a Boston bomber suspect by 4Chan and Reddit "gumshoe" trolls.

Some other interesting tidbits:
  • Twitter is not actually degrading language; instead of people using contractions they mostly work around the 140 char limitation by using richer words
  • Autocomplete might be actually perpetuating stereotypes by suggesting them to people in searches
  • Everything from sexuality, drug use and intelligence can be gleaned about people on Facebook by friends and likes with surprising accuracy using algorithm tools

There is a lot of valuable sociological data out there; and we the users of the internet are passively compiling it every single day. What's troubling is that it's mostly private entities like Google, Twitter, Facebook and OKC (at least Rudder admits this himself) that have access to it. By and large, we aren't getting anything other than free use of services in return unless these corporate entities decide to donate it to academics. But the reality is that this information holds a lot of commercial value; so it's likely they won't, at least not until they're ready to.
"... social scientists are very cagey with data sets; ... they treat them like big bags of weed-- possessive, slightly paranoid, always curious who else is holding and how dank that shit is." -- Rudder
Rudder reminds us of how this data can be used nefariously by talking briefly about the scope of PRISM, but also imagine if prospective employers start using algorithm tools to analyze your social media before hiring you; a similar practice is already used by some prisons in the United States to try to filter out guards with possible gang connections by demanding Facebook passwords.

Sometimes the damage is unintentional and removed from human action, like the case of the teenage girl who's pregnancy was outed to her father by Target by data-mining her purchases.

Rudder's book can give us some idea in what direction society is going with the way people are analyzed for both academic and commercial purposes. For socialists we have to figure out how we can use these tools to our advantage, what will be to our disadvantage, how to avoid those disadvantages (perhaps by opting out of certain services, hiding or giving false data) and how this will effect the prospect of socialism in the future.

The book is a fun read, with Rudder laying down some witty and funny remarks here and there to keep things from being dry. Oh, and it's full of pies, charts and graphs so if you are into that kind of thing you'll enjoy it!

--
See also:

No comments:

Post a Comment

Web Analytics