Monday, 2 December 2013

Apples and Oranges and Acholi

I'm pretty sure there aren't muscles in my brain.  But I swear I can feel teeny tiny sinuous straining in my head when I stare at my data.  Raking over the pages.  Dicing it up.  Piecing it back together in new combinations. Milking every ounce of remotely interesting reportable information from it.  Participants 5 and 9 mentioned the word BLUE. great. no idea what that means. yet.

Two things are in the back of my mind: Daniel Dennett's latest book, Intuition Pumps and Other Tools for Thinking, because one of the early chapters is about failure.  Strategic flops.  He talks a lot about the toil of generating ideas.  And it does feel like a rare and wonderful thing that we (people who have the luxury to just sit around and ponder) get a problem that really strains those brain muscles.  Data analysis has also brought to mind an image of sculptors from another age who could see their design within the marble while slowly chipping away toward it.  I am reasonably confident that somewhere within all the data I've collected, if I sift and sort enough, the description I envisioned at the start is poised to emerge.  What I don't know is what tools I need to get there.  And then again, if my first pass at designing an experiment like this wasn't quite right, was it a strategic flop?  Did it generate anything that will lead me in the right direction next time or provide a least (fingers crossed) a partial picture?

When I initially designed my experiment, (The experiment was conducted in three stages. Participants were asked to watch a short video on a laptop from Youtube titled, ‘Crazy Nigerians’ which they might watch alone or in pairs. They were asked to describe what they had seen in Acholi, ‘Lok ma ineno/ Tell me what you saw,’ then they were handed a mobile device to answer question(s) about the video, and finally they described the scene again in English.)  I knew that it might not work (the risk of all experiments), but I hoped it would provide a certain kind of data, a certain kind of description of the situation I was investigating.  Unlike previous qualitative work, this experiment might be more useful to software engineers.  Additionally, it narrows the focus from language or communication generally to event construal, or a type of cognitive connection to language.  How we conceptualize an event in time and space, who was there, etc... how this is stored in our memory and recalled through language has been shown by previous bilingual research to vary by language, but what happens when ICT is thrown in the mix? 

As mentioned in previous posts,  I modeled my experiment on the work of several others in the field of cognitive linguistics.  In particular, Aneta Pavlenko (the new president of the American Association for Applied Linguistics) and Christiane von Stutterheim an expert on narrative.  Once my results take shape, I will integrate comments from members of the software development community.  Just as I have done with earlier papers, writing to engage several disciplines is one of the biggest challenges.  I am by no means trying to please everyone, but at least get my point across and be prepared to answer reasonable criticism.   

In subsequent posts, I will be addressing initial themes as well as analysis techniques since this is a new, although adapted, method.  So far... the emergence of an alternative hypothesis indicating narrative disruption, the identity of the attacker in the video, and the conveyance of doubt/uncertainty have emerged as the main themes.... Stay tuned.

Monday, 19 August 2013

Footprints, Skeletons, and other Telltale Signs

One of the tricks I'm told that authors try to do as they develop characters for novels is invent deep back stories, intricate details which may never appear in the actual novel, but nevertheless inform the character's actions.  The incident with the clown and the candles at the 7th birthday party.  The aversion to sesame seeds, etc. etc....

When I began to investigate ICT for conflict management, I was mostly thinking about the software I had worked with on laptops.  The interface layout was a primary concern for me and it was as if I was approaching my analysis like I was dissecting a painting.   In fact, there are many scholars who treat 'space as meaning' in this way.  As my project developed, I came to center more on mobile technologies, and their interface layout is dramatically different, often starkly minimal.  However, I believe the underlying logic of the original software for the desktop program still guides the structure and organization of the mobile version in the same way as the novel's character is guided by previous experiences kept invisible to the reader.  Colleagues who play devil's advocate push me try to imagine another way in which information could be organized.  They ask me to say precisely how the current system is inadequate.  I can't at the moment.  This is in part because the current system comes from my own culture and makes sense to me.  However, I think my experiment can demonstrate that sense-making as captured by narrative flow, scene conceptualization, isn't universal.

Is it possible to track backwards, to recreate the logic of the desktop experience from the mobile one?  For the mobile software I've been working with, it's not super sophisticated, so it hasn't been designed for mobile-only.  Generally, it reads like a survey, gathering information in a step-wise fashion from structured questions which try to streamline answer choices and simplify the user's need to type or scroll around.

Does the simplification, the invisible backbone of the desktop version driving the organization of information gathering actually end up driving the narrative instead?

As I analyze my initial findings, I am grouping participant responses with a few questions of my own.  Or rather, after reading the responses through a few times, the things that jumped out at me can best be captured with the following questions:
(The next step is to look for what doesn't jump out, what conforms, or what may even contradict my hypothesis.)

  1. Did participants tend to identify the scene (a video they watched) as an argument, fight, theft, or 'other' in the preliminary description.
  2. Did this change between step 2 and 3 (after interaction with mobile survey)? How?
  3. In which language (English or Acholi) did they give more details? Or equivalent?
  4. Participants seemed to include details from their personal lives to give structure to the ambiguous scene… examples? Frequency?
I used two surveys, one contained one open question and participants could just type in a similar manner to sending an SMS and the second version was a structured survey with multiple choice questions, open format, and yes/no questions.  It was patterned after the type of ICTs I expect participants to interact with more and more in the future.  

So far, I have noticed that some participants who have the open format do not change their description throughout the three description stages while a few who interacted with the structured survey changed their version of events.  For example, two participants who first described watching a fight later said it was a robbery because this was one of the multiple choice options or narratives that presented throughout the structured survey.  Some participants told me afterwards they could see what I wanted them to say.  (To be honest, having watched the video perhaps 200 times, I have no idea what is going on in the scene, so I didn't have a 'true' narrative in mind among the choices.) 

The complicated game of psychology is partly the result of a population that has been inundated with researchers.  While my work is unfamiliar and participants were generally curious and positive about wanting to participate, the survey questions that touch on theft, violence, or conflict may trigger a connection to the type of research to which they are habituated and thus associate with certain objectives.  This is exactly why I wanted to conduct an experiment instead of interviews.  To avoid the guessing game.  The posturing and manipulation for responses.  Often the intention is to please me, the researcher, but in the case of an experiment like this, a comparison of the thought stage of language production, of conceptualization of scenarios which manifest in specific ways bilinguals cross or mix their languages.  I am reasonably confident the experimental design guards their responses from this kind of manipulation because they are unaware of what I am measuring.  

That's not to say we don't discuss it.  Some of the most valuable insights for me were gained during the discussion about the context and purpose of my research.  Participants asked some important questions.  However, I maintain that the quantitative more than the qualitative will be the more valuable outcome (and more novel aspect) of this research as a first step toward how culture can be considered as a variable in design.

Wednesday, 31 July 2013

The Culture of Whistleblowing







Baba Jukwa of Zimbabwe exposes corruption.  He remains hidden behind this cartoon symbol.  The posts on his Facebook page about government officials have put a high price for the discovery of his identity ahead of the 2013 elections on July 31st.

In a recent interview for ICT Africa, Dr. Jabulani Dhliwayo explained that in Zimbabwe, as well as other countries in Africa with similarly high levels of corruption, the risk to whistleblowers is extreme and the corruption so deep, so pervasive that they have no chance to pursue justice though the systems and institutions of government. A mask, a pseudonym, a means of remaining anonymous and outside the system are essential.  The interviewer questioned whether the citizens could trust the reports from a source who would not reveal themselves and did not work within a system which could verify reports.  Dr. Dhliwayo replied, this is the best they can hope for right now.

Zimbabwe is frequently mentioned in the same breath with Cuba and Belarus as one of the countries which have bought surveillance technology from China. (Literally the same source is always quoted, which says 'suspected to have' and gives no more specifics which I find suspicious and frustrating... so this suggests a side project.)  Anyway, while this firewall and tracking software is meant to limit and finally root out actors like Baba Jukwa, Dr. Dhliwayo explains that the skills of the security services are not as advanced as the software they're tasked to monitor.  (Small sigh of relief for activists across Zimbabwe.)

Current cases in the U.S. of Edward Snowden and Bradley Manning highlight the contrast of overtly engaging with the system vs. Baba Jukwa's strategy of anonymity, particularly with respect to the emphasis on the character of the individuals over the content of the material they sought to bring to public attention.  Does inventing a two-dimensional cartoon keep the focus on the political message and limit character assassination to the obvious speculation on identity or judgement about revealing identity... all arguably less reality tv show than when there is a real individual with family, friends, neighbors, dentists, teachers, and baristas to dig dirt on? 

Perhaps this is too simplistic or obvious.  The U.S. has a judicial system in which to bring perpetrators to justice and there are ways, as Snowden or Assange are demonstrating, to take advantage of international legal systems to remain unharmed as a whistleblower.  Both of these avenues would not be available in many countries.

This is not the only example of using a mask to create a sort of shield from which to attack authority figures. 

La Comay, a five-foot puppet (or a man in a flamboyant costume topped off with a foam head) is another example of disguising the identity of a person pursuing sensitive topics.  While the actor who used this vehicle in Puerto Rico was not anonymous, by adopting such an over-the-top character whose comic and salacious style was far from formal journalism, he had more latitude on serious topics and could even book tough interview guests.  (no longer broadcasting)

And how does the mask strategy appear (or disappear) across cultures when using software platforms where crowds can report corruption or crimes?  Some are designed for reports to be made anonymously and to function within or as auxiliary to the justice system, but there are other crowd-created projects which have sprung up in different cultures which approach anonymity differently.  Vigilante or sunlight, they are often based on social media such as Facebook or Twitter.  These efforts have been criticized for lack of verification or worse such as in Mexico where some individuals have been tried as terrorists for inciting violence based on rumors they spread through Twitter.  The individuals were partially veiled, and then revealed.  Much of the reporting of violent drug related crime is being spread in this manner, but the identities of the sources is, to some extent, hidden.  And this state of being in the shadows is not culturally problematic, in fact, it seems necessary because of the risks involved in reporting.  In contrast, in Canada after the 2011 Stanley Cup riots in Vancouver, citizens policed themselves via Facebook to 'name and shame' looters and other offenders (to the great frustration of the real police).  The need and desire to remain anonymous while pointing the finger was not a high a priority in Canada.  Neither, as it turns out, was verification. 
(photo edition.cnn.com)

Is there a culture, a history, a tradition to how we cry foul?  And thinking about how the whistleblower must go about presenting evidence, (as I said some countries simply have a higher risk factor) how people expect whistleblowers to behave... is a mask permitted?  Is a mask even part of that tradition? vs. a more cowboy approach of riding out and putting a target on yourself?  Or is this a measure of the health of our institutions, our systems of justice? (I truly didn't intend to bring in the the Lone Ranger, but he was a cowboy with a mask back in the days when our justice system was still getting up and running... I'm going to propose this as a scale to Freedom House.)




Monday, 24 June 2013

Yu's on First


No matter how clearly we think we've written something, we've all had someone who's read our email, our essay, our simple message and come back with a completely different reading than we intended.  The miscommunication, the multiple interpretations come at the semantic level in my experience, the result of a disagreement on degree or connotation or tone.

Similarly, we've all misheard things.  "Have I got cat spies?" "No. Have you got a half size?"  Ah.  Our brain fills in as best it can around the morphemes it missed even though we know what we came up isn't semantically solid.

Here in Gulu, I can only describe the mode of misunderstanding as feeling a bit like a hybrid of these two-- on the one hand, the problem I'm seeing happens between writer and reader, but on the other hand, the nature of the problem isn't as much semantic as it is denotative. 

In my experiment, I ask participants to fill in one of two surveys in Acholi. (surveys designed with native speakers)  I was prepared for the semantic level misunderstandings surrounding some questions, but I was not prepared to have denotative deviation, something along the lines of my writing the word 'carrot' in English and having someone interpret it as 'marble.'  That's basically what has been happening, a dispute at the foundational level of word meaning.  I'll give a second example in case that wasn't clear.

I throw the ball.  (what is written on the page)
I throw the fight. (what you, a native speaker of this language understand from what you've read)

I've never had this kind of misunderstanding before.   

Not every participant has the same misinterpretations, so it's not a mistake, typo, or spelling variation.  I don't entirely have an explanation yet, but in one instance a participant will give me a lecture about the inaccuracy of my survey to have included such a bizarre term and in the following session the individual sees no problem.

I understood from my research and from other linguistic environments, that with an oral language, meaning is tied to listening and speaking.  This experiment brought this process into relief for me as a native speaker of a chirographic language because I was able watch the meaning mechanism weaken on the page, dissolve without the context of sound.  In the short-form writing mandated by mobile applications, meaning becomes fuzzy, unpredictable, contestable. (I wrote about the 6 meanings of gwok in a previous post.  Coo has 9)

One of the series of questions I asked participants before we began the experiment was about language use in daily life, at work, and with their mobile device.  Did they mostly talk on their phones, talk and text, how did they use their phone?  Most respondents were college educated.  Many were multilingual.  The most common answer to, 'How do you use your phone?' was that they rarely texted or not at all.  They preferred to call even though texting is cheaper.  Everyone texted in English.  And they told me this with a tone that said this was a foolish question.  A few later added that they could mix languages to joke with friends in Acholi, or they can send a text to their grandmother in the village who does not speak English.  I have no quantitative data to look at how frequently this happens.  There are other studies which look at the frequency and context of language mixing.  A participant from a micro-finance organization described why he used English to text:
“We wish to text in Luo, wish to maybe send voice mails in Luo, but [the mobile phone] has been invented in another country, so in English we have to now do that. [me: Why couldn’t you text in Luo?]  In Luo?  It is not easy. You know our language is very short. chuch chuch chuch [noise of texting action].  So to formulate is not easy.”
He was describing the problem of the misunderstanding I had encountered.  It is avoided in speech because Achoil (sometimes called Luo) uses a lot of repetition as well as intonation to convey meaning.  This kind of thing isn't easily adapted to mobile applications where the design focus is on minimalism and streamlining of interface text.

Have now wrapped up my three months in Gulu with transcriptions and translations mostly completed.  On to data analysis.  Will be sharing first impression from the field at Africa Writes, July 5th at the British Library. 



Tuesday, 11 June 2013

Hardware Heartbreak

Speaking to people is the easy part.  Getting the mobile devices, laptop, digital recorder, camera, and 3G modem to cooperate-- that's the circus of modern fieldwork.

Everyone, research participants that is, seems delighted to take a short break from their workday and find out what I'm up to.... balancing my laptop with a Nollywood-style video playing, repositioning the set-up on a plastic chair to find the sweet spot for the 3G modem, trying to nonchalantly hold the digital voice recorder so that participants forget I'm recording (which means I sometimes forget myself and cover the mic with my hand), and pausing from pen and paper note-taking to pass a mobile phone where they fill out a brief survey about the video they've watched.  It's the first time many of them have used a touch screen, so I crouch and lean over their shoulder squinting in the lunchtime sun. (Here at the equator, the optimal time to use any device outside.)  I am conducting an experiment, which I'm told is very risky for a thesis methodology in the social sciences.  With five devices in this scheme, the chances of suffering 'technical difficulties' are terribly high.  Not to mention the chances of me just dropping one of them in the dirt.  The experiment is simple; the hardware is the challenge.

The who
the what
the how of it all.

Ideally, my participants are individuals who are engaged in their community in such a way that they might want to gather information or make reports to address a problem, take action, or make a policy.  The kind of people who might use these mobile applications in the future.  This could include ngo staff, social workers, local government staff, community development project members, and ICT students.  In addition, they are bi-lingual in English and Acholi (although they may speak other languages), but they have not spent more than 6 months outside of Uganda which may contribute to acculturation.  Age was not a primary consideration, but an effort was made to speak with an equal number of men and women.

It's quick and painless.
I show a one-minute video, then ask people to tell me what they've seen.  First, in Acholi, then through a series of questions on a mobile device (also in Acholi), and finally a third time in English.  I chose the video to approximate a scene of conflict because I am interested in how software applications can ultimately be adapted as a tool in conflict resolution.  The video is not particularly violent, but it shows a scuffle on a crowded street involving many people.  It is unclear to the viewer why the fight started, and this ambiguity means s/he must draw on schema, past experience, intuition to understand what has transpired. (This connects to my previous post about narrative structure and cultural variations in interpreting narratives surrounding events.)

I will be comparing the three versions using cognitive linguistics-- a combination of looking at thought and language focusing on how we decide (although perhaps not consciously) to articulate concepts when we have access to multiple languages in our minds.  I am looking for English concepts that appear in the Acholi versions-- especially the one produced with the mobile device.  This would be evidence of conceptual transfer and transfer can happen in either direction.  At the level before language is produced, even when the participant is intending to use Acholi, they might be engaging with English concepts.  My hypothesis is that the mobile technology triggers this engagement.  Changing the interface language to Acholi isn't enough, participants will convey an anglicized narrative when using this ICT.  By doing an experiment I can measure the instances of transfer.  Of course there might not be any or they could occur in a manner I don't expect.  That's the fun and the risk of doing an experiment.

It is only an initial step to describe in quantitative terms the limitations of current technology to capture narratives in languages that are extremely different from the languages that the software was designed for.  And also point towards avenues for addressing these limitations.  That is the next step, to contemplate what it could look like... a step for software engineers here to consider. 

Why does this matter?
There is already research that looks at how we remember events differently in different languages.  We connect these memories to sensory and emotional information through language.  If we are forced to recall events in another language, the narrative we give may be different.  If you stack up enough of those altered narratives as reports about a conflict, a crime, a human rights abuse, the final picture will compound these distortions.  What would happen if the narrative could be collected in another way?  A method that does not yet exist, but that reflects not only the language of the individuals sharing narratives, but their notion of what constitutes information worth collecting, and still further, how to piece that information together, how to organize it. An alternative to the current logic governing ICTs.

Beyond uses for conflict management, indigenous software that boosted regional or domestic use would be economically significant in places like Uganda.  If it organized information in a way that was incompatible to current software, i.e., the new couldn't immediately talk to the old, then that creates a market for still more software to facilitate interoperability when it needed to happen.  And purposeful inaccessibility could make systems more secure.

But this is all a long way off.  This is the potential, the reason I find the topic so interesting.  My immediate concern is contextualizing results such as when an individual with a degree in computer science and a job in crowdsoucred ICT work prefers to have me type the text on the mobile device or to speak their answers rather than write them in the provided textbox.  Is user-repellent the opposite of user-friendly?









Wednesday, 22 May 2013

Method Madness


I recently found this passage in doing some background reading on narrative.  It's the kind of excerpt that makes me want to get up and do a cartwheel or yell jackpot! and confirm to everyone sitting nearby that I've finally lost it.  (If you don't have the same reaction, that's probably good.)
"Sir Frederic Bartlett, a celebrated Cambridge psychologist, was the first scholar to investigate and theorize cross-linguistic and cross-cultural differences in narrative construction.  In Bartlett’s (1932) classic study, Western subjects were read a Native American story, The War of Ghosts, and then were asked to re-tell it.  Because the participants found both the story structure and many accompanying details unfamiliar, they repeatedly transformed the tale in recall, both through omissions of details and through rationalizations, which made the story conform to a more familiar Western pattern.  On the basis of these observations and experiments, Bartlett (1932) developed his theory of schema that informs much of contemporary cognitive science, psychology, and narrative study.(Pavelenko, in press, p.7)
What I find so exciting about Bartlett's experiment is that it is so similar to my own, but it's from 1932!  It tests the culturally defined pattern of narrative which is linked to how we make sense of the entire plot line of the story.  Discovering that my methodology has some precedent which I have now adapted and appied to communication via ICT strengthens the validity of my approach.

Here in Gulu, Uganda, I am investigating the reverse phenomenon as Bartlett.  I'm interested in individuals far from the cultural of design...telling their own stories through a western artifact.  (Some Acholi speakers may call phones and tech related objects "things for work made with craftsmanship in iron" or alternatively, "things for work made with with skill from whites")  I hypothesize that when a narrative is given in Acholi via an ICT, a narrative shift occurs, but it is the ICT which triggers this shift.  The shift is between an Acholi narrative structure and a western one, and the spatial cues of the technology 'space' we enter when using ICT applications causes users to adapt their narratives and concepts to fit the western model much in the way a bilingual Acholi-English speaker makes small changes when switching into English.  This change happen at the cognitive level, the level of categorization, ordering, and many other 'thinking' level pre-language processes that the speaker may or may not be aware of.  My hypothesis is that the ICT, even if the interface language is in Acholi, is recognized to be 'of the west' because of other visual cues which every culture makes sense of in different ways. (Check out international signage for some great examples.)

My hypothesis is based on years of field observations in a range of linguistic and cultural settings, but the challenge is how to create a research method that captures a phenomenon I have a strong inkling about in valid and reproducible terms that the academic community will also find compelling.

Other research that I find cartwheel-worthy comes out of ALT-I, the African Languages Technology Initiative.  In particular, the engineers Odejobi and Adegbola theorize, 
"services supporting CMC [computer-mediated-communication] intended for use in African environment should exploit and implement language technologies developed around African languages and cultures."
 They propose this addition to current technologies of American and European origin should first,
"describe and represent the knowledge systems underlying African systems of communication in a form amenable to computation, e.g., numerical, graphical, or symbolically…. by critically and analytically address[ing] the question of how African people represent concepts." 
My research is therefore grounded in conceptual transfer theory within cognitive linguistics.  Their idea is both broad and ambitious, and my research begins to explore the possibilities they suggest.  If we concede that there are a myriad of ways in which different cultures communicate, why is there only one style of communication technology as research teams led by both hill Hill and also Zakaria propose, only the western-engineered model of sharing our narratives, transmitting stories, moderating the digital information that has become interconnected with our very identities?  This research examines the impact of information and communication technology design-- the current mono-cultural design-- on narrative, identity and participation with examples from a bi-lingual Acholi-English case study in Gulu, Uganda.

...and this case study involves what exactly?  That is what I have been explaining to community leaders for the past several weeks in order to get the OK to start collecting data. (My Acholi explanation is getting better slowly and involves the word apoka poka  which means difference)


---------
Odejobi, T. and T. Adegbola. 2010. Computational and engineering issues in human computer interaction systems for supporting communication in African languages. In: O.A. Taiwo ed.  Handbook of research on discourse behavior and digital communication: language structures and social interaction. Chpt. 56. [ebook] ISBN: 9781615207732 [Accessed 20 January 2012].


Friday, 10 May 2013

For the Birds and Mr Brooks

(this is a break from Gulu and current research while I'm out in the field collecting data, but check out my piece with the Policy and Internet Blog out of the Oxford Internet Institute)

This is a response to many pieces I've read on big data analysis and large scale social media analysis and, in particular, a recent NYTs op-ed column by titled, 'What You'll Do Next' in which Mr. Brooks wrote:
"The theory of big data is to have no theory, at least about human nature. You just gather huge amounts of information, observe the patterns and estimate probabilities about how people will act in the future. . . .

To discern meaningful correlations from meaningless ones, you often have to rely on some causal hypothesis about what is leading to what. You wind up back in the land of human theorizing."
Mr. Brooks is correct, and this is something I've written about before.  As much as big data scientists hate to admit it, there is social science underpinning their algorithms and interpretations.  Teams at top institutes are using a theory called homophily.  And just hearing the name turns my stomach.  Not so much because of the word (although, it is an odd term), but for the same reasons I grimaced when I heard the book Three Cups of Tea spun as a handbook for foreign policy.  However, since my visceral response is not a recognized metric (yet), I will enumerate my objections to the theory’s current application to big data mining.  

It is a social network theory roughly asserting that it is our common ‘likes’ that bind us.  It has inspired research papers with titles such as, ‘Birds of a Feather [Stick Together],’ which chronicles many types of social network theory.  Did I mention, this is all sociology, not information science, and it’s the study of humans relating offline out in the world?  
Yes, these theories started back around the 1960s in the United States when the racial upheaval motivated social scientists to ask, ‘What is it that binds us together at all?’  The insights they gained from work stretching into the 1990s, perhaps combined with the familiar word network, has attracted researchers from information science to apply the finding to the online domain.  
Now assuming that humans behave offline in the same way they do online is one leap, but big data scientists have made yet another.  The heaps of data come from many sources such as social media, applications, and devices.  Headlines were made when researchers at MIT predicted the political leanings of mobile phone users and even tracked the spread of illness based on users' habits. The ground-breaking results from MIT were based on only American user behavior, but discussed as though they were universally applicable.  

Results assumed the one user/one device/one account rule of the US, but this isn’t the pattern in some communal cultures.  Social scientists are just beginning to study user engagement with technology in places like Nigeria and Indonesia and discover how much we thought we knew, how much we assumed was universal, does not hold up under scrutiny and is increasingly dynamic. 
Bulk and ease-of-access to data does not immediately add up to persuasive conclusions.  I am not convinced by the argument I hear so often from big data proponents, ‘the data speaks for itself,’ because the models I come across that are essential for any human to make sense of the tonnage of data are based on cold war era foundations.  Theories like homophily do not take into account the advent of the internet or cultural variations. When interpreting social media data or making assumptions about user behavior, culture is a variable that cannot be ignored. 
The truth is we don’t know what to do with all this data yet.  And I am a bit torn here because my inner-engineer wants to build better models, to improve.  I am fascinated by the problem of how to incorporate cultural variation and increase what we can learn from the rich amount of information at our disposal.  But what will we use this be used for?  Researchers at Harvard's Berkman Center aspired, through this flawed method, to create a model of the Iranian blogosphere as 'unique as a snowflake.'  I probably don't need to explain the value of this research, but the social science foundation proposed simultaneously that all humans behave in a similar and predictable manner and also that unique cultural insights can be gained from a model that ignores cultural variation. (If you got lost in that last sentence, you're actually right where you should be. Most research grounded in homophily makes about that much sense.)  So this is where I hope the larger community of scientists, social and data scientists, can have a rigorous debate about how to do better... make better models and concern ourselves with the broader ethical implications.