Skip navigation

Category Archives: Hunter’s section

One thing that frustrates me about discussions of “big data” is that they often take for granted that big data solutions (especially to questions of consumer preference) are universally effective. To me, it seems like they’re often quite terrible: Greg Linden’s characterization of early Amazon recommendations as “going shopping with the village idiot” don’t feel that far from the present. For example, I recently bought some tupperware on Amazon. Now my Amazon homepage recommends that I buy 10 different kinds of tupperware, because apparently I am a tupperware hoarding monster. Facebook’s personalized ads seem equally silly (either directly related to a page I “like”, or seemingly random). These often seem like over-engineered solutions, caught up in hype and techno-utopian idea that “if we have enough data, the answer will become clear!” The Mayer-Schonberger and Cukier reading gets at this too: is it surprising that the frequency of previous incidents and cable age are the strongest indicators of which manholes might explode? So, my first question is: will big data make us dumb (or has it already)? If we worry more about the what than the why and use big data as the proverbial hammer to the nail that all of our problems start to look like, what are we losing?

On a slightly different note, one thing that makes me so uncomfortable about big data is its constant use of classification. Machine learning — responsible for so many of the algorithms we think of as “big data” — is most often about categorization: the emergence of categories and clusters from a data set in “unsupervised” learning, or the fitting of data to a set of predefined and labeled categories in “supervised” learning. Of course, even in the unsupervised mode, these categories never simply “emerge” — they depend on the features we use to describe our data, which come with certain assumptions about what might be important about it. We can classify consumers into different groups based on their preferences, and this might help some corporation sell more products, but these categories all require a mode of thinking that normalizes some behaviors and marginalizes others. The “urban tribes” project from UCSD’s computer vision program is a great example of this trend (and very much an instance of the big-data-as-hammer logic): why not just categorize people based on their looks? The blatant potential for work like this to reinforce damaging stereotypes almost goes without saying, but hey, look what big data can do!

Fast Company’s response is pretty good.

I thought the lecture on Monday was particularly interesting as it got me thinking about why we feel the way we do about being surveilled. Big data is used to track us by companies like Netflix and Amazon to pinpoint our taste. Many consumers definitely feel violated by the tracking that the companies do, but some, like me, enjoying finishing an episode of Parks and Recreation and getting the recommendation to watch The Unbreakable Kimmy Schmidt based on our taste. However, when this surveillance goes from private to public, from companies like Netflix to the US government, more and more people feel violated by it. Although the data is gathered differently, from tracking your clicks to monitoring phone calls and even visually observing via drones, it can provide equally intimate information. People often enjoy having their internet activity guided and continue to use Amazon, etc even after discovering their use of big data, but in situations like that portrayed in Citizen Four in which the extent of government knowledge is revealed there is immediate backlash. We are equally minimized into statistically predictable minorities by all the entities we pour our information into, so why is this ok when it involves the TV shows we watch or books we buy and a violation when it comes to our mundane phone calls or routes to work?

In the reading about post-digital, it is discussed how digital really just means that something is divided into discrete units.  I think it’s interested to think about that because I never thought of digital in that way.  Now it makes sense, I guess I just never even had a good definition of what digital was.  Good thing we figured that out by the last week of Digital Media.

The author then says that technically anything aesthetic is by definition analog, since our senses only perceive non-discrete signals.  I think that that is something interesting to consider.  The first thing that comes to mind there is that classic duality of light that we learn about in middle school, where light is both a (continuous) wave and (discrete) particle.  Similarly, you can see how our eyes have rods and cones which are these discrete structures that perceive light.  Obviously, Cramer only mentions this as a technicality but it’s interesting to consider in what ways aesthetics differs between the continuous and discrete domains.

If you look at a display, you can see that all the little lights are arranged in a grid pattern.

(Sorry the image is so large)

It’s sort of cool to think about how all the pictures we see on digital monitors are created from something looking like this, and to what extent it matters whether or not it is discrete.  People care a great deal about screen resolution and whether or not they can see the pixels in an image.  It’s cool when you look at how an animated movie is rendered, there is an algorithm that literally divides the screen into a grid of pixels, and shoots a ray through each pixel into the 3d scene and then bounces around until it figures out what color it is.  It is the epitome of a discretized aesthetic.

 

Pretty fun stuff.  So you do this per pixel thing for every frame, and then play all the frames really quickly so that it looks smooth.  So basically you combine a bunch of colored squares to look like a coherent picture, and a bunch of pictures to look like a coherent movement.  This would obviously be characterized as digital.   But then apparently since the light from this motion picture travels to our eyes as waves, this is called analog.

It’s cool to step back and look at something like a flip book animation and see a whole bunch of pencil drawings that act as the images.  You lose the per pixel operation, but still have the motion simulation.  This might be a little less digital to most people.

In conclusion, it seems like the idea of something being digital has a lot to it, and there is a lot of gray area to decide what is digital and what is not digital.  Do we use technical definitions or not?  Is there a specific time to use a technical definition, and how do we know what that time is?

On Thursday of last week, I contributed to a Wikipedia page for the first time. I eagerly checked in over the following days to see whether my additions would remain or be torn apart by the community, and for better or worse, it seemed I managed to fly under the radar. However, upon visiting the “Talk” discussion regarding my page, I found an interesting note. In a yellow box, the text alerted me that “This article is currently or was the subject of an educational assignment.” I followed this to discover Wikipedia: Student Assignment.

The page is dedicated entirely to the practice of editing Wikipedia pages for classroom assignment; in other words, it was exactly what I had just done. The page opens with the suggestion that students can actually do more harm than good for the site, which lead me to read on and understand how and why students would be less able to edit properly than regular Wikipedia contributors, whoever those people are. The page is broken down into advice for students, instructors, editors, and ambassadors, each with a strict reminder of the policy guidelines and style of Wikipedia. This made me nervous about my own work and whether I met the standards of an overwhelming community. Furthermore, the article took a clear stance on the structure of assigned classwork, and how it should be respectful of Wikipedia’s goals and aims (ie. do not set arbitrary word requirements when Wikipedia prizes brevity).

The question I would like to pose this week is why doesn’t our class have a course page for this Wikipedia assignment? The student assignment article has reasonable evidence in support of a course page that will allow Professor Chun and TAs to ensure that our Wiki-contributions are done with proper respect toward Wikipedia guidelines. At the least, I feel students who are given this assignment would benefit from reading the Student Assignment page so that they understand the risks they run from improper editing or using their real name, for example. Maybe we do not have a course page so that our activity could feel more free throughout the site, but by now we know better than to simply accept freedom.

While more and more causalities and algorithms are being produced to help make sense and find causalities in big data I don’t think that there will ever be a complete series of algorithms that can counter the free will of the people these formulas are processing. The very motivators such as curiosity that draw humans to find patterns and make meaning of the world will always be a variable that we irrational beings will always carry with us. If big data does ultimately have the answer I think that many people can or will do something different out of spite, tiring the predictions to death. The only way I can see out of this loop is for enslavement via mind control devices that eliminate all free will – only then would the corporations have a true control to aid in the processing of the data they are collecting for their benefit. I feel this tangent is related because if big data is moving our society towards the predictability of a mind control device then there won’t be any need to predict, we can just be what they want us to be. But in that same way why can we not create an optimization of the world that allows for total success without enslavement? – I guess it is the irrationality of people’s actions that dispel this. Crazy people.

As we wrap up our lectures this week and tie together some of the course’s overarching ideas, I was struck by the many distinctions Big Data completely obliterates – the difference between public and private, visible and invisible, the window of the interface and the ‘real’ world, a subjective ‘you’ as opposed to a collective ‘you.’ Mayer-Schonberger’s basic definition of Big Data is “things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value.” Data may be “big” in size, but it formulates insights that are very “small” as it, for example, zeroes in on one particular user’s preferences and “suggested purchases” on Amazon out of millions of customers. The haystack may be big, but the data always finds the needle.

Perhaps…

Suddenly, personal preferences don’t seem so unique to any given individual. I  am grouped in with tons of other needles in the haystack who may be interested in listening to the Beatles just because we listened to the Rolling Stones. It’s not so easy to say that my public life is suddenly private – rather my life is simultaneously private and public, belonging to me and millions of others like me without need for distinction. In the end, Big Data still gets the job done, so I don’t really feel the need to grasp a distinction between public or private any longer.

The same goes for a distinction between feeling visible, while at the same time invisible, on the Internet. Big Data may make my preferences visible, uncovering them from deep within the haystack. In doing so, I as a unique individual am not just made invisible as I’m grouped in with other needles, but completely obliterated. Just as Mayer-Schonberger mentions the loss of specific estimation as a quantity increases (ex: it’s never one-million…and one dollars), I am lost from sight among the masses. Suddenly, ‘one-million people plus me’ equates to just ‘one-million.’ At the same instant, Big Data is both watching and not watching you, largely because ‘you’ no longer holds any distinct meaning separate from ‘subjective’ or ‘collective.’

google

My following question may seem arbitrary and nominally nit-picky, but it’s one I can’t shake after thinking through the above distinctions. Is Big Data really “big”? Could we not, at the same time, call it “Small Data”? Or “Data That Is Both Big And Small”? That’s definitely a bit more of a mouthful.

Most likely due to the fact that I study public health, the examples in the Mayan-Schonberger essays that resonated with me the most were those related to health issues — namely Google Flu Trends and the collection of vital sign data from patients. When bodily functions and disease are quantified in such a way, it is difficult to avoid criticism of these practices from the perspective of biopower. Population behavior is assessed and regulated.

When big data is used for biological purposes, the issue of time must be considered. Analyses such as Google Flu Trends and H1N1 occur more retrospectively — that is, during or after a pandemic — than prospectively. A person’s vital signs are taken once they have been admitted to the hospital, not preemptively.

The most prospective example was Target, who successfully identified a woman’s pregnancy and advertised relevant products to her. The capitalization upon this prediction likely affected the consumer decisions she made while pregnant (and possibly those that occurred after her child was born). There was a greater temporal distance — something I suspect will have much more significant implications for big data and its biological role in future years.

Rather than big data enabling the prediction of immediate solutions, at what point will it be able to make prospective decisions distant from an uncertain future in order to predict and regulate behavior years in advance? And how do our identities tie into these predictions (e.g.  a pregnant girl whose place of residence, income and other factors are likely discernible from data she unknowingly produces and can correlate with how she will raise a child)?

Finally, although biopower typically concerns state actions, Mayer-Schonberger’s examples demonstrate that a great deal of big data analysis is conducted by private corporations. Of course, companies such as Google are not completely separated by the state, but their activities appear to be more motivated by profit than the subjugation and regulation of populations. But as the possibilities (and limitations) of big data are beginning to illustrate, a synchrony between profit-making and regulation may be necessary in the age of big data.

 

Florian Cramer’s discussion of the “post-digital” left me a bit unsettled. It isn’t that I haven’t felt unsettled in the course to date, but his challenging of many (or most?) of the topics we have studied in this course comes when I am trying to mentally wrap up what I have learned and how I can apply it outside of this academic framework. To have so much of it called into question complicates that matter for me.
I think the single most interesting takeaway for me that he called into question many of the dichotomies with which we have spent so much time defining and dissecting. Principal among these is the distinction between “old” and “new” media and what is “digital” and “analog”. He characterizes our colloquially “digital” devices as truly hybrids, stating that, “Most ‘digital media’ devices are in fact analog-to-digital-to-analog converters,” going on to cite the example of the MP3 player, which is something many of us consider digital rather than analog (709). Indeed, his questioning of our need to define these boxes as separate entities is rather poignant. In discussing the example of the hipster typewriter at the end of the piece, he advocates that instead of a bold move for ironic anachronism, the subject of the photo (and its ridicule) chose the best tool for the job:

The alleged typewriter hipster later turned out to be a writer who earned his livelihood by selling custom-written stories from a bench in the park… Knowing the whole story, one can conclude only that his decision to bring a mechanical typewriter to the park was pragmatically the best option. Electronic equipment (a laptop with a printer) would have been cumbersome to set up, dependent on limited battery power and prone to weather damage and theft, while handwriting would have been too slow, insufficiently legible and lacking the appearance of a professional writer’s work.
Had he been an art student, even in a media arts program, the typewriter would still have been the right choice for this project. This is a perfect example of a post-digital choice: using the technology most suitable to the job, rather than automatically ‘defaulting’ to the latest ‘new media’ device. It also illustrates the post-digital hybridity of ‘old’ and ‘new’ media, since the writer advertises (again, on the sign on his typewriter case) his Twitter account “@rovingtypist”, and conversely uses this account to promote his story-writing service. (709)

This has made me question many of the other distinctions we have made. Those that come first to mind are capture/surveillance, labor/leisure, and public/private. Let’s take the first one: surveillance and capture. We imagine these spheres as somewhat separate (or at least that they used to be). But it is also possible to conceive of them as being more intertwined than they have been given credit. For example, we still conceive of much of capture as an online “paper trail”, hearkening back to the fact that in the past there have been traces of events that could be used ex post to figure out information without ever interfacing with the subject and perhaps without regard for the original intent of the capture. Indeed, many forms of capture come to mind (bitrh/death certificates, report cards, medical records) that are akin to the methods of capture Facebook might use to capture events (name, date, name of event, and some form of documentation such as a photo or a signature) and logs that a website may use to track your experience. This isn’t to say that there has been no change as a result of recent technological developments, but to say we have moved from a “disciplinary society” to a “control society” implies an absoluteness I am uncomfortable defending, and especially so after reading Cramer’s piece.
I could devote an essay to any of these distinctions (and indeed, I may do so very soon), but in the meantime I am looking forward to a conclusion to the lectures and one final section where we might be able to dig into these distinctions we have made.

One of the most striking aspects of this week’s discussions reminded me of conversations I have viewed online concerning “polite politics”. This discussion is centered on how marginalized groups attempt to abolish oppression in all forms in hopes of securing fundamental liberties and privileges they have been excluded from. Politene politics tends to argue that to not be oppressed, marginalized groups must act in peaceful, submissive manners to appeal to the oppressive groups. They have to make the oppressors feel unthreatened to prove they are worthy of not being oppressed. This argument was described in Beltran’s essay concerning the DREAM act and how the protestors went about demanding their freedoms. I was in high favor of her criticism of routes to liberation, which rested upon proving the individual’s self worth rather than criticizing those who were oppressing. I think this topic is particularly interesting in the ways it puts the onus on the oppressors rather than the oppressed. The oppressed, I feel, should be shifting the focus of their situation to highlight their stories but to also highlight the fact that majorities are acting incorrectly, independently of the type of person they are oppressing. This harps back to similar ideas another theorist talks about in terms of visibility – if we rely on visibility as the standard of objective knowledge, we also rely on invisibility as a lack of knowledge and the unknown. In the same sense, by demarcating reasons somebody should not be oppressed, the implication is also that there are also reasons that people should be oppressed.

In the past few lectures, we’ve been carefully examining the concept of “coming out” as well as the conceptions of “public” and “private” in relation to the internet. Before reading Beltrán’s piece, I hadn’t heard of the DREAM Act, and found it interesting in the way that it is the complete opposite of the form of protest associated with Anonymous. While Anonymous is decentralized and has no strictly defined philosophy, all DREAMers share the same goal of becoming US citizens. In addition, while it is hard to believe individuals are behind Anonymous protests, DREAM activism humanizes the people behind the protests. Contrary to the internet’s early years of anonymity, these young protestors use social media to expose their identities and give the movement legitimacy. As Beltrán puts it, they use a “participatory politics that rejects secrecy and criminalization in favor of more aggressive forms of nonconformist visibility, voice and protest.”