Skip navigation

Commenting on the introduction to Raw Data, somebody noted during section that data has become “singular”. More precisely, it has become a mass noun, a category of inherently-plural noun that is more usually applied to substances like water, liquid more generally, smoke certainly, but not many solids. (Note the mandatory ‘s’ on that last one.) Interestingly, discrete things like the dollar or datum can, in sufficiently large aggregate, also become mass nouns: cash and the new meaning of data respectively. I don’t think it’s too much of a leap to say that there’s a sort of commodification going on when this happens; “I have many dollar bills” is a strange statement, implying individual attention to each one rather than consideration of the cash in aggregate, as a resource. (Similarly, attention is a mass noun, despite the fact (increasingly ignored in the digital world) that it’s far from infinitely divisible.)


But like water, data has begun to be perceived as a fluid or continuous substance; as the Introduction notes, data are “corpuscular, like sand” but also “aggregative.” Sand is of course a mass noun; you can have a bucket of sand, a lot of sand, but not a sand or three sands. The individual grains of sand are not usually considered, any more than molecules of water are. This has odd implications for the social construction of hard drive space in an era of overabundant free space.


In the dark days of scarce storage space, the prevailing metaphor for a hard disk (as seen by the user — we’re being screen essentialists, here) was an office. Data (datums) were individual documents; larger organizational structures were files, folders, briefcases and so on. The implication was that free space was a scarce resource, like desk space (note the stigma of a cluttered Desktop even on modern machines), and that data were to be considered individually. Increasingly, this paradigm is being modified; data is more like sand or cash, a continuous resource that occupies and permeates the free space of a storage medium. Meanwhile, free space has also changed, in subtler ways. I asked a few friends today whether free space on a hard drive was more like free space in a closet, or free space in a wallet. The more technically-minded of them were much more inclined to say “wallet”. Crucially, the more piracy-minded ones were especially likely to take that view.


In such abundance, free space cries out to be occupied. If data is like water, free space is a sponge. But this permeability creates a demand that legal channels cannot fill. To be sure, high-definition movies and high-quality music fill space quickly, but a terabyte hard drive, full, represents thousands (or hundreds of thousands) of dollars of legal media files. Piracy is often associated with an attitude of obsessive data gluttony. A common joke is that “I heard a song I liked, so I downloaded their discography.” I don’t think piracy creates this attitude, though; I think it’s a response to it. Lawrence Liang and others largely focus on piracy as a response to necessity, such as that incurred by poverty or regional unavailability of cultural content. While this is a strong influence, it can’t explain the discography-paradigm of data overkill. By soaking up data, free space licenses and even encourages indiscriminate piracy: Why not download everything available? Free space exists to be filled, after all. In this sense, the permeability of free space implies a permittivity of free space. Together, they propagate a cultural wave of totally-accessible media, of Everything That Ever Was, Available Forever at the speed of light.