‘Walking in the city with headphones on’: some thoughts about Music, Big Data & The Harkive Project


I’ve recently become involved with a Pop Music Writing Group as part of my PhD studies at BCU. We meet once a fortnight and respond to an academic paper, or a book chapter, with 2000 words of our own. The purpose is to get us all writing regularly, flexing the muscles so that the task of coming up with 80,000 words isn’t quite so daunting. This is the 2nd piece of work I have produced for the group. It’s a response to a chapter from Michel De Certeau’s 1984 book, ‘The Practice of Everyday Life’, entitled, ‘Walking in the City’, in which I begin to explore some ideas around music and ‘big data’ as they relate my own PhD and The Harkive Project.


Walking in the city with headphones on

Following a recent conversation with friends about general health and fitness, I worked out, using Google Maps, that during the course of my normal, daily life, I regularly walk over 25 miles each week. This total did not include the steps I take around my home or office, or any sporadic forays into the world of sport, but consisted solely of my daily commute to work, which includes a half-mile walk at either end of a bus journey, twice a day, and a daily walk of over 2 miles with my dogs. That I walk the equivalent of a marathon each week during two activities I take entirely for granted was a surprise.

As I trudge through these miles I’m almost always accompanied by music. I listen using my iPhone, with headphones, and normally via my Spotify subscription, which gives me unlimited, mobile access to a large catalogue of songs and to my library of playlists. The only time I’m not listening to music is usually on Saturdays, when the walk I take coincides with a live football broadcast on the radio, which I listen to using the mobile BBC Radio app, also via headphones and using my iPhone.

The interesting thing to consider for the purposes of this essay is that data related to a lot of this activity is either logged, or is capable of being logged, by third parties who can find a use or value for it. My iPhone can report my geographic position and movement, and the songs and/or radio programmes I listen to are logged by the respective media outlets that deliver them. My mobile service provider, 02, as well as Spotify and the BBC, already know a certain amount of personal information about me, including my age, sex, postal address, and bank details, and from there it isn’t too great a leap to understand that it would be possible to cross-reference my listening and geographic activity with other consumer activities I engage in. My data can also be cross-referenced with other information, such as the local weather conditions, or the consumption patterns of others. Further to that, and like many others, I have an online identity that exists in numerous dispersed places, including social networks and in the logs of search engines, which could enable further cross-referential analysis. In short, from just a small element of my normal, everyday life – the activity of walking around the city with headphones on – I am generating a good deal of potentially useful data from which it is possible for organisations to glean valuable information about not just my music consumption, but also about my other habits, opinions and preferences. I am, of course, not alone here; millions of others generate similar data about themselves on a daily basis, and often without any effort to do so on their part.

There are a number of ways one can react to this as an individual: indifference; annoyance; ambivalence; fear; and acceptance, are all possible emotional responses. Whether, at the one extreme end, you view this data capture as symptomatic of a culture of surveillance and control consistent with the practices of 21st century Western capitalism, or, at the other, as a harmless and entirely non-intrusive means of media companies improving the quality of the services they offer, it is nevertheless a state of affairs that almost everyone who engages with media (and other) services in a hyper-connected modern world finds themselves implicated in. Extrapolating out from the tiny example of my walks around the city, and viewing the generation, collection and analysis of data on a global scale, we are collectively facilitating and assisting in the creation of millions of bits of data on a daily basis at a rate hitherto unseen in human history.

Due to the scale and voracity of such activity, issues and questions related to data protection, use, monetisation, ownership, access, surveillance, storage and archives are of growing interest to academics in a number of fields (see Housley et al (2014) for an overview). The realm of data is of particular interest to scholars of Popular Music because of its growing influence on matters related to the production, distribution and consumption of music, and it is here that my own area of research intersects with the wider debates.

In very broad terms, my PhD research project provides a mechanism and motivation for music listeners to share with me details of their music consumption, which I then intend to analyse. Clearly, then, by creating, promoting and operating The Harkive Project, I am engaging in very much the same activity I have described above, and in particular it is similar to the activities media companies and rights holders involved in the music industries are currently focussing a large amount of attention and resources on1. On a positive note, this has the benefit of making my project timely. On another, more problematic level, it raises a question: If my project is to make an original contribution to knowledge, how do I ensure that it is steered towards something new, something different, and is not in danger of simply replicating, or contributing to, the work and conclusions of those involved with industrial data analysis, both in and outside of the music industries? In other words: How is Harkive different?

In order to begin to explore this problem I’m going to attempt to map the work of Michel De Certeau, and in particular his discussion of walking in the city as a practice of everyday life (De Certeau, 1984, pp. 91–110), on to a discussion of some ideas around music and data, using my own experience of listening to music as I walk as a reference point. For the purpose of this we must first substitute De Certeau’s New York City for the landscape of popular music consumption. Imagine, if you will, not a city built of a network of roads, buildings and people, with laws and regulations governing activity, trade and movement, but one comprised instead of an ecosystem of media businesses, music listeners and connections, both on and offline, that has regulatory frameworks of its own, including copyright legislation, pricing models, and so on.

By adapting Jeremy Silver’s (2013) idea of digital city-states, we can understand the larger players in the marketplace (Amazon, iTunes, Facebook, major labels and broadcasters, and so on) as the skyscrapers that dominate the skyline of the city and to which the main routes and thoroughfares carry and direct traffic. The smaller, side-streets lead to the mid-sized buildings (Bandcamp, Soundcloud, independent retailers, media outlets and labels), and the less-trodden paths to the unregulated, or niche areas of the landscape (band sites, messageboards, torrent sites, and so on). With this image in mind, we can then swap De Certeau’s view from the top of the World Trade Centre for the view afforded by the collected and collated data about music consumption (sales, streams, searches, social media metrics, and so on), which ‘makes the complexity of the city readable, and immobilizes its opaque mobility in a transparent text’, and for the people walking in the streets of New York, far below, we can instead see the music listeners navigating their way from song to song, service to service, within ‘the city’.

In this context, the data generated and collected by my music listening whilst walking can be understood as an infinitesimally small element of the texturological picture of music consumption practices that are created by music listeners daily. Along with millions of others, I am the co-writer of a ‘poem‘ I cannot read; I am (we are) ‘the individual in the mass that is read by the all-seeing eye as representational of the individual’. As individuals we could perhaps find this problematic – it depersonalises us; it is an affront to our idea of self. But is there also a problem with applying such a logic to the arena of music consumption, where the idiosyncrasies of taste and other drivers so heavily influence our choices in listening to the music we do? I shall return to this question later in the essay.

We can also understand the design and development of the present landscape of consumption in terms of De Certeau’s idea of the city, which produces it’s own space by repressing that which could compromise it, creates systems to suppress tactics of opportunities, and creates universal and anonymous subjects; ‘a finite number of stable, isolatable, and interconnected properties’. If we consider, for example, the service I use on my daily walks, Spotify, as something which grew out of a response to disruptive digital technologies (piracy, in other words), then De Certeau’s model could easily be deployed here: the ecosystem of music consumption reorganised by the establishment of new ‘loci of exchange’ (Burkart and McCourt, 2006) in response to citizens not sticking to the designated pedestrian zones of the city, for instance.

According to De Certeau’s model, the concept of the city must always attempt to make the fact of the city fit its model. Even if ‘linking the city to the concept never makes them identical..it [nevertheless] plays on their progressive symbiosis‘. The data gold-rush and the continued drive for and investment in the creation of music discovery platforms2 is a case in point here. We can track and ‘predict’ consumption with data, therefore listeners must consume according to this data via music discovery platform recommendations, which completes the circle. As Simon Frith observed, long before the advent of the age of Big Data, ‘the culture industry is the central agency in contemporary capitalism for the production and satisfaction of false needs(Frith, 1981, pp. 44–45), and in that sense, data can be seen as merely the latest logical step in the process of standardisation and rationalisation in popular music that was so heavily criticised by Adorno (Adorno and Simpson, 1942).

Yet, in spite of this, and just as the work of sub-cultural theorists (Hebdige, 1979) and sociologists examining music in everyday life (Bull, 2000; DeNora, 2000) who built upon Adorno et al attempted to show, there is hope to be found also in De Certeau’s model: ‘beneath the discourses that ideologise the city, the ruses and combinations of powers that have no readable identity proliferate’.

Just as it is in the city, so it is in music…

…and it is perhaps here where a glimmer of hope appears. The driving premise of Harkive from its very inception was the idea (my assumption) that, ‘No two people listen to music in precisely the same way’. If that is indeed the case, and De Certeau’s model would suggest that it is more than mere possibility, even in spite of growing and efficient rationalisation through data, then it follows that a reliance and focus on data alone is a flawed approach. Indeed, this idea is explored by Lazer et al (2014) in their caution against any creeping ‘Big Data Hubris’ in academic enquiry. I would argue that similar caution should be paid by those operating at an industrial level.

Can data, for instance, ever fully account for what De Certeau refers to as ‘practices of space‘ – the illusive movements of walkers in a city (for which we can read, music listeners)? A possible hypothetical aim, function or argument of Harkive, then, would be to argue that it cannot. This is not to say that data is without merit, of course, and, indeed, to test out such a hypothesis would require a methodology that mapped Harkive’s data against industrial data in order to challenge or disprove the conclusions drawn. It would also be one that simultaneously built on and challenged existing scholarly ideas around music consumption. However, whilst challenging existing ideas within the academy is the function of a good academic, a potential danger in identifying flaws and under-attended areas of industrial practice in the music industries would be that I provide a means for their reification. As I hint at below, however, neither the academic nor industrial process can ever be complete. The only guaranteed outcome in both cases would be more questions.

There is a great deal more to be made of this reading of De Certeau’s work, I feel: the idea that footsteps (listens, plays) are ‘an innumerable collection of singularities‘; that walking (listening) is a speech act; that it has a grammar and a rhetoric of its own, and so on, lend themselves as ideal models for exploring music consumption in the context of data. Unfortunately, space does not permit me to explore them here, and in any case this line of thought would require a considerable amount of further development before anything concrete might emerge. However from this explorative start there are perhaps the beginnings of a position of my own, the kernel of an argument. In closing, I will attempt to sketch out some related areas that might be included in such a development.

As hinted at above, the ideas considered provide numerous routes back to themes explored by popular music scholars over the years, and thus to possible areas of new knowledge in the context of modern developments:

  • If big data can be seen, for example, as representative of the latest logical step in the march of ‘reason’, and if the actuality of listening that reason can never sufficiently explain is ‘material reality’, then the presence of Adorno looms large;
  • Just as Sterne (2012) and Milner (2010) have pointed out in their explorations of the development of recording and audio technologies, the idea that a recording, however advanced, can capture a true representation of reality, is fundamentally flawed – for Sterne (2006), the gaps between the zeros and ones in digital recording, it’s flaws, in other words, are where the interesting questions lie;
  • The affordances of zeros and ones are exactly what the service offered by Shazam uses to do its work. It now accounts for 10% of all digital music sales3 and is heavily influencing music production and distribution through the monetisation of its data. It represents a further rationalisation of process in the music industries, yet a similar service, HitPredictor, armed with granular data and an algorithm which analyses the ‘hit potential’ of a song, entirely failed to predict the success of All About That Bass, one of the biggest hits of 2014. Building on Sterne’s observation above, is it possible that the failures and blind spots of Big Data are more interesting than its successes?;
  • I’m aware that I have completed a 2,500 word essay entitled ‘Walking in the city with headphones on’ based on a theory of everyday life, and only briefly mentioned a number of key studies in the field of popular music and everyday life, notably Micheal Bull’s ‘Sounding Out The City’ (2000) and Tia DeNora’s ‘Music In EveryDay Life’ (2000). Both studies pre-date the current developments in Big Data (although Michael Bull did update his study in 2006 to include a consideration of the rise of the iPod). The opportunity to build on both pieces of work to include digitally delivered music, big data and social media (and the idea – another assumption of mine – that the relationship between the acts of ‘listening to music’ and ‘communicating about music’ is evolving) would be another potentially fruitful route that my project could incorporate.


Adorno, T.W., Simpson, G., 1942. On popular music. Institute of Social Research.

Bull, M., 2000. Sounding out the city: Personal stereos and the management of everyday life. Berg Publishers.

Burkart, P., McCourt, T., 2006. Digital music wars: ownership and control of the celestial jukebox. Rowman & Littlefield, Oxford.

DeNora, T., 2000. Music in everyday life. Cambridge University Press, Cambridge.

Frith, S., 1981. Sound effects; youth, leisure, and the politics of rock’n’roll. Sound Eff. Youth Leis. Polit. Rocknroll.

Hebdige, D., 1979. Subculture: the meaning of style. Methuen, London (etc.).

Housley, W., Procter, R., Edwards, A., Burnap, P., Williams, M., Sloan, L., Rana, O., Morgan, J., Voss, A., Greenhill, A., 2014. Big and broad social data and the sociological imagination: A collaborative response. Big Data Soc. 1, 2053951714545135.

Lazer, D.M., Kennedy, R., King, G., Vespignani, A., 2014. The parable of Google Flu: Traps in big data analysis.

Michel, D.C., 1984. The practice of everyday life. Berkeley U Calif. P.

Milner, G., 2010. Perfecting sound forever: the story of recorded music. Granta, London.

Silver, J. 2013. Digital Medieval, Xtorical Publications Media.

Sterne, J., 2006. The mp3 as cultural artifact. New Media Soc. 8, 825–842.

Sterne, J., 2012. MP3: The meaning of a format. Duke University Press.

1The most high-profile recent example of this is the acquisition of MusicMetric, a firm specialising in music data collection and analysis, by Apple in a deal reported to be worth $50M. http://musically.com/2015/01/21/apple-buys-musicmetric/. Whilst the reasons for the purchase have not been made public by either party, industry experts have speculated that MusicMetric will be incorporated into the relaunch of the Beats Music service, which Apple acquired in 2014.

2For an overview of the manner in which data is having a growing influence on industrial practice in the music industries, see http://www.theatlantic.com/magazine/archive/2014/12/the-shazam-effect/382237/

3The figure was reported by the BBC in January 2015, but it should be noted that Shazam itself was the original source of the figures http://www.bbc.co.uk/news/technology-30917477