By Zachary M. Seward
Editor’s Note: Zach is outreach editor for The Wall Street Journal, where he helps manage the newspaper’s relationship with companies like Twitter and Foursquare. Below, he explains one way that he makes use of those and other services.
The image below is my New York. More precisely, it’s a heatmap of where I spent my time in 2010 as documented by my activity on Foursquare, a location-based social network. Perhaps you can guess where I live, where I work and which baseball team I prefer.
But I never intended to create this map. I just used Foursquare normally, checking in 1,491 times over the course of the year, and ended up with this wealth of data. Generating the map was easy, thanks to a tool by programmer Steven Lehrburger.
Lifelogging has been around since at least Benjamin Franklin, but digital technology transformed the practice, allowing obsessive types to record, store and visualize every detail of their lives, from sleep cycles to eating habits. The goal? Ultimate self-awareness and reflection. “We’ve arrived at a time when the memory of machines creates ideas we’ve never considered,” Clive Thompson declared in a cover story about lifelogging for Fast Company in 2006.
I dig that notion but would never wear a Fitbit (to track every step I take) or use a service like Moodscope (to log my emotions). I just want to do my thing while passively self-quantifying. For instance, I spend money to satisfy needs, but because Mint has access to my bank accounts, the service can tell me how much I spent on food each month in 2010:
I moved to New York a year ago, and those first few months were, I began to recognize, extravagant. So I imposed a period of relative austerity that you can see had mixed results.
Personal finance is an age-old craft, but the difference here is that I did almost no additional work to compile a comprehensive dataset of my income, spending and investments. I can augment that with other, passively collected information: Foursquare tells me that I checked into the Italian restaurant around the corner from my apartment 19 times this year.
With enough of this data, I’m composing an image of myself outside of my self. I could tell you about my favorite music (”Oh, anything but country, really”), or we could look at this:
Those are the 10 artists who burned up my headphones in 2010 as recorded by Last.fm, a music service I’ve asked to, essentially, eavesdrop on me. Any song I listen to with iTunes, Hype Machine or Last.fm itself is automatically “scrobbled,” to use the site’s term. I like that word and think it should apply to all passive lifelogging (e.g., “TripIt has been scrobbling my business trips for years, so it should be easy to recommend a hotel in San Francisco”).
The scrobble is what bridges the gap between me and Nicholas Feltron. He famously compiles a beautifully designed “annual report” of his life with data that includes every alcoholic beverage he drinks and the mode of transportation every time he moves from one place to another. Some of that is compiled with the help of digital tools, but the rest requires a devotion to self-quantification that will never gain broad adoption. Kevin Kelly predicted in 2007 that lifelogging “will become as pervasive as text is to us now,” and while we’re still far from that vision, the intervening years have brought us, promisingly, the mainstream scrobble.
My scrobbles are mine; your scrobbles are yours; together, they look something like this:
Beach House was pretty big this year, and that’s a timeline chart of all Last.fm users listening to the band (gray) compared to my listens (red). Obviously, my interest in Beach House did not survive the hype. With other artists, like Rihanna and Arcade Fire, my listening habits closely conformed to those of the broader public. The Web is often said to be pushing us toward a collective identity, but with enough data, individuality shines.
I live in New York, Manhattan, Upper Manhattan, Greater Harlem, West Harlem or Manhattanville, depending on how particular we want to get. But no label is as specific as the individual data points of my Foursquare activity, which dot the area like a fingerprint. Here are my check-ins plotted, in Google Earth, on top of a heatmap of Harlem’s black population:
I’m white, and I’m much more likely to travel down the largely white Upper West Side than across areas of Harlem with higher densities of African Americans. (Of course, that’s largely determined by the commercial landscape of Manhattan, my commute to work and where my friends live.) Census data can describe the segregation of my block, but how about telling me how segregated my life is? Location data points in that direction.
But most of my life isn’t spent on the move. It’s spent online, which makes passive data collection ever easier. I don’t use any applications for logging computer use, but several services I rely on heavily return data about my usage. (That should be a standard feature, shouldn’t it?) If you’ve opted into Google Web History, it will tell you when you search:
And that one puzzles me. The parabolic curve suggests seasonal variation, but I’d like to see if it holds true over 2011, as well. In any event, I don’t really know how to interpret a decline in search activity, and my best guesses are a little personal for even this confessional post.
But personal stuff is, of course, the ripest area for analysis, which is why I was thrilled when, earlier this year, Princeton’s Bill Zeller released Graph Your Inbox, an extension for Google Chrome. The tool lets you graph the frequency of words, email addresses, subject lines, ex-girlfriends, unpaid bills or whatever else is hiding in your inbox. Of course, I immediately typed in two of the most-common curse words and got these results over the past five years:
Oh no. My cursing — and, for that matter, the cursing of those with whom I correspond — has been steadily declining since I took a respectable job in the fall of 2008. That result was genuinely surprising, which is not always the point of self-quantification. Sometimes I’m just looking for confirmation of what I already know, like when I tweet (visualized by TweetStats):
Examining hourly data is, in many ways, more interesting and useful than looking at my life over the stretch of a year. It’s especially so when multiple datasets are compared. So while all of the above visualizations were generated without any programming, I had to lightly wrangle some data for this one. I grabbed my hour-by-hour usage statistics from Twitter, Last.fm, Google’s search engine, and Google Reader; then, I tried my hand at graphic design:
I feel like my day starts at 6:30 a.m. (if only my teapot collected usage data), but it’s pretty clear that my day doesn’t begin in earnest until around 9 a.m., when I arrive at work. There’s also a certain rhythm in which these activities spike. Am I less productive in the middle of the day? Well, look, that’s when I’m more likely to be in meetings and not using any of these services, which are mostly part of my desk routine. But that’s the stammering of an anecdotalist. The truly quantified self would just say, we need more data.