A time-lapse video of the maps, cycled twice, is available below (best viewed at 720p):
Mood Variations
A number of interesting trends can be observed in the data. First, overall daily variations can be seen (first graph), with the early morning and late evening having the highest level of happy tweets. Second, geographic variations can be observed (second graph), with the west coast showing happier tweets in a pattern that is consistently three hours behind the east coast.
Similar variations were discovered independently by Michael Macy and Scott Golder, and first reported in the talk "Answers in Search of a Question" at the New Directions in Text Analysis Conference in May 2010.
Weekly Variations
Weekly trends can be observed as well, with weekends happier than weekdays. The peak in the overall tweet mood score is observed on Sunday mornings, and the trough occurs on Thursday evenings.
About the Data and Visualization
The plots were calculated using over 300 million tweets (Sep 2006 - Aug 2009) collected by MPI-SWS researchers [1], represented as density-preserving cartograms. This visualization includes both weekdays and weekends; in the future, will we create seperate maps for each. The mood of each tweet was inferred using ANEW word list [2] using the same basic methodology as previous work [3]. County area data were taken from the U.S. Census Bureau, and the base U.S. map was taken from Wikimedia Commons. User locations were inferred using the Google Maps API, and mapped into counties using PostGIS and U.S. county maps from the U.S. National Atlas. Mood colors were selected using Color Brewer 2.
About Cartograms
A cartogram is a map in which the mapping variable (in this case, the number of tweets) is substituted for the true land area. Thus, the geometry of the actual map is altered so that the shape of each region is maintained as much as possible, but the area is scaled in order to be proportional to the number of tweets that originate in that region. The result is a density-equalizing map. The cartograms in this work were generated using the cart software by Mark E. J. Newman, available at http://www-personal.umich.edu/~mejn/cart.
Who We Are
We are researchers from Northeastern University and Harvard University, studying the characteristics and dynamics of Twitter.
Alan Mislove, College of Computer and Information Science, Northeastern University
Sune Lehmann, Center for Complex Network Research, Northeastern University
Yong-Yeol Ahn, Center for Complex Network Research, Northeastern University
The most comprehensive list of stories that covered the project can be found here. For more information or media requests, please contact Sune Lehmann.
[1] Cha, M., Haddadi, H., Benevenuto, F., and Gummadi, K.P. Measuring User Influence in Twitter: The Million Follower Fallacy. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 2010.
[2] Bradley, M.M., & Lang, P.J. Affective norms for English words (ANEW): Stimuli, instruction manual and affective ratings. Technical report C-1, The Center for Research in Psychophysiology, University of Florida.
[3] O'Connor, B., Balasubramanyan, R., Routedge, B., & Smith, N. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media (ICWSM). Washington, DC, May 2010.