Excel, PowerPivot and Data Mining TwitterNov 17, 2010 By Karsten Januszewski
Out of the box, The Archivist (our Twitter archival and analysis service) provides six visualizations of sliced Twitter data. However, there are some visualizations that The Archivist doesn’t provide.
For example, consider an archive on the term soundcloud with 388,000+ tweets. The Archivist will show you the top 25 users who tweeted about soundcloud. I can learn from The Archivist that the user who tweeted the most about soundcloud was top100djsgirls. But what if I wanted to look at just the tweets from top100djsgirls? The Archivist can’t do that.
Similarly, The Archivist provides a tweet by volume chart. I can see the peak day over the last 4 months for the term soundcloud was in mid-August of 2010. What if I’d like to look at just the tweets from the high volume day?
Or, what if I’d like to see a pie chart that shows distribution based on the language of a tweet? How’s about looking at the distribution of tweets over time filtered on a given user?
Pulling an archive into Excel and using PowerPivot to slice and dice the data makes it possible to answer these kinds of questions (and more). PowerPivot is designed for working with large datasets (100,000+) inside Excel, so it’s perfect for mining large tweet archives.
In the screencast below, I’ll walk you through exactly how to do this: