Excel, PowerPivot and Data Mining Twitter
Nov 17, 2010 By Karsten JanuszewskiOut of the box, The Archivist (our Twitter archival and analysis service) provides six visualizations of sliced Twitter data. However, there are some visualizations that The Archivist doesn’t provide.
For example, consider an archive on the term soundcloud with 388,000+ tweets. The Archivist will show you the top 25 users who tweeted about soundcloud. I can learn from The Archivist that the user who tweeted the most about soundcloud was top100djsgirls. But what if I wanted to look at just the tweets from top100djsgirls? The Archivist can’t do that.
Similarly, The Archivist provides a tweet by volume chart. I can see the peak day over the last 4 months for the term soundcloud was in mid-August of 2010. What if I’d like to look at just the tweets from the high volume day?
Or, what if I’d like to see a pie chart that shows distribution based on the language of a tweet? How’s about looking at the distribution of tweets over time filtered on a given user?
Pulling an archive into Excel and using PowerPivot to slice and dice the data makes it possible to answer these kinds of questions (and more). PowerPivot is designed for working with large datasets (100,000+) inside Excel, so it’s perfect for mining large tweet archives.
In the screencast below, I’ll walk you through exactly how to do this:



Follow the Conversation
7 comments so far. You should leave one, too.
I can''t view your screencast.
hmm, there''s something goofy with the screencast. It seems if you hit refresh it will appear. Not sure what''s going on; will investigate. In the meantime, hit refresh should get you to the screencast.
This looks extremely useful, but it looks to me as if you have turned off the ability to export tweets to Excel. Is that right?
I am not quite sure how I missed the desktop version of the software, but I now see that, which does contain an export to Excel button.
But I can''t verify that that is working (save as well). It certainly doesn''t present a window to allow me to name the file and that is true of save as well.
Where should these files be located?
The window also references "remaining Twitter hits." How does that limit operate?
Thanks.
Karsten:
Is this still available? The video seems to be gone, and I do not see any way to get it. I hope it is available as I would very much like to use it.
Hmm, somethings screwed up with the embedded video. Here's a link to the wmv: http://ecn.channel9.msdn.com/o9/ch9/e6aa/129e4c5c-fc9c-4b37-8cc7-9e2f0184e6aa/archivistdatamining_2MB_ch9.wmv
What does "remaining Twitter hits" mean?