You are reading a MIX Online Opinion. In which we speak our minds. Karsten Januszewski Meet Karsten Arrow

Opinions

15Comment Retweet

The Archivist: Save And Export Twitter Searches Before They Go Away

May 15, 2009 In News By Karsten Januszewski

If you have used Twitter search before, you may notice that you can only go back a certain amount of time and/or number of tweets for a given search. In fact, if you read the Twitter search documentation, you’ll note that the folks from Twitter say, "We also restrict the size of the search index by placing a date limit on the updates we allow you to search. This limit is currently around a month but is dynamic and subject to shrink as the number of tweets per day continues to grow."

Thus was born The Archivist, a new experiment from the Mix Online lab that Tim Aidlin and I cooked up recently.  Our motives for writing the app where to solve a problem that, from my research, hasn’t quite been solved before: How do you archive Twitter searches? I looked on the Twitter Apps Wiki but didn’t see anything that accomplished exactly what I was looking for. So, what can you do but write it yourself?

The Archivist is a Windows application that runs on your local system and allows you to archive tweets for later data-mining and analysis for a given search.

There are two screens. First the main screen, which is a big list of tweets:

 

The Archivist allows you to start a search and will get as many results as it can on the initial search.  If you leave The Archivist open, it will update with the latest results every 10 minutes.  You can also close The Archivist and open it later. The Archivist will save the tweets and get all the tweets it can since that search.

Then, there’s a graph that shows number of Tweets a day:

The Archivist will display a chart that shows the number of tweets per day for a given search,so that you can quickly assess traffic for a given search. For more comprehensive data analysis,The Archivist lets you export Tweets to Excel. It also natively saves tweets in an XML format, which could also be parsed  for deeper data analysis.

Install The Archivist today! (Requires .NET Framework 3.5 SP1.)

To learn more about The Archivist, read the documentation.

For details about the development of The Archivist, check out this post.

Oh, and follow Mix Online on Twitter!

Follow the Conversation

15 comments so far. You should leave one, too.

Thomas Lewis Thomas Lewis said on Aug 13, 2009

@Agnieszka_M: Did you install the .NET Framework 3.5 SP1 first? The link is available above. Once you install that, you can then click on the Install link and it should work. Let us know if that doesn''t fix it. We would love to have you check The Archivist out!

Agnieszka_M said on Aug 13, 2009

Thank''s know it works...

I''ve test it and it''s great! Very helpful for my analyse. Thank you!

daniel said on Sep 8, 2009

App will not launch as it reports missing files. Re-installed MS .Net Framework SP1, but that does not correct problem. Any other suggestions.

Rahul Rahul said on Jan 12, 2010

is there any way to archive only on lang=en ? i tried using for e.g. "Avatar" lang=en in the search box but that doesn''t work...

any ideas?

Dave Dave said on Jan 22, 2010

Just downloaded, works fine. Can I archive more than one search?

Pradeep Pradeep said on Jan 27, 2010

I downloaded the app and it starts fine. But when I give any search, it says, "The remote server returned an error: (403) Forbidden".
What could be the reason?

Andrei Andrei said on May 29, 2010

Hello! I''ve got that message too. I installed .net 3.5 sp1 over and over again, but nothing... What''s wrong with this app?

Thomas Lewis Thomas Lewis said on May 29, 2010

Hi Pradeep and Andrei, I have notified our developer on the project and he will look into it and respond back to you. Sorry for the inconvenience.

Karsten Januszewski Karsten Januszewski said on Jun 1, 2010

Sorry for the delay in getting back to people''s comments:

@daniel -- Try downloading the .exe directly instead of installing it from the website here: http://code.msdn.microsoft.com/archivistdesktop/Release/ProjectReleases.aspx

@Rahul -- Unfortunately, no. That''s a feature we are hoping to implement at some point.

@Dave -- The only way to archive more than one search is to run two instances of the archivist

@Pradeep and @Andrei -- A 403 indicates that you are being rate limited by Twitter

Rick Stavanja Rick Stavanja said on Jul 16, 2010

Any chance of this code ever going open-source?

Michael Windham said on Jan 13, 2012

I've been looking for a solution that will allow me to run a wildcard search, like searching for example* and getting everything that includes words starting with "example". For some reason this has been extremely difficult for me to find for Twitter.
Does your application allow this?

Karsten Januszewski said on Jan 13, 2012

@Michael Windham - No, neither Archivist Web or Archivist Desktop allows for wild card searching.

Jun said on Jan 22, 2012

The search operator link doesn't work at this page:
http://visitmix.com/work/archivist-desktop/documentation.html

Would you please fix it?

Thanks.

Karsten Januszewski said on Jan 23, 2012

@Jun -- Hmm, looks like Twitter turned that page into a pop-up, thus the broken link. Here's the syntax:

EXAMPLES:

"happy hour"
containing the exact phrase "happy hour".

love OR hate
containing either "love" or "hate" (or both).

beer -root
containing "beer" but not "root".

#haiku
containing the hashtag "haiku".

from:alexiskold
sent from person "alexiskold".

to:techcrunch
sent to person "techcrunch".

@mashable
referencing person "mashable".

"happy hour" near:"san francisco"
containing the exact phrase "happy hour" and sent near "san francisco".

near:NYC within:15mi
sent within 15 miles of "NYC".

superhero since:2010-12-27
containing "superhero" and sent since date "2010-12-27" (year-month-day).

ftw until:2010-12-27
containing "ftw" and sent up to date "2010-12-27".

movie -scary :)
containing "movie", but not "scary", and with a positive attitude.

flight :(
containing "flight" and with a negative attitude.

traffic ?
containing "traffic" and asking a question.

hilarious filter:links
containing "hilarious" and linking to URLs.

news source:twitterfeed
containing "news" and entered via TwitterFeed

We'll use your email to grab your gravatar. We won't store your email or sell it to trolls.