miércoles, 18 de mayo de 2011

Parsing Twitter's User Timeline with Python

Usually, when you want to include the tweets on a web, it is common to embed a javascript code to show it. Twitter offers a very useful widget for this purpose. http://twitter.com/about/resources/widgets/

This is the best way to show the user timeline without worries. You just copy and paste some javascript code, but sometimes you need more flexibility and copy and paste javascript widget it's not enough.

In this post you'll find a way to parse the Twitter's user timeline from a json file with Python. There are a lot of ways to do that, here's mine.

This is not a Python wrapper arround the twitter api, if you are looking for something like that, please visit http://code.google.com/p/python-twitter/

Getting the Twitter's user time

To download the last tweets, I use the GET statuses/user_timeline function that returns a json file. I've created a function to download the json file and  parse it. I've also created a Tweet class to store the tweet info.

The "read_tweets()" function parse the json file and store the tweet info in a list of Tweet instances. The Tweet class also provide some methods to store info easly.

Let's see the code.



In "read_tweets()" I use urllib2 to read the file and simplejson to parse it. To store the info I use the Tweet Class methods.




The most important Tweet class method is "set_text()". It converts plain text into html code with http, user and hashtag links. I use python regular expressions to find, http://xxx, https://xxx, #xxx and @xxx and replace it for a valid Twitter link.

Django advice

If you gonna use this code with Django, please, let me give you some advice.

"read_tweets()" download a json file every time you call it so please don't use it in a view. It'll delay the view too much. The tweets must be stored with "read_tweets()" and the view must read this stored tweets. You can store the tweets in a lot ways. I use a "manage.py" script added to cron to stored it in memcache, but It's your choice.

If you gonna show the html code stored en Tweet.html_text in a Django template don't forget use the safe tag.

[python]{{ tweet.html_text|safe }}[/python]

martes, 3 de mayo de 2011

Actualizar la Lista de Workspaces Recientes en Eclipse

Cuando se cambia de workspace en Eclipse, aparece una lista con los últimos workspaces con los que has trabajado. Aunque estos workspaces se hayan borrado del disco duro, la lista no se actualiza, y puede aparecer alguno que ya no existe. Para modificar esta lista, es necesario editar el fichero 'org.eclipse.ui.ide.prefs'


 En mi Ubuntu, este fichero se encuentra en ~/.eclipse/org.eclipse.platform_3.5.0_155965261/configuration/.settings/org.eclipse.ui.ide.prefs

Hay que modificar "RECENT_WORKSPACES"