Unofficial del.icio.us Python API for research
by Michael G. Noll on December 18, 2006 (last updated: July 8, 2007)
One of my recent research tasks required me to retrieve various information from del.icio.us, a well-known social bookmarking service. My programming language of choice is Python, and so I wrote a basic Python module for getting the data I needed.
Before you start
Seems like it was a head-to-head race. One day before the publication of this article, del.icio.us released a similar “API” with their JSON url feeds together with new tagometers (see their blog entry).
Differences between the del.icio.us “API” and my module:
+ exact count for common tags (my API retrieves only a weight score)
- no native Python module (it’s JSON, after all)
- no support for getting a user’s public tag vocabulary (other than your own)
deliciousapi.py
IMPORTANT NOTE: It is strongly advised that you read the del.icio.us Terms of Use document before using this Python module. In particular, read section 5 “Intellectual Property”.
My (as in: unofficial) del.icio.us Python API provides the following features:
- getting a user’s tags including tag counts, i.e. her tagging vocabulary
- getting a url’s so-called “common tags”, i.e. the most popular tags assigned to user bookmarks of said url, if any (number between 0 and 25)
- getting the total number of bookmarks for a url, i.e. the number of users who have bookmarked the url
- HTTP proxy support
Note: Only public del.icio.us data will be mined (read below). This means that this API does not (yet) provide means to access your private bookmark data.
Here is a code snippet to demonstrate basic usage of deliciousapi.py:
-
import deliciousapi
-
d = deliciousapi.DeliciousAPI()
-
url = "http://www.wikipedia.org/"
-
item_tags = d.get_common_tags(url) # list of (tag, tag_weight) tuples
-
number_of_users = d.get_number_of_users(url)
-
user_tags = d.get_tags(a_delicious_username) # list of (tag, tag_count) tuples; can be any del.icio.us username
The official del.icio.us API does not provide the functionality mentioned above, so this module will query the del.icio.us website directly and extract the required information by parsing the HTML code of the resulting web pages (a kind of poor man’s web mining). The module is able to detect IP throttling, which is employed by del.icio.us to temporarily block abusive HTTP request behavior, and will raise a custom Python error to indicate that. Please be a nice netizen and do not stress the del.icio.us service more than necessary. I don’t, and you shouldn’t, too.
del.icio.us does not provide exact tag counts for a url’s common tags but rather lists “weights” for a tag. A weight is a number between 1 and 5 (higher = more popular). When you use the API to get a user’s tags however, the tag count is exact. You will get the precise number of bookmarks to which the user has assigned a specific tag.
deliciousmonitor.py
I have also written a Python script for monitoring del.icio.us bookmark RSS feeds. The default RSS feed is the “hotlist” of urls you see on the del.icio.us frontpage.
This script uses my delicious Python API and demonstrates how it can be used. Basically, it mirrors the RSS feed and retrieves additional metadata such as an entry’s most popular tags from the del.icio.us service itself.
Here is an example output:
-
<document url="http://www.google.com/webmasters/" users="1720" tags="25">
-
<tag name="analytics" weight="1" />
-
<tag name="blog" weight="1" />
-
<tag name="design" weight="1" />
-
<tag name="development" weight="1" />
-
<tag name="google" weight="5" />
-
<tag name="howto" weight="1" />
-
<tag name="html" weight="1" />
-
<tag name="internet" weight="1" />
-
<tag name="marketing" weight="1" />
-
<tag name="programming" weight="1" />
-
<tag name="reference" weight="2" />
-
<tag name="resources" weight="1" />
-
<tag name="search" weight="1" />
-
<tag name="seo" weight="3" />
-
<tag name="tips" weight="1" />
-
<tag name="tool" weight="1" />
-
<tag name="tools" weight="2" />
-
<tag name="tutorial" weight="1" />
-
<tag name="utilities" weight="1" />
-
<tag name="web" weight="2" />
-
<tag name="webdesign" weight="2" />
-
<tag name="webdev" weight="1" />
-
<tag name="webmaster" weight="3" />
-
<tag name="website" weight="1" />
-
<tag name="work" weight="1" />
-
</document>
Download
You can now download the del.icio.us API from Python Cheese Shop (includes only deliciousapi.py). Just run easy_install DeliciousAPI or sudo easy_install DeliciousAPI, and after installation, a simple import deliciousapi will do the trick.
An alternative is to download the code straight from my Subversion repository.
- deliciousapi.py (read del.icio.us Terms of Use before using)
- deliciousmonitor.py
Note that deliciousmonitor.py depends on deliciousapi.py and Mark Pilgrim’s excellent Universal Feed Parser. The code has been tested with Python 2.4.3 and 2.5.
License
The code is licensed to you under version 2 of the GNU General Public License.
thx, i will play a bit with it…
watch
http://wiki.mobbing-gegner.de/Linux/Internet/Web2.0/Bookmarken-mit-delicious
an see perhaps the result
[...] I had to write my own Python scripts to retrieve data since, unfortunately, Michael G. Noll’s Unofficial del.icio.us Python API for research are not available [...]
The cause of the problem was an outdated local installation of SVN::Web after the package was upgraded on the server on which this blog’s running. It’s fixed now, and you can download the del.icio.us Python API again and have fun!
Thanks Jean-Etienne for your bug report!
My del.icio.us API is now also available from Python Cheese Shop, i.e. you can use “easy_install DeliciousAPI” to download and install it. After installation, you can use “import delicious” in your Python scripts to access the module.