Unofficial del.icio.us Python API for research

by Michael G. Noll on December 18, 2006 (last updated: July 8, 2007)

One of my recent research tasks required me to retrieve various information from del.icio.us, a well-known social bookmarking service. My programming language of choice is Python, and so I wrote a basic Python module for getting the data I needed.

This page is outdated. Please use only the new and current version of this document.

Before you start

Seems like it was a head-to-head race. One day before the publication of this article, del.icio.us released a similar “API” with their JSON url feeds together with new tagometers (see their blog entry).

Differences between the del.icio.us “API” and my module:
+ exact count for common tags (my API retrieves only a weight score)
- no native Python module (it’s JSON, after all)
- no support for getting a user’s public tag vocabulary (other than your own)

deliciousapi.py

IMPORTANT NOTE: It is strongly advised that you read the del.icio.us Terms of Use document before using this Python module. In particular, read section 5 “Intellectual Property”.

My (as in: unofficial) del.icio.us Python API provides the following features:

  • getting a user’s tags including tag counts, i.e. her tagging vocabulary
  • getting a url’s so-called “common tags”, i.e. the most popular tags assigned to user bookmarks of said url, if any (number between 0 and 25)
  • getting the total number of bookmarks for a url, i.e. the number of users who have bookmarked the url
  • HTTP proxy support

Note: Only public del.icio.us data will be mined (read below). This means that this API does not (yet) provide means to access your private bookmark data.

Here is a code snippet to demonstrate basic usage of deliciousapi.py:

  1. import deliciousapi
  2. d = deliciousapi.DeliciousAPI()
  3. url = "http://www.wikipedia.org/"
  4. item_tags = d.get_common_tags(url) # list of (tag, tag_weight) tuples
  5. number_of_users = d.get_number_of_users(url)
  6. user_tags = d.get_tags(a_delicious_username) # list of (tag, tag_count) tuples; can be any del.icio.us username

The official del.icio.us API does not provide the functionality mentioned above, so this module will query the del.icio.us website directly and extract the required information by parsing the HTML code of the resulting web pages (a kind of poor man’s web mining). The module is able to detect IP throttling, which is employed by del.icio.us to temporarily block abusive HTTP request behavior, and will raise a custom Python error to indicate that. Please be a nice netizen and do not stress the del.icio.us service more than necessary. I don’t, and you shouldn’t, too.

del.icio.us does not provide exact tag counts for a url’s common tags but rather lists “weights” for a tag. A weight is a number between 1 and 5 (higher = more popular). When you use the API to get a user’s tags however, the tag count is exact. You will get the precise number of bookmarks to which the user has assigned a specific tag.

deliciousmonitor.py

I have also written a Python script for monitoring del.icio.us bookmark RSS feeds. The default RSS feed is the “hotlist” of urls you see on the del.icio.us frontpage.

This script uses my delicious Python API and demonstrates how it can be used. Basically, it mirrors the RSS feed and retrieves additional metadata such as an entry’s most popular tags from the del.icio.us service itself.

Here is an example output:

  1. <document url="http://www.google.com/webmasters/" users="1720" tags="25">
  2.     <tag name="analytics" weight="1" />
  3.     <tag name="blog" weight="1" />
  4.     <tag name="design" weight="1" />
  5.     <tag name="development" weight="1" />
  6.     <tag name="google" weight="5" />
  7.     <tag name="howto" weight="1" />
  8.     <tag name="html" weight="1" />
  9.     <tag name="internet" weight="1" />
  10.     <tag name="marketing" weight="1" />
  11.     <tag name="programming" weight="1" />
  12.     <tag name="reference" weight="2" />
  13.     <tag name="resources" weight="1" />
  14.     <tag name="search" weight="1" />
  15.     <tag name="seo" weight="3" />
  16.     <tag name="tips" weight="1" />
  17.     <tag name="tool" weight="1" />
  18.     <tag name="tools" weight="2" />
  19.     <tag name="tutorial" weight="1" />
  20.     <tag name="utilities" weight="1" />
  21.     <tag name="web" weight="2" />
  22.     <tag name="webdesign" weight="2" />
  23.     <tag name="webdev" weight="1" />
  24.     <tag name="webmaster" weight="3" />
  25.     <tag name="website" weight="1" />
  26.     <tag name="work" weight="1" />
  27. </document>

Download

This page is outdated. Please use only the new and current version of this document.

You can now download the del.icio.us API from Python Cheese Shop (includes only deliciousapi.py). Just run easy_install DeliciousAPI or sudo easy_install DeliciousAPI, and after installation, a simple import deliciousapi will do the trick.

An alternative is to download the code straight from my Subversion repository.

Note that deliciousmonitor.py depends on deliciousapi.py and Mark Pilgrim’s excellent Universal Feed Parser. The code has been tested with Python 2.4.3 and 2.5.

License

The code is licensed to you under version 2 of the GNU General Public License.