Del.icio.us Python API
From Michael G. Noll
One of my recent research tasks required me to retrieve various information from del.icio.us, a well-known social bookmarking service. My programming language of choice is Python, and so I wrote a basic Python module for getting the data I needed.
![]()
Figure 1: A tag cloud as seen on del.icio.us.
Contents |
deliciousapi.py
IMPORTANT NOTE: It is strongly advised that you read the del.icio.us Terms of Use document before using this Python module. In particular, read section 5 "Intellectual Property".
Part of the functionality in DeliciousAPI is implemented by calling the official del.icio.us API or parsing its JSON feeds, other parts are provided by mining and scraping data directly from the del.icio.us website. The module is able to detect IP throttling, which is employed by del.icio.us to temporarily block abusive HTTP request behavior, and will raise a custom Python error to indicate that. Please be a nice netizen and do not stress the del.icio.us service more than necessary. I don’t, and you shouldn’t, too.
DeliciousAPI provides the following features plus some more:
- get_url(): returns all public bookmarks of a URL, i.e. its "history"
- get_user():
- returns a user's full bookmark collection including private bookmarks if you know username AND password; in this case, all communication with del.icio.us is encrypted via SSL
- returns a user's most recent public bookmarks (up to 100) if you don't know the password
- get_tags_of_user(): returns a user's full tagging vocabulary, i.e. tags and tag counts, aggregated over all public bookmarks
- HTTP proxy support
Please note that DeliciousAPI can currently not scrape a user's full public bookmark collection if you don't know the user's password. This is because of technical reasons on del.icio.us' side.
Here is a code snippet to demonstrate basic usage of deliciousapi.py:
import deliciousapi dapi = deliciousapi.DeliciousAPI() url = "http://www.michael-noll.com/wiki/Del.icio.us_Python_API" username = "jsmith" # DeliciousURL object, providing # .title : title of the web document as stored on delicious.com # .url : URL of the corresponding web document # .total_bookmarks: total number of bookmarks/users for this url # .bookmarks : list of (user, tags, comment, timestamp) tuples # .top_tags: list of (tag, tag_count) tuples, representing the # most popular tags of this url (up to 10) # .tags : dict mapping tags to total tag count # # # Note that by default, get_url() does only retrieve the # 50 most recent bookmarks of a given url. You can control # this behavior with the max_bookmarks parameter (see # docstrings). url_metadata = dapi.get_url(url) print url_metadata # output: [http://www.michael-noll.com/wiki/Del.icio.us_Python_API] 103 total bookmarks (= users), 187 tags (37 unique), 10 out of 10 max 'top' tags # print url_metadata.title # output: Del.icio.us Python API - Michael G. Noll print url_metadata.bookmarks # output: [ # (u'neetij', [u'python', u'api', u'del.icio.us', u'programming'], None, datetime.datetime(2008, 8, 4, 0, 0)), # (u'jsf.online', [u'software', u'programming', u'free', u'development', u'del.icio.us', u'python', u'2008'], u'Python API - wraps the del.icio.us api for python', datetime.datetime(2008, 8, 4, 0, 0)), # (u'as11018', [u'python', u'api', u'programming'], None, datetime.datetime(2008, 7, 30, 0, 0)), # ...] print url_metadata.top_tags # output: [ (u'python', 91), (u'api', 73), (u'del.icio.us', 71), ... ] print url_metadata.tags # output : { u'is:api': 1, u'code': 6, u'toread': 1, ... } # If get_user() is called with both username and password, the full # bookmark collection of the user is returned, including any private # bookmarks. Communication is encrypted via SSL. You can use get_user() # for creating a backup of your del.icio.us bookmarks. # # If get_user() is called without password, only the most recent # public bookmarks of the given user are returned (up to 100). # # DeliciousUser object, providing # .bookmarks : list of (url, tags, title, notes, timestamp) tuples # .tags : dict mapping tags to total tag count # .username : name of the corresponding del.icio.us user user_metadata = dapi.get_user(username) print user_metadata # output: [jsmith] 31 bookmarks, 78 tags (45 unique) print user_metadata.bookmarks # output: [ (u'http://www.twellow.com/', [u'mashup', u'tools', u'twitter'], u'Twellow.com :: Twitter users organized into business categories', u'Kind of yellow pages for Twitter, interesting.', datetime.datetime(2008, 6, 25, 0, 0, 0)), ... ] # list of (tag, tag_count) tuples user_tags = dapi.get_tags_of_user(username) print user_tags # output: { 'golf': 1, 'toread': 11, 'recipe': 1, 'rest': 4, ... }
deliciousmonitor.py
I have also written a Python script for monitoring del.icio.us bookmark RSS feeds. The default RSS feed is the "hotlist" of urls you see on the del.icio.us frontpage.
This script uses my delicious Python API and demonstrates how it can be used. Basically, it mirrors the RSS feed and retrieves additional metadata such as an entry’s most popular tags from the del.icio.us service itself.
Here is an example output:
<document url="http://www.michael-noll.com/wiki/Del.icio.us_Python_API" users="103" top_tags="10"> <top_tag name="python" count="91" /> <top_tag name="api" count="73" /> <top_tag name="del.icio.us" count="71" /> <top_tag name="delicious" count="32" /> <top_tag name="programming" count="29" /> ... </document>
Download
You can now download and install the del.icio.us API from Python Cheese Shop (includes only deliciousapi.py) via setuptools/easy_install. Just run
- easy_install DeliciousAPI, or
- easy_install -U DeliciousAPI for updates
and after installation, a simple import deliciousapi in your Python scripts will do the trick.
An alternative is to download the code straight from my Subversion repository.
- deliciousapi.py
- deliciousmonitor.py (requires deliciousapi.py and Universal Feed Parser)
The code has been tested with Python 2.4.3 and 2.5.
License
The code is licensed to you under the GNU General Public License, version 2.
Feedback
Comments, questions and constructive feedback are always welcome. Just drop me a note.