Del.icio.us Python API
From Michael G. Noll
One of my recent research tasks required me to retrieve various information from Delicious.com, a well-known social bookmarking service. My programming language of choice is Python, and so I wrote a basic Python module for getting the data I needed.
![]()
Figure 1: A tag cloud as seen on Delicious.com.
Contents |
deliciousapi.py
IMPORTANT NOTE: It is strongly advised that you read the Delicious.com Terms of Use document before using this Python module. In particular, read section 5 "Intellectual Property".
Part of the functionality in DeliciousAPI is implemented by calling the official Delicious.com API or parsing its JSON feeds, other parts are provided by mining and scraping data directly from the Delicious.com website. The module is able to detect IP throttling, which is employed by Delicious.com to temporarily block abusive HTTP request behavior, and will raise a custom Python error to indicate that. Please be a nice netizen and do not stress the Delicious.com service more than necessary. I don’t, and you shouldn’t, too.
DeliciousAPI provides the following features plus some more:
- get_urls(): retrieves the most recent URLs which have been bookmarked and annotated with a given tag; supports the retrieving links from the delicious hotlist (front page) plus /popular/<tag> and /tag/<tag>
- get_url(): returns all public bookmarks of a URL, i.e. its "history"
- get_user():
- returns a user's full bookmark collection including private bookmarks if you know username AND password; in this case, all communication with Delicious.com is encrypted via SSL
- returns a user's full public bookmark collection if you don't know the user password (additional parameter: max_bookmarks; by default, only the 50 most recent bookmarks are retreived)
- get_tags_of_user(): returns a user's full tagging vocabulary, i.e. tags and tag counts, aggregated over all public bookmarks
- HTTP proxy support
Here is a code snippet to demonstrate basic usage of deliciousapi.py:
import deliciousapi dapi = deliciousapi.DeliciousAPI() url = "http://www.michael-noll.com/wiki/Del.icio.us_Python_API" username = "jsmith" # web pages shown on the front page of Delicious.com aka the 'hotlist' featured_links = dapi.get_urls() # popular web pages tagged with "photography" popular_photography_links = dapi.get_urls(tag="photography") # web pages recently tagged with "web2.0", up to a maximum of # 300 URLs if possible; note that get_urls() cannot guarantee # that the list of URLs is free of duplicate items - this is # due to the way Delicious.com generates the regular feeds for # a given tag (i.e. /tag/<tag> as opposed to /popular/<tag>) recent_web20_links = dapi.get_urls(tag="web2.0", popular=False, max_urls=300) # DeliciousURL object, providing # .title : title of the web document as stored on delicious.com # .url : URL of the corresponding web document # .total_bookmarks: total number of bookmarks/users for this url # .bookmarks : list of (user, tags, comment, timestamp) tuples # .top_tags: list of (tag, tag_count) tuples, representing the # most popular tags of this url (up to 10) # .tags : dict mapping tags to total tag count # # # Note that by default, get_url() does only retrieve the # 50 most recent bookmarks of a given url. You can control # this behavior with the max_bookmarks parameter (see # docstrings). url_metadata = dapi.get_url(url) print url_metadata # output: [http://www.michael-noll.com/wiki/Del.icio.us_Python_API] 103 total bookmarks (= users), 187 tags (37 unique), 10 out of 10 max 'top' tags # print url_metadata.title # output: Del.icio.us Python API - Michael G. Noll print url_metadata.bookmarks # output: [ # (u'neetij', [u'python', u'api', u'del.icio.us', u'programming'], None, datetime.datetime(2008, 8, 4, 0, 0)), # (u'jsf.online', [u'software', u'programming', u'free', u'development', u'del.icio.us', u'python', u'2008'], u'Python API - wraps the del.icio.us api for python', datetime.datetime(2008, 8, 4, 0, 0)), # (u'as11018', [u'python', u'api', u'programming'], None, datetime.datetime(2008, 7, 30, 0, 0)), # ...] print url_metadata.top_tags # output: [ (u'python', 91), (u'api', 73), (u'del.icio.us', 71), ... ] print url_metadata.tags # output : { u'is:api': 1, u'code': 6, u'toread': 1, ... } # If get_user() is called with both username and password, the full # bookmark collection of the user is returned, including any private # bookmarks. Communication is encrypted via SSL. You can use get_user() # for creating a backup of your Delicious.com bookmarks. # # If get_user() is called without password, only the most recent # public bookmarks of the given user are returned (up to 100). # # DeliciousUser object, providing # .bookmarks : list of (url, tags, title, notes, timestamp) tuples # .tags : dict mapping tags to total tag count # .username : name of the corresponding del.icio.us user user_metadata = dapi.get_user(username) print user_metadata # output: [jsmith] 31 bookmarks, 78 tags (45 unique) print user_metadata.bookmarks # output: [ (u'http://www.twellow.com/', [u'mashup', u'tools', u'twitter'], u'Twellow.com :: Twitter users organized into business categories', u'Kind of yellow pages for Twitter, interesting.', datetime.datetime(2008, 6, 25, 0, 0, 0)), ... ] # list of (tag, tag_count) tuples user_tags = dapi.get_tags_of_user(username) print user_tags # output: { 'golf': 1, 'toread': 11, 'recipe': 1, 'rest': 4, ... }
deliciousmonitor.py
I have also written a Python script for monitoring Delicious.com bookmark RSS feeds. The default RSS feed is the "hotlist" of urls you see on the Delicious.com front page.
This script requires DeliciousAPI and demonstrates how it can be used. Basically, it mirrors the RSS feed and retrieves additional metadata such as an entry’s most popular tags from the Delicious.com service itself.
Here is an example output:
<document url="http://www.michael-noll.com/wiki/Del.icio.us_Python_API" users="103" top_tags="10"> <top_tag name="python" count="91" /> <top_tag name="api" count="73" /> <top_tag name="del.icio.us" count="71" /> <top_tag name="delicious" count="32" /> <top_tag name="programming" count="29" /> ... </document>
Download
You can now download and install DeliciousAPI from Python Cheese Shop (includes only deliciousapi.py) via setuptools/easy_install. Just run
- easy_install DeliciousAPI, or
- easy_install -U DeliciousAPI for updates
and after installation, a simple import deliciousapi in your Python scripts will do the trick.
An alternative is to download the code straight from my git repository.
- deliciousapi.py
- deliciousmonitor.py (requires deliciousapi.py and Universal Feed Parser)
The code has been tested with Python 2.4.3 and 2.5.
License
The code is licensed to you under the GNU General Public License, version 2.
Feedback
Comments, questions and constructive feedback are always welcome. Just drop me a note.