Here is an overview of my talks, scientific papers, articles, and other published content.

Talks & Presentations

Upcoming talks:

Past talks:

Academic Papers & Publications

Selected Press Coverage

Research Data Sets

  • CABS120k08 (published 2008)
    Large research data set about Web metadata based on a sample of 120,000 web documents with data retrieved from the Open Directory Project, the AOL Search query log corpus AOL500k, Google PageRank,!, and anchor text from incoming hyperlinks
  • DMOZ100k06 (published 2007)
    Large research data set about document metadata based on a random sample of 100,000 web documents from the Open Directory combined with data retrieved from!, Google, and ICRA.


See my separate Tutorials section.