Here is an overview of my talks, podcasts, scientific papers, external articles, and other published content.

External Articles

Podcasts

Talks & Presentations

Academic Papers & Publications

Selected Press Coverage

Research Data Sets

  • CABS120k08 (published 2008)
    Large research data set about Web metadata based on a sample of 120,000 web documents with data retrieved from the Open Directory Project, the AOL Search query log corpus AOL500k, Google PageRank, Delicious.com/Yahoo!, and anchor text from incoming hyperlinks
  • DMOZ100k06 (published 2007)
    Large research data set about document metadata based on a random sample of 100,000 web documents from the Open Directory combined with data retrieved from Delicious.com/Yahoo!, Google, and ICRA.

Patents

Tutorials

See my separate Tutorials section.