Michael G. Noll

Applied Research. Big Data. Distributed Systems. Open Source.


Here is an overview of my scientific papers, articles and other published documents.

Academic Papers & Publications

Talks & Presentations

I have been a speaker at international scientific and industrial conferences, meetups, and workshops such as:

Selected Press Coverage

Research Data Sets

  • CABS120k08 (published 2008)
    Large research data set about Web metadata based on a sample of 120,000 web documents with data retrieved from the Open Directory Project, the AOL Search query log corpus AOL500k, Google PageRank, Delicious.com/Yahoo!, and anchor text from incoming hyperlinks
  • DMOZ100k06 (published 2007)
    Large research data set about document metadata based on a random sample of 100,000 web documents from the Open Directory combined with data retrieved from Delicious.com/Yahoo!, Google, and ICRA.


See my separate Tutorials section.