Publications

From Michael G. Noll

Jump to: navigation, search

Here is an overview of my papers, articles and other published documents.

Contents

Academic Papers & Publications

Talks & Presentations

I have been a speaker at international scientific conferences, research seminars and workshops such as:

Selected Press Coverage

Research data sets

  • CABS120k08 (published 2008)
    Large research data set about Web metadata based on a sample of 120,000 Web documents with data retrieved from the Open Directory Project, the AOL Search query log corpus AOL500k, Google PageRank, Delicious.com/Yahoo!, and anchor text from incoming hyperlinks
  • DMOZ100k06 (published 2007)
    Large research data set about document metadata based on a random sample of 100,000 Web documents from the Open Directory combined with data retrieved from Delicious.com/Yahoo!, Google, and ICRA

Tutorials

In my PhD project, I use Hadoop quite a lot. Hadoop is a Yahoo-sponsored open source framework for distributed computing and data storage, similar to Google MapReduce and Google File System. Here are some tutorials to get you started.