AOL500k

From Michael G. Noll

Jump to: navigation, search

The AOL500k data set is huge data collection of 20,000,000 search queries from 650,000 users sampled over three months (March-May 2006) and originally published by AOL Research. However, the release of the data set spurred a lot of public discussion on privacy issues and eventually ended in a PR disaster for AOL. As a result the two responsible scientists and AOL's CTO were fired.

More information about AOL500k including download links are available in my blog post: AOL Research publishes 650,000 user queries

Tags: 500k, aol, aol500k, collection, data, privacy, queries, query, query logs, research, sample, search