About this entry
You’re currently reading “The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries”.
- Author:
- Michael G. Noll
- Published:
- Sep 05, 2008
- Last updated:
- Jul 20, 2010
- Bookmark:
- Permanent Link
- Tags:
- acm, anchor text, anchor texts, aol500k, cabs120k08, categorization, classification, del.icio.us, dmoz, evaluation, google, ieee, metadata, open directory project, paper, papers, publication, Publications, Research, search queries, search query, social-bookmarking, social-tagging, study, tag, tagging, tags, web2.0, wi, yahoo (show tag cloud)
The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries
My paper “The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries” has been accepted for publication and presentation at this year’s IEEE/WIC/ACM International Conference on Web Intelligence (WI) which will be held in Sydney, Australia, from December 09 – 12, 2008.
Abstract
In this paper, we study and compare three different but related types of “metadata” about web documents: social annotations provided by readers of web documents, hyperlink anchor text provided by authors of web documents, and search queries of users trying to find web documents. We introduce a large research data set called CABS120k08 which we have created for this study from a variety of information sources such as AOL500k, the Open Directory Project, del.icio.us/Yahoo!, Google and the WWW in general. We use this data set to investigate several characteristics of said metadata including length, novelty, diversity, and similarity and discuss theoretical and practical implications.
Full Paper & Presentation
- M. G. Noll, C. Meinel
The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries (PDF)
Proceedings of 7th IEEE/WIC/ACM International Conference on Web Intelligence (WI), IEEE CS Press, Sydney, Australia, December 2008, pp. 640-647, ISBN 978-0-7695-3496-1 (IEEE Link, BibTeX) - Presentation: The Metadata Triumvirate (PDF), my talk at WI 2008

Related Links
- List of my publications
- Exploring Social Annotations for Web Document Classification, Proceedings of 23rd Int’l ACM Symposium on Applied Computing, Fortaleza, Ceará, Brazil, March 2008, pp. 2315-2320, ISBN 978-1-59593-753-7
- Authors vs. Readers: A Comparative Study of Document Metadata and Content in the WWW, Proceedings of 7th Intl’l ACM Symposium on Document Engineering (ACM DocEng), Winnipeg, Canada, August 2007, pp. 177-186, ISBN 978-1-59593-776-6
- 7th IEEE/WIC/ACM International Conference on Web Intelligence (WI), Sydney, Australia, December 2008
- CABS120k08, a large research data set about Web metadata based on a sample of 120,000 web documents with data retrieved from the Open Directory Project, the AOL Search query log corpus AOL500k, Google PageRank, Delicious.com, and anchor text from incoming hyperlinks
- DMOZ100k06, a large research data set about document metadata based on a random sample of 100,000 web documents
Comments are closed
Comments are closed on this entry for protection against spam. If you want to send me feedback, just contact me.