Open Source Software

You can find my open source software projects primarily under my GitHub account. But because nowadays I work full-time on Apache Kafka there’s not much time left for personal projects.

Example projects:

  • Wirbelsturm – Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data related infrastructure such as Apache Kafka or Apache Storm.
  • avro-hadoop-starter – Example MapReduce jobs in Java, Hadoop Streaming, Pig and Hive that work on Avro data
  • Replephant – A Clojure library to perform interactive analysis of Hadoop cluster usage via REPL and to generate usage reports.
  • ruby-bootstrap – Bootstrap an installation of rvm, Ruby, bundler and any defined gems in a Ruby project directory
  • DevOps related projects such as puppet-storm and puppet-kafka
  • Cookiemonster – Strip cookies from XMLHttpRequests in Mozilla Firefox, using JavaScript and XUL.


SPEAR Ranking Algorithm

Ching-man Au Yeung and I have developed the SPEAR ranking algorithm for ranking users in social networks by their expertise and influence within the community.

SPEAR Algorithm: Discoverers and Followers

Our work on SPEAR has been covered by international press and news media, including Technology Review and Communications of the ACM as well as the world’s top technology blogs such as ReadWriteWeb. We have been also very happy that Yahoo! invited us to write a featured article about SPEAR, considering that we used its social bookmarking service (which was owned by Yahoo back then) as data source for our scientific experiments.

For starters, you want to read my quick introduction on how to use the SPEAR Python library.