Apache Hadoop is a free and open source implementation of frameworks for reliable, scalable, distributed computing and data storage. It enables applications to work with thousands of nodes and petabytes of data, and as such is a great tool for research and business operations. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.
I have written the following tutorials related to the Hadoop technology stack:
- Writing An Hadoop MapReduce Program In Python
- Running Hadoop On Ubuntu Linux (Single-Node Cluster)
- Running Hadoop On Ubuntu Linux (Multi-Node Cluster)
Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. It is being used at companies and institutions such as Twitter, Yahoo! and PARC.
I have written the following tutorials related to the Storm:
- Running a Multi-Node Storm Cluster
- Implementing Real-Time Trending Topics with a Distributed Rolling Count Algorithm in Storm
- Understanding the Parallelism of a Storm Topology
If you are a Firefox add-on developer, the tutorials below might come in handy.
- Cookie Monster for XMLHttpRequest