About this entry

Tutorial: Writing An Hadoop MapReduce Program In Python

I finished another tutorial in my Hadoop series:
Writing An Hadoop MapReduce Program In Python.

In this hands-on tutorial, I will describe how to write a simple MapReduce program for Hadoop in the Python programming language without using Jython to translate our code to Java jar files. This means it’s the most Pythonic and straight-forward way that I know of to write a MapReduce program for the Hadoop framework. I tried to keep the code as readable and understandable as possible while using not more than 25 lines of code (excluding comments).

Of course, the tutorial also explains how to prepare and run the Python MapReduce program on an Hadoop cluster.