Hadoop Tutorial : Getting started with Hadoop and Mapreduce

Hadoop Mapreduce Logo

This tutorial talks about various resources that you can use to leard about hadoop and map reduce. It also talks about how you can think about learning about Big data as a subject in totality

Studying Hadoop or MapReduce can be a daunting task if you get your hand dirty at the start.

Some of the prerequisites for learning Hadoop are having a good experience in Java. Good Analytical skills help a lot as well and final secret sauce for being successful is – you need to be motivated to self learn lot of things in the bigdata arena.

For Learning Hadoop ,I followed the schedule as follows :

  1. Start with very basics of MR with code.google.com/edu/parallel/dsd-tutorial.html code.google.com/edu/parallel/mapreduce-tutorial.html
  2. Then go for the first two lectures in http://www.cs.washington.edu/education/courses/cse490h/08au/lectures.htm A very good course intro to MapReduce and Hadoop.
  3. Read the seminal paper labs.google.com/papers/mapreduce.html and its improvements in the updated version http://www.cs.washington.edu/education/courses/cse490h/08au/readings/communications200801-dl.pdf
  4. Then go for all the other videos in the U.Washington link given above.
  5. Try youtubing the terms Map reduce and hadoop to find videos by ORielly and Google RoundTable for good overview of the future of Hadoop and MapReduce
  6. Then off to the most important videos –
    Cloudera Videos
    Google MiniLecture Series

Along with all the Multimedia above we need good written material

  1. Architecture diagrams at hadooper.blogspot.com are good to have on your wall
  2. Hadoop: The definitive guide goes more into the nuts and bolts of the whole system where as Hadoop in Action is a good read with lots of teaching examples to learn the concepts of hadoop. Pro Hadoop is not for beginners
  3. pdfs of the documentation from Apache Foundation
    and hadoop.apache.org/common/docs/stable/
    will help you learn as to how model your problem into a MR solution in order to gain the advantages of Hadoop in total.
  4. HDFS paper by Yahoo! Research is also a good read in order to gain in depth knowledge of hadoop
  5. Subscribe to the User Mailing List of Commons, MapReduce and HDFS in order to know problems, solutions and future solutions.
  6. Try the http://developer.yahoo.com/hadoop/tutorial/module1.html link for beginners to expert path to Hadoop

In Addition following 2 books are good resources:

  • Hadoop – Definitive Guide
  • Hadoop in Action

For Any Queries …
Contact Apache, Google, Bing, Yahoo!

11 thoughts on “Hadoop Tutorial : Getting started with Hadoop and Mapreduce

  1. Hello I’m a network admin Windows side looking into getting into Hadoop training I have no knowledge except reading about it as well no programming skills, I see Java is required and understanding of it . What Java course u recommend I take ( intro to java programming or some other one in java)? Would java be good only then I can take Hadoop Administration course or I need Linux experenince as well? If u can please let me know thanks.


    1. Hi Mehul,

      MapReduce and HDFS is not about Language, but more about conceptually understanding how distributed components work. If you are good with Computer Science, and algorithms, language won’t be a difficult thing to pick up for Hadoop. Also, as hadoop supports hadoop-streaming, you are not bound by the lanuage. I’ll prefer Python as its easier to learn and has good learning resources around it (http://www.codecademy.com/tracks/python).

      If you want more tutorials on hadoop internals, you would be interested in looking at the following links –

      A presentation I gave in the colleges at Solapur – You’d be interested in going through it – http://www.slideshare.net/VaradMeru/big-data-hadoop-nosql-and-more

      Please find the links of the blog articles I’ve written. Its in the logical flow –

      Hadoop Setup on a Single Node (for Dev)

      Eclipse Setup for programming

      Step-by-Step MapReduce Programming

      Pig for Beginners

      Hive for Beginners


  2. Hi,
    I recently started my job search, and heard about hadoop in recent times. Before hearing about Hadoop I thought of getting trained in Java. Now, I’m in a dilemma to choose between Java Developer/ Hadoop. I need some advice regarding this… If we know java can we get ourself trained in Hadoop and Do Hadoop really need hands on experience in Java. I just know oops concepts in Java..

    Please help me out


  3. hello….sir
    i had one question for u…
    can any one answer this question …..
    why we are using Writables in hadoop for data transformation …..through networks …?
    already we have a serialization in java ….?
    any one can…?


  4. Hello Sir I’m a System admin on Windows side and i am looking into getting into Hadoop.
    I have no very less programming skills, I dont know much about Java

    Can I take Hadoop Administration course or I need to learn Jave & Linux First?


  5. Hi,
    I recently started my job search, and heard about hadoop in recent times. Before hearing about Hadoop I thought of getting trained in Java. Now, I’m in a dilemma to choose between Java Developer/ Hadoop. I need some advice regarding this I had worked on SAP and Tally ealier


Leave a Reply to harikrisha Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s