Hadoop Tutorial : Map Reduce Introduction and Internal Data flow

Hadoop Tutorial : Map Reduce Introduction and Internal Data flow

This tutorial talks about Map reduce programming paradigm used widely in the Big Data analytics arena. We will also run through an example step by step to understand various mechanisms involved.

Continue reading “Hadoop Tutorial : Map Reduce Introduction and Internal Data flow”

Advertisements

Hadoop : WordCount with Custom Mapper and Reducer

So here is the next article in series. In the last post we learnt how to write wordcount without using explicit custom mappers or reducers. You can find the post here

Today we will go a step ahead and we will rewrite the same wordcount program by writing our own custom mappers as well as reducers.

We will use 2 classes in addition to our wordcount class.

Class WCMap Continue reading “Hadoop : WordCount with Custom Mapper and Reducer”

Hadoop : A wordcount without explicit mapper/reducer

So I was trying out my first hadoop program and I was little wary of writing mapper and reducer. But I still wanted to write the program to give me the word count for all words in the input files

So I wrote a driver program of hadoop with map class as a TokenCounterMapper Class. This class is provided by hadoop and it tokenizes the input text and emits each word with count 1.

Just the recipe that I ordered …

Now I needed a reducer which Continue reading “Hadoop : A wordcount without explicit mapper/reducer”