So I was trying out my first hadoop program and I was little wary of writing mapper and reducer. But I still wanted to write the program to give me the word count for all words in the input files
So I wrote a driver program of hadoop with map class as a TokenCounterMapper Class. This class is provided by hadoop and it tokenizes the input text and emits each word with count 1.
Just the recipe that I ordered …
Now I needed a reducer which could actually count .. So I used IntSumReducer class which sums the values in the input list of the reducer and outputs to context.
Bingo and the program does what it is supposed to do . Counting words…
Here is the listing…
public class wordcount extends Configured implements Tool{ public static void main(String[] args) throws Exception { Configuration configuration = new Configuration(); ToolRunner.run(configuration, new wordcount(),args); } @Override public int run(String[] arg0) throws Exception { Job job = new Job(); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setJarByClass(wordcount.class); job.setMapperClass(TokenCounterMapper.class); job.setReducerClass(IntSumReducer.class); FileInputFormat.addInputPath(job, new Path(arg0[0])); FileOutputFormat.setOutputPath(job,new Path(arg0[1])); job.submit(); int rc = (job.waitForCompletion(true)?1:0); return rc; } }
the class names are changed .. the function within remains same… nice try š
LikeLike
can you tell me the procedure(step) how to execute mapreduce program using eclips and how can we use it in hadoop .it will help.
LikeLike
Hello, just wanted to mention, I liked thhis
blog post. It was practical. Keep on posting!
LikeLike