Rajinder Sandhu: Running "WordCount" Map Reduce Job in Hadoop 1.0.3

Thursday, February 6, 2014

Running "WordCount" Map Reduce Job in Hadoop 1.0.3

This post will explain the steps required to run WordCount map reduce job in Hadoop v 1.0.3.

Create a folder to store files. Word will be counted from these files. For current setup we have three books in plain text format.

su - hduser
mkdir /tmp/sandhu

2. Copy three files to /tmp/sandhu folder. Check it using following command.

cd /tmp/sandhu
ls -l

output will look like:

3. Start the Hadoop Cluster:

/home/hadoop/bin/hadoop/start-all.sh

4. Before we run the actual MapReduce job, we first have to copy the files from our local file system to Hadoop’s HDFS.

cd /home/hadoop
bin/hadoop dfs -copyFromLocal /tmp/sandhu /home/hduser/sandhu

Check that files are correctly copied to HDFS by following command.

bin/hadoop dfs -ls /home/hduser/sandhu

output will look like:

5. Now, we actually run the WordCount example job.

bin/hadoop jar hadoop*examples*.jar wordcount /home/hduser/sandhu /home/hduser/sandhu-output

Output will be like:

6. Retrieve the job result from HDFS

bin/hadoop dfs -cat /user/hduser/sandhu-output/part-r-00000

7. Hadoop API's

http://localhost:50070/ – web UI of the NameNode daemon
http://localhost:50030/ – web UI of the JobTracker daemon
http://localhost:50060/ – web UI of the TaskTracker daemon

1 comment:

UnknownAugust 30, 2015 at 1:00 PM
There are lots of information about latest technology and how to get trained in them, like Big Data Training in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Big Data Training). By the way you are running a great blog. Thanks for sharing this.

Hadoop Training in Chennai | Big Data Training in Chennai
ReplyDelete
Replies

Subscribe to: Post Comments (Atom)