Starting and Testing HDFS and MapReduce

In order to check that the configuration has been set up correctly, start up HDFS and MapReduce and try to run some simple jobs.

  1. On the NameNode host:

    su $HDFS_USER
    /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start namenode
  2. On the SecondaryNameNode host:

    su $HDFS_USER
    /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start secondarynamenode
  3. On each DataNode host:

    su root
    /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode
    [Note] Note

    You must start DataNodes as root because they use a privileged port.

  4. On the JobTracker host:

    /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start jobtracker
  5. On the History Server host:

    /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start historyserver
  6. On each TaskTracker host:

    /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start tasktracker
  7. Open Kerberos administration. On the KDC:

  8. Create a test hdfs principal. At the kadmin.local prompt:

    addprinc hdfs@EXAMPLE.COM

    You are asked to enter and confirm a password for the principal.

  9. Exit kadmin.local:

  10. On a machine in the cluster, switch to the hdfs UNIX user and then log into Kerberos:

    su $HDFS_USER
  11. Run some sample MapReduce jobs, like teragen and terasort:

    /usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop/hadoop-examples.jar teragen 100000
    /usr/lib/hadoop/bin/hadoop jar /usr/lib/hadoop/hadoop-examples.jar terasort 
                                    /test/10gsort/input /test/10gsort/output

loading table of contents...