2.3.1. Configuring Secure HDFS and MapReduce

Use the following instructions to configure, start and test secure HDFS and MapReduce:

  1. Using a text editor, edit the /etc/hadoop/conf/core-site.xml file on every host in your cluster, to add or modify the following information:

    [Note] Note

    Be sure to set the hadoop.security.auth_to_local key with your mapping rules.

    <property>   
            <name>hadoop.security.authentication</name>   
            <value>kerberos</value>   
            <description>Set the authentication for the cluster. Valid values are: simple or kerberos.   
            </description>  
    </property> 

    <property>  
            <name>hadoop.security.authorization</name>  
            <value>true</value>  
            <description>Enable authorization for different protocols.  
            </description> 
    </property>  

    <property>
        
            <name>hadoop.security.auth_to_local</name>    
            <value>RULE:[2:$1@$0]([jt]t@.*EXAMPLE.COM)s/.*/$MAPREDUCE_USER/
    RULE:[2:$1@$0]([nd]n@.*EXAMPLE.COM)s/.*/$HDFS_USER/
    DEFAULT </value> 
            <description>The mapping from Kerberos principal names to local OS user names. </description>
    </property>

    For mapping Kerberos principal names to local OS user names, see Creating Mappings Between Principals and UNIX Usernames.

  2. Using a text editor, edit the /etc/hadoop/conf/hdfs-site.xml file on every host in your cluster, to add or modify the following information:

    <property>
            <name>dfs.block.access.token.enable</name> 
            <value>true</value> 
            <description> If "true", access tokens are used as capabilities
            for accessing datanodes. If "false", no access tokens are checked on
            accessing datanodes. </description> 
    </property> 

    <property>
            <name>dfs.namenode.kerberos.principal</name> 
            <value>nn/_HOST@EXAMPLE.COM</value> 
            <description> Kerberos principal name for the
            NameNode </description> 
    </property>   

    <property>
            <name>dfs.secondary.namenode.kerberos.principal</name> 
            <value>nn/_HOST@EXAMPLE.COM</value>    
            <description>Kerberos principal name for the secondary NameNode.    
            </description>          
    </property>  

    <property>
            <name>dfs.web.authentication.kerberos.keytab</name> 
            <value>/etc/security/keytabs/spnego.service.keytab</value>    
            <description>The Kerberos keytab file with the credentials for the
      	HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.    
            </description>          
    </property>  

    <property>
            <name>dfs.namenode.keytab.file</name> 
            <value>/etc/security/keytabs/nn.service.keytab</value>    
            <description>Combined keytab file containing the namenode service and host principals.   
            </description>          
    </property>  

    <property>
            <name>dfs.secondary.namenode.keytab.file</name> 
            <value>/etc/security/keytabs/nn.service.keytab</value>    
            <description>Combined keytab file containing the namenode service and host principals.   
            </description>          
    </property>  

    <property>
            <name>dfs.datanode.keytab.file</name> 
            <value>/etc/security/keytabs/dn.service.keytab</value>    
            <description>The filename of the keytab file for the DataNode.   
            </description>          
    </property>  

    <property>
            <name>dfs.datanode.kerberos.principal</name> 
            <value>dn/_HOST@EXAMPLE.COM</value>    
            <description>The Kerberos principal that the DataNode runs as. "_HOST" is replaced by the real host name.
             </description>          
    </property>  

    <property>
            <name>dfs.datanode.address</name> 
            <value>0.0.0.0:1019</value>           
    </property>  

    <property>
            <name>dfs.datanode.http.address</name> 
            <value>0.0.0.0:1022</value>           
    </property>  

    <property>
            <name>dfs.namenode.kerberos.internal.spnego.principal</name> 
            <value>${dfs.web.authentication.kerberos.principal}</value>           
    </property>  

    <property>
            <name>dfs.secondary.namenode.kerberos.internal.spnego.principal</name> 
            <value>>${dfs.web.authentication.kerberos.principal}</value>           
    </property>  

    <property>
            <name>dfs.web.authentication.kerberos.principal</name> 
            <value>dn/_HOST@EXAMPLE.COM</value>    
            <description>The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.
             </description>          
    </property>  
  3. Using a text editor, edit the /etc/hadoop/conf/mapred-site.xml file on every host in your cluster to add or modify the following information:

    <property>  
            <name>mapreduce.jobtracker.kerberos.principal</name>  
            <value>jt/_HOST@EXAMPLE.COM</value>  
            <description>Kerberos principal name for the JobTracker   </description> 
    </property>  

    <property>  
            <name>mapreduce.tasktracker.kerberos.principal</name>   
            <value>tt/_HOST@EXAMPLE.COM</value>  
            <description>Kerberos principal name for the TaskTracker."_HOST" is replaced by the host name of the TaskTracker.  
            </description> 
    </property> 

    <property>   
            <name>mapreduce.jobtracker.keytab.file</name>   
            <value>/etc/security/keytabs/jt.service.keytab</value>   
            <description>The keytab for the JobTracker principal.   
            </description>   
    </property>

    <property>   
            <name>mapreduce.tasktracker.keytab.file</name>   
            <value>/etc/security/keytabs/tt.service.keytab</value>    
            <description>The filename of the keytab for the TaskTracker</description>  
    </property>

    <property>    
            <name>mapreduce.jobhistory.kerberos.principal</name>     
            <!--cluster variant -->  
            <value>jt/_HOST@EXAMPLE.COM</value>    
            <description> Kerberos principal name for JobHistory. 
                          This must map to the same user as the JobTracker user ($MAPREDUCE_USER).
            </description>  
    </property> 

    <property>   
            <name>mapreduce.jobhistory.keytab.file</name>     
            <!--cluster variant -->   
            <value>/etc/security/keytabs/jt.service.keytab</value>   
            <description>The keytab for the JobHistory principal.
            </description>  
    </property>   


loading table of contents...