Kerberos Authentication for Hadoop

CloverDX Designer > Graphs > Connections > Hadoop Connections > Kerberos Authentication for Hadoop

For user authentication in Hadoop, CloverDX can use the Kerberos authentication protocol.

To use Kerberos, you have to set up your Java, project and HDFS connection. For more information, see Kerberos requirements and setting.

Note that the following instructions are applicable for Tomcat application server and Unix-like systems.

Java Setting

There are several ways of setting Java for Kerberos. In the case of the first two options (configuration via system properties and via configuration file), you must modify both setenv.sh in CloverDX Server and CloverDXDesigner.ini in CloverDX Designer.

Additionally, add the parameters in CloverDX Designer to Window Preferences CloverDX Runtime VM parameters pane.

Configuration via system properties Set the Java system property java.security.krb5.realm to the name of your Kerberos realm, for example:
```
-Djava.security.krb5.realm=EXAMPLE.COM
```
Set the Java system property java.security.krb5.kdc to the hostname of your Kerberos key distribution center, for example:
```
-Djava.security.krb5.kdc=kerberos.example.com
```
Configuration via config file Set the Java system property java.security.krb5.conf to point to the location of your Kerberos configuration file, for example:
```
-Djava.security.krb5.conf="/path/to/krb5.conf"
```
Configuration via config file in Java installation directory Put the krb5.conf file into the %JAVA_HOME%/lib/security directory, e.g. /opt/jdk-11.0.14+9/lib/security/krb5.conf.
Project Setting
Copy the .keytab file into the project, e.g. conn/clover.keytab.

Connection Setting

Kerberos authentication requires the hadoop-auth-*.jar library on both HDFS + MapReduce and Hive connection classpath.

HDFS and MapReduce Connection

Set Username to the principal name, e.g. clover/clover@EXAMPLE.COM.

Set the following parameters in the Hadoop Parameters pane:

cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab
hadoop.security.authentication=Kerberos
yarn.resourcemanager.principal=yarn/_HOST@EXAMPLE.COM

Example 10. Properties needed to connect to a Hadoop High Availability (HA) cluster in Hadoop connection

mapreduce.app-submission.cross-platform\=true

yarn.application.classpath\=\:$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/*\:
yarn.app.mapreduce.am.resource.mb\=512
mapreduce.map.memory.mb\=512
mapreduce.reduce.memory.mb\=512
mapreduce.framework.name\=yarn
yarn.log.aggregation-enable\=true

mapreduce.jobhistory.address\=example.com\:port

yarn.resourcemanager.ha.enabled\=true
yarn.resourcemanager.ha.rm-ids\=rm1,rm2
yarn.resourcemanager.hostname.rm1\=example.com
yarn.resourcemanager.hostname.rm2\=example.com
yarn.resourcemanager.scheduler.address.rm1\=example.com\:port
yarn.resourcemanager.scheduler.address.rm2\=example.com\:port

fs.permissions.umask-mode\=000
fs.defaultFS\=hdfs\://nameservice1
fs.default.name\=hdfs\://nameservice1
fs.nameservices\=nameservice1
fs.ha.namenodes.nameservice1\=namenode1,namenode2
fs.namenode.rpc-address.nameservice1.namenode1\=example.com\:port
fs.namenode.rpc-address.nameservice1.namenode2\=example.com\:port
fs.client.failover.proxy.provider.nameservice1\=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

type=HADOOP
host=nameservice1
username=clover/clover@EXAMPLE.COM
hostMapred=Not needed for YARN

The _HOST string in yarn/_HOST@EXAMPLE.COM and hive/_HOST@EXAMPLE.COM is a placeholder that will be automatically replaced with an actual hostname. This is the recommended way that will work even with high-availability Hadoop cluster setup.

If you encounter an error: No common protection layer between client and server set the hadoop.rpc.protection parameter to match your Hadoop cluster configuration.

Hive Connection
1. Add ;principal=hive/_HOST@EXAMPLE.COM to the URL, e.g. jdbc:hive2://hive.example.com:10000/default;principal=hive/_HOST@EXAMPLE.COM
2. Set User to the principal name, e.g. clover/clover@EXAMPLE.COM
3. Set cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab in Advanced JDBC properties.

{{{ highlightedName }}}

Kerberos Authentication for Hadoop