Kerberos Authentication for Hadoop
For user authentication in Hadoop, CloverDX can use the Kerberos authentication protocol.
To use Kerberos, you have to set up your Java, project and HDFS connection. For more information, see Kerberos requirements and setting.
Note that the following instructions are applicable for Tomcat application server and Unix-like systems.
Java Setting
There are several ways of setting Java for Kerberos.
In the case of the first two options (configuration via system properties and via configuration file),
you must modify both setenv.sh in CloverDX Server
and CloverDXDesigner.ini in CloverDX Designer.
Additionally, add the parameters in CloverDX Designer to → → → pane.
Configuration via system properties
Set the Java system property
java.security.krb5.realmto the name of your Kerberos realm, for example:-Djava.security.krb5.realm=EXAMPLE.COM
Set the Java system property
java.security.krb5.kdcto the hostname of your Kerberos key distribution center, for example:-Djava.security.krb5.kdc=kerberos.example.com
Configuration via config file
Set the Java system property
java.security.krb5.confto point to the location of your Kerberos configuration file, for example:-Djava.security.krb5.conf="/path/to/krb5.conf"
Configuration via config file in Java installation directory
Put the
krb5.conffile into the%JAVA_HOME%/lib/securitydirectory, e.g./opt/jdk1.8.0_144/jre/lib/security/krb5.conf.![[Note]](../figures/note.png)
Note If you are using AES256 in Kerberos, install JCE unlimited strength policy files into Java installation: Java 8
For more information, see the
README.txtin the downloaded zip archive.
Project Setting
-
Copy the
.keytabfile into the project, e.g.conn/clover.keytab.
Connection Setting
![]() | Note |
|---|---|
Kerberos authentication requires the |
HDFS and MapReduce Connection
-
Set Username to the principal name,
e.g.
clover/clover@EXAMPLE.COM. Set the following parameters in the Hadoop Parameters pane:
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab hadoop.security.authentication=Kerberos yarn.resourcemanager.principal=yarn/_HOST@EXAMPLE.COMExample 32.1. Properties needed to connect to a Hadoop High Availability (HA) cluster in Hadoop connection
mapreduce.app-submission.cross-platform\=true yarn.application.classpath\=\:$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/*\: yarn.app.mapreduce.am.resource.mb\=512 mapreduce.map.memory.mb\=512 mapreduce.reduce.memory.mb\=512 mapreduce.framework.name\=yarn yarn.log.aggregation-enable\=true mapreduce.jobhistory.address\=example.com\:port yarn.resourcemanager.ha.enabled\=true yarn.resourcemanager.ha.rm-ids\=rm1,rm2 yarn.resourcemanager.hostname.rm1\=example.com yarn.resourcemanager.hostname.rm2\=example.com yarn.resourcemanager.scheduler.address.rm1\=example.com\:port yarn.resourcemanager.scheduler.address.rm2\=example.com\:port fs.permissions.umask-mode\=000 fs.defaultFS\=hdfs\://nameservice1 fs.default.name\=hdfs\://nameservice1 fs.nameservices\=nameservice1 fs.ha.namenodes.nameservice1\=namenode1,namenode2 fs.namenode.rpc-address.nameservice1.namenode1\=example.com\:port fs.namenode.rpc-address.nameservice1.namenode2\=example.com\:port fs.client.failover.proxy.provider.nameservice1\=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider type=HADOOP host=nameservice1 username=clover/clover@EXAMPLE.COM hostMapred=Not needed for YARN
![[Tip]](../figures/tip.png)
Tip The
_HOSTstring inyarn/_HOST@EXAMPLE.COMandhive/_HOST@EXAMPLE.COMis a placeholder that will be automatically replaced with an actual hostname. This is the recommended way that will work even with high-availability Hadoop cluster setup.If you encounter an error:
No common protection layer between client and serverset the
hadoop.rpc.protectionparameter to match your Hadoop cluster configuration.
-
Set Username to the principal name,
e.g.
Hive Connection
Add
;principal=hive/_HOST@EXAMPLE.COMto the URL, e.g.jdbc:hive2://hive.example.com:10000/default;principal=hive/_HOST@EXAMPLE.COM-
Set User to the principal name,
e.g.
clover/clover@EXAMPLE.COM -
Set
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytabin Advanced JDBC properties.