Kerberos Authentication for Hadoop
For user authentication in Hadoop, CloverDX can use the Kerberos authentication protocol.
To use Kerberos, you have to set up your Java, project and HDFS connection. For more information, see Kerberos requirements and setting.
Note that the following instructions are applicable for Tomcat application server and Unix-like systems.
Java Setting
There are several ways of setting Java for Kerberos.
In the case of the first two options (configuration via system properties and via configuration file),
you must modify both setenv.sh
in CloverDX Server
and CloverDXDesigner.ini
in CloverDX Designer.
Additionally, add the parameters in CloverDX Designer to → → → pane.
Configuration via system properties
Set the Java system property
java.security.krb5.realm
to the name of your Kerberos realm, for example:-Djava.security.krb5.realm=EXAMPLE.COM
Set the Java system property
java.security.krb5.kdc
to the hostname of your Kerberos key distribution center, for example:-Djava.security.krb5.kdc=kerberos.example.com
Configuration via config file
Set the Java system property
java.security.krb5.conf
to point to the location of your Kerberos configuration file, for example:-Djava.security.krb5.conf="/path/to/krb5.conf"
Configuration via config file in Java installation directory
Put the
krb5.conf
file into the%JAVA_HOME%/lib/security
directory, e.g./opt/jdk1.8.0_144/jre/lib/security/krb5.conf
.Note If you are using AES256 in Kerberos, install JCE unlimited strength policy files into Java installation: Java 8
For more information, see the
README.txt
in the downloaded zip archive.
Project Setting
-
Copy the
.keytab
file into the project, e.g.conn/clover.keytab
.
Connection Setting
Note | |
---|---|
Kerberos authentication requires the |
HDFS and MapReduce Connection
-
Set Username to the principal name,
e.g.
clover/clover@EXAMPLE.COM
. Set the following parameters in the Hadoop Parameters pane:
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab hadoop.security.authentication=Kerberos yarn.resourcemanager.principal=yarn/_HOST@EXAMPLE.COM
Example 32.1. Properties needed to connect to a Hadoop High Availability (HA) cluster in Hadoop connection
mapreduce.app-submission.cross-platform\=true yarn.application.classpath\=\:$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/*\: yarn.app.mapreduce.am.resource.mb\=512 mapreduce.map.memory.mb\=512 mapreduce.reduce.memory.mb\=512 mapreduce.framework.name\=yarn yarn.log.aggregation-enable\=true mapreduce.jobhistory.address\=example.com\:port yarn.resourcemanager.ha.enabled\=true yarn.resourcemanager.ha.rm-ids\=rm1,rm2 yarn.resourcemanager.hostname.rm1\=example.com yarn.resourcemanager.hostname.rm2\=example.com yarn.resourcemanager.scheduler.address.rm1\=example.com\:port yarn.resourcemanager.scheduler.address.rm2\=example.com\:port fs.permissions.umask-mode\=000 fs.defaultFS\=hdfs\://nameservice1 fs.default.name\=hdfs\://nameservice1 fs.nameservices\=nameservice1 fs.ha.namenodes.nameservice1\=namenode1,namenode2 fs.namenode.rpc-address.nameservice1.namenode1\=example.com\:port fs.namenode.rpc-address.nameservice1.namenode2\=example.com\:port fs.client.failover.proxy.provider.nameservice1\=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider type=HADOOP host=nameservice1 username=clover/clover@EXAMPLE.COM hostMapred=Not needed for YARN
Tip The
_HOST
string inyarn/_HOST@EXAMPLE.COM
andhive/_HOST@EXAMPLE.COM
is a placeholder that will be automatically replaced with an actual hostname. This is the recommended way that will work even with high-availability Hadoop cluster setup.If you encounter an error:
No common protection layer between client and server
set the
hadoop.rpc.protection
parameter to match your Hadoop cluster configuration.
-
Set Username to the principal name,
e.g.
Hive Connection
Add
;principal=hive/_HOST@EXAMPLE.COM
to the URL, e.g.jdbc:hive2://hive.example.com:10000/default;principal=hive/_HOST@EXAMPLE.COM
-
Set User to the principal name,
e.g.
clover/clover@EXAMPLE.COM
-
Set
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab
in Advanced JDBC properties.