-Djava.security.krb5.realm=EXAMPLE.COM
Kerberos Authentication for Hadoop
For user authentication in Hadoop, CloverDX can use the Kerberos authentication protocol.
To use Kerberos, you have to set up your Java, project and HDFS connection. For more information, see Kerberos requirements and setting.
Note that the following instructions are applicable for Tomcat application server and Unix-like systems.
There are several ways of setting Java for Kerberos.
In the case of the first two options (configuration via system properties and via configuration file), you must modify both setenv.sh
in CloverDX Server and CloverDXDesigner.ini
in CloverDX Designer.
Additionally, add the parameters in CloverDX Designer to pane.
-
Configuration via system properties Set the Java system property
java.security.krb5.realm
to the name of your Kerberos realm, for example:Set the Java system property
java.security.krb5.kdc
to the hostname of your Kerberos key distribution center, for example:-Djava.security.krb5.kdc=kerberos.example.com
-
Configuration via config file Set the Java system property
java.security.krb5.conf
to point to the location of your Kerberos configuration file, for example:-Djava.security.krb5.conf="/path/to/krb5.conf"
-
Configuration via config file in Java installation directory Put the
krb5.conf
file into the%JAVA_HOME%/lib/security
directory, e.g./opt/jdk-11.0.14+9/lib/security/krb5.conf
.Project Setting -
Copy the
.keytab
file into the project, e.g.conn/clover.keytab
.
Kerberos authentication requires the |
-
HDFS and MapReduce Connection
-
Set Username to the principal name, e.g.
clover/clover@EXAMPLE.COM
. -
Set the following parameters in the Hadoop Parameters pane:
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab hadoop.security.authentication=Kerberos yarn.resourcemanager.principal=yarn/_HOST@EXAMPLE.COM
Example 10. Properties needed to connect to a Hadoop High Availability (HA) cluster in Hadoop connectionmapreduce.app-submission.cross-platform\=true yarn.application.classpath\=\:$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/*, $HADOOP_COMMON_HOME/lib/*, $HADOOP_HDFS_HOME/*, $HADOOP_HDFS_HOME/lib/*, $HADOOP_MAPRED_HOME/*, $HADOOP_MAPRED_HOME/lib/*, $HADOOP_YARN_HOME/*, $HADOOP_YARN_HOME/lib/*\: yarn.app.mapreduce.am.resource.mb\=512 mapreduce.map.memory.mb\=512 mapreduce.reduce.memory.mb\=512 mapreduce.framework.name\=yarn yarn.log.aggregation-enable\=true mapreduce.jobhistory.address\=example.com\:port yarn.resourcemanager.ha.enabled\=true yarn.resourcemanager.ha.rm-ids\=rm1,rm2 yarn.resourcemanager.hostname.rm1\=example.com yarn.resourcemanager.hostname.rm2\=example.com yarn.resourcemanager.scheduler.address.rm1\=example.com\:port yarn.resourcemanager.scheduler.address.rm2\=example.com\:port fs.permissions.umask-mode\=000 fs.defaultFS\=hdfs\://nameservice1 fs.default.name\=hdfs\://nameservice1 fs.nameservices\=nameservice1 fs.ha.namenodes.nameservice1\=namenode1,namenode2 fs.namenode.rpc-address.nameservice1.namenode1\=example.com\:port fs.namenode.rpc-address.nameservice1.namenode2\=example.com\:port fs.client.failover.proxy.provider.nameservice1\=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider type=HADOOP host=nameservice1 username=clover/clover@EXAMPLE.COM hostMapred=Not needed for YARN
The
_HOST
string inyarn/_HOST@EXAMPLE.COM
andhive/_HOST@EXAMPLE.COM
is a placeholder that will be automatically replaced with an actual hostname. This is the recommended way that will work even with high-availability Hadoop cluster setup. -
If you encounter an error:
No common protection layer between client and server
set thehadoop.rpc.protection
parameter to match your Hadoop cluster configuration.
-
-
Hive Connection
-
Add
;principal=hive/_HOST@EXAMPLE.COM
to the URL, e.g.jdbc:hive2://hive.example.com:10000/default;principal=hive/_HOST@EXAMPLE.COM
-
Set User to the principal name, e.g.
clover/clover@EXAMPLE.COM
-
Set
cloveretl.hadoop.kerberos.keytab=${CONN_DIR}/clover.keytab
in Advanced JDBC properties.
-