Version

    Libraries Needed for Hadoop

    Hadoop components need to have Hadoop libraries accessible from CloverDX. The libraries are needed by HadoopReader, HadoopWriter, ExecuteMapReduce, HDFS and Hive.

    The Hadoop libraries are necessary to establish a Hadoop connection, see Hadoop connection.

    The officially supported version of Hadoop is Cloudera 5 version 5.6.0. Other versions close to this one might work, but the compatibility is not guaranteed.

    Cloudera 5

    The below mentioned libraries are needed for the connection to Cloudera 5.

    Common libraries
    • hadoop-common-2.6.0-cdh5.6.0.jar

    • hadoop-auth-2.6.0-cdh5.6.0.jar

    • guava-15.0.jar

    • avro-1.7.6-cdh5.6.0.jar

    • htrace-core4-4.0.1-incubating.jar

    • servlet-api-3.0.jar

    HDFS
    • hadoop-hdfs-2.6.0-cdh5.6.0.jar

    • protobuf-java-2.5.0.jar

    MapReduce
    • hadoop-annotations-2.6.0-cdh5.6.0.jar

    • hadoop-mapreduce-client-app-2.6.0-cdh5.6.0.jar

    • hadoop-mapreduce-client-common-2.6.0-cdh5.6.0.jar

    • hadoop-mapreduce-client-core-2.6.0-cdh5.6.0.jar

    • hadoop-mapreduce-client-hs-2.6.0-cdh5.6.0.jar

    • hadoop-mapreduce-client-jobclient-2.6.0-cdh5.6.0.jar

    • hadoop-mapreduce-client-shuffle-2.6.0-cdh5.6.0.jar

    • jackson-core-asl-1.9.2.jar

    • jackson-mapper-asl-1.9.12.jar

    • hadoop-yarn-api-2.6.0-cdh5.6.0.jar

    • hadoop-yarn-client-2.6.0-cdh5.6.0.jar

    • hadoop-yarn-common-2.6.0-cdh5.6.0.jar

    Hive
    • hive-jdbc-1.1.0-cdh5.6.0.jar

    • hive-exec-1.1.0-cdh5.6.0.jar

    • hive-metastore-1.1.0-cdh5.6.0.jar

    • hive-service-1.1.0-cdh5.6.0.jar

    • libfb303-0.9.2.jar

    • slf4j-api-1.7.5.jar

    • slf4j-log4j12-1.7.5.jar

    The libraries can be found in your CDH installation or in a package downloaded from Cloudera.

    CDH installation

    Required libraries from CDH reside in the directories from the following list.

    • /usr/lib/hadoop

    • /usr/lib/hadoop-hdfs

    • /usr/lib/hadoop-mapreduce

    • /usr/lib/hadoop-yarn

    • + 3rd party libraries are located in lib subdirectories

    Package downloaded from Cloudera

    The files can be found also in a package downloaded from Cloudera on the following locations.

    • share/hadoop/common

    • share/hadoop/hdfs

    • share/hadoop/mapreduce2

    • share/hadoop/yarn

    • + lib subdirectories