0

We are creating a Spark based application using Spark 2.3.0. Our Spark jobs interacts with HBase. While creating JAR, we are getting following compile time exception exception: [ERROR] class file for org.apache.spark.Logging not found This exception occurs in the code, that is reading data from the HBase tables.

We are able to successfully write data into the HBase tables, using the jar's configuration/versions below.

We are using following configuration in pom.xml

<property>
<org.apache.spark.version>2.3.0</org.apache.spark.version>
<scala.version>2.11</scala.version>
<hbase.version>1.0.0-cdh5.4.0</hbase.version>
</property> 

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${org.apache.spark.version}</version>
</dependency>


        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.version}</artifactId>
            <version>${org.apache.spark.version}</version>
        </dependency>


        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-spark</artifactId>
            <version>1.2.0-cdh5.10.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-common</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-server</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-protocol</artifactId>
            <version>${hbase.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.htrace</groupId>
            <artifactId>htrace-core</artifactId>
            <version>3.1.0-incubating</version>
        </dependency>

We found multiple solutions on stackoverflow, all mentioning to use Spark 1.6 instead. java.lang.NoClassDefFoundError: org/apache/spark/Logging

This is not possible for us.

Is there any other workaround to solve this issue?

Thanks

Anuj Mehra
  • 320
  • 3
  • 19
  • in the link you mention there is an answer "The error is because you are using Spark 2.0 libraries with the connector from Spark 1.6 (which looks for the Spark 1.6 logging class. Use the 2.0.5 version of the connector." does that helps you to investigate more ? – Karthick Sep 21 '18 at 15:26
  • I am not able to find any such jar in the maven cloudera repo: https://mvnrepository.com/artifact/org.apache.hbase/hbase-spark?repo=cloudera – Anuj Mehra Sep 24 '18 at 07:50

1 Answers1

0

Replying to an older question I posted here. Since we couldn't roll back to Spark version 1.6 (we are using Spark 2.3), hence we found a workaound of using HBaseContext.bulkGet.

We are doing something as below:

val respDataFrame = keyDf.mapEachPartition((keys) => {
--> creating instance of HTable
--> create list of all the gets 
--> fetch getsList
})
Anuj Mehra
  • 320
  • 3
  • 19