Unable to configure hive.exec hooks due to missing jar - hadoop

I am trying to use Hive and to switch databases using the 'use db' command. My setup is Hadoop 2.4.0 and Hive 0.13.1. I add the following 3 properties to a .settings file
set hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook;
set hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook;
set hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook;
I then open hive command line, passing in the .settings file via 'hive -i my.settings' and then I get:
hive> use db;
hive.exec.pre.hooks Class not found:org.apache.hadoop.hive.ql.hooks.ATSHook
FAILED: Hive Internal Error: java.lang.ClassNotFoundException(org.apache.hadoop.hive.ql.hooks.ATSHook)
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
It seems there is a jar missing from my classpath. I tried searching the web for a jar containing "org.apache.hadoop.hive.ql.hooks.ATSHook" class, but have had no luck. I tried adding all paths with jars in them from HIVE_HOME to yarn-site.xml via:
<property>
<name>yarn.application.classpath</name>
<value>
...
/apps/hive/hive-0.13.1/hcatalog/share/hcatalog/*,
/apps/hive/hive-0.13.1/hcatalog/share/hcatalog/storage-handlers/hbase/lib/*,
/apps/hive/hive-0.13.1/hcatalog/share/webhcat/java-client/*,
/apps/hive/hive-0.13.1/hcatalog/share/webhcat/svr/lib/*,
/apps/hive/hive-0.13.1/lib/*
</value>
</property>
Still no luck. Does anyone know is there some additional step I need to do configure these properties?

Apparently the jar is only available in the, as yet unreleased, Hive 0.14.0. So I had to download and build Hive according to the directions on the Hive Wiki. Which is simply:
mvn clean install -DskipTests -Phadoop-2
Once that was built I was able to do this:
hive> add jar <HIVE_HOME>/ql/target
> ;
Or by adding this property to hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>file:///<HIVE_HOME>/ql/target/hive-exec-0.14.0-SNAPSHOT.jar</value>
</property>
I also found a nice slide share presentation about plugins.

Related

Alluxio Error:java.lang.IllegalArgumentException: Wrong FS

I am able to run wordcount on alluxio with an example jar provided by cloudera, using:
sudo -u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar wordcount -libjars /home/nn1/alluxio-1.2.0/core/client/target/alluxio-core-client-1.2.0-jar-with-dependencies.jar alluxio://nn1:19998/wordcount alluxio://nn1:19998/wc1
and it's a success.
But I can't run it when I use the jar created with the ATTACHED CODE, This is also a sample wordcount example
code
sudo -u hdfs hadoop jar /home/nn1/HadoopWordCount-0.0.1-SNAPSHOT-jar-with-dependencies.jar edu.am.bigdata.C45TreeModel.C45DecisionDriver -libjars /home/nn1/alluxio-1.2.0/core/client/target/alluxio-core-client-1.2.0-jar-with-dependencies.jar alluxio://10.30.60.45:19998/abdf alluxio://10.30.60.45:19998/outabdf
Above code is build using maven
Pom.xml file contains
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.6.0-mr1-cdh5.4.5</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0-cdh5.4.5</version>
</dependency>
Could you please help me in running my wordcount program in alluxio cluster. Hope no extra configurations are added into pom file for running the same.
I am getting the following error after running my jar :
java.lang.IllegalArgumentException: Wrong FS:
alluxio://10.30.60.45:19998/outabdf, expected: hdfs://10.30.60.45:8020
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:657)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1215)
at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1211)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1211)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1412)
at edu.WordCount.run(WordCount.java:47)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at edu.WordCount.main(WordCount.java:23)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
The problem comes from the call to
FileSystem fs = FileSystem.get(conf);
on line 101. The FileSystem created by FileSystem.get(conf) will only support paths with the scheme defined by Hadoop's fs.defaultFS property. To fix the error, change that line to
FileSystem fs = FileSystem.get(URI.create("alluxio://nn1:19998/", conf)
By passing a URI, you override fs.defaultFS, enabling the created FileSystem to support paths using the alluxio:// scheme.
You could also fix the error by modifying fs.defaultFS in your core-site.xml
<property>
<name>fs.defaultFS</name>
<value>alluxio://nn1:19998/</value>
</property>
However, this could impact other systems that share the core-site.xml file, so I recommend the first approach of passing an alluxio:// URI to FileSystem.get()

Geomesa Configuration Erorr

I am using Hadoop 2.7 with geoserver 2.8.0, but while I am trying to configure Geomesa 1.2.0, I am getting this error message:
$ geomesa
Using GEOMESA_HOME = /usr/local/geomesa/dist/tools/geomesa-tools-1.2.0
Warning: you have not set ACCUMULO_HOME and/or HADOOP_HOME as environment variables.
GeoMesa tools will not run without the appropriate Accumulo and Hadoop jars in the tools classpath.
Please ensure that those jars are present in the classpath by running 'geomesa classpath' .
To take corrective action, please place the necessary jar files in the lib directory of geomesa-tools.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/accumulo/core/client/TableNotFoundException
at org.locationtech.geomesa.tools.commands.TableConfCommand.<init>(TableConfCommand.scala:32)
at org.locationtech.geomesa.tools.Runner$.createCommand(Runner.scala:50)
at org.locationtech.geomesa.tools.Runner$.main(Runner.scala:21)
at org.locationtech.geomesa.tools.Runner.main(Runner.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.accumulo.core.client.TableNotFoundException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 4 more
How can I fix this?
The GeoMesa tools need Hadoop and Accumulo jars in order to connect to Accumulo.
One quick option is to run the GeoMesa tools from a tablet server or another machine already configured to be part of the Hadoop cluster. If you are using another machine, you can mirror the $HADOOP_HOME and $ACCUMULO_HOME directories from a cluster node locally.
As another alternative, you can download the install-hadoop-accumulo.sh script in the geomesa-tools/bin directory to download a set of Hadoop and Accumulo jars.
verify that corresponding jar file is present in the classpath, you can check this with the help of command:- Geomesa classpath
If jar is not present then copy the jar in the Geomesa directoryin my case it is in following path:
/*/geomesa-1.2.4/dist/tools/geomesa-tools-1.2.4/lib/common/

Using Oozie to create a hive table on hbase causes an error with libthrift?

I'm using an oozie hive action on cloudera (cdh 4) to create an hbase hive table. Running the create table command on my local dev util box executes without error. When I execute the same command via an oozie hive action in the cluster, I get this error:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], main() threw exception, org.apache.thrift.EncodingUtils.setBit(BIZ)B
java.lang.NoSuchMethodError: org.apache.thrift.EncodingUtils.setBit(BIZ)B
at org.apache.hadoop.hive.ql.plan.api.Query.setStartedIsSet(Query.java:487)
at org.apache.hadoop.hive.ql.plan.api.Query.setStarted(Query.java:474)
at org.apache.hadoop.hive.ql.QueryPlan.updateCountersInQueryPlan(QueryPlan.java:309)
at org.apache.hadoop.hive.ql.QueryPlan.getQueryPlan(QueryPlan.java:450)
at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:622)
at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:504)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1106)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:713)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:302)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:260)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Googling around, most answers said that this was due to different versions of thrift on hive, hbase, or hadoop; but as far as I can tell (using find -name in a shell action) they all have version 0.9.0:
Stdoutput ./lib/flume-ng/lib/libthrift-0.9.0.jar
Stdoutput ./lib/hcatalog/share/webhcat/svr/lib/libthrift-0.9.0.jar
Stdoutput ./lib/whirr/lib/libthrift-0.9.0.jar
Stdoutput ./lib/whirr/lib/libthrift-0.5.0.jar
Stdoutput ./lib/hive/lib/libthrift-0.9.0-cdh4-1.jar
Stdoutput ./lib/oozie/libserver/libthrift-0.9.0.jar
Stdoutput ./lib/oozie/libtools/libthrift-0.9.0.jar
Stdoutput ./lib/hbase/lib/libthrift-0.9.0.jar
Stdoutput ./lib/mahout/lib/libthrift-0.9.0.jar
These same versions are on my dev util box, and the hive command works fine. Any ideas what could be causing this issue?
Thanks in advance!
The issue was with a jar included in the workflow's lib directory. This jar had dependencies that had dependencies with an older version of thrift.
I was able to circumvent this by making the hive action happen in a sub workflow, then setting
<global>
<configuration>
<property>
<name>oozie.use.system.libpath</name>
<value>false</value>
</property>
<property>
<name>oozie.libpath</name>
<value>${wf:appPath()}/lib</value>
</property>
</configuration>
</global>
on the workflow. This essentially told it to use the lib in my subworkflow's directory, not the main workflow's lib (which included the bad jar).

Apache Oozie throws classnotfound exception while creating mysql DB

I am trying to use mysql DB with apache OOzie.
my $OOZIE_HOME is
-bash: /opt/oozie_install/oozie-3.3.0-cdh4.2.2: Is a directory
But I copied mysql-connector-java-5.1.29-bin.jar in almost every possible places.
Like I copied it inside
/opt/oozie_install/oozie-3.3.0-cdh4.2.2
/opt/oozie_install/oozie-3.3.0-cdh4.2.2/libs
/opt/oozie_install/oozie-3.3.0-cdh4.2.2/libtools
/usr/lib/jvm/jdk/libs
/user/home/hadoop/
But I am still getting ClassnotFoundException.
java.lang.Exception: Could not connect to the database: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at org.apache.oozie.tools.OozieDBCLI.validateConnection(OozieDBCLI.java:473)
at org.apache.oozie.tools.OozieDBCLI.createDB(OozieDBCLI.java:179)
at org.apache.oozie.tools.OozieDBCLI.run(OozieDBCLI.java:118)
at org.apache.oozie.tools.OozieDBCLI.main(OozieDBCLI.java:64)
Caused by: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.oozie.tools.OozieDBCLI.createConnection(OozieDBCLI.java:462)
at org.apache.oozie.tools.OozieDBCLI.validateConnection(OozieDBCLI.java:469)
Exactly where am I supposed to copy the Mysql connector.
I have verified my oozie-site.xml-
I followed the following steps to use mysql in Oozie
My oozie directory looks like-
You have to copy mysql-connector-java-5.1.29-bin.jar to /opt/oozie_install/oozie-3.3.0-cdh4.2.2/libext directory then restart oozie instance. Make sure that mysql user oozie has suffient privileges to the database oozie, if not, grant suffient permission using grant command in mysql server.
i meet the same problem finally i solve it by edit the oozie-env.sh and append the JAVA_HOME at last export JAVA_HOME=/usr/local/jdk1.7 the java_home is yourself javapath
I ran into this issue when I was converting my local Derby instance to MySql. The difference between my issue and the others, is that I did not install an RPM. My Oozie instance was pre-compiled in a tar.gz file. I had to copy the mysql-connector-java-bin.jar to the oozie-server/lib directory. This was in addition to copying it to the lib, libext, and libtools directories. I am not sure if all of those are needed, but I do know that oozie-server/lib is needed for Oozie to start. Hope this helps someone!

Problems running Manning's Hadoop in Practice 4.1 MapReduce code on Hadoop 1.0.3

I am attempting to run the 4.1 example code from Manning's "Hadoop in Practice" at http://www.manning.com/lam/
I am running Ubuntu 10.4 using hadoop 1.0.3 java 6.
The examples from http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/, I used the wordcount example to verify the installation.
I then tried to running the 4.1 example using:
hduser#ubuntu:/usr/local/hadoop$ bin/hadoop jar MyJob.jar MyJob /user/hduser/4.1/input /user/hduser/4.1output
I get the error:
Exception in thread "main" java.lang.ClassNotFoundException: MyJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
The public run method in the example that runs and manning's code appear to be different.
I appreciate your assistance!
Give the complete path of the jar. For example, if MyJob.jar is present inside your home directory then : hduser#ubuntu:/usr/local/hadoop$ bin/hadoop jar /home/hduser/MyJob.jar MyJob /user/hduser/4.1/input /user/hduser/4.1output
I had the same problem with Hadoop 1.0.3.16 and java 6 but I managed to get the Manning example 4.1 working by adding job.setJar("/path/to/MyJob.jar"); after job.setJobName("MyJob"); I thought of making this change because I was getting a warning: WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). Do you get the same warning Tariq?
I also tried adding job.setJarByClass(MyJob.class); instead but this did not work.
Cheers, Alex

Resources