Hadoop JAR command - Setting java.library.path - hadoop

I'm trying to run a java program on Hadoop cluster. Here's the command-
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/lib/*:/home/rgupta/bdAnalytics/lib/*
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub > $logsFldr/subsHdpOMQ_$1.log 2>&1 &
#java -Djava.library.path=/usr/local/lib -classpath class/:lib/:lib/jzmq-2.1.3.jar bigdat.twitter.queue.TweetOMQSub > log/subsFilterOMQ_$1.log 2>&1 &
This throws following error -
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jzmq in java.library.path
If I use the Java native command above, it works fine. Also, the hadoop node where I m trying to test it, does have the necessary jzmq jars under /usr/local/lib directory. Is there a way I can set java.library.path to Hadoop JAR command.
Please suggest how can I fix this.

sorry misread your question so editing:
you should be able to use the libjars option
In your case:
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/lib/:/home/rgupta/bdAnalytics/lib/
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub -libjars /usr/local/lib ...

Try export HADOOP_OPTS=$HADOOP_OPTS -Djava.library.path=/usr/local/lib
and export other jars the usual way you are doing before running a job - using HADOOP_CLASSPATH
Hope this helps.

Related

How can I fix ClassNotFounException when executing HBase java application from command line?

I don't know anything about bash, but i put together a script to help me run my Hbase java application:
#!/bin/bash
HADOOP_CLASSPATH="$(hbase classpath)"
hadoop jar my.jar my_pkg.my_class
When I run it I get a:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy
When I echo out the HADOOP_CLASSPATH I see that hbase-server-1.2.0-cdh5.8.0.jar is there...
Is the hadoop jar command ignoring the HADOOP_CLASSPATH?
Also I have tried to run the commands from the command-line instead of using my script. I get the same error.
The approach was inspired by this cloduera-question
The solution was to include the Hadoop class path on the same line. I am not certain what the difference is, but this works:
HADOOP_CLASSPATH="$(hbase classpath)" hadoop jar my.jar my_pkg.my_class

Not able to start the Master process of Spark

I have set up a three node Hadoop cluster and trying to run Spark using Hadoop's YARN and HDFS.
I set the various environment variables like HADOOP_HOME , HADOOP_CONF_DIR, SPARK_HOME etc.
Now, when I try to run the master process of spark using start-master.sh , it is giving me some exceptions,the main contents of exception file is below:
Spark Command: /usr/local/java/bin/java -cp /usr/local/spark/conf/:/usr/local/spark/jars/*:/usr/local/hadoop/etc/hadoop/ - Xmx1g org.apache.spark.deploy.master.Master --host master.hadoop.cluster --port 7077 --webui-port 8080
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
As it says ClassNotFound exception,I am not able to understand how to provide this class, and which Jar to use from which it can pick the class file. Does this jar come as bundled with the Spark download?
Can anyone please help in fixing this issue.

Spark Submit Issue

I am trying to run a fat jar on a Spark cluster using Spark submit.
I made the cluster using "spark-ec2" executable in Spark bundle on AWS.
The command I am using to run the jar file is
bin/spark-submit --class edu.gatech.cse8803.main.Main --master yarn-cluster ../src1/big-data-hw2-assembly-1.0.jar
In the beginning it was giving me the error that at least one of the HADOOP_CONF_DIR or YARN_CONF_DIR environment variable must be set.
I didn't know what to set them to, so I used the following command
export HADOOP_CONF_DIR=/mapreduce/conf
Now the error has changed to
Could not load YARN classes. This copy of Spark may not have been compiled with YARN support.
Run with --help for usage help or --verbose for debug output
The home directory structure is as follows
ephemeral-hdfs hadoop-native mapreduce persistent-hdfs scala spark spark-ec2 src1 tachyon
I even set the YARN_CONF_DIR variable to the same value as HADOOP_CONF_DIR, but the error message is not changing. I am unable to find any documentation that highlights this issue, most of them just mention these two variables and give no further details.
You need to compile spark against Yarn to use it.
Follow the steps explained here: https://spark.apache.org/docs/latest/building-spark.html
Maven:
build/mvn -Pyarn -Phadoop-2.x -Dhadoop.version=2.x.x -DskipTests clean package
SBT:
build/sbt -Pyarn -Phadoop-2.x assembly
You can also download a pre-compiled version here: http://spark.apache.org/downloads.html (choose a "pre-built for Hadoop")
Download prebuilt spark which supports hadoop 2.X versions from https://spark.apache.org/downloads.html
The --master argument should be: --master spark://hostname:7077 where hostname is the name of your Spark master server. You can also specify this value as spark.master in the spark-defaults.conf file and leave out the --master argument when using Spark submit from the command line. Including the --master argument will override the value set (if exists) in the spark-defaults.conf file.
Reference: http://spark.apache.org/docs/1.3.0/configuration.html

Hadoop and JZMQ - no jzmq in java.library.path

I'm trying to get the JZMQ code working on ONE of the nodes on Hadoop Cluster. I have necessary native jmzq library files installed under - /usr/local/lib directory on that node.
Here's the list -
libjzmq.a libjzmq.la libjzmq.so libjzmq.so.0 libjzmq.so.0.0.0 libzmq.a libzmq.la libzmq.so libzmq.so.3 libzmq.so.3.0.0 pkgconfig
In my shell script if I run the Java command below, it works absolutely fine -
java -Djava.library.path=/usr/local/lib -classpath class/:lib/:lib/jzmq-2.1.3.jar bigdat.twitter.queue.TweetOMQSub
But when I run the below command, it throws Exception in thread "main"
java.lang.UnsatisfiedLinkError: no jzmq in java.library.path
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub
I explicitly set the necessary files/Jars in Hadoop Classpath, Opts etc, using Export command
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/lib/"
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/txtUser/analytics/lib/*
export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native/:/usr/local/lib/
The source code JAR and jzmq-2.1.3.jar files are present under /home/txtUser/analytics/lib/ folder on Hadoop node.
Also, the /usr/local/lib is added onto the system ld.conf
Can anyone suggest, given inputs on what I may be doing wrong here?
Add following line to /etc/profile or .bashrc:
export JAVA_LIBRARY_PATH=/usr/local/lib
Reload /etc/profile or .bashrc.

Apache Pig Mapreduce Mode

i'm getting the following exception while try to run pig in mapreduce mode,
ERROR 4010: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath). If you plan to use local mode, please put -x local option in command line
I have the following hadoop class path set,
export HADOOP_INSTALL=/media/Software/hadoop-1.0.4
Thanks

Resources