Set LD_LIBRARY_PATH or java.library.path for YARN / Hadoop2 Jobs - hadoop

i have a Hadoop FileSystem which is using native libraries with JNI.
Apparently i have to include the shared object independently of the currently executed job. But i can't find a way to tell Hadoop/Yarn where it should look for the shared object.
I had partial success with the following solutions, while starting the wordcount example with yarn.
Setting export JAVA_LIBRARY_PATH=/path when starting the resource- and the nodemanager.
This helps with with the resource and the nodemanager, but the actual Job/Application fails. Printing the LD_LIBRARY_PATH and the java.library.path while executing the wordcount example yield the following result. What
/logs/userlogs/application_x/container_x_001/stdout
...
java.library.path : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_001:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
LD_LIBRARY_PATH : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x
Setting yarn.app.mapreduce.am.env="LD_LIBRARY_PATH=/path"
This did help with some of the Jobs. The actual map/reduce job did work (at least i have the correct results), but the call did fail with the Error no jni-xtreemfs in java.library.path.
Somehow the first application/job did work and shows
/logs/userlogs/application_x/container_x_001/stdout
...
java.library.path : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_001:/path:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
LD_LIBRARY_PATH : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_001:/path
But the second and the rest did fail with:
/logs/userlogs/application_x/container_x_002/stdout
...
java.library.path : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_002:/opt/hadoop-2.7.1/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
LD_LIBRARY_PATH : /tmp/hadoop-u/nm-local-dir/usercache/u/appcache/application_x/container_x_002/opt/hadoop-2.7.1/lib/native
The stacktrace for the later shows, that the error occured while executing YarnChild:
2015-08-03 15:24:03,851 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.UnsatisfiedLinkError: no jni-xtreemfs in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1886)
at java.lang.Runtime.loadLibrary0(Runtime.java:849)
at java.lang.System.loadLibrary(System.java:1088)
at org.xtreemfs.common.libxtreemfs.jni.NativeHelper.loadLibrary(NativeHelper.java:54)
at org.xtreemfs.common.libxtreemfs.jni.NativeClient.<clinit>(NativeClient.java:41)
at org.xtreemfs.common.libxtreemfs.ClientFactory.createClient(ClientFactory.java:72)
at org.xtreemfs.common.libxtreemfs.ClientFactory.createClient(ClientFactory.java:51)
at org.xtreemfs.common.clients.hadoop.XtreemFSFileSystem.initialize(XtreemFSFileSystem.java:191)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Supply the libjni-xtreemfs.so via the commandline argument -files
This does work. I assume the .so is copied to the tmp directory. But this is no feasible solution, because it would require the users to supply the path to the .so on every call.
Does anybody now how i can globally set the LD_LIBRARY_PATH or the java.library.path or can suggest which configuration options i did probably miss? I'd be very thankful!

Short Answer: in your mapred-site.xml put the following
<property>
<name>mapred.child.java.opts</name>
<value>-Djava.library.path=$PATH_TO_NATIVE_LIBS</value>
</property>
Explanation:
The Job/Applications aren't executed by yarn rather than by a mapred (map/reduce) container, whoose configuration is controlled by the mapred-site.xml file. Specifying custom java parameters there causes that actual workers to spin with the correct path

Use mapreduce.map.env in your job or site configuration.
Usage is as follows:
<property>
<name>mapreduce.map.env</name>
<value>LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/my/libs</value>
</property>
Note:
Hadoop docs encourage the use of mapreduce.map.env for this over mapred.child.java.opts. "Usage of -Djava.library.path can cause programs to no longer function if hadoop native libraries are used."

Related

Run install pig from ozzie hadoop

I have installed oozie on my system and I have also installed pig. Now I want ozzie to run workflow from pig which is installed on my system not from the ozzie sharelib. Please help, as I get the following error:
2015-08-19 17:15:25,724 WARN PigActionExecutor:523 - SERVER[edb-node1] USER[hduser] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000002-150819170943510-oozie-hdus-W] ACTION[0000002-150819170943510-oozie-hdus-W#pig-node] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.PigMain], exception invoking main(), java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
2015-08-19 17:15:25,728 WARN PigActionExecutor:523 - SERVER[edb-node1] USER[hduser] GROUP[-] TOKEN[] APP[pig-wf] JOB[0000002-150819170943510-oozie-hdus-W] ACTION[0000002-150819170943510-oozie-hdus-W#pig-node] Launcher exception: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:234)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.PigMain not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
... 13 more
You've got error messages that show clearly that you have an incomplete CLASSPATH.
That's because the pig command line does a lot of things, among which setting up silently the appropriate Pig JARs in the CLASSPATH, before invoking the PigMain Java class. But Oozie calls the Java class directly; the CLASSPATH issues are supposed to be handled either...
by the Pig ShareLib, when active
or by you, as a knowledgeable Java developer -- after all, it's your
choice not to use the default way to run Pig, so you know what you
are doing, right?
Before asking your question, did you try a search on StackOverflow (and/or Google) with the following keywords? The results could prove useful.
oozie pig custom classpath
In the above case have you checked if you are able to access Pig CLI {grunt shell} or are you able to running Pig Script manually, without Using Oozie.
The Case what happend is Pig Jar was not accessable while You are trying to use via Oozie, the Best Case to avoide such issue is to use Oozie - -ShareLib , but as You mention you dnt want the same to work using sharedlib, then use some alternate case like:
Update Pig Home Path in Hadoop ClassPath. These will help MR to fetch the List of Jars in HDFS Temp Location every time Oozie Submit Request to MR.
Update Pig Home in BashProfile {if Pig shows error while accessing GRANT Shell with Command not Found}.
Hope these will help some.

Job via Oozie HDP 2.1 not creating job.splitmetainfo

When trying to execute a sqoop job which has my Hadoop program passed as a jar file in -jarFiles parameter, the execution blows off with below error. Any resolution seems to be not available. Other jobs with same Hadoop user is getting executed successfully.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://sandbox.hortonworks.com:8020/user/root/.staging/job_1423050964699_0003/job.splitmetainfo
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1541)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1396)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1363)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:976)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:135)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1241)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1041)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1452)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1448)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1381)
So here is the way I solved it. We are using CDH5 to run Camus to pull data from kafka. We run CamusJob which is responsible for getting data from kafka using comman line:
hadoop jar...
The problem is that new hosts didn't get so-called "yarn-gateway". Cloudera names pack of configs related to service and copied to /etc/hadoop/conf
as "gateway". So I just clicked "deploy client configuration" in CM UI. YARN client conf has been copied to each YARN NodeManager node and it solved problem.

NoClassDefFoundError for simple Yarn application

I was trying to run the simple yarn application from simple-yarn-app. But I am getting the following exception in my application error logs.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/conf/YarnConfiguration
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
at java.lang.Class.getMethod0(Class.java:2774)
at java.lang.Class.getMethod(Class.java:1663)
at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.conf.YarnConfiguration
But if I run "yarn classpath" command on all my datanodes, I see the following output:
/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-yarn/lib/*
which has the path to the yarn-client, yarn-api, yarn-common and the hadoop-common jars required by the application. Can anyone point me to the direction where I might have forgotten to set the right classpath.
I found that Hadoop does not resolve $HADOOP_HOME and $YARN_HOME environment variables while iterating over the YarnConfiguration attributes. Running the following in your Yarn Client will print the unresolved configuration, like,
$HADOOP_HOME/, $HADOOP_HOME/lib/
YarnConfiguration conf = new YarnConfiguration()
for (String c : conf.getStrings(
YarnConfiguration.YARN_APPLICATION_CLASSPATH,
YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH)) {
System.out.println(c);
}
So, if you provide the full path for the yarn.application.classpath property, the NoClassDefFoundError issue gets resolved.
<property>
<description>CLASSPATH for YARN applications. A comma-separated list of CLASSPATH entries</description>
<name>yarn.application.classpath</name>
<value>
/etc/hadoop/conf,
/usr/lib/hadoop/*,
/usr/lib/hadoop/lib/*,
/usr/lib/hadoop-hdfs/*,
/usr/lib/hadoop-hdfs/lib/*,
/usr/lib/hadoop-mapreduce/*,
/usr/lib/hadoop-mapreduce/lib/*,
/usr/lib/hadoop-yarn/*,
/usr/lib/hadoop-yarn/lib/*
</value>
</property>
The issue will happen on YARN clusters where the ResourceManager and/or NodeManager daemons are started with an application classpath that is incomplete. Even something as simple as the included spark-shell will fail:
user#linux$ spark-shell --master yarn-client
Sadly, you only find out when launching your application; or running it long enough to run into the class(es) that are missing. To remedy this issue, I took the output of the following CLASSPATH command,
user#linux$ yarn classpath
and cleaned it up (because it contains duplicates and non-canonical entries), appended it to the below YARN configuration directive, which is found in
/etc/hadoop/conf/yarn-site.xml, and finally restarted the YARN cluster daemons:
user#linux$ sudo vi /etc/hadoop/conf/yarn-site.xml
[ ... ]
<property>
<name>yarn.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/*,
$HADOOP_COMMON_HOME/lib/*,
$HADOOP_HDFS_HOME/*,
$HADOOP_HDFS_HOME/lib/*,
$HADOOP_MAPRED_HOME/*,
$HADOOP_MAPRED_HOME/lib/*,
$YARN_HOME/*,
$YARN_HOME/lib/*,
/etc/hadoop/conf,
/usr/lib/hadoop/*,
/usr/lib/hadoop/lib,
/usr/lib/hadoop/lib/*,
/usr/lib/hadoop-hdfs,
/usr/lib/hadoop-hdfs/*,
/usr/lib/hadoop-hdfs/lib/*,
/usr/lib/hadoop-yarn/*,
/usr/lib/hadoop-yarn/lib/*,
/usr/lib/hadoop-mapreduce/*,
/usr/lib/hadoop-mapreduce/lib/*
</value>
</property>
The entries above that don't contain references to environment variables, are the ones that I added. Remember to copy this modified file to all nodes on your YARN cluster prior to restarting the ResourceManager and NameNode daemons.
In general, you'll want to package all non-provided dependencies (classes and modules) into your application archive. =:)

hbase-master fails to start

I am trying to set up a Hadoop development environment. I am using CDH4 and following the installation instructions in their website https://ccp.cloudera.com/display/CDH4DOC/.
I got to the point in which I was able to install CDH4 in pseudo-distributed mode and I am following the part regarding "Components that require additional configuration".
I have installed HBase-master package, but when I try to start the service I am getting the following error:
$ sudo /sbin/service hbase-master start
starting master, logging to /var/log/hbase/hbase-hbase-master-slc01euu.out
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: org.apache.hadoop.util.PlatformName. Program will exit.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
I suposse that it has something to do with some env variable (i believe HADOOP_HOME). But I am not sure where to look at since all the previous processes (name node, data node, job tracker,task tracker) started with no problem.
When I search for HADOOP_HOME variable it says that it is undefined.
Do you guys have any idea about how I could solve this?
Thanks a lot in advance.
Please set HADOOP_HOME to your hadoop installation directory
eg: export HADOOP_HOME=/usr/hadoop
Add a pointer to your HADOOP_CONF_DIR to the HBASE_CLASSPATH environment variable in hbase-env.sh.
Add a copy of hdfs-site.xml (and core-site.xml) in under ${HBASE_HOME}/conf folder

java.io.EOFException when trying to run examples on HBase standalone

I'm trying to run this example: https://github.com/larsgeorge/hbase-book/blob/master/ch03/src/main/java/client/PutExample.java, from this book: http://ofps.oreilly.com/titles/9781449396107/, on a standalone HBase installation. Starting HBase works fine and the shell is accessible, but when I try to run the example I get the following error:
Exception in thread "main" java.io.IOException: Call to /127.0.0.1:55958 failed on local exception: java.io.EOFException
at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:872)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:841)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:141)
at $Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:174)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:295)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:272)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:324)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:228)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1228)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1190)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1177)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:914)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:810)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:784)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1014)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:814)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:778)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:188)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:159)
at client.CRUDExample.main(CRUDExample.java:26)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:548)
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:486)
Thanks in advance.
The problem was that I compiled the examples with a newer version of HBase. To fix this error, for these examples, edit pom.xml and make sure that the HBase dependency is the same version as you're running and build again. (Also don't forget to remove chXX/target/cached_classpath.txt, otherwise it still adds the other library to your classpath)
I met this problem with exactly same error logs, but difference reason.
It is fixed after I added bellow items to hbase client config:
<property>
<name>hbase.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hbase.rpc.engine</name>
<value>org.apache.hadoop.hbase.ipc.SecureRpcEngine</value>
</property>

Resources