starting data node on the worker node throws this error - hadoop

I am using hadoop-2.5.1 and hbase-1.0.1
when i start the datanode on a worker node by
hadoop-daemon.sh start datanode
it throws SLF4J:class path contains multiple  SLF4J bindings
and data node is not getting started

I encounter the same error while installing hive .
the simple solution for it is to remove the slf4j jar file from folder in hive due to which this multiple jar error is appearing.
Now there are multiple slf4j jar file so in your error log check which slf4j jar is causing error most probably it will be slf4j-log4j12 or something like that. it actually depend on your hive and Hadoop version.
This error occur because the same jar is present in Hadoop lib folder which contains all the jar related to hadoop and also in hive lib folder. Now when we install hive after hadoop the jar which is common is again added as it is also present in hadoop lib folder thus it is required to be removed from hive lib folder as hive will automatically detect this jar from Hadoop lib folder due to its dependency on Hadoop. Thus it is safe to remove it from hive lib folder.
Hope this solve your query

Related

Create Oozie Shared Library

I have Oozie installed and running. I am trying to run the example workflows, and got an error that I need to create the shared library. So I put the shared folder in HDFS under /user/{username}/share/lib.
When I run: bin/oozie-setup.sh sharelib create -fs hdfs://{host}:8020, I get an error: java.lang.ClassNotFoundException: org.apache.htrace.SamplerBuilder
I have tried adding the htrace core jar into my libext folder of oozie and that gives me a Java Servlet error. Did I miss a step? I cannot seem to find anything related to this.
It ended up being an issue with my /libext directory. I went back and added the ext js zip and then found all ".jar" files in my hadoop directory. After filtering out the overlap of tomcat, etc with the Oozie UI, i was able to get an example workflow running!

Shouldn't Oozie/Sqoop jar location be configured during package installation?

I'm using HDP 2.4 in CentOS 6.7.
I have created the cluster with Ambari, so Oozie was installed and configured by Ambari.
I got two errors while running Oozie/Sqoop related to jar file location. The first concerned postgresql-jdbc.jar, since the Sqoop job is incrementally importing from Postgres. I added the postgresql-jdbc.jar file to HDFS and pointed to it in workflow.xml:
<file>/user/hdfs/sqoop/postgresql-jdbc.jar</file>
It solved the problem. But the second error seems to concern kite-data-mapreduce.jar. However, doing the same for this file:
<file>/user/hdfs/sqoop/kite-data-mapreduce.jar</file>
does not seem to solve the problem:
Failing Oozie Launcher, Main class
[org.apache.oozie.action.hadoop.SqoopMain], main() threw exception,
org/kitesdk/data/DatasetNotFoundException
java.lang.NoClassDefFoundError:
org/kitesdk/data/DatasetNotFoundException
It seems strange that this is not automatically configured by Ambari and that we have to copy jar files into HDFS as we start getting errors.
Is this the correct methodology or did I miss some configuration step?
This is happening due to the missing jars in the classpath. I would suggest you to use the property oozie.use.system.libpath=true in the job.properties file. All the sqoop related jars will be added automatically in the classpath. Then add only custom jar you need to the lib directory of the workflow application path., all the sqoop related jars will be added from the /user/oozie/share/lib/lib_<timestamp>/sqoop/*.jar.

how to add xml mahout classifier jar into hadoop cluster, as i dont want to add that library into hadoop classpath

I am parsing xml file using XMLInputFormat.class which is present in mahout-exmaples jar. but while running the jar file of map reduce i am getting below error
Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.mahout.classifier.bayes.XmlInputFormat not found
Please let me know how can i make these jars available while running on multinode hadoop cluster.
Include the all mahout-examples JARs in the “-libjars” command line option of the hadoop jar ... command. The jar will be placed in distributed cache and will be made available to all of the job’s task attempts. More specifically, you will find the JAR in one of the ${mapred.local.dir}/taskTracker/archive/${user.name}/distcache/… subdirectories on local nodes.
Please refer this link for more details.

Hadoop job DocumentDB dependency jar file

I have a hadoop job which gets its input from azure documentdb. I have put the documentdb jar dependency files under a directory called 'lib'. However when I ran the job it gives me a ClassNotFound error message for one of the classes in the jar file. I also tried adding the jar files using the -libjars option but it didn't work either. Does anyone have any idea what can be wrong?

How does zookeeper determine the 'java.library.path' for a hadoop job?

I am running hadoop jobs on a distributed cluster using oozie. I give a setting 'oozie.libpath' for the oozie jobs.
Recently, I have deleted few of my older version jar files from the library path oozie uses and I have replaced them with newer versions. However, when I run my hadoop job, my older version of jar files and newer version of jar files both get loaded and the mapreduce is still using the older version.
I am not sure where zookeeper is loading the jar files from. Are there any default settings that it loads the jar files from ? There is only one library path in my HDFS and it does not have those jar files.
I have found what is going wrong. The wrong jar is shipped with my job file. Should have check here first

Resources