sqoop2 not finding log4j2 from hadoop - hadoop

I am trying to install sqoop2 (1.99.7) on my ubuntu server. I am trying to follow the instructions provided on the apache website here. I have a working hadoop installation and I have downloaded and extracted the sqoop file to the /usr/local/sqoop location.
tar -xvf sqoop-1.99.7-bin-hadoop200.tar.gz
mv sqoop-1.99.7-bin-hadoop200 /usr/local/sqoop
I believe I have all the environmental variables defined, in particular HADOOP_HOME which I thought is stated to direct where sqoop looks for the jar files.
However, when I try to verify installation with sqoop2-tool verify I get the following output.
Setting conf dir: /usr/local/sqoop/bin/../conf
Sqoop home directory: /usr/local/sqoop
Sqoop tool executor:
Version: 1.99.7
Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at org.apache.sqoop.security.authentication.SimpleAuthenticationHandler.secureLogin(SimpleAuthenticationHandler.java:36)
at org.apache.sqoop.security.AuthenticationManager.initialize(AuthenticationManager.java:98)
at org.apache.sqoop.core.SqoopServer.initialize(SqoopServer.java:57)
at org.apache.sqoop.tools.tool.VerifyTool.runTool(VerifyTool.java:36)
at org.apache.sqoop.tools.ToolRunner.main(ToolRunner.java:72)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
Somehow, it is failing to find the log4j2 configuration file. I'm not sure why this is the case.
This question is similar to the one here but the solution provided does not help. If I modify the sqoop.properties file and point directly to the hadoop configuration directory /usr/local/hadoop/etc/hadoop (which is where my core-site.xml, hdfs-site.xml, etc. are located) I continue to get the error above.
EDIT
Output of grep -r "org.apache.hadoop.conf.Configuration" /usr/local/hadoop | grep jar
Binary file /usr/local/hadoop/share/hadoop/common/sources/hadoop-common-2.8.0-sources.jar matches
Binary file /usr/local/hadoop/share/hadoop/common/hadoop-common-2.8.0.jar matches
Binary file /usr/local/hadoop/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-common-2.8.0.jar matches
Binary file /usr/local/hadoop/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/hadoop-common-2.8.0.jar matches

Sqoop.properties is a Java property file. Environment variable should be defined in sqoop-env.sh or set it up using export command.
Can you try to execute the below environment variables export command before executing the sqoop command, It it works you can add these commands to sqoop-env.sh environment file.
export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HDFS_HOME=/usr/local/hadoop
export HADOOP_COMMON_HOME=/usr/local/hadoop
export HADOOP_YARN_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export HADOOP_MAPRED_HOME=/usr/local/hadoop
Make sure /usr/local/hadoop is correct.
Edit -
If you look at the last line of sqoop command, it's a bash script and it uses hadoop command internally to invoke sqoop class, so all hadoop related libs will be loaded to sqoop environment, if HADOOP_COMMON_HOME env variable is correct.
Are you able to execute hadoop commands in this server ?, Can you share the output of ${HADOOP_COMMON_HOME}/bin/hadoop fs -ls / ; If this works, this error could be due to compatibility - Sqoop version may not compatible with Hadoop.

Related

Snappy compressed file on HDFS appears without extension and is not readable

I configured a Map Reduce job to save output as a Sequence file compressed with Snappy. The MR job executes successfully however in HDFS the output file looks as the following:
I've expected that the file will have a .snappy extension and that it should be part-r-00000.snappy. And now I think that this may be the reason for the file to be not readable when I'm trying to read it from a local file system using this pattern hadoop fs -libjars /path/to/jar/myjar.jar -text /path/in/HDFS/to/my/file
So I'm getting the –libjars: Unknown command when executing the command:
hadoop fs –libjars /root/hd/metrics.jar -text /user/maria_dev/hd/output/part-r-00000
And when I'm using this command hadoop fs -text /user/maria_dev/hd/output/part-r-00000, I'm getting the error:
18/02/15 22:01:57 INFO compress.CodecPool: Got brand-new decompressor [.snappy]
-text: Fatal internal error
java.lang.RuntimeException: java.io.IOException: WritableName can't load class: com.hd.metrics.IpMetricsWritable
Caused by: java.lang.ClassNotFoundException: Class com.hd.ipmetrics.IpMetricsWritable not found
Could it be that the absence of the .snappy extension causes the problem? What other command should I try to read the compressed file?
The jar is in my local file system /root/hd/ Where should I place it not to cause ClassNotFoundException? Or how should I modify the command?
Instead of hadoop fs –libjars (which actually has a wrong hyphen and should be -libjars. Copy that exactly, and you won't see Unknown command)
You should be using HADOOP_CLASSPATH environment variable
export HADOOP_CLASSPATH=/root/hd/metrics.jar:${HADOOP_CLASSPATH}
hadoop fs -text /user/maria_dev/hd/output/part-r-*
The error clearly says ClassNotFoundException: Class com.hd.ipmetrics.IpMetricsWritable not found.
It means that a required library is missing in classpath.
To clarify your doubts:
Map-Reduce by default output the file as part-* and there is no
meaning of extension. Remember extension "thing" is just a metadata
usually required by windows operating system to determine suitable
program for the file. It has no meaning in linux/unix and the
system's behavior is not going to change, even though you rename the
file as .snappy (you may actually try this).
The command looks absolutely fine to inspect the snappy file, but it seems that some required jar file are not there, which is causing ClassNotFoundException.
EDIT 1:
By default hadoop picks the jar files from the path emit by below command:
$ hadoop classpath
By default it list all the hadoop core jars.
You can add your jar by executing below command on the prompt
export HADOOP_CLASSPATH=/path/to/my/custom.jar
After executing this, try checking the class path again by hadoop classpath command and you should be able to see your jar listed along with hadoop core jars.

Could not format the Namenode in hadoop 2.6?

I have installed the hadoop 2.6 on ubuntu 14.04.I just followed this blog.
While I am trying to format the namenode, I am hitting with below error:
hduser#data1:~$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
/usr/local/hadoop/bin/hdfs: line 276: /home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java: No such file or directory
/home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java: No such file or directory
This error occurs because the JAVA_HOME you have provided does not have java.
Just add this line in hadoop-env.sh and /home/hduser/.bashrc:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
I think you have already set the $JAVA_HOME but you did it wrong (just a guess):
/home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java
It would be :
/usr/lib/jvm/java-7-openjdk-amd64/bin/java
You probably have added ~ before the path when you exported that JAVA_HOME and this added the home directory /home/hduser.
To check this out, type java -version and see if java is working. And type echo $JAVA_HOME and check the path manually.
I figured out. The entry we made was for amd64. it is really i386 computers. Please verify the path and that should fix the issue.

Error in Pig: Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop', and try again

I am trying to Start Pig-0.12.0 on MAC after I Installed Pig from Apache website.
Before I start Pig shell, I copied below 4 lines after creating pig-env.sh file in conf Directory.
Export JAVA_HOME=/usr
Export PIG_HOME=/Users/Hadoop_Cluster/pig-0.12.0
Export HADOOP_HOME=Users/Hadoop_Cluster/hadoop-1.2.1
Export PIG_CLASSPATH=$HADOOP_HOME/conf/
Also, Added below text in pig.properties file:
Fs.default.name=hdfs://localhost:9000
Mapred.job.tracker=localhost:9001
I copied core-site.xml, hdfs-site.xml and mapped-site.xml file from
Hadoop_home/conf to pig_home/conf
I Get below Error when starting Pig in Command line under bin directory of Pig. Error says:
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop', and Try again
If it is not there copy pig-0.12.0-withouthadoop.jar (renamed or not, it shouldn't matter) to your $PIG_HOME, so in the end the file /Users/Hadoop_Cluster/pig-0.12.0/pig-0.12.0-withouthadoop.jar exists.
Also be careful about the lower case/upper case letters. Otherwise it should be fine.
Finally it works.
All I did is rename the file in conf directory to "pig-withouthadoop.jar" instead of pig-0.12.0-withouthadoop. Also I make sure the hadoop is not in safe mode.
I kept the same settings as below in file below and all the 3 hdp files are
copied to pig_home/conf directory.
export JAVA_HOME=/usr
export PIG_HOME=/Users/Hadoop_Cluster/pig-0.12.0
export HADOOP_HOME=/Users/Hadoop_Cluster/hadoop-1.2.1
export PIG_CLASSPATH=$HADOOP_HOME/conf/
I too got the same error. Solved by removing /bin in the home patch in .bashrc .. source in bashrc and start pig..
export PIG_HOME=/home/hadoop/pig-0.13.0/bin ==> wrong
export PIG_HOME=/home/hadoop/pig-0.13.0 ==> correct..
You need to follow as per the error generated :
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop'
One needs to run the command ant jar-withouthadoop to get pig-withouthadoop.jar
if ant is not installed for ubuntu users try apt-get install ant.
The command ant jar-withouthadoop will take roughly 15 -20 mins, but one needs to be patient for getting this sorted.
I scratched my head all day.Kept looking for solutions on goggle none helped.
On extraction of the pig tar there is no jar that is created in the home directory.The above is to be followed to create the jar file and to run pig successfully.
I don't exactly know why this is done,but this is the solution that has worked for me with hadoop 1.2 [out of safe mode] and pig 0.12.1
The key is find
pig-withouthadoop.jarpig-withouthadoop.jar\
in your $pig_home.
so use
find / -name *withouthadoop*
you can find it. maybe
pig-withouthadoop.jar
, you should rename it and cp to $pig_home. Worked for me

Hadoop and JZMQ - no jzmq in java.library.path

I'm trying to get the JZMQ code working on ONE of the nodes on Hadoop Cluster. I have necessary native jmzq library files installed under - /usr/local/lib directory on that node.
Here's the list -
libjzmq.a libjzmq.la libjzmq.so libjzmq.so.0 libjzmq.so.0.0.0 libzmq.a libzmq.la libzmq.so libzmq.so.3 libzmq.so.3.0.0 pkgconfig
In my shell script if I run the Java command below, it works absolutely fine -
java -Djava.library.path=/usr/local/lib -classpath class/:lib/:lib/jzmq-2.1.3.jar bigdat.twitter.queue.TweetOMQSub
But when I run the below command, it throws Exception in thread "main"
java.lang.UnsatisfiedLinkError: no jzmq in java.library.path
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub
I explicitly set the necessary files/Jars in Hadoop Classpath, Opts etc, using Export command
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/lib/"
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/txtUser/analytics/lib/*
export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native/:/usr/local/lib/
The source code JAR and jzmq-2.1.3.jar files are present under /home/txtUser/analytics/lib/ folder on Hadoop node.
Also, the /usr/local/lib is added onto the system ld.conf
Can anyone suggest, given inputs on what I may be doing wrong here?
Add following line to /etc/profile or .bashrc:
export JAVA_LIBRARY_PATH=/usr/local/lib
Reload /etc/profile or .bashrc.

Doubts in configuration of SQOOP

Scenario:
I have configured SQOOP on my PC. But I am facing some problem that,
when I go for bin/sqoop I get some error as:
Error:
Exception in thread "main"
`java.lang.NoSuchMethodError:`
org.apache.hadoop.conf.Configuration.getInstances(Ljava/lang/
String;Ljava/lang/Class;)Ljava/util/List;
at com.cloudera.sqoop.tool.SqoopTool.loadPlugins(SqoopTool.java:139)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:209)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:228)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:237)
Question:
What could be the problem? I have also set the path of $HBASE_HOME and $ZOOKEEPER_HOME.
Please suggest me how can we do it.
Thanks.
I am giving you the steps as I configured on my terminal.
Downloaded sqoop-1.3.0-cdh3u1 from the Cloudera archive.
Download mysql-connector-java-5.0.8 and copy the mysql-connector-java-5.0.8.jar file to lib and bin directory of sqoop (for sqoop and mysql connection)
Copy all jars from lib to bin (optional)
Add 2 lines in .bash_profile file
export SQOOP_HOME=/home/hadoop/Desktop/Cloudera/sqoop-1.3.0-cdh3u1
export PATH=$PATH:$SQOOP_HOME/bin
Save it and just type sqoop help on terminal
It worked on my terminal. Post me the steps you followed .
Maybe this helps:
https://issues.apache.org/jira/browse/SQOOP-384
Try to downgrade to a different version of Sqoop.

Resources