Hadoop and JZMQ - no jzmq in java.library.path - hadoop

I'm trying to get the JZMQ code working on ONE of the nodes on Hadoop Cluster. I have necessary native jmzq library files installed under - /usr/local/lib directory on that node.
Here's the list -
libjzmq.a libjzmq.la libjzmq.so libjzmq.so.0 libjzmq.so.0.0.0 libzmq.a libzmq.la libzmq.so libzmq.so.3 libzmq.so.3.0.0 pkgconfig
In my shell script if I run the Java command below, it works absolutely fine -
java -Djava.library.path=/usr/local/lib -classpath class/:lib/:lib/jzmq-2.1.3.jar bigdat.twitter.queue.TweetOMQSub
But when I run the below command, it throws Exception in thread "main"
java.lang.UnsatisfiedLinkError: no jzmq in java.library.path
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub
I explicitly set the necessary files/Jars in Hadoop Classpath, Opts etc, using Export command
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=/usr/local/lib/"
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/txtUser/analytics/lib/*
export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native/:/usr/local/lib/
The source code JAR and jzmq-2.1.3.jar files are present under /home/txtUser/analytics/lib/ folder on Hadoop node.
Also, the /usr/local/lib is added onto the system ld.conf
Can anyone suggest, given inputs on what I may be doing wrong here?

Add following line to /etc/profile or .bashrc:
export JAVA_LIBRARY_PATH=/usr/local/lib
Reload /etc/profile or .bashrc.

Related

sqoop2 not finding log4j2 from hadoop

I am trying to install sqoop2 (1.99.7) on my ubuntu server. I am trying to follow the instructions provided on the apache website here. I have a working hadoop installation and I have downloaded and extracted the sqoop file to the /usr/local/sqoop location.
tar -xvf sqoop-1.99.7-bin-hadoop200.tar.gz
mv sqoop-1.99.7-bin-hadoop200 /usr/local/sqoop
I believe I have all the environmental variables defined, in particular HADOOP_HOME which I thought is stated to direct where sqoop looks for the jar files.
However, when I try to verify installation with sqoop2-tool verify I get the following output.
Setting conf dir: /usr/local/sqoop/bin/../conf
Sqoop home directory: /usr/local/sqoop
Sqoop tool executor:
Version: 1.99.7
Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at org.apache.sqoop.security.authentication.SimpleAuthenticationHandler.secureLogin(SimpleAuthenticationHandler.java:36)
at org.apache.sqoop.security.AuthenticationManager.initialize(AuthenticationManager.java:98)
at org.apache.sqoop.core.SqoopServer.initialize(SqoopServer.java:57)
at org.apache.sqoop.tools.tool.VerifyTool.runTool(VerifyTool.java:36)
at org.apache.sqoop.tools.ToolRunner.main(ToolRunner.java:72)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
Somehow, it is failing to find the log4j2 configuration file. I'm not sure why this is the case.
This question is similar to the one here but the solution provided does not help. If I modify the sqoop.properties file and point directly to the hadoop configuration directory /usr/local/hadoop/etc/hadoop (which is where my core-site.xml, hdfs-site.xml, etc. are located) I continue to get the error above.
EDIT
Output of grep -r "org.apache.hadoop.conf.Configuration" /usr/local/hadoop | grep jar
Binary file /usr/local/hadoop/share/hadoop/common/sources/hadoop-common-2.8.0-sources.jar matches
Binary file /usr/local/hadoop/share/hadoop/common/hadoop-common-2.8.0.jar matches
Binary file /usr/local/hadoop/share/hadoop/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/hadoop-common-2.8.0.jar matches
Binary file /usr/local/hadoop/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/lib/hadoop-common-2.8.0.jar matches
Sqoop.properties is a Java property file. Environment variable should be defined in sqoop-env.sh or set it up using export command.
Can you try to execute the below environment variables export command before executing the sqoop command, It it works you can add these commands to sqoop-env.sh environment file.
export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HDFS_HOME=/usr/local/hadoop
export HADOOP_COMMON_HOME=/usr/local/hadoop
export HADOOP_YARN_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export HADOOP_MAPRED_HOME=/usr/local/hadoop
Make sure /usr/local/hadoop is correct.
Edit -
If you look at the last line of sqoop command, it's a bash script and it uses hadoop command internally to invoke sqoop class, so all hadoop related libs will be loaded to sqoop environment, if HADOOP_COMMON_HOME env variable is correct.
Are you able to execute hadoop commands in this server ?, Can you share the output of ${HADOOP_COMMON_HOME}/bin/hadoop fs -ls / ; If this works, this error could be due to compatibility - Sqoop version may not compatible with Hadoop.

spark-shell throws error in Apache spark

I have installed hadoop on ubuntu on virtual box(host os Windows 7).I have also installed Apache spark, configured SPARK_HOME in .bashrc and added HADOOP_CONF_DIR to spark-env.sh. Now when I start the spark-shell it throws error and do not initialize spark context, sql context. Am I missing something in installation and also I would want to run it on a cluster (hadoop 3 node cluster is set up).
I have the same issue when trying to install Spark local with Windows 7. Please make sure the below paths is correct and I am sure I will work with you. I answer same question in this link So, you can follow the below and it will work.
Create JAVA_HOME variable: C:\Program Files\Java\jdk1.8.0_181\bin
Add the following part to your path: ;%JAVA_HOME%\bin
Create SPARK_HOME variable: C:\spark-2.3.0-bin-hadoop2.7\bin
Add the following part to your path: ;%SPARK_HOME%\bin
The most important part Hadoop path should include bin file before winutils.ee as the following: C:\Hadoop\bin Sure you will locate winutils.exe inside this path.
Create HADOOP_HOME Variable: C:\Hadoop
Add the following part to your path: ;%HADOOP_HOME%\bin
Now you can run the cmd and write spark-shell it will work.

Error in Pig: Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop', and try again

I am trying to Start Pig-0.12.0 on MAC after I Installed Pig from Apache website.
Before I start Pig shell, I copied below 4 lines after creating pig-env.sh file in conf Directory.
Export JAVA_HOME=/usr
Export PIG_HOME=/Users/Hadoop_Cluster/pig-0.12.0
Export HADOOP_HOME=Users/Hadoop_Cluster/hadoop-1.2.1
Export PIG_CLASSPATH=$HADOOP_HOME/conf/
Also, Added below text in pig.properties file:
Fs.default.name=hdfs://localhost:9000
Mapred.job.tracker=localhost:9001
I copied core-site.xml, hdfs-site.xml and mapped-site.xml file from
Hadoop_home/conf to pig_home/conf
I Get below Error when starting Pig in Command line under bin directory of Pig. Error says:
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop', and Try again
If it is not there copy pig-0.12.0-withouthadoop.jar (renamed or not, it shouldn't matter) to your $PIG_HOME, so in the end the file /Users/Hadoop_Cluster/pig-0.12.0/pig-0.12.0-withouthadoop.jar exists.
Also be careful about the lower case/upper case letters. Otherwise it should be fine.
Finally it works.
All I did is rename the file in conf directory to "pig-withouthadoop.jar" instead of pig-0.12.0-withouthadoop. Also I make sure the hadoop is not in safe mode.
I kept the same settings as below in file below and all the 3 hdp files are
copied to pig_home/conf directory.
export JAVA_HOME=/usr
export PIG_HOME=/Users/Hadoop_Cluster/pig-0.12.0
export HADOOP_HOME=/Users/Hadoop_Cluster/hadoop-1.2.1
export PIG_CLASSPATH=$HADOOP_HOME/conf/
I too got the same error. Solved by removing /bin in the home patch in .bashrc .. source in bashrc and start pig..
export PIG_HOME=/home/hadoop/pig-0.13.0/bin ==> wrong
export PIG_HOME=/home/hadoop/pig-0.13.0 ==> correct..
You need to follow as per the error generated :
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop'
One needs to run the command ant jar-withouthadoop to get pig-withouthadoop.jar
if ant is not installed for ubuntu users try apt-get install ant.
The command ant jar-withouthadoop will take roughly 15 -20 mins, but one needs to be patient for getting this sorted.
I scratched my head all day.Kept looking for solutions on goggle none helped.
On extraction of the pig tar there is no jar that is created in the home directory.The above is to be followed to create the jar file and to run pig successfully.
I don't exactly know why this is done,but this is the solution that has worked for me with hadoop 1.2 [out of safe mode] and pig 0.12.1
The key is find
pig-withouthadoop.jarpig-withouthadoop.jar\
in your $pig_home.
so use
find / -name *withouthadoop*
you can find it. maybe
pig-withouthadoop.jar
, you should rename it and cp to $pig_home. Worked for me

Failed to locate the winutils binary in the hadoop binary path

I am getting the following error while starting namenode for latest hadoop-2.2 release. I didn't find winutils exe file in hadoop bin folder. I tried below commands
$ bin/hdfs namenode -format
$ sbin/yarn-daemon.sh start resourcemanager
ERROR [main] util.Shell (Shell.java:getWinUtilsPath(303)) - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:863)
Simple Solution:
Download it from here and add to $HADOOP_HOME/bin
(Source)
IMPORTANT UPDATE:
For hadoop-2.6.0 you can download binaries from Titus Barik blog >>.
I have not only needed to point HADOOP_HOME to extracted directory [path], but also provide system property -Djava.library.path=[path]\bin to load native libs (dll).
If you face this problem when running a self-contained local application with Spark (i.e., after adding spark-assembly-x.x.x-hadoopx.x.x.jar or the Maven dependency to the project), a simpler solution would be to put winutils.exe (download from here) in "C:\winutil\bin". Then you can add winutils.exe to the hadoop home directory by adding the following line to the code:
System.setProperty("hadoop.home.dir", "c:\\\winutil\\\")
Source: Click here
If we directly take the binary distribution of Apache Hadoop 2.2.0 release and try to run it on Microsoft Windows, then we'll encounter ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path.
The binary distribution of Apache Hadoop 2.2.0 release does not contain some windows native components (like winutils.exe, hadoop.dll etc). These are required (not optional) to run Hadoop on Windows.
So you need to build windows native binary distribution of hadoop from source codes following "BUILD.txt" file located inside the source distribution of hadoop. You can follow the following posts as well for step by step guide with screen shot
Build, Install, Configure and Run Apache Hadoop 2.2.0 in Microsoft Windows OS
ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
The statement
java.io.IOException: Could not locate executable null\bin\winutils.exe
explains that the null is received when expanding or replacing an Environment Variable. If you see the Source in Shell.Java in Common Package you will find that HADOOP_HOME variable is not getting set and you are receiving null in place of that and hence the error.
So, HADOOP_HOME needs to be set for this properly or the variable hadoop.home.dir property.
Hope this helps.
Thanks,
Kamleshwar.
Winutils.exe is used for running the shell commands for SPARK.
When you need to run the Spark without installing Hadoop, you need this file.
Steps are as follows:
Download the winutils.exe from following location for hadoop 2.7.1
https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin
[NOTE: If you are using separate hadoop version then please download the winutils from corresponding hadoop version folder on GITHUB from the location as mentioned above.]
Now, create a folder 'winutils' in C:\ drive. Now create a folder 'bin' inside folder 'winutils' and copy the winutils.exe in that folder.
So the location of winutils.exe will be C:\winutils\bin\winutils.exe
Now, open environment variable and set HADOOP_HOME=C:\winutils
[NOTE: Please do not add \bin in HADOOP_HOME and no need to set HADOOP_HOME in Path]
Your issue must be resolved !!
I just ran into this issue while working with Eclipse. In my case, I had the correct Hadoop version downloaded (hadoop-2.5.0-cdh5.3.0.tgz), I extracted the contents and placed it directly in my C drive. Then I went to
Eclipse->Debug/Run Configurations -> Environment (tab) -> and added
variable: HADOOP_HOME
Value: C:\hadoop-2.5.0-cdh5.3.0
You can download winutils.exe here:
http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe
Then copy it to your HADOOP_HOME/bin directory.
In Pyspark, to run local spark application using Pycharm use below lines
os.environ['HADOOP_HOME'] = "C:\\winutils"
print os.environ['HADOOP_HOME']
winutils.exe are required for hadoop to perform hadoop related commands. please download
hadoop-common-2.2.0 zip file. winutils.exe can be found in bin folder. Extract the zip file and copy it in the local hadoop/bin folder.
I was facing the same problem. Removing the bin\ from the HADOOP_HOME path solved it for me. The path for HADOOP_HOME variable should look something like.
C:\dev\hadoop2.6\
System restart may be needed. In my case, restarting the IDE was sufficient.
As most answers here refer to pretty old versions of winutils, I will leave a link to the most comprehensive repository, which supports all versions of Hadoop including the most recent ones:
https://github.com/kontext-tech/winutils
(find the directory corresponding to your Hadoop version, or try the most recent one).
If you have admin permissions on you machine.
Put bin directory into C:\winutils
The whole path should be C:\winutils\bin\winutils.exe
Set HADOOP_HOME into C:\winutils
If you don't have admin permissions or want to put the binaries into user space.
Put bin directory into C:\Users\vryabtse\AppData\Local\Programs\winutils or similar user directory.
Set HADOOP_HOME value into path to this directory.
Set up HADOOP_HOME variable in windows to resolve the problem.
You can find answer in org/apache/hadoop/hadoop-common/2.2.0/hadoop-common-2.2.0-sources.jar!/org/apache/hadoop/util/Shell.java :
IOException from
public static final String getQualifiedBinPath(String executable)
throws IOException {
// construct hadoop bin path to the specified executable
String fullExeName = HADOOP_HOME_DIR + File.separator + "bin"
+ File.separator + executable;
File exeFile = new File(fullExeName);
if (!exeFile.exists()) {
throw new IOException("Could not locate executable " + fullExeName
+ " in the Hadoop binaries.");
}
return exeFile.getCanonicalPath();
}
HADOOP_HOME_DIR from
// first check the Dflag hadoop.home.dir with JVM scope
String home = System.getProperty("hadoop.home.dir");
// fall back to the system/user-global env variable
if (home == null) {
home = System.getenv("HADOOP_HOME");
}
Download desired version of hadoop folder (Say if you are installing spark on Windows then hadoop version for which your spark is built for) from this link as zip.
Extract the zip to desired directory.
You need to have directory of the form hadoop\bin (explicitly create such hadoop\bin directory structure if you want) with bin containing all the files contained in bin folder of the downloaded hadoop. This will contain many files such as hdfs.dll, hadoop.dll etc. in addition to winutil.exe.
Now create environment variable HADOOP_HOME and set it to <path-to-hadoop-folder>\hadoop. Then add ;%HADOOP_HOME%\bin; to PATH environment variable.
Open a "new command prompt" and try rerunning your command.
Download [winutils.exe]
From URL :
https://github.com/steveloughran/winutils/hadoop-version/bin
Past it under HADOOP_HOME/bin
Note : You should Set environmental variables:
User variable:
Variable: HADOOP_HOME
Value: Hadoop or spark dir
I used "hbase-1.3.0" and "hadoop-2.7.3" versions. Setting HADOOP_HOME environment variable and copying 'winutils.exe' file under HADOOP_HOME/bin folder solves the problem on a windows os.
Attention to set HADOOP_HOME environment to the installation folder of hadoop(/bin folder is not necessary for these versions).
Additionally I preferred using cross platform tool cygwin to settle linux os functionality (as possible as it can) because Hbase team recommend linux/unix env.
I was getting the same issue in windows. I fixed it by
Downloading hadoop-common-2.2.0-bin-master from link.
Create a user variable HADOOP_HOME in Environment variable and assign the path of hadoop-common bin directory as a value.
You can verify it by running hadoop in cmd.
Restart the IDE and Run it.
I recently got the same error message while running spark application on Intellij Idea. What I did was, I downloaded the winutils.exe that is compatible with the Spark version I was running and moved it to the Spark bin directory. Then in my Intellij, I edited the configuration.
The 'Environment variables' area was empty. So, I entered HADOOP_HOME = P:\spark-2.4.7-bin-hadoop2.7
Since, the winutils.exe is in the P:\spark-2.4.7-bin-hadoop2.7\bin directory, it will locate the file while running.
So, by setting HADOOP_HOME, the null would be the HADOOP_HOME directory. Complete path would be P:\spark-2.4.7-bin-hadoop2.7\bin\winutils.exe
That was how I resolved it

Hadoop JAR command - Setting java.library.path

I'm trying to run a java program on Hadoop cluster. Here's the command-
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/lib/*:/home/rgupta/bdAnalytics/lib/*
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub > $logsFldr/subsHdpOMQ_$1.log 2>&1 &
#java -Djava.library.path=/usr/local/lib -classpath class/:lib/:lib/jzmq-2.1.3.jar bigdat.twitter.queue.TweetOMQSub > log/subsFilterOMQ_$1.log 2>&1 &
This throws following error -
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jzmq in java.library.path
If I use the Java native command above, it works fine. Also, the hadoop node where I m trying to test it, does have the necessary jzmq jars under /usr/local/lib directory. Is there a way I can set java.library.path to Hadoop JAR command.
Please suggest how can I fix this.
sorry misread your question so editing:
you should be able to use the libjars option
In your case:
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/local/lib/:/home/rgupta/bdAnalytics/lib/
hadoop jar $jarpath bigdat.twitter.queue.TweetOMQSub -libjars /usr/local/lib ...
Try export HADOOP_OPTS=$HADOOP_OPTS -Djava.library.path=/usr/local/lib
and export other jars the usual way you are doing before running a job - using HADOOP_CLASSPATH
Hope this helps.

Resources