CLASSPATH issue in Hadoop on Cygwin while running "hadoop version" command - hadoop

I have installed Hadoop version 2.1 beta from Apache on Windows using Cygwin terminal. Running the command hadoop version gets me this error :
Error: Could not find or load main class org.apache.hadoop.util.VersionInfo

you can also add the following to your ~/.bashrc
export HADOOP_CLASSPATH=$(cygpath -pw $(hadoop classpath)):$HADOOP_CLASSPATH
this solved it for me

I met the same issue when trying to install Hadoop 2.2.0 on windows 2008 Server Sp1 64bit.
I have installed cygwin64 and configured openssh.
The answer by user2870991 works for me.
Modify the \hadoop\bin\hadoop script as below, comment the original exec line and insert the new one.
#exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$#"
#add the -claspath "$(cygpath -pw "$CLASSPATH")" TO FIX the script running in cygwin
exec "$JAVA" -classpath "$(cygpath -pw "$CLASSPATH")" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$#"

Add the below statement in hadoop-config.sh # line no 285
CLASSPATH=`cygpath -wp "$CLASSPATH"`
//Comments goes here
if [ "$HADOOP_CLASSPATH" != "" ]; then
# Prefix it if its to be preceded
if [ "$HADOOP_USER_CLASSPATH_FIRST" != "" ]; then
CLASSPATH=${HADOOP_CLASSPATH}:${CLASSPATH}
else
CLASSPATH=${CLASSPATH}:${HADOOP_CLASSPATH}
fi
fi
Output :
admin#admin-PC /cygdrive/e/hadoop/hadoop-2.2.0/bin
$ ./hadoop version
Hadoop 2.2.0
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
Compiled by hortonmu on 2013-10-07T06:28Z
Compiled with protoc 2.5.0
From source with checksum 79e53ce7994d1628b240f09af91e1af4
This command was run using /E:/hadoop/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar

Related

Unexpected behavior in bash scripting on a ssh function

I've built a Raspberry pi cluster, with Spark and Hadoop installed, and have made a few functions in .bashrc to make communication and interaction a little easier
function otherpis {
grep "pi" /etc/hosts | awk '{print $2}' | grep -v $(hostname)
}
function clustercmd {
for pi in $(otherpis); do ssh $pi "$#"; done
$#
}
Where otherpis simply looks at the host file where i've precompiled all other raspberry pis in the cluster with their static ip addresses. I've also configured ssh with authorized keys so I don't have to enter a password everytime I ssh in.
I can call commands like
$ clustercmd date
Thu 03 Oct 2019 02:00:13 PM CDT
Thu 03 Oct 2019 02:00:11 PM CDT
Thu 03 Oct 2019 02:00:12 PM CDT
......
or
$ clustercmd sudo mkdir -p /opt/hadoop_tmp/hdfs
and it works just fine. But for some reason when I try to pass anything into the command with hadoop or spark, it says it can't find the command except for the pi that I'm invoking the command from.
$ clustercmd hadoop version | grep Hadoop
bash: hadoop: command not found
bash: hadoop: command not found
.....
Hadoop 3.2.1
But when I manually ssh into a pi and call the command, it works just fine.
$ ssh pi2
pi#pi2: $ hadoop version | grep Hadoop
Hadoop 3.2.1
I've exported all proper paths in .bashrc. I've chown of all relevant directories on each pi. No matter what I try, just the spark and hadoop commands aren't registering. Everything else is. I've even have a function that will do a file copy across the entire cluster
function clusterscp {
for pi in $(otherpis); do
cat $1 | ssh $pi "sudo tee $1" > /dev/null 2>&1
done
}
I setup hadoop and spark on the first pi, and then mass transferred all files and configurations with the above function with no problems. Any insight would help
EDIT
Adding all exported paths in .bashrc
export JAVA_HOME=$(readlink –f /usr/bin/java | sed "s:bin/java::")
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export SPARK_HOME=/opt/spark
export PATH=$PATH:$SPARK_HOME/bin
export HADOOP_OPTS="-XX:-PrintWarnings –Djava.net.preferIPv4Stack=true"
export HADOOP_HOME_WARN_SUPPRESS=1
export HADOOP_ROOT_LOGGER="WARN,DRFA"
Note that as I stated earlier, when i'm actually SSH'd into the pi, all exported paths work fine, it is only when I try to run a clustercmd command that hadoop and spark is not found
Solved
I fixed this by moving all exports above this line in the .bashrc
# If not running interactively, don't do anything
case $- in
*i*);;
*) return;;
esac
And I added it to .profile in the home directory. This was originally suggested by mangusta, he just added a word to the file ".bash_profile", when it should be just "profile"
~/.bashrc is executed when you have already logged in and want to open a new terminal window or execute a new shell instance.
If you login into machine (locally or remotely), what actually runs is ~/.bash_profile, not ~/.bashrc.
Try including you HADOOP_HOME and SPARK_HOME folders into PATH inside of .bash_profile on all your pi_N hosts
I think you can directly debug the problem reported by bash ("command not found") by running
echo $PATH
in the ssh shells on each RPi, and comparing the result with the result of
clustercmd 'echo $PATH'
Probably this will show that mangusta's answer is correct.

Spring boot CLI doesn't work on Git Bash on windows

Spring boot CLI refuses to run on Git bash window. I've added it to path in windows and works from cmd. The error in git bash is:
$ spring
Error: Could not find or load main class org.springframework.boot.loader.JarLauncher
The problem occurs because when attempting to run the Spring Boot CLI jar the script is using both a malformed file path for JAVA_HOME and for the Classpath. When looking at the bin directory of the spring installation you can see two scripts:
spring
spring.bat
The spring.bat script is executed when you run from the windows CMD and will work fine, however when running in git bash it will use the spring script. This script will attempt to correct the issue by using cygpath to ensure that the file paths are in a unix format. It does this only when it determines that it is running in a Cygwin environment, but does not make this determination when running from the git bash. As a result the file paths become malformed.
Fortunately there is a hack that can resolve this issue if you are interested. Comment out the if statement at line 17 through 19 and the add its contents as a separate line like so:
# if $cygwin ; then
# [ -n "$JAVA_HOME" ] && JAVA_HOME=`cygpath --unix "$JAVA_HOME"`
# fi
[ -n "$JAVA_HOME" ] && JAVA_HOME=`cygpath --unix "$JAVA_HOME"`
And again for another if statement at line 92:
# if $cygwin; then
# SPRING_HOME=`cygpath --path --mixed "$SPRING_HOME"`
# CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
# fi
SPRING_HOME=`cygpath --path --mixed "$SPRING_HOME"`
CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
You will now be able to run the Spring Boot CLI from git bash.
I had the same issue while running the "spring" command in git bash, I found that my spring was installed on D drive(while Java was installed on C drive). When running the "spring" command in git bash on C drive directory, I would get the above error. If I switch the directory to D drive in git bash and run the "spring" command again, it would work.

Unable to change path for Maven on Mac

I was trying to change M2_HOME in bash_profile to configure a new version of Maven. Earlier, it was set to 2.2.1. Now I'm trying to change the path to 3.3.3. This is my bash_profile
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_25.jdk/Contents/Home
export M2_HOME=/usr/local/apache-maven-3.3.3
#export M2_HOME=/usr/local/apache-maven-2.2.1
export PATH=$PATH:$JAVA_HOME/bin
export PATH=$PATH:$M2_HOME/bin
export CATALINA_HOME=/Library/Tomcat/apache-tomcat-7.0.68
When I try to run source ~/.bash_profile and then mvn -version I get the following error -
$mvn -version
Error: Could not find or load main class org.codehaus.classworlds.Launcher
Any suggestions to solve this please?
PS: I'm on OS X El Captain
A simpler alternative is to set up some bash aliases. I have added the following to my ~/.bash_profile for switching between maven versions and Java versions:
export BASE_PATH=$PATH
#alias java5="export JAVA_HOME=`/usr/libexec/java_home -v1.5 -a x86_64 -d64`"
alias java6="export JAVA_HOME=`/usr/libexec/java_home -v1.6`"
alias java7="export JAVA_HOME=`/usr/libexec/java_home -v1.7`"
alias java8="export JAVA_HOME=`/usr/libexec/java_home -v1.8`"
# maven versions
alias m30="PATH=~/tools/apache-maven-3.0.5/bin:$BASE_PATH"
alias m31="PATH=~/tools/apache-maven-3.1.1/bin:$BASE_PATH"
alias m32="PATH=~/tools/apache-maven-3.2.5/bin:$BASE_PATH"
alias m33="PATH=~/tools/apache-maven-3.3.9/bin:$BASE_PATH"
Note the use of /usr/libexec/java_home for setting up JAVA_HOME which is similar to linux alternatives for switching java versions.
So, in a new terminal session the following:
[steve#steves-mbp ~]$ java8
[steve#steves-mbp ~]$ m33
[steve#steves-mbp ~]$
sets me up to use maven 3.3 and Java 8.
Please also take into account the comment by ~khmarbaise regarding M2_HOME and forget that this environment variable exists.
Add a new symlink for mvn3 worked for me
ln -s /usr/local/apache-maven-3.3.3/bin/mvn /usr/local/User/bin/mvn3
It is common, particularly in an environment where a product portfolio is quite large, to have to support multiple Maven versions. Similar to having to support multiple Java versions, you can create a script that will track and modify your environment accordingly. I use a Mac, so the notion of a jEnv type of mechanism is what I use for Java. On Windows, a similar concept can be used although It would take some coding to properly adjust the path settings.
Here's a /usr/local/bin/mvnEnv bash script that I use to quickly change my Maven runtime. It's not nearly as comprehensive as jEnv, but it works for me so perhaps it can work for you. Adjust the various parameters to conform to your various Maven installs and update your PATH appropriately, if on Windows. (I know you're using a Mac, so the Windows comment is for others that may have this issue on Windows.)
Just update your ~/.bash_profile to call this script with the appropriate parameters if you need a default. Then, when you need a different version of Maven, you can just execute the script like
mvnEnv v33
And voila, you've just quickly changed your Maven version! If you don't know what versions of Maven are supported, simply execute the mvnEnv command and a list of valid versions will be printed. You will, however, have to add any new versions of Maven to the script for the new version to be available.
#!/bin/bash
echo "Setting the maven implementation version"
v22=/usr/local/Cellar/maven2/2.2.1/libexec/bin/mvn
v30=/usr/local/Cellar/maven30/3.0.5/libexec/bin/mvn
v31=/usr/local/Cellar/maven31/3.1.1/libexec/bin/mvn
v32=/usr/local/Cellar/maven32/3.2.5/libexec/bin/mvn
v33=/usr/local/Cellar/maven/3.3.9/libexec/bin/mvn
if [ -e /usr/local/bin/mvn ]
then
echo "Remove the maven soft link."
sudo rm /usr/local/bin/mvn
else
echo "Maven soft link could not be found."
fi
maven=$v22
if [ $# == 0 ] || [ -z "${!1// }" ]
then
echo "No Arguments supplied, using default $maven"
echo "Available versions:"
echo " v22 = 2.2.1"
echo " v30 = 3.0.5"
echo " v31 = 3.1.1"
echo " v32 = 3.2.5"
echo " v33 = 3.3.9"
elif [ -e ${!1} ]
then
echo "Setting maven to use ${!1} via $1"
maven=${!1}
else
echo "Using the default maven setting, provided argument [$1] is not recognized."
fi
echo "Creating new soft link to $maven";
sudo ln -s $maven /usr/local/bin/mvn

Unable to find Namenode class when setting up Hadoop on Windows 8

Trying to set up Hadoop 2.4.1 on my machine using Cygwin and I'm stuck when I try to run
$ hdfs namenode -format
which gives me
Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode
I think it's due to an undefined environment variable, since I can run
$ hadoop version
without a problem. I've defined the following:
JAVA_HOME
HADOOP_HOME
HADOOP_INSTALL
as well as adding the Hadoop \bin and \sbin (and Cygwin's \bin) to the Path. Am I missing an environment variable that I need to define?
Ok, looks like the file hadoop\bin\hdfs also has to be changed like the hadoop\bin\hadoop file described here.
The end of the file must be changed from:
exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$#"
to
exec "$JAVA" -classpath "$(cygpath -pw "$CLASSPATH")" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$#"
I assume I'll have to make similar changes to the hadoop\bin\mapred and hadoop\bin\yarn when I get to using those files.

How to adapt bin/hdfs for executing from outside $HADOOP_HOME/bin?

I'm trying to modify the hdfs script so that it still functions although not located in $HADOOP_HOME/bin anymore, but when I execute the modified hdfs I get:
hdfs: line 110: exec: org.apache.hadoop.fs.FsShell: not found
line 110 is:
exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$#"
I've highlighted the changes I made to the script:
bin=**"$HADOOP_HOME"/bin # was** `dirname "$0"`
bin=`cd "$bin"; pwd`
./**hdfs-config.sh # was .** "$bin"/hdfs-config.sh
-
$ hadoop version
Hadoop 0.20.3-SNAPSHOT
Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append -r 1041718
Compiled by hammer on Mon Dec 6 17:38:16 CET 2010
Why don't you simply put a second copy of Hadoop on the system and let it have a different value for HADOOP_HOME?

Resources