Error in Pig: Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop', and try again - hadoop

I am trying to Start Pig-0.12.0 on MAC after I Installed Pig from Apache website.
Before I start Pig shell, I copied below 4 lines after creating pig-env.sh file in conf Directory.
Export JAVA_HOME=/usr
Export PIG_HOME=/Users/Hadoop_Cluster/pig-0.12.0
Export HADOOP_HOME=Users/Hadoop_Cluster/hadoop-1.2.1
Export PIG_CLASSPATH=$HADOOP_HOME/conf/
Also, Added below text in pig.properties file:
Fs.default.name=hdfs://localhost:9000
Mapred.job.tracker=localhost:9001
I copied core-site.xml, hdfs-site.xml and mapped-site.xml file from
Hadoop_home/conf to pig_home/conf
I Get below Error when starting Pig in Command line under bin directory of Pig. Error says:
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop', and Try again

If it is not there copy pig-0.12.0-withouthadoop.jar (renamed or not, it shouldn't matter) to your $PIG_HOME, so in the end the file /Users/Hadoop_Cluster/pig-0.12.0/pig-0.12.0-withouthadoop.jar exists.
Also be careful about the lower case/upper case letters. Otherwise it should be fine.

Finally it works.
All I did is rename the file in conf directory to "pig-withouthadoop.jar" instead of pig-0.12.0-withouthadoop. Also I make sure the hadoop is not in safe mode.
I kept the same settings as below in file below and all the 3 hdp files are
copied to pig_home/conf directory.
export JAVA_HOME=/usr
export PIG_HOME=/Users/Hadoop_Cluster/pig-0.12.0
export HADOOP_HOME=/Users/Hadoop_Cluster/hadoop-1.2.1
export PIG_CLASSPATH=$HADOOP_HOME/conf/

I too got the same error. Solved by removing /bin in the home patch in .bashrc .. source in bashrc and start pig..
export PIG_HOME=/home/hadoop/pig-0.13.0/bin ==> wrong
export PIG_HOME=/home/hadoop/pig-0.13.0 ==> correct..

You need to follow as per the error generated :
Cannot locate pig-withouthadoop.jar. do 'ant jar-withouthadoop'
One needs to run the command ant jar-withouthadoop to get pig-withouthadoop.jar
if ant is not installed for ubuntu users try apt-get install ant.
The command ant jar-withouthadoop will take roughly 15 -20 mins, but one needs to be patient for getting this sorted.
I scratched my head all day.Kept looking for solutions on goggle none helped.
On extraction of the pig tar there is no jar that is created in the home directory.The above is to be followed to create the jar file and to run pig successfully.
I don't exactly know why this is done,but this is the solution that has worked for me with hadoop 1.2 [out of safe mode] and pig 0.12.1

The key is find
pig-withouthadoop.jarpig-withouthadoop.jar\
in your $pig_home.
so use
find / -name *withouthadoop*
you can find it. maybe
pig-withouthadoop.jar
, you should rename it and cp to $pig_home. Worked for me

Related

Not a valid jar when I was running an example of Hadoop

I am learning Hadoop recently. I am using sandbox on virtualbox. I downloaded a python script with mrjob frame and run the following command,
python RatingsBreakdown.py -r hadoop --hadoop-streaming-jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming-jar u.data
and then got this,
Running step 1 of 1...
Not a valid JAR: /usr/hdp/2.6.3.0-235/hadoop-mapreduce/hadoop-streaming-jar
lib/hadoop-mapreduce/hadoop-streaming.jar
This is the jar in my computer ,
a valid jar is end with .jar your command is has some mistakes .
you can open the folder to observe (cd foldername) the filename or try to use tab to completion your file name .In that way to reduce mistakes.

Could not format the Namenode in hadoop 2.6?

I have installed the hadoop 2.6 on ubuntu 14.04.I just followed this blog.
While I am trying to format the namenode, I am hitting with below error:
hduser#data1:~$ hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
/usr/local/hadoop/bin/hdfs: line 276: /home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java: No such file or directory
/home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java: No such file or directory
This error occurs because the JAVA_HOME you have provided does not have java.
Just add this line in hadoop-env.sh and /home/hduser/.bashrc:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
I think you have already set the $JAVA_HOME but you did it wrong (just a guess):
/home/hduser/usr/lib/jvm/java-7-openjdk-amd64/bin/java
It would be :
/usr/lib/jvm/java-7-openjdk-amd64/bin/java
You probably have added ~ before the path when you exported that JAVA_HOME and this added the home directory /home/hduser.
To check this out, type java -version and see if java is working. And type echo $JAVA_HOME and check the path manually.
I figured out. The entry we made was for amd64. it is really i386 computers. Please verify the path and that should fix the issue.

pig local mode spill data issue

I am trying to solve this issue but unable to understand. The pig script in my Development machine ran on a 1.8 GB data file successfully.
When I am trying to run it in server it is stating that it cannot find a local device to spill data spill0.out
I have modified the pig.temp.Dir property in the pig.property file to point to a location having space..
error:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/spill0.out
So how to find out where pig is spilling out the data and can we change the pig spill directory location as well somehow.
I using pig in local mode.
Any ideas or suggestions or workarounds will be of great help.
Thanks..
I found an answer.
We need to put the follwing to the $PIG_HOME/conf/pig.properties file
mapreduce.jobtracker.staging.root.dir
mapred.local.dir
pig.temp.dir
and then test.
This has helped me solve the problem.
This is not a problem with Pig.
I'm not using Pig and I also have exactly the same error.
The problem seems to be more related to Hadoop. I also use it in local mode. I'm using Hadoop 2.6.0
I had no luck with these answers, Pig (version 0.15.0) was still writing pigbag* files to /tmp dir so I just renamed my /tmp dir and created a symbolic link to the desired location like this:
sudo -s #change to root
cd /
mv tmp tmp_local
ln -s /desired/new/tmp/location tmp
chmod 1777 tmp
mv tmp_local/* tmp
Make sure there are no active applications writing to tmp folder at the time of running these commands.

hbase installation on single node

i have installed hadoop single node on ubuntu 12.04. Now I am trying to install hbase over it (version 0.94.18). But i get the following errors(even though i have extracted it in the /usr/local/hbase):
Error: Could not find or load main class org.apache.hadoop.hbase.util.HBaseConfTool
Error: Could not find or load main class org.apache.hadoop.hbase.zookeeper.ZKServerTool
starting master, logging to /usr/lib/hbase/hbase-0.94.8/logs/hbase-hduser-master-ubuntu.out
nice: /usr/lib/hbase/hbase-0.94.8/bin/hbase: No such file or directory
cat: /usr/lib/hbase/hbase-0.94.8/conf/regionservers: No such file or directory
To resolve This Error
Download binary version of hbase
Edit conf file hbase-env.sh and hbase-site.xml
Set Up Hbase Home Directory
Start hbase By - Start-hbase.sh
Explanation To above Error:
Could not find or load main class your downloaded version does not have required jar
Hi can you tell when it is coming this error.
I think you gave environment set wrong
You should enter bellow command:
export HBASE_HOME="/usr/lib/hbase/hbase-0.94.18"
Then try hbase it will work.
If you want shell script you can download this lik :: https://github.com/tonyreddy/Apache-Hadoop1.2.1-SingleNode-installation-shellscript
It have hadoop, hive, hbase, pig.
Thank
Tony.
It is not recommended to run hbase from the source distribution directly instead you have to download the binary distribution as they have mentioned in their official site, follow the same instructions and you will get it up.
You could try installing the version 0.94.27
Download it from : h-base 0.94.27 dowload
This one worked for me.
Follow the instruction specified in :
Hbase installation guide
sed "s/<\/configuration>/<property>\n<name>hbase.rootdir<\/name>\n<value>hdfs:\/\/'$c':54310\/hbase<\/value>\n<\/property>\n<property>\n<name>hbase.cluster.distributed<\/name>\n<value>true<\/value>\n<\/property>\n<property>\n<name>hbase.zookeeper.property.clientPort<\/name>\n<value>2181<\/value>\n<\/property>\n<property>\n<name>hbase.zookeeper.quorum<\/name>\n<value>'$c'<\/value>\n<\/property>\n<\/configuration>/g" -i.bak hbase/conf/hbase-site.xml
sed 's/localhost/'$c'/g' hbase/conf/regionservers -i
sed 's/#\ export\ HBASE_MANAGES_ZK=true/export\ HBASE_MANAGES_ZK=true/g' hbase/conf/hbase-env.sh -i
Yes just type this tree commands and you need change replace $c to your hostname.
Then try it will work.

Cannot find hadoop installation: $HADOOP_HOME must be set or hadoop must be in the path

So a little background. I've been trying to setup Hive on a CentOS 6 machine. I followed the instructions of this Youtube video: http://www.youtube.com/watch?v=L2lSrHsRpOI
For my case, I'm using Hadoop-1.1.2 and Hive 0.9.0, all the directories labeled "mnt" in this video I replaced it with "opt" because that's where all of my hadoop and hive packages have been opened up.
As I reached the portion of the video where I was actually supposed to run Hive via "./hive"
this error popped up:
"Cannot find hadoop installation: $HADOOP_HOME must be set or hadoop must be in the path"
I guess one of the questions I have is, in which directory did I have to edit the ".profile" file? because I don't understand why we would have to go to the "home" directory for this change. And also if this helps, this is what I had put down in the ".profile" file in my /home/hadoop directory
export HADOOP_HOME=/opt/hadoop/hadoop
export HIVE_HOME=/opt/hadoop/hive
export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin
Thank you so much!
Go to /etc/profile.d directory and create a hadoop.sh file in there with
export HADOOP_HOME=/opt/hadoop/hadoop
export HIVE_HOME=/opt/hadoop/hive
export PATH=$PATH:$HADOOP_HOME/bin:$HIVE_HOME/bin
After you save the file, make sure to
chmod +x /etc/profile.d/hadoop.sh
source /etc/profile.d/hadoop.sh
This should take care of it.

Resources