Error with Pig on Local machine - hadoop

I am a newbie so pardon me if the question appears to be very silly. I have installed hadoop 1.2.1 and the basic wordcount example works fine on my local so as the next level of exploration i installed Pig 0.13.0.
When i just tried running pig -help it seemed to work fine. But when i run pig version i get an IOException as below:
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-06 01:00:08,321 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:29:34
2014-08-06 01:00:08,322 [main] INFO org.apache.pig.Main - Logging error messages to: /home/<user>/pig/log/pig_1407301208318.log
2014-08-06 01:00:09,856 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist.
Details at logfile: /home/<user>/pig/log/pig_1407301208318.log
The content of the log file is as below:
java.io.FileNotFoundException: File version does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:402)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:255)
at org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:778)
at org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:722)
at org.apache.pig.Main.run(Main.java:550)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
It would be great if someone could help me understand what needs to be done to fix this.

I think nothing is wrong, pig will run fine. The problem is that you used this command:
pig version
But you should use
pig -version
Think you will get same error when write pig help.
Have a nice day

Try this
pig -version
if you give pig version alone, then it searching for the file named 'version'.

Related

pig script does not exists error , even if I can see it in hdfs

I am trying to run the pig script using the -f usecatalog option but it is giving me issue.
it says script does not exist, while I can see the file is present in hdfs file system. see below.
[hdfs#ip-xx-xx-xx-x-xx ec2-user]$ pig -useHCatalog -f /user/admin/pig/scripts/hcat1.pig
WARNING: Use "yarn jar" to launch YARN applications.
16/04/01 13:44:13 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
16/04/01 13:44:13 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
16/04/01 13:44:13 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2016-04-01 13:44:13,645 [main] INFO org.apache.pig.Main - Apache Pig version 0.15.0.2.3.4.0-3485 (rexported) compiled Dec 16 20 15, 04:30:33
2016-04-01 13:44:13,645 [main] INFO org.apache.pig.Main - Logging error messages to: /tmp/hsperfdata_hdfs/pig_1459532653643.log
2016-04-01 13:44:14,184 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File /user/admin/pig/scripts/hca t1.pig does not exist
Details at logfile: /tmp/hsperfdata_hdfs/pig_1459532653643.log
2016-04-01 13:44:14,203 [main] INFO org.apache.pig.Main - Pig script completed in 753 milliseconds (753 ms)
[hdfs#ip-xxx-xx-xx-xx ec2-user]$ hadoop fs -cat /user/admin/pig/scripts/hcat1.pig
a = load 'trucks' using org.apache.hive.hcatalog.pig.HCatLoader();
b = filter a by truckid == 'A1';
store b INTO '/user/admin/pig/scritps/outputb1';
You need to specify the complete HDFS URI to run the scripts that are stored in HDFS.
Here is what you need:
$pig -useHCatalog hdfs://namenode_hostname:port/user/admin/pig/scripts/hcat1.pig

PIG setup throwing error

I was trying to install PIG v0.13.0 in my Fedora 20 system. After extracting the tar.gz contents, I did the PATH setup for JAVA_HOME and PIG/bin. Then I type the command pig in the console and this is what I got: Unable to understand what went wrong:
[root#localhost /]# pig
14/12/21 00:05:15 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/12/21 00:05:15 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/12/21 00:05:15 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-12-21 00:05:16,082 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-12-21 00:05:16,083 [main] INFO org.apache.pig.Main - Logging error messages to: //pig_1419100516081.log
2014-12-21 00:05:16,130 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /root/.pigbootup not found
2014-12-21 00:05:16,765 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-12-21 00:05:16,771 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-12-21 00:05:16,771 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:8020
2014-12-21 00:05:16,780 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
2014-12-21 00:05:19,130 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2014-12-21 00:05:19,130 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:8021
2014-12-21 00:05:19,136 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
grunt> ls
2014-12-21 00:05:33,697 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Call From localhost.localdomain/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Details at logfile: //pig_1419100516081.log
Please let me know why did the ls command in grunt shell throw the error?
Please guide.
When you type pig in console, by default it will go to MAPREDUCE mode, for that you need access to a Hadoop cluster and HDFS installation. Mapreduce mode is the default mode in pig.
It looks like your hadoop cluster is not configured properly that is the reason you are getting the connection refunded error. Please follow up this link to solve this connect-refused problem.http://wiki.apache.org/hadoop/ConnectionRefused.
As a workaround use LOCAL mode, this doesn't need hadoop installation.
In the console type pig -x local this will bring the grunt shell and type ls command.
Local mode
$ pig -x local
Mapreduce mode
$ pig
(or) //try to connect HDFS
$ pig -x mapreduce
Ok I got this one working. if I connect to the pig mapreduce mode the the ls command will change to ls hdfs:/. Hence changing the above command from ls to ls hdfs:/ resolves my problem. But again, if I am connecting to the local mode then the ls command works fine.

Integrating Pig with Hbase

I have installed hadoop-2.5.0, pig 0.13.0 and HBase 0.98.6.1 in linux. When trying to run simple pig script, error occurs as
2014-10-14 16:01:54,891 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Details at logfile: /home/labuser/pig_1413279561970.log
Pasted the log below...
Pig Stack Trace
ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
at org.apache.hadoop.hbase.TableName.(TableName.java:281)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:344)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:382)
at org.apache.hadoop.hbase.TableName.(TableName.java:82)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
It seems that HBase 0.98.6.1 version does not support for pig 0.13.0
So how to make it works? or which version of HBase does support for pig 0.13.0?
The root cause for this has been identified to be https://issues.apache.org/jira/browse/HBASE-6658 where it says the class "org.apache.hadoop.hbase.filter.WritableByteArrayComparable" was renamed.
You may need to re-compile using the HBase profile you're using.

not able to run pig on windows 7 using cygwin

I configured pig as directed in the documentation.
Enviornment: Windows 7, Hadoop-0.20.2, pig 0.13.0, Cygwin
But when i type pig (mapreduce) on the command prompt it just displays below thing. I am not sure whether pig is started or not. I don't see GRUNT shell to execute script.
Btw, Hadoop is running on the same node.
Can someone please help?
$ pig
Find hadoop at /hadoop-0.20.2/bin/hadoop
dry run:
HADOOP_CLASSPATH: C:\cygwin64\pig-0.13.0\conf;C;C:\Program Files\Java\jdk1.6.0_25\lib\tools.jar;C;C:\cygwin64\hadoop-0.20.2\conf;C:\cygwin64\pig-0.13.0\lib\accumulo-core-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-fate-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-server-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-start-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\accumulo-trace-1.5.0.jar;C:\cygwin64\pig-0.13.0\lib\avro-1.7.5.jar;C:\cygwin64\pig-0.13.0\lib\avro-mapred-1.7.5.jar;C:\cygwin64\pig-0.13.0\lib\avro-tools-1.7.5-nodeps.jar;C:\cygwin64\pig-0.13.0\lib\groovy-all-1.8.6.jar;C:\cygwin64\pig-0.13.0\lib\hbase-0.94.1.jar;C:\cygwin64\pig-0.13.0\lib\jruby-complete-1.6.7.jar;C:\cygwin64\pig-0.13.0\lib\js-1.7R2.jar;C:\cygwin64\pig-0.13.0\lib\json-simple-1.1.jar;C:\cygwin64\pig-0.13.0\lib\jython-standalone-2.5.3.jar;C:\cygwin64\pig-0.13.0\lib\piggybank.jar;C:\cygwin64\pig-0.13.0\lib\protobuf-java-2.4.0a.jar;C:\cygwin64\pig-0.13.0\lib\zookeeper-3.4.5.jar:C:\cygwin64\PIG-01~1.0/pig-withouthadoop-h2.jar:
HADOOP_OPTS: -Xmx1000m -Dpig.log.dir=C:\cygwin64\PIG-01~1.0\logs -Dpig.log.file=pig.log -Dpig.home.dir=C:\cygwin64\PIG-01~1.0\
HADOOP_CLIENT_OPTS: -Xmx1000m -Dpig.log.dir=C:\cygwin64\PIG-01~1.0\logs -Dpig.log.file=pig.log -Dpig.home.dir=C:\cygwin64\PIG-01~1.0\
/hadoop-0.20.2/bin/hadoop jar C:\cygwin64\PIG-01~1.0/pig-withouthadoop-h2.jar
when i run in debug mode, i see this exception. This is because Hadoop Jar is not set.
localhsot#mymachine~
$ echo $PIG_INSTALL
C:\cygwin64\pig-0.13.0
localhsot#mymachine~
$ export PIG_INSTALL=/cygdrive/c/cygwin64/pig-0.13.0
localhsot#mymachine~
$ export HADOOP_INSTALL=/cygdrive/c/cygwin64/hadoop-0.20.2/
localhsot#mymachine~
$ export PATH=$PATH:$PIG_INSTALL/bin:$HADOOP_INSTALL/bin
$ pig
14/08/26 14:05:12 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/26 14:05:12 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/26 14:05:12 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-26 14:05:12,998 [main] INFO org.apache.pig.Main - Apache Pig version 0. 13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-08-26 14:05:12,998 [main] INFO org.apache.pig.Main - Logging error message s to: C:\cygwin64\home\chparekh\pig_1409076312996.log
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/map reduce/task/JobContextImpl
at org.apache.pig.tools.pigstats.PigStatsUtil.<clinit>(PigStatsUtil.java :68)
at org.apache.pig.Main.run(Main.java:643)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.task.Jo bContextImpl
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 8 more
you can refer below link for the same, i hope this will help you.
http://abhijitsureshshingate.wordpress.com/2013/07/08/code-debug-test-apache-pig-scripts-using-eclipse-on-windows/

Pig 0.13 ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl

Just installed Pig 0.13 and I am attempting to use it with Hadoop 1.1.2. (Pig documentation states Pig 0.13 is compatible with Hadoop 1.1.2). Per the Pig install instructions, I set $PIG_CLASSPATH
to point at /etc/hadoop where core-site.xml, hdfs-site.xml, and mapred-site.xml are defined. Hadoop cluster is functional and works fine with non-Pig jobs. Based on the error descriptions below, I understand that Pig cannot find the JobContextImpl class it is looking for.
Based on the Hadoop 1.1.2 API documentation, I don't believe "task" is a sub-package of the "mapreduce" package. I have tried adding hadoop-core-1.1.2.jar directly to $PIG_CLASSPATH
and that did not work. (After looking at the contents of hadoop-core-1.1.2.jar, and the Hadoop 1.1.2 API documentation, I don't believe JobContextImpl is defined in the package Pig is attempting to load it from). How do I get Pig 0.13 to work with Hadoop 1.1.2?
=======Error follows as below=======
14/08/03 14:01:05 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/03 14:01:05 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/03 14:01:05 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-03 14:01:05,959 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-08-03 14:01:05,959 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hadoop/pig-0.13.0/bin/pig_1407088865958.log
2014-08-03 14:01:06,112 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master.localdomain:8020/
2014-08-03 14:01:06,388 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master.localdomain:8021
2014-08-03 14:01:06,440 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl
Details at logfile: /home/hadoop/pig-0.13.0/bin/pig_1407088865958.log
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class org.apache.pig.tools.pigstats.PigStatsUtil
at org.apache.pig.Main.run(Main.java:643)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
===Contents of pig_1407088865958.log ===
Pig Stack Trace
ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/task/JobContextImpl
at org.apache.pig.tools.pigstats.PigStatsUtil.<clinit>(PigStatsUtil.java:68)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:79)
at org.apache.pig.Main.run(Main.java:510)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.task.JobContextImpl
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 9 more
Though it is unclear how well this works for everyone, it appears that the asker mentioned how he solved the problem:
In my searching for help I saw posts stating that it needs to be
recompiled with a parameter that indicates version. The parameter
values I saw were 23,24. I did not know how that parameter mapped to
the version of hadoop that I am using 1.1.2. I hacked the bin/pig
script to point to hadoop-core-1.1.2.jar. The script requires
HADOOP_HOME be set (which is deprecated).

Resources