Integrating Pig with Hbase - hadoop

I have installed hadoop-2.5.0, pig 0.13.0 and HBase 0.98.6.1 in linux. When trying to run simple pig script, error occurs as
2014-10-14 16:01:54,891 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Details at logfile: /home/labuser/pig_1413279561970.log
Pasted the log below...
Pig Stack Trace
ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
at org.apache.hadoop.hbase.TableName.(TableName.java:281)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:344)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:382)
at org.apache.hadoop.hbase.TableName.(TableName.java:82)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
It seems that HBase 0.98.6.1 version does not support for pig 0.13.0
So how to make it works? or which version of HBase does support for pig 0.13.0?

The root cause for this has been identified to be https://issues.apache.org/jira/browse/HBASE-6658 where it says the class "org.apache.hadoop.hbase.filter.WritableByteArrayComparable" was renamed.
You may need to re-compile using the HBase profile you're using.

Related

Which version of pig should use for hbase 0.98.8

I have hadoop 2.5.1 installed
Hive version is 0.13.1
Pig version is 0.13.0
Habse version is 0.98.8
If I want to load files from HDFS into habase using pig then will my pig version work fine?
For now I am facing issue as follows:
2014-12-24 16:11:24,783 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Make sure you build pig with the following options:
-Dhbaseversion=95 -Dhadoopversion=23 -Dprotobuf-java.version=2.5.0

Pig 0.13 ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl

Just installed Pig 0.13 and I am attempting to use it with Hadoop 1.1.2. (Pig documentation states Pig 0.13 is compatible with Hadoop 1.1.2). Per the Pig install instructions, I set $PIG_CLASSPATH
to point at /etc/hadoop where core-site.xml, hdfs-site.xml, and mapred-site.xml are defined. Hadoop cluster is functional and works fine with non-Pig jobs. Based on the error descriptions below, I understand that Pig cannot find the JobContextImpl class it is looking for.
Based on the Hadoop 1.1.2 API documentation, I don't believe "task" is a sub-package of the "mapreduce" package. I have tried adding hadoop-core-1.1.2.jar directly to $PIG_CLASSPATH
and that did not work. (After looking at the contents of hadoop-core-1.1.2.jar, and the Hadoop 1.1.2 API documentation, I don't believe JobContextImpl is defined in the package Pig is attempting to load it from). How do I get Pig 0.13 to work with Hadoop 1.1.2?
=======Error follows as below=======
14/08/03 14:01:05 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/03 14:01:05 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/03 14:01:05 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-03 14:01:05,959 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:27:58
2014-08-03 14:01:05,959 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hadoop/pig-0.13.0/bin/pig_1407088865958.log
2014-08-03 14:01:06,112 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master.localdomain:8020/
2014-08-03 14:01:06,388 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: master.localdomain:8021
2014-08-03 14:01:06,440 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl
Details at logfile: /home/hadoop/pig-0.13.0/bin/pig_1407088865958.log
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class org.apache.pig.tools.pigstats.PigStatsUtil
at org.apache.pig.Main.run(Main.java:643)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
===Contents of pig_1407088865958.log ===
Pig Stack Trace
ERROR 2998: Unhandled internal error. org/apache/hadoop/mapreduce/task/JobContextImpl
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/task/JobContextImpl
at org.apache.pig.tools.pigstats.PigStatsUtil.<clinit>(PigStatsUtil.java:68)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:79)
at org.apache.pig.Main.run(Main.java:510)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.task.JobContextImpl
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 9 more
Though it is unclear how well this works for everyone, it appears that the asker mentioned how he solved the problem:
In my searching for help I saw posts stating that it needs to be
recompiled with a parameter that indicates version. The parameter
values I saw were 23,24. I did not know how that parameter mapped to
the version of hadoop that I am using 1.1.2. I hacked the bin/pig
script to point to hadoop-core-1.1.2.jar. The script requires
HADOOP_HOME be set (which is deprecated).

Error with Pig on Local machine

I am a newbie so pardon me if the question appears to be very silly. I have installed hadoop 1.2.1 and the basic wordcount example works fine on my local so as the next level of exploration i installed Pig 0.13.0.
When i just tried running pig -help it seemed to work fine. But when i run pig version i get an IOException as below:
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-06 01:00:08,321 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:29:34
2014-08-06 01:00:08,322 [main] INFO org.apache.pig.Main - Logging error messages to: /home/<user>/pig/log/pig_1407301208318.log
2014-08-06 01:00:09,856 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist.
Details at logfile: /home/<user>/pig/log/pig_1407301208318.log
The content of the log file is as below:
java.io.FileNotFoundException: File version does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:402)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:255)
at org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:778)
at org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:722)
at org.apache.pig.Main.run(Main.java:550)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
It would be great if someone could help me understand what needs to be done to fix this.
I think nothing is wrong, pig will run fine. The problem is that you used this command:
pig version
But you should use
pig -version
Think you will get same error when write pig help.
Have a nice day
Try this
pig -version
if you give pig version alone, then it searching for the file named 'version'.

Reading hive table using Pig script

I am trying to read hive table using PIG script but when I run a pig code to read a table in hive its giving me following error:
2014-02-12 15:48:36,143 [main] WARN org.apache.hadoop.hive.conf.HiveConf
-hive-site.xml not found on CLASSPATH 2014-02-12 15:49:10,781 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate
exception from backed error: Error: Found class
org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
(Ignore newlines and whitespace added for readability)
Hadoop version
1.1.1
Hive version
0.9.0
Pig version
0.10.0
Pig code
a = LOAD '/user/hive/warehouse/test' USING
org.apache.pig.piggybank.storage.HiveColumnarLoader('name string');
Is it due to some version mismatch ?
Why can't you use hcatalog to access hive metadata in pig?
Check this for an example

Can't Store pig relation into Hbase

Hi I am trying to store pig relation into HBase.
store result INTO 'hbase://hourlyAggregation' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('countDetails:ansCount countDetails:divCount countDetails:unansCount countDetails:engCount');
This is running fine in local. when I tried to run pig in mapred mode my job is failing and my log is showing no error
ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
Details at logfile: /home/HadoopUser/pig_1384412383791.log
Pig Stack Trace
---------------
ERROR 2244: Job failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:119)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:172)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:500)
at org.apache.pig.Main.main(Main.java:107)
================================================================================
my profile is as follows
export JAVA_HOME=/home/hadoop/jdk1.6.0_39
export HADOOP_HOME=$MY_HOME/hadoop-0.20.2-cdh3u4
export CLASSPATH=$JAVA_HOME/lib/tools.jar:.
export HIVE_HOME=$MY_HOME/hive-0.7.1-cdh3u4
export PIG_HOME=$MY_HOME/pig-0.8.1-cdh3u4
export HBASE_HOME=$MY_HOME/hbase-0.90.6-cdh3u4
export PIG_CLASSPATH=”`${HBASE_HOME}/bin/hbase classpath`:$PIG_CLASSPATH”
please help me on this
I even tried to register jars of zookeeper and hbase in pig_home/lib
JT log
14-Nov-2013 14:48:29 (17sec)
java.lang.RuntimeException: could not instantiate 'org.apache.pig.backend.hadoop.hbase.HBaseStorage' with arguments '[countDetails:ansCount countDetails:divCount countDetails:unansCount countDetails:engCount]'
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:502)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:218)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:85)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:68)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:278)
at org.apache.hadoop.mapred.Task.initialize(Task.java:511)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:306)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
registering hbase,zookeeper jar in PIG/lib and guava jar in hbase lib did worked thanks Lorand Bendig for ur support.

Resources