Reading hive table using Pig script - hadoop

I am trying to read hive table using PIG script but when I run a pig code to read a table in hive its giving me following error:
2014-02-12 15:48:36,143 [main] WARN org.apache.hadoop.hive.conf.HiveConf
-hive-site.xml not found on CLASSPATH 2014-02-12 15:49:10,781 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate
exception from backed error: Error: Found class
org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
(Ignore newlines and whitespace added for readability)
Hadoop version
1.1.1
Hive version
0.9.0
Pig version
0.10.0
Pig code
a = LOAD '/user/hive/warehouse/test' USING
org.apache.pig.piggybank.storage.HiveColumnarLoader('name string');
Is it due to some version mismatch ?

Why can't you use hcatalog to access hive metadata in pig?
Check this for an example

Related

Which version of pig should use for hbase 0.98.8

I have hadoop 2.5.1 installed
Hive version is 0.13.1
Pig version is 0.13.0
Habse version is 0.98.8
If I want to load files from HDFS into habase using pig then will my pig version work fine?
For now I am facing issue as follows:
2014-12-24 16:11:24,783 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Make sure you build pig with the following options:
-Dhbaseversion=95 -Dhadoopversion=23 -Dprotobuf-java.version=2.5.0

ERROR 2998: Unhandled internal error. Run the code

executing the following command -x local -f /Hbase/load_hbase.pig
I get the following error
2014-11-08 23:36:47,455 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.1 (r1585011) compiled Apr 05 2014, 01:41:34
2014-11-08 23:36:47,455 [main] INFO org.apache.pig.Main - Logging error messages to: /home/eduardo/pig_1415497007452.log
2014-11-08 23:36:47,817 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/eduardo/.pigbootup not found
2014-11-08 23:36:47,918 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-11-08 23:36:48,436 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable
Here is the code that I run:
raw_data = LOAD '/data/QCLCD201211/201201hourly.txt' USING PigStorage(',');
weather_data = FOREACH raw_data GENERATE $1, $10;
ranked_data = RANK weather_data;
final_data = FILTER ranked_data BY $0 IS NOT NULL;
STORE final_data INTO 'hbase://weather'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('info:date info:temp');
I wonder what I'm doing wrong I'll put down the version of hadoop, hbase and the pig.
Hadoop: hadoop-1.2.1
Hbase: hbase-0.96.2-hadoop1
Pig: pig-0.12.1
copy pig jar and hbase jar in hadoop
1) COPY THESE FILES TO THE HADOOP LIBRARY.
sudo cp /usr/lib/pig/lib/pig-common-0.8.0-cdh3u0.jar /usr/lib/hadoop/lib/
sudo cp /usr/lib/pig/lib/hbase-0.96.2-cdh3u0.jar /usr/lib/hadoop/lib/
sudo cp /usr/lib/pig/lib/hbase-0.96.2-cdh3u0.jar /usr/lib/hadoop/lib/
2)CLOSE HBASE AND HADOOP USING FOLLOWING COMMOND
/usr/lib/hadoop/bin/stop-all.sh
/usr/lib/hbase/bin/stop-hbase.sh
3) RESTART HBASE AND HADOOP USING COMMOND
/usr/lib/hadoop/bin/start-all.sh
/usr/lib/hadoop/bin/start-hbase.sh

Integrating Pig with Hbase

I have installed hadoop-2.5.0, pig 0.13.0 and HBase 0.98.6.1 in linux. When trying to run simple pig script, error occurs as
2014-10-14 16:01:54,891 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Details at logfile: /home/labuser/pig_1413279561970.log
Pasted the log below...
Pig Stack Trace
ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
at org.apache.hadoop.hbase.TableName.(TableName.java:281)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:344)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:382)
at org.apache.hadoop.hbase.TableName.(TableName.java:82)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
It seems that HBase 0.98.6.1 version does not support for pig 0.13.0
So how to make it works? or which version of HBase does support for pig 0.13.0?
The root cause for this has been identified to be https://issues.apache.org/jira/browse/HBASE-6658 where it says the class "org.apache.hadoop.hbase.filter.WritableByteArrayComparable" was renamed.
You may need to re-compile using the HBase profile you're using.

Can't Store pig relation into Hbase

Hi I am trying to store pig relation into HBase.
store result INTO 'hbase://hourlyAggregation' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('countDetails:ansCount countDetails:divCount countDetails:unansCount countDetails:engCount');
This is running fine in local. when I tried to run pig in mapred mode my job is failing and my log is showing no error
ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
Details at logfile: /home/HadoopUser/pig_1384412383791.log
Pig Stack Trace
---------------
ERROR 2244: Job failed, hadoop does not return any error message
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:119)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:172)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:500)
at org.apache.pig.Main.main(Main.java:107)
================================================================================
my profile is as follows
export JAVA_HOME=/home/hadoop/jdk1.6.0_39
export HADOOP_HOME=$MY_HOME/hadoop-0.20.2-cdh3u4
export CLASSPATH=$JAVA_HOME/lib/tools.jar:.
export HIVE_HOME=$MY_HOME/hive-0.7.1-cdh3u4
export PIG_HOME=$MY_HOME/pig-0.8.1-cdh3u4
export HBASE_HOME=$MY_HOME/hbase-0.90.6-cdh3u4
export PIG_CLASSPATH=”`${HBASE_HOME}/bin/hbase classpath`:$PIG_CLASSPATH”
please help me on this
I even tried to register jars of zookeeper and hbase in pig_home/lib
JT log
14-Nov-2013 14:48:29 (17sec)
java.lang.RuntimeException: could not instantiate 'org.apache.pig.backend.hadoop.hbase.HBaseStorage' with arguments '[countDetails:ansCount countDetails:divCount countDetails:unansCount countDetails:engCount]'
at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:502)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getStoreFunc(POStore.java:218)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.getCommitters(PigOutputCommitter.java:85)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.<init>(PigOutputCommitter.java:68)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getOutputCommitter(PigOutputFormat.java:278)
at org.apache.hadoop.mapred.Task.initialize(Task.java:511)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:306)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
registering hbase,zookeeper jar in PIG/lib and guava jar in hbase lib did worked thanks Lorand Bendig for ur support.

HIve shell exception java type java.lang.Integer cant be mapped for this datastore

I have hadoop and hbase installed. When i run show tables comand in hive shell the following error raised.
Hive version 0.10.0
Hbase version 0.90.6
Hadoop version 1.1.2
hive> show tables;
FAILED: Error in metadata: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException javax.jdo.JDOFatalInternalException: JDBC type integer declared for field
"org.apache.hadoop.hive.metastore.model.MTable.createTime" of java type java.lang.Integer cant be mapped for this datastore.
NestedThrowables:
org.datanucleus.exceptions.NucleusException: JDBC type integer declared for field "org.apache.hadoop.hive.metastore.model.MTable.createTime" of java type java.lang.Integer cant be mapped for this datastore.)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
I found where the problem comes from. Error is related to language settings of the linux box. Before launching hive export LANG=C is needful.

Resources