Which version of pig should use for hbase 0.98.8 - hadoop

I have hadoop 2.5.1 installed
Hive version is 0.13.1
Pig version is 0.13.0
Habse version is 0.98.8
If I want to load files from HDFS into habase using pig then will my pig version work fine?
For now I am facing issue as follows:
2014-12-24 16:11:24,783 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z

Make sure you build pig with the following options:
-Dhbaseversion=95 -Dhadoopversion=23 -Dprotobuf-java.version=2.5.0

Related

Specify a vaild path to the correct hive jars using $HIVE_METASTORE_JARS or change spark.sql.hive.metastore.version to 1.2.1

When i try to run spark-submit on the Jar which had HiveContext,getting the below error.
Spark-defaults.conf had
spark.sql.hive.metastore.version 0.14.0
spark.sql.hive.metastore.jars ----/external_jars/hive-metastore-0.14.0.jar
#spark.sql.hive.metastore.jars maven
I would like to use Hive Metastore version 0.14. both spark and hadoop are on diff clusters.
Can anyone helping me with resolving this one?
16/09/19 16:52:24 INFO HiveContext: default warehouse location is /apps/hive/warehouse
Exception in thread "main" java.lang.IllegalArgumentException: Builtin jars can only be used when hive execution version == hive metastore version. Execution: 1.2.1 != Metastore: 0.14.0.
Specify a vaild path to the correct hive jars using $HIVE_METASTORE_JARS or change spark.sql.hive.metastore.version to 1.2.1.
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:254)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:237)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:441)
at org.apache.spark.sql.SQLContext$$anonfun$4.apply(SQLContext.scala:272)
at org.apache.spark.sql.SQLContext$$anonfun$4.apply(SQLContext.scala:271)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$cla
try
val hadoopConfig: Configuration = spark.hadoopConfiguration
hadoopConfig.set("fs.hdfs.impl", classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getNam‌​e)
hadoopConfig.set("fs.file.impl", classOf[org.apache.hadoop.fs.LocalFileSystem].getName)
in the spark

ERROR 2998: Unhandled internal error. Run the code

executing the following command -x local -f /Hbase/load_hbase.pig
I get the following error
2014-11-08 23:36:47,455 [main] INFO org.apache.pig.Main - Apache Pig version 0.12.1 (r1585011) compiled Apr 05 2014, 01:41:34
2014-11-08 23:36:47,455 [main] INFO org.apache.pig.Main - Logging error messages to: /home/eduardo/pig_1415497007452.log
2014-11-08 23:36:47,817 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/eduardo/.pigbootup not found
2014-11-08 23:36:47,918 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///
2014-11-08 23:36:48,436 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable
Here is the code that I run:
raw_data = LOAD '/data/QCLCD201211/201201hourly.txt' USING PigStorage(',');
weather_data = FOREACH raw_data GENERATE $1, $10;
ranked_data = RANK weather_data;
final_data = FILTER ranked_data BY $0 IS NOT NULL;
STORE final_data INTO 'hbase://weather'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('info:date info:temp');
I wonder what I'm doing wrong I'll put down the version of hadoop, hbase and the pig.
Hadoop: hadoop-1.2.1
Hbase: hbase-0.96.2-hadoop1
Pig: pig-0.12.1
copy pig jar and hbase jar in hadoop
1) COPY THESE FILES TO THE HADOOP LIBRARY.
sudo cp /usr/lib/pig/lib/pig-common-0.8.0-cdh3u0.jar /usr/lib/hadoop/lib/
sudo cp /usr/lib/pig/lib/hbase-0.96.2-cdh3u0.jar /usr/lib/hadoop/lib/
sudo cp /usr/lib/pig/lib/hbase-0.96.2-cdh3u0.jar /usr/lib/hadoop/lib/
2)CLOSE HBASE AND HADOOP USING FOLLOWING COMMOND
/usr/lib/hadoop/bin/stop-all.sh
/usr/lib/hbase/bin/stop-hbase.sh
3) RESTART HBASE AND HADOOP USING COMMOND
/usr/lib/hadoop/bin/start-all.sh
/usr/lib/hadoop/bin/start-hbase.sh

Integrating Pig with Hbase

I have installed hadoop-2.5.0, pig 0.13.0 and HBase 0.98.6.1 in linux. When trying to run simple pig script, error occurs as
2014-10-14 16:01:54,891 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
Details at logfile: /home/labuser/pig_1413279561970.log
Pasted the log below...
Pig Stack Trace
ERROR 2998: Unhandled internal error. org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.util.Bytes.equals([BLjava/nio/ByteBuffer;)Z
at org.apache.hadoop.hbase.TableName.(TableName.java:281)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:344)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:382)
at org.apache.hadoop.hbase.TableName.(TableName.java:82)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
It seems that HBase 0.98.6.1 version does not support for pig 0.13.0
So how to make it works? or which version of HBase does support for pig 0.13.0?
The root cause for this has been identified to be https://issues.apache.org/jira/browse/HBASE-6658 where it says the class "org.apache.hadoop.hbase.filter.WritableByteArrayComparable" was renamed.
You may need to re-compile using the HBase profile you're using.

Error with Pig on Local machine

I am a newbie so pardon me if the question appears to be very silly. I have installed hadoop 1.2.1 and the basic wordcount example works fine on my local so as the next level of exploration i installed Pig 0.13.0.
When i just tried running pig -help it seemed to work fine. But when i run pig version i get an IOException as below:
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
14/08/06 01:00:08 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2014-08-06 01:00:08,321 [main] INFO org.apache.pig.Main - Apache Pig version 0.13.0 (r1606446) compiled Jun 29 2014, 02:29:34
2014-08-06 01:00:08,322 [main] INFO org.apache.pig.Main - Logging error messages to: /home/<user>/pig/log/pig_1407301208318.log
2014-08-06 01:00:09,856 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist.
Details at logfile: /home/<user>/pig/log/pig_1407301208318.log
The content of the log file is as below:
java.io.FileNotFoundException: File version does not exist.
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:402)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:255)
at org.apache.pig.impl.io.FileLocalizer.fetchFilesInternal(FileLocalizer.java:778)
at org.apache.pig.impl.io.FileLocalizer.fetchFile(FileLocalizer.java:722)
at org.apache.pig.Main.run(Main.java:550)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
It would be great if someone could help me understand what needs to be done to fix this.
I think nothing is wrong, pig will run fine. The problem is that you used this command:
pig version
But you should use
pig -version
Think you will get same error when write pig help.
Have a nice day
Try this
pig -version
if you give pig version alone, then it searching for the file named 'version'.

Reading hive table using Pig script

I am trying to read hive table using PIG script but when I run a pig code to read a table in hive its giving me following error:
2014-02-12 15:48:36,143 [main] WARN org.apache.hadoop.hive.conf.HiveConf
-hive-site.xml not found on CLASSPATH 2014-02-12 15:49:10,781 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate
exception from backed error: Error: Found class
org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
(Ignore newlines and whitespace added for readability)
Hadoop version
1.1.1
Hive version
0.9.0
Pig version
0.10.0
Pig code
a = LOAD '/user/hive/warehouse/test' USING
org.apache.pig.piggybank.storage.HiveColumnarLoader('name string');
Is it due to some version mismatch ?
Why can't you use hcatalog to access hive metadata in pig?
Check this for an example

Resources