"Could not get input splits" Error, with Hive-Cassandra-CqlStorageHandler - hadoop

Im trying to read data from cassandra using Hive with CqlStorageHandler.
The versions:
Hive 0.11.0
Hadoop 1.2.1
Cassandra 1.2.6
Im able to create EXTERNAL table with the following HIVE Query
CREATE EXTERNAL TABLE input(number string,name string,address string) STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES ("cassandra.columns.mapping" = ":key, name, address", "cassandra.ks.name" ="cassandradb", "cassandra.host" = "localhost" ,"cassandra.port" = "9160") TBLPROPERTIES ("cassandra.input.split.size" = "64000","cassandra.range.size" = "1000","cassandra.slice.predicate.size" = "1000");
(The table "input" is already existing and containing some data in cassandra created with CQL3)
However, When I try to read data with the following query
select * from input where number="1";
Im facing the folowing issue:
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: Could not get input splits
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:189)
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:213)
at org.apache.hadoop.hive.cassandra.input.cql.HiveCqlInputFormat.getSplits(HiveCqlInputFormat.java:169)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:292)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:297)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "143514173170822869679056708180186660043"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:185)
... 31 more
Caused by: java.lang.NumberFormatException: For input string: "143514173170822869679056708180186660043"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:444)
at java.lang.Long.valueOf(Long.java:540)
at org.apache.cassandra.dht.Murmur3Partitioner$1.fromString(Murmur3Partitioner.java:188)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:239)
at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:207)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Job Submission failed with exception 'java.io.IOException(Could not get input splits)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask
Am I missing anything? Kindly advise.

Related

Unable to load an file from ADLS(azure Datalake) to Hive table

When ever i am tring to load a file from my azure datalake storage to an Hive table using below command,
hiveContext.sql(LOAD DATA INPATH 'adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv' OVERWRITE INTO TABLE sample.test03)
i am getting an error :ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
Whole error Log:
17/07/05 05:45:48 INFO SparkSqlParser: Parsing command: CREATE TABLE IF NOT EXISTS sample.test03 ( GEO STRING,UPC STRING,WeekEnding STRING,BaseDollars INT,BaseDollars_AnyPromo INT,BaseDollars_Display INT,BaseDollars_FeatAndDisp INT,BaseDollars_FeatAndOrDisp INT,BaseDollars_Feature INT,BaseDollars_NoPromo INT,BaseDollars_TPR INT,BaseUnits INT,BaseUnits_AnyPromo INT,BaseUnits_Display INT,BaseUnits_EQ STRING,BaseUnits_EQ_AnyPromo STRING,BaseUnits_EQ_Display STRING,BaseUnits_EQ_FeatAndDisp STRING,BaseUnits_EQ_FeatAndOrDisp STRING,BaseUnits_EQ_Feature STRING,BaseUnits_EQ_NoPromo STRING,BaseUnits_EQ_TPR STRING,BaseUnits_FeatAndDisp INT,BaseUnits_FeatAndOrDisp INT,BaseUnits_Feature INT,BaseUnits_NoPromo INT,BaseUnits_TPR INT,Dollars INT,Dollars_AnyPromo INT,Dollars_Display INT,Dollars_FeatAndDisp INT,Dollars_FeatAndOrDisp INT,Dollars_Feature INT,Dollars_NoPromo INT,Dollars_TPR INT,PACV_Discount INT,PACV_DispWOFeat INT,PACV_FeatAndDisp INT,PACV_FeatWODisp INT,Units INT,Units_AnyPromo INT,Units_Display INT,Units_EQ INT,Units_EQ_AnyPromo STRING,Units_EQ_Display STRING,Units_EQ_FeatAndDisp STRING,Units_EQ_FeatAndOrDisp STRING,Units_EQ_Feature STRING,Units_EQ_NoPromo STRING,Units_EQ_TPR STRING,Units_FeatAndDisp INT,Units_FeatAndOrDisp INT,Units_Feature INT,Units_NoPromo INT,Units_TPR INT,ACV INT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE
17/07/05 05:45:49 INFO SparkSqlParser: Parsing command: LOAD DATA INPATH 'adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv' OVERWRITE INTO TABLE sample.test03
17/07/05 05:45:49 INFO SessionState: Could not get hdfsEncryptionShim, it is only applicable to hdfs filesystem.
17/07/05 05:45:49 INFO Hive: Replacing src:adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv, dest: wasb://bieno-da-d-56872-unilevercom-hdi-01#049bienobrunilevercomstg.blob.core.windows.net/hive/warehouse/sample.db/test03/NLS_SYN_SCT.csv, Status:false
17/07/05 05:45:49 ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.sql.hive.client.Shim_v0_14.loadTable(HiveShim.scala:633)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply$mcV$sp(HiveClientImpl.scala:646)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:646)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:646)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:280)
at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227)
at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226)
at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:269)
at org.apache.spark.sql.hive.client.HiveClientImpl.loadTable(HiveClientImpl.scala:645)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply$mcV$sp(HiveExternalCatalog.scala:248)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:246)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:246)
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:72)
at org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:246)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.loadTable(SessionCatalog.scala:297)
at org.apache.spark.sql.execution.command.LoadDataCommand.run(tables.scala:335)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:682)
at com.accenture.Unilever.Nielsen.RestatementSample.Restatement(RestatementSample.scala:70)
at com.accenture.Unilever.StageToEnrich.RestatementLogic$.main(RestatementLogic.scala:36)
at com.accenture.Unilever.StageToEnrich.RestatementLogic.main(RestatementLogic.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv into: wasb://bieno-da-d-56872-unilevercom-hdi-01#049bienobrunilevercomstg.blob.core.windows.net/hive/warehouse/sample.db/test03/NLS_SYN_SCT.csv
at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2919)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1640)
... 44 more
Caused by: java.io.IOException: Error moving: adl://bienodad56872stgadlstemp.azuredatalakestore.net/Enriched/Nielsen/NielsenScantrack/Incremental_withoutRepartition/NLS_SYN_SCT.csv into: wasb://bieno-da-d-56872-unilevercom-hdi-01#049bienobrunilevercomstg.blob.core.windows.net/hive/warehouse/sample.db/test03/NLS_SYN_SCT.csv
at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2913)
... 45 more
I can execute the same code from HIVE shell but from spark script i am getting this error. Is there any special Jar file i need to include. Any help will be appreciated.

Error while saving as parquet {Could not read footer}

I am writing a spark application.
After processing logs , i save the output in parquet format. (using Dataframe.saveAsParquetFile() api)
But I am getting error sometime while saving as parquet. The error goes away if i rerun the process to save as parquet.
Please let me know root cause for this
java.io.IOException: Could not read footer: java.io.IOException: Could not read footer for file org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus#6233d82f
at parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:238)
at org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:369)
at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$metadataCache$lzycompute(newParquet.scala:154)
at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$metadataCache(newParquet.scala:152)
at org.apache.spark.sql.parquet.ParquetRelation2.refresh(newParquet.scala:197)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.insert(commands.scala:134)
at org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.run(commands.scala:114)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57)
at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:68)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:87)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:939)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:939)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:332)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:144)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:135)
at org.apache.spark.sql.DataFrame.saveAsParquetFile(DataFrame.scala:1494)
at ParquetWriter$.main(ParquetWriter.scala:182)
at ParquetWriter.main(ParquetWriter.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Could not read footer for file org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus#6233d82f
at parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:230)
at parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:224)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: file:/{directory}/output.parquet2/part-r-01418.gz.parquet at 17920
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:248)
at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:273)
at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:211)
at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:229)
at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:193)
at org.apache.hadoop.fs.FSInputChecker.readFully(FSInputChecker.java:431)
at org.apache.hadoop.fs.FSInputChecker.seek(FSInputChecker.java:412)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:48)
at org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream.seek(ChecksumFileSystem.java:318)
at parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:413)
at parquet.hadoop.ParquetFileReader$2.call(ParquetFileReader.java:228)
... 5 more
Thanks

Error while creating Hive/Hbase Table

Failed to create Hive Table.
Here is the log
0: jdbc:hive2://localhost:10000> CREATE TABLE hbase_table_1(key int, value string)
0: jdbc:hive2://localhost:10000> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
0: jdbc:hive2://localhost:10000> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
0: jdbc:hive2://localhost:10000> TBLPROPERTIES ("hbase.table.name" = "xyz");
...
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.io.IOException: Attempt to start meta tracker failed.
at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:204)
at org.apache.hadoop.hbase.client.HBaseAdmin.startCatalogTracker(HBaseAdmin.java:262)
at org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:235)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:306)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:322)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:200)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:664)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
at com.sun.proxy.$Proxy8.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4135)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71)
at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:200)
... 35 more
) (state=08S01,code=1)
It looks like it is looking for /hbase location in zookeeper but it is not present. Can you verify in your zookeeper ,if above /hbase location is present?

Issue in Hive-0.14.0 Integration in HBase-0.98.8-hadoop2

I have hive 0.14.1 hbase 0.98.8 and hadoop 2.5.0 I am trying to integrate hive with hbase and put zookeeper-3.4.6.jar,hbase-common-0.98.8-hadoop2.jar file from HBase/lib to Hive/lib. Steps Followed are as follows:
1.hive --auxpath $HIVE_HOME/lib/hive-hbase-handler-0.14.1.jar,$HIVE_HOME/lib/hbase-common-0.98.8-hadoop2.jar,$HIVE_HOME/lib/zookeeper-3.4.6.jar,$HIVE_HOME/lib/guava-11.0.2.jar -hiveconf hbase.master=masternode:60000
2.CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
This Command does not works for me.It give me error.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Scan$Builder.setCaching(I)Lorg/apache/hadoop/hbase/protobuf/generated/ClientProtos$Scan$Builder;
at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:216)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:125)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:93)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:283)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:188)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:183)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:110)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:745)
at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:542)
at org.apache.hadoop.hbase.catalog.MetaReader.tableExists(MetaReader.java:310)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:307)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:321)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:182)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:602)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:595)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
at com.sun.proxy.$Proxy8.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:670)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3959)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:295)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1604)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1364)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1177)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Scan$Builder.setCaching(I)Lorg/apache/hadoop/hbase/protobuf/generated/ClientProtos$Scan$Builder;
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:879)
at org.apache.hadoop.hbase.protobuf.RequestConverter.buildScanRequest(RequestConverter.java:488)
at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:303)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:164)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:117)
... 40 more
)
You may need to add the hbase-protocol JAR to the classpath. The symbol that could not be resolved per your stack trace appears to come from there, at least based on cursory inspection only.

Error while creating an Hive Table on top of an Hbase table

I am trying to create a Hive table on top of HBase, but getting error each time. Please tell me what I am doing wrong here.
CREATE TABLE hbase_trades(key string, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val") TBLPROPERTIES ("hbase.table.name" = "trades");
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.io.IOException: Attempt to start meta tracker failed.
at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:199)
at org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:221)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:269)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:285)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:483)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:476)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy12.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:598)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3697)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:253)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1485)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1263)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1091)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:921)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:199)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:425)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:195)
... 33 more
)
You need to tell Hive where to find the zookeepers quorum which would elect the HBase master
Change zk1.mymachine.com,zk2.mymachine.com with the name of your corresponding zookeepers' hosts.
$HIVE_SRC/build/dist/bin/hive -hiveconf hbase.zookeeper.quorum=zk1.mymachine.com,zk2.mymachine.com

Resources