Hive and Cassandra integration using CqlStorageHandler - hadoop

I referred this git project for integrating cassandra data using hive table.I copied the appropriate cassandra jars into the hive lib folder.But while running the query against cassandra I'm getting the following error.Please help me to resolve it.
https://github.com/milliondreams/hive/tree/cas-support-cql/cassandra-handler
hive> CREATE EXTERNAL TABLE messages(row_key string, col1 string, col2 string)
STORED BY 'org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler' WITH SERDEPROPERTIES("cql.primarykey" = "row_key")
TBLPROPERTIES ("cassandra.ks.name" = "mycqlks", "cassandra.ks.stratOptions"="'DC':1, 'DC2':1",
"cassandra.ks.strategy"="NetworkTopologyStrategy");
java.lang.NoSuchMethodError: org.apache.hadoop.hive.metastore.MetaStoreUtils.getSchema(Lorg/apache/hadoop/hive/metastore/api/Table;)Ljava/util/Properties;
at org.apache.hadoop.hive.cassandra.cql.CqlManager.createColumnFamily(CqlManager.java:238)
at org.apache.hadoop.hive.cassandra.cql.CqlManager.createCFIfNotFound(CqlManager.java:189)
at org.apache.hadoop.hive.cassandra.cql.CqlStorageHandler.preCreateTable(CqlStorageHandler.java:247)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:462)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:455)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at com.sun.proxy.$Proxy11.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3776)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:256)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.DDLTask

Which version of hive are you using?
It has to be hive 0.9 as per https://github.com/milliondreams/hive/tree/cas-support-cql/cassandra-handler
I think you are using version >= 0.11.0
Version 0.9.0: http://svn.apache.org/repos/asf/hive/tags/release-0.9.0/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Version 0.10.0: http://svn.apache.org/repos/asf/hive/tags/release-0.10.0/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Version 0.11.0: http://svn.apache.org/repos/asf/hive/tags/release-0.11.0/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Single argument method - org.apache.hadoop.hive.metastore.MetaStoreUtils.getSchema is missing in 0.11.0

Related

Hive on Spark - java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS

My Apps versions are below. I am unable to run a SELECT using beeline command line. Are these versions compatible?.
​Hadoop - 2.7.3
Hive - 2.3.0
Spark - 2.0.2
java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS
at org.apache.hive.spark.client.rpc.RpcConfiguration.<clinit>(RpcConfiguration.java:47)
at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:134)
at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:516)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)

Hive 2.1.0 Issue org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source

I am using hadoop 2.7.2 , hive 2.1.0 have enabled tez and but reverted back to use mr as execution engine and facing below error while trying to run a Insert Select query on orc and parquet tables,
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://.../user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18/.hive-staging_hive_2016-10-20_14-46-33_718_5449118624054166228-1/-ext-10000 to destination hdfs://..../user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2959)
at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1532)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1461)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:497)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: rename for src path: hdfs:.....:9000/user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18/.hive-staging_hive_2016-10-20_14-46-33_718_5449118624054166228-1/-ext-10000/000000_0 to dest:hdfs://.....:9000/user/hive/warehouse/access_logs.db/crawlstats_dpp/day=2016-10-18 returned false
Any clue about the issue, would be highly appriciated.
Regards.

Problems with Apache Spark to consult a table

I try spark and hive,
I want select one table
hiveContext.hql("select * from final_table").collect()
but I have this error
ERROR Hive: NoSuchObjectException(message:default.final_table table not found)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1569)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:106)
at com.sun.proxy.$Proxy27.get_table(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1008)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
at com.sun.proxy.$Proxy28.getTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1000)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:974)
at org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:70)
at org.apache.spark.sql.hive.HiveContext$$anon$2.org$apache$spark$sql$catalyst$analysis$OverrideCatalog$$super$lookupRelation(HiveContext.scala:253)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)
at org.apache.spark.sql.catalyst.analysis.OverrideCatalog$$anonfun$lookupRelation$3.apply(Catalog.scala:141)
at scala.Option.getOrElse(Option.scala:120)
but when I try this
hiveContext.hql("CREATE TABLE IF NOT EXISTS TestTable (key INT, value STRING)")
I haven't any problem and the table is created.
Any ideas about this problem, any solution?
Thanks!
Which command do you use to start Spark? Most likely you haven't set up the usage of Hive metastore the right way, which means each time you start your cluster you are creating new temporary local metastore. To use the Hive metastore, follow these guides: (Run Spark with build-in Hive and Configuring a remote PostgreSQL database for the Hive Metastore and https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables). This way the tables created by you will persist between cluster restarts, in Hive metastore.

Phoenix Error while trying to create table

When I either log in to sqlline.py in phoenix or I try to create table in phoenix through api's I get an exception.
With my limited knowledge of phenix I am unable to figure out why phoenix is checking for System.Catalog table even before it creates it.Any help will be greatly appreciated.
StackTrace:
*4/11/18 06:07:18 WARN client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: SYSTEM.CATALOG, row=SYSTEM.CATALOG,\x00SYSTEM\x00CATALOG,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:151)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1059)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1121)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:914)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1053)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1156)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:422)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:183)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:226)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:908)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1351)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:131)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:112)
at sqlline.SqlLine$DatabaseConnection.connect(SqlLine.java:4650)
at sqlline.SqlLine$DatabaseConnection.getConnection(SqlLine.java:4701)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3942)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3851)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.SqlLine$ReflectiveCommandHandler.execute(SqlLine.java:2810)
at sqlline.SqlLine.dispatch(SqlLine.java:817)
at sqlline.SqlLine.initArgs(SqlLine.java:633)
at sqlline.SqlLine.begin(SqlLine.java:680)
at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
at sqlline.SqlLine.main(SqlLine.java:424)
Error: SYSTEM.CATALOG (state=08000,code=101)
org.apache.phoenix.exception.PhoenixIOException: SYSTEM.CATALOG
at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:97)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:935)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1053)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1156)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:422)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:183)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:226)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:908)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1351)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:131)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:112)
at sqlline.SqlLine$DatabaseConnection.connect(SqlLine.java:4650)
at sqlline.SqlLine$DatabaseConnection.getConnection(SqlLine.java:4701)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3942)
at sqlline.SqlLine$Commands.connect(SqlLine.java:3851)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at sqlline.SqlLine$ReflectiveCommandHandler.execute(SqlLine.java:2810)
at sqlline.SqlLine.dispatch(SqlLine.java:817)
at sqlline.SqlLine.initArgs(SqlLine.java:633)
at sqlline.SqlLine.begin(SqlLine.java:680)
at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441)
at sqlline.SqlLine.main(SqlLine.java:424)
Caused by: org.apache.hadoop.hbase.TableNotFoundException: SYSTEM.CATALOG
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1139)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1001)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:958)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:914)
... 23 more*
Check your hdfs://.../hbase/data/default/ is exist SYSTEM.CATALOG ?
if there is nothing, you must try to use bin/hbase clean --cleanZk
before you use the command, you must stop hbase Master and regionServers,but still keep ZK alive.

NoSuchMethodError using Guava 15 on Hadoop (2.3.0)

I have a compiled jar for Hadoop including this library:
com.google.guava:guava:jar:15.0:compile
When I submit it into my Hadoop CDH5.0.1 cluster I have this error:
java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.createStarted()Lcom/google/common/base/Stopwatch;
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
at com.trovit.pipeline.AdsPipelineDriver.main(AdsPipelineDriver.java:17)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
The main thing is that Hadoop has an older version of guava into its classpath, loads it before mine and crashes because the used function doesn't exist.
I've tried configuration parameters such as mapreduce.task.classpath.user.precedence, mapreduce.task.classpath.first or mapreduce.job.user.classpath.first but none of them worked.
Any guess for solving this problem?

Resources