Hive remote postgres metastore - hadoop

I was doing multi-node setup using Apache distribution .I was able to complete hadoop installation successfully (Hadoop 2.7.3).
When I tried hive (Hive 2.3),its working without issues with the default metastore(derby).Then I changed the hive-site.xml to point to my external postgresDB
I gave host,username,password as per the tutorial .But when I ran the schemainit it is faliling as bellow ,still showing derby details and initialization
is failing .Anybody faced the same issue ever?
bash-4.2$ /data/hive/bin/schematool -initSchema -dbType postgres --verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.postgres.sql
Connecting to jdbc:derby:;databaseName=metastore_db;create=true
Connected to: Apache Derby (version 10.10.2.0 - (1582446))
Driver: Apache Derby Embedded JDBC Driver (version 10.10.2.0 - (1582446))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:derby:> !autocommit on
Autocommit status: true
0: jdbc:derby:> SET statement_timeout = 0
Error: Syntax error: Encountered "statement_timeout" at line 1, column 5. (state=42X01,code=30000)
Closing: 0: jdbc:derby:;databaseName=metastore_db;create=true
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:590)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:563)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1145)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:980)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:959)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:586)
... 8 more
*** schemaTool failed ***

Related

Install Hive on Windows

I am trying to install hive 3.1.2 on windows 10, hadoop 3.2.2.
I can start hadoop server and start hive shell by run "hive".
First problem is it show a lot of WARN:
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2021-09-09T21:01:22,001 INFO [main] org.apache.hadoop.hive.conf.HiveConf - Found configuration file file:/C:/my_programs/hive_3.1.2/conf/hive-site.xml
2021-09-09T21:01:22,303 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
2021-09-09T21:01:23,557 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
Hive Session ID = f879881f-c49b-449b-b8cf-81302c585358
Logging initialized using configuration in jar:file:/C:/my_programs/hive_3.1.2/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
2021-09-09T21:01:25,309 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Created HDFS directory: /tmp/hive/admin/f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:01:25,317 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Created local directory: C:/Users/admin/AppData/Local/Temp/admin/f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:01:25,325 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Created HDFS directory: /tmp/hive/admin/f879881f-c49b-449b-b8cf-81302c585358/_tmp_space.db
2021-09-09T21:01:25,345 INFO [main] org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:01:25,345 INFO [main] org.apache.hadoop.hive.ql.session.SessionState - Updating thread name to f879881f-c49b-449b-b8cf-81302c585358 main
2021-09-09T21:01:25,383 WARN [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
2021-09-09T21:01:53,237 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] CliDriver - Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
It still run hive shell when I start run
hive> show databases;
it come to error like this:
hive> show databases;
2021-09-09T21:05:01,341 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: f879881f-c49b-449b-b8cf-81302c585358
FAILED: HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
2021-09-09T21:05:43,092 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.conf.HiveConf - Using the default value passed in for log id: f879881f-c49b-449b-b8cf-81302c585358
2021-09-09T21:05:43,093 INFO [f879881f-c49b-449b-b8cf-81302c585358 main] org.apache.hadoop.hive.ql.session.SessionState - Resetting thread name to main
hive>
I have read some solution and I think that problem come from hive metastore.
I followed by tutorial that connect derby metastore with hive.
But when I try to run
schematool -dbType derby -initSchema
Window cannot run schematool as a command line.
So I really confuse how can init db to hive, or can do in another way?
Update 2021/9/20:
I have fix all variable in my paths, and right now I got stuck at new problem. The error is quite clear but no solution found on my research:
PS C:\my_programs\hive_3.1.2> .\bin\hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/my_programs/hive_3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/my_programs/hadoop-3.2.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2021-09-20T20:22:34,274 INFO [main] org.apache.hadoop.hive.conf.HiveConf - Found configuration file file:/C:/my_programs/hive_3.1.2/conf/hive-site.xml
2021-09-20T20:22:34,694 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
2021-09-20T20:22:37,728 WARN [main] org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.server2.enable.impersonation does not exist
Hive Session ID = 89fc5e06-2a55-496c-aea0-ab5512839ac3
Logging initialized using configuration in jar:file:/C:/my_programs/hive_3.1.2/lib/hive-exec-3.1.2.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.getTimeDuration(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;Ljava/util/concurrent/TimeUnit;)J
at org.apache.hadoop.hdfs.client.impl.DfsClientConf.<init>(DfsClientConf.java:248)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:307)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:291)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:173)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:624)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:591)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:747)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
PS C:\my_programs\hive_3.1.2>
It seems like error at hdfs config. But I don't know what I need to do with hadoop config, core-site.xml or hdfs-site.xml, or st else. Do we need some config at hadoop side to connect with hive???

Unable to read HiveServer2 configs from ZooKeeper

I use HDP3.1. And I Ambari to deploy hadoop cluster and hive. After deployed, I can run hive in shell successfully. And then I deploy Apache Kylin2.6, it can sync hive table. But when I build the cube, I got the following error:
java.io.IOException: OS command error exit with return code: 1, error message: SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://datacenter1:2181,datacenter2:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/15 10:04:53 [main]: INFO jdbc.HiveConnection: Connected to datacenter3:10000
19/02/15 10:04:53 [main]: WARN jdbc.HiveConnection: Failed to connect to datacenter3:10000
19/02/15 10:04:53 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs from ZooKeeper
Error: Could not open client transport for any of the Server URI's in ZooKeeper: Failed to open new session: java.lang.IllegalArgumentException: Cannot modify dfs.replication at runtime. It is not in list of params that are allowed to be modified at runtime (state=08S01,code=0)
Cannot run commands specified using -e. No current connection
The command is:
hive -e "USE default;
I run hive command in shell. It's success. The connection string is same as the string when run build cube in kylin. I'm confused why it is success in shell but failed in building cube.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://datacenter1:2181,datacenter2:2181,datacenter3:2181/default;password=hdfs;serviceDiscoveryMode=zooKeeper;user=hdfs;zooKeeperNamespace=hiveserver2
19/02/15 12:10:19 [main]: INFO jdbc.HiveConnection: Connected to datacenter3:10000
Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
0: jdbc:hive2://datacenter1:2181,datacenter2:>
You can try to add these two properties to hive-site.xml.
<property>
<name>hive.security.authorization.sqlstd.confwhitelist</name>
<value>mapred.*|hive.*|mapreduce.*|spark.*</value>
</property>
<property>
<name>hive.security.authorization.sqlstd.confwhitelist.append</name>
<value>mapred.*|hive.*|mapreduce.*|spark.*</value>
</property>
Finally, I found the root cause. There is 'Cannot modify dfs.replication at runtime.' error message in the error log. Kylin set this property in $KYLIN_HOME/conf/kylin_hive_conf.xml. And when it is running hive command, it will auto append the properties in that file. The final command likes: hive --hiveconf dfs.replication=2 ..........
It looks like that dfs.replication property can't be appened to hive command. I removed this property in kylin_hive_conf.xml. And it works now.

Failed to initialize schema for HiveServer2 in Apache Hive 3.0.0 on Cygwin (Windows 10)

I already had a Hadoop 3.0.0 cluster consisting of 2 machine: 1 namenode + RM and 1 datanode. I tried to install Apache Hive 3.0.0 by following this document.
When I run schematool -dbType derby -initSchema --verbose on Cygwin, an exception was thrown:
$ schematool -dbType derby -initSchema --verbose
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/BigSol/apache-hive-3.0.0-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/BigSol/hadoop-3.0.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver : org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User: APP
Starting metastore schema initialization to 3.0.0
org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 3.0.0
org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 3.0.0
at org.apache.hadoop.hive.metastore.MetaStoreSchemaInfo.generateInitFileName(MetaStoreSchemaInfo.java:137)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:580)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:562)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1445)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
*** schemaTool failed ***
When viewing the line of code that thrown the exception, I found that Hive tried to find a SQL schema located at %HIVE_HOME%\scripts\metastore\upgrade\derby\hive-schema-3.0.0.derby.sql.
I doubt that Cygwin messed up the path so that Hive didn't find that schema.
My questions:
How can I correct the path (or fix the problem)?
Are there batch files equivalent to *.sh files in %HIVE_HOME%\bin directory as Hive 2.1.1 have?
I found the solution. After running schematool on a Linux machine and copied metastore_db directory to Windows machine, I managed to start HiveServer2 but the beeline CLI said that the jar in C:\cygdrive\c\BigSol\apache-hive-3.0.0-bin\lib\hive-beeline-3.1.0.jar was not found.
It turned out that java in Cygwin parse the wrong path. I made a symbolic link from C:\cygdrive\c to C:\ and it worked.

Unable to start Hive CLI Hadoop(MapR)

I am trying to access hive CLI. However, it is failing to start with the following AccessControl issue.
Strangly enough, I am able to query hive data from Hue without the AccessControl issue. However, hive CLI is not working.
I am on a MapR cluster.
Any help is much appreciated.
[<user_name>#<edge_node> ~]$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/hive/hive-2.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/opt/mapr/hive/hive-2.1/conf/hive-log4j2.properties Async: true
2017-09-23 23:52:08,988 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-api-jdo-4.2.4.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-api-jdo-4.2.1.jar."
2017-09-23 23:52:08,993 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-core-4.1.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-core-4.1.6.jar."
2017-09-23 23:52:09,004 WARN [main] DataNucleus.General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/opt/mapr/spark/spark-2.1.0/jars/datanucleus-rdbms-4.1.19.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/opt/mapr/hive/hive-2.1/lib/datanucleus-rdbms-4.1.7.jar."
2017-09-23 23:52:09,038 INFO [main] DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
2017-09-23 23:52:09,039 INFO [main] DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2017-09-23 23:52:14,2251 ERROR JniCommon fs/client/fileclient/cc/jni_MapRClient.cc:2172 Thread: 20235 mkdirs failed for /user/<user_name>, error 13
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: User <user_name>(user id 50005586) has been denied access to create <user_name>
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:617)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:531)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:646)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hadoop.security.AccessControlException: User <user_name>(user id 50005586) has been denied access to create <user_name>
at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1256)
at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1276)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1913)
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getDefaultDestDir(DagUtils.java:823)
at org.apache.hadoop.hive.ql.exec.tez.DagUtils.getHiveJarDirectory(DagUtils.java:917)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createJarLocalResource(TezSessionState.java:616)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:256)
at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.beginOpen(TezSessionState.java:220)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:614)
... 10 more
The error is saying you're defined access to create a directory in the file system. This is likely /user/<user name>, which will need to be added by the HDFS / MapR FS super user.
I am able to query hive data from Hue without the AccessControl
Hue communicates via Thrift and HiveServer2.
Hive CLI bypasses HiveServer2 and is deprecated.
You should use Beeline instead.
beeline -n $(whoami) -u jdbc:hive2://hiveserver:10000/default
And if you're in a kerberized cluster, then you'll need some extra options there.

Hive Server2 ACID transactions not working

I am using Hadoop-2.6.0 secured with kerberos. Installed hive server2 1.1.0 version with derby database as connectionurl, enabled security and enabled Authorization. When enabling transaction configuration, I am getting the below exception and cannot execute any queries;
Exception
Error: Error while compiling statement: FAILED: LockException [Error 10280]: Error communicating with the metastore (state=42000,code=10280)
Logs
[Error 10280]: Error communicating with the metastore
org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidTxns(DbTxnManager.java:300)
at org.apache.hadoop.hive.ql.Driver.recordValidTxns(Driver.java:927)
Caused by: MetaException(message:Unable to select from transaction database, java.sql.SQLSyntaxErrorException: Table/View 'TXNS' does not exist.
So i have created a below property in hive-site.xml file as mentioned in a blog here
Configuration
<property>
<name>hive.in.test</name>
<value>true</value>
</property>
If i set the above property then getting the below exception where i am struck and unable to solve it. I cannot run any query even use mydb;
Exception
Error: Error while compiling statement: FAILED: NullPointerException null (state=42000,code=40000)
Logs
Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: NullPointerException null
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:103)
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkQFileTestHack(TxnHandler.java:1146)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.(TxnHandler.java:117)
I need a solution to work ACID transactions in Hive Server2. I found two related questions but not solved my issue.
hive 0.14 update and delete queries configuration error
Hive Transactions are crashing
Upgrade your hive mysql metastore db with hive-txn-schema-0.14.0.mysql.sql as follows..
mysql> SOURCE /usr/local/hadoop/hive/scripts/metastore/upgrade/mysql/hive-txn-schema-0.14.0.mysql.sql;

Resources