Error in loading hive - hadoop

I am new to hive and when I started the hive from command prompt in windows...I got the follwing error
(A required privilege is not held by the client)
Can any one tell me why?

Try sudo access and try to restart the program

Related

hive prompt is not loading using the hive command

I installed hive successfully. Then i tried to launch the hiveserver2 thrift. then something went wrong and now the hive prompt also not getting loaded.
Logging initialized using configuration in jar:file:/usr/local/hadoop/hive/lib/hive-common-1.2.2.jar!/hive-log4j.properties
the execution is stuck at this message. Can anyone help me to fix this.

Hive jdbc connection is giving error if MR is involved

I am working on Hive-jdbc connection in HDP 2.1
Code is working fine for queries where mapreduce is not involved like "select * from tabblename". The same code is showing error when the query is modified with a 'where' clause or if we specify columnnames(which will run mapreduce in the the background).
I have verified the correctness of the query by executing it in HiveCLI.
Also I have verified the read/write permissions for the table for the user through which I am running the java-jdbc code.
The error is as follows
java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:355)
at com.testing.poc.hivejava.HiveJDBCTest.main(HiveJDBCTest.java:25)
Today I also got this exception when I submit a hive task from java.
The following error:
org.apache.hive.jdbc.HiveDriverorg.apache.hive.jdbc.HiveDriverhive_driver:
org.apache.hive.jdbc.HiveDriverhive_url:jdbc:hive2://10.174.242.28:10000/defaultget
connection sessucess获取hive连接成功!
java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
I tried to use the sql execute in hive and it works well. Then I saw the log in /var/log/hive/hadoop-cmf-hive-HIVESERVER2-cloud000.log.out then I found the reason of this error. The following error:
Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=anonymous, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
Solution
I used the following command :
sudo -u hdfs hadoop fs -chmod -R 777 /
This solved the error!
hive_driver:org.apache.hive.jdbc.HiveDriver
hive_url:jdbc:hive2://cloud000:10000/default
get connection sessucess
获取hive连接成功!
Heart beat
执行insert成功!
If you use beeline to execute the same queries, do you see the same behaviour as you get while running your test program?
The beeline client also uses the open source JDBC driver and connects to Hive server, which is similar to what you do in your program. HiveCLI on the other hand has Hive embedded in it and does not connect to a remote Hive server by default. You can use HiveCLI to connect to a remote Hive Server 1 but I don't believe you can use it to connect to Hive Server2 (use beeline for Hive Server 2).
For this error, you can take a look at the hive.log and hiveserver2.log on the server side to get more insight into what might have caused the MapReduce error.
Hope this helps.
Cheers,
Holman

java.sql.SQLException: Failed to start database '/var/lib/hive/metastore/metastore_db' in hive

I am a starter to hive. When I try to execute any hive commands:
hive>SHOW TABLES;
it's showing the below error:
FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Failed to start database '/var/lib/hive/metastore/metastore_db', see the next exception for details.
NestedThrowables:
java.sql.SQLException: Failed to start database '/var/lib/hive/metastore/metastore_db', see the next exception for details.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
It looks like derby locking issue. you can temporarily fix this issue by deleting the lock file inside the directory /var/lib/hive/metastore/metastore_db. But this issue will also occur in future also
sudo rm -rf /var/lib/hive/metastore/metastore_db/*.lck
With default hive metastore embedded derby, it is not possible to start multiple instance of hive at the same time. By changing hive metastore to mysql or postgres server this issue can be solved.
See the following cloudera documentation for changing hive metastore
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_18_4.html
I've encountered similar error when I forgot about another instance of spark-shell running on same node.
update hive-site.xml under ~/hive/conf folder as below name/value and try this:
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value>
In my case I needed to create a directory and grant proper permissions:
$ sudo mkdir /var/lib/hive/metastore/
$ sudo chown hdfs:hdfs /var/lib/hive/metastore/

Hive JDBC client throws SQLException

I am connecting to a hive installation using a JDBC client code. I have created a test table with two columns(column1, column2) both string type. When i try executing simple queries like "select* from test" i get result in java program but queries with where clauses and other complex queries throw the following exception.
"Query returned non-zero code: 1, cause: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask"
I have tried changing permissions of hdfs directories where file is present, /tmp on local directory but this didn't work.
This is my connection code
Connection con = DriverManager.getConnection("jdbc:hive://"+host+":"+port+"/default", "", "");
Statement stmt = con.createStatement();
Error is thrown at executeQuery() method
Checking the logs on server gives the following exception:
java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:83)
at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:76)
at org.apache.hadoop.mapred.JobClient.init(JobClient.java:478)
at org.apache.hadoop.mapred.JobClient.(JobClient.java:457)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:426)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:138)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1374)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1160)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:973)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:893)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Job Submission failed with exception 'java.io.IOException(Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.)'
The queries work when run on a command prompt but not in JDBC client.
I am stuck on this. Any suggestions would be helpful.
UPDATE
I am using cloudera CDH4 hadoop/hive distribution. The script that i ran is as follows
#!/bin/bash
HADOOP_HOME=/usr/lib/hadoop/client
HIVE_HOME=/usr/lib/hive
echo -e '1\x01foo' > /tmp/a.txt
echo -e '2\x01bar' >> /tmp/a.txt
HADOOP_CORE={{ls $HADOOP_HOME/hadoop*core*.jar}}
CLASSPATH=.:$HADOOP_CORE:$HIVE_HOME/conf
for i in ${HIVE_HOME}/lib/*.jar ; do
CLASSPATH=$CLASSPATH:$i
done
for i in ${HADOOP_HOME}/*.jar ; do
CLASSPATH=$CLASSPATH:$i
done
java -cp $CLASSPATH com.hive.test.HiveConnect
I had change HADOOP_CORE={{ls $HADOOP_HOME/hadoop-*-core.jar}} to HADOOP_CORE={{ls $HADOOP_HOME/hadoop*core*.jar}} as there was no jar file in my hadoop_home starting with hadoop- and ending with -core.jar. Is this correct? Also running the script gives the following error
/usr/lib/hadoop/client/hadoop*core*.jar}}: No such file or directory
Also i have modified the script to add hadoop client jars to classpath as the script threw the error that hadoop fileReader not found. So i added the following as well.
for i in ${HADOOP_HOME}/*.jar ; do
CLASSPATH=$CLASSPATH:$i
done
This executes the class file and runs the query "select * from test" but fails on "select column1 from test".
Still no success and the same error.
Since, it is running fine with the hive shell, can you check if the user with which you are running the hive shell and the java program (with JDBC) are the same?
Next, Starting the Thrift Server
cd to where hive is -
Issue this commands -
bin/hive --service hiveserver &
you should see -
Starting Hive Thrift Server
A quick way to ensure the HiveServer is running is to use the netstat command to determine if port 10,000 is open and listening for connections:
netstat -nl | grep 10000
tcp 0 0 :::10000 :::* LISTEN
Next, create a file called myhivetest.sh and put the follwing inside
and replace HADOOP_HOME, HIVE_HOME and package.youMainClass according to your requirements-
#!/bin/bash
HADOOP_HOME=/your/path/to/hadoop
HIVE_HOME=/your/path/to/hive
echo -e '1\x01foo' > /tmp/a.txt
echo -e '2\x01bar' >> /tmp/a.txt
HADOOP_CORE={{ls $HADOOP_HOME/hadoop-*-core.jar}}
CLASSPATH=.:$HADOOP_CORE:$HIVE_HOME/conf
for i in ${HIVE_HOME}/lib/*.jar ; do
CLASSPATH=$CLASSPATH:$i
done
java -cp $CLASSPATH package.youMainClass
Save the myhivetest.sh and do a chmod +x myhivetest.sh. You can run the bash script using ./myhivetest.sh, which will build your classpath before invoking your hive program.
Please follow the instruction here for details.
There are two ways embedded mode and standalone mode.
You should look for the standalone mode.
For your information:
Hive is not a extensive query engine akin to the DBMS like MySQL, Oracle and Teradata etc.
Hive has got limitations on the extent of complex queries you can make, like very complex joins etc.
Hive runs Hadoop MapReduce jobs when you do a query.
Check this tutorial for what type of queries are supported and which are not.
Hope this helps.
I had the same issue. I have managed to resolve the issue.
This error popped up when I was running the hive jdbc client on a hadoop cluster with /user accounts set up.
With such a environment set up, the ability to run map-reduce jobs were all based on permissions.
With the connection string being wrong, the map-reduce framework was not able to set up staging directories and trigger off the job.
Please look at your connection string [if this error is popping up in a hadoop-cluster setup].
If the connection string looks this way
Connection con = DriverManager
.getConnection(
"jdbc:hive2://cluster.xyz.com:10000/default",
"hive", "");
Change it to
Connection con = DriverManager
.getConnection(
"jdbc:hive2://cluster.xyz.com:10000/default",
"user1", "");
where user1 is a configured user on the cluster setup.
I was having similar issues. I am trying to query Hive using Oracle SQL Developer (http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html) combined with a third-party JDBC driver as described here: https://blogs.oracle.com/datawarehousing/entry/oracle_sql_developer_data_modeler. Yes, I know that I could use Hue to do this but I interact with many other databases, including Oracle, and it is nice to have a rich client that I can save SQL queries and simple reports directly on my machine.
I am running the latest version of Cloudera CDH (5.4) on a cluster on AWS.
I was able to issue simple queries such as "SELECT * FROM SAMPLE_07" and receive a result, but running "SELECT COUNT(*) FROM SAMPLE_07" would throw a JDBC error. I was able to solve this by creating a user in Hue, and entering this user information in the Oracle SQL Developer connection information dialog. After doing this, I was able to run both queries.
What was confusing about this is that I was able to run a simple SELECT statement and received no error -- what I am used to is either a) I can log into a system to run queries or b) I can't. Strange that it "sort of" works without the correct user ID but I guess one of those strange Hadoop things.

FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

I shutdown my HDFS client while HDFS and hive instances were running. Now when I relogged into Hive, I can't execute any of my DDL Tasks e.g. "show tables" or "describe tablename" etc. It is giving me the error as below
ERROR exec.Task (SessionState.java:printError(401)) - FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Can anybody suggest what do I need to do to get my metastore_db instantiated without recreating the tables? Otherwise, I have to duplicate the effort of creating the entire database/schema once again.
I have resolved the problem. These are the steps I followed:
Go to $HIVE_HOME/bin/metastore_db
Copied the db.lck to db.lck1 and dbex.lck to dbex.lck1
Deleted the lock entries from db.lck and dbex.lck
Log out from hive shell as well as from all running instances of HDFS
Re-login to HDFS and hive shell. If you run DDL commands, it may again give you the "Could not instantiate HiveMetaStoreClient error"
Now copy back the db.lck1 to db.lck and dbex.lck1 to dbex.lck
Log out from all hive shell and HDFS instances
Relogin and you should see your old tables
Note: Step 5 may seem a little weird because even after deleting the lock entry, it will still give the HiveMetaStoreClient error but it worked for me.
Advantage: You don't have to duplicate the effort of re-creating the entire database.
Hope this helps somebody facing the same error. Please vote if you find useful. Thanks ahead
I was told that generally we get this exception if we the hive console not terminated properly.
The fix:
Run the jps command, look for "RunJar" process and kill it using
kill -9 command
See: getting error in hive
Have you copied the jar containing the JDBC driver for your metadata db into Hive's lib dir?
For instance, if you're using MySQL to hold your metadata db, you wll need to copy
mysql-connector-java-5.1.22-bin.jar into $HIVE_HOME/lib.
This fixed that same error for me.
I faced the same issue and resolved it by starting the metastore service. Sometimes service might get stopped if your machine is re-booted or went down. You could start the service by running the command:
Login as $HIVE_USER
nohup hive --service metastore>$HIVE_LOG_DIR/hive.out 2>$HIVE_LOG_DIR/hive.log &
I had a similar problem with hive server and followed the below steps:
1. Go to $HIVE_HOME/bin/metastore_db
2. Copied the db.lck to db.lck1 and dbex.lck to dbex.lck1
3. Deleted the lock entries from db.lck and dbex.lck
4. Relogin from hive shell. It is working
Thanks
For instance, I use MySQL to hold metadata db, I copied
mysql-connector-java-5.1.22-bin.jar into $HIVE_HOME/lib folder
My error resolved
I also was facing the same problem, and figured out that I had both hive-deafult.xml and hive-site.xml(created manually by me),
I moved my hive-site.xml to hive-site.xml-template(as I was not needed this file) then
started hive, worked fine.
Cheers,
Ajmal
I have faced this issue and in my case it was while running hive command from command line.
I resolved this issue by running kinit command as I was using kerberized hive.
kinit -kt <your keytab file location> <kerberos principal>

Resources