Can't create Hive table through JDBC connection - hadoop

I am running Hive2 on my ubuntu and trying to create tables both through the hive interface and through beeline\jdbc.
I have no problem creating tables through hive interface but when accessing through jdbc I get a permission denied error.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/user/hive/warehouse/page_view":cto:supergroup:drwxrwxr-x
From the exception I see it is trying to create the table in a directory which does not exist (/user/hive/warehouse/...)
my hive-default.xml has the following configuration:
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/local/apache-hive-1.2.1-bin/metastore_db</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>3</value>
<description>Number of retries while opening a connection to metastore</description>
</property>
<property>
<name>hive.metastore.failure.retries</name>
<value>1</value>
<description>Number of retries upon failure of Thrift metastore calls</description>
</property>
<property>
<name>hive.metastore.client.connect.retry.delay</name>
<value>1s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
Number of seconds for the client to wait between consecutive connection attempts
</description>
</property>
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>600s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
MetaStore Client socket timeout in seconds
</description>
</property>
<property>
<name>hive.metastore.client.socket.lifetime</name>
<value>0s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
MetaStore Client socket lifetime in seconds. After this time is exceeded, client
reconnects on the next MetaStore operation. A value of 0s means the connection
has an infinite lifetime.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mine</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.ds.connection.url.hook</name>
<value/>
<description>Name of the hook to use for retrieving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used</description>
</property>
<property>
<name>javax.jdo.option.Multithreaded</name>
<value>true</value>
<description>Set this to true if multiple threads access metastore through JDO concurrently.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/local/apache-hive-1.2.1-bin/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
So why it is trying to create the metastore_db under /user/hive/warehouse?

My problem was I did not change the hive-default.xml.template name to hive-site.xml as I should.

Related

Getting error while running hive on windows

I am getting following error when I try to run hive on windows, my hadoop is running fine on windows:
Connecting to jdbc:hive2://
Error applying authorization policy on hive configuration: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Beeline version 2.1.1 by Apache Hive
Error applying authorization policy on hive configuration: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Connection is already closed.
Here is my hive-site.xml:
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>user name for connecting to mysql server </description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepwd</value>
<description>password for connecting to mysql server </description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.server2.enable.impersonation</name>
<value>true</value>
</property>
<property>
<name>hive.exec.script.wrapper</name>
<value/>
<description/>
</property>
<property>
<name>hive.exec.plan</name>
<value/>
<description/>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>true</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoCreateTables</name>
<value>True</value>
</property>
Here is my hive-env.sh:
> export HADOOP_HOME=C:\Users\namaagarwal\Desktop\hadoop-2.6.2 export
> JAVA_HOME=C:\Progra~1\Java\jdk1.8.0_151
> HIVE_CONF_DIR=C:\Users\namaagarwal\Desktop\hadoop-2.6.2\hive\apache-hive-2.1.1-bin\conf
mysql user is created and derbyclient.jar and mysql-connector-java-5.0.5.jar is placed in hive/bin/lib directory. Also I have created schema for my database by using SOURCE C:/Users/namaagarwal/Desktop/hadoop-2.6.2/hive/apache-hive-2.1.1-bin/scripts/metastore/upgrade/mysql/hive-txn-schema-2.0.0.mysql.sql;
I have searched a lot but didn't get any solution for this.
What am I missing??

HDFS HA cluster Standby node doesn't become Active when the actual active namenode shutdown

I have configured HDFS on HA mode. I have a "Active" node and a "Standby" node. I have started ZKFC.
If I stop zkfc of the active node, the standby node change the state and put as "Active" node.
The problem is when I shutdown the Active server having zkfc started and one "Active" server and one "Standby" server, the Standby server doesn't change his status, always stay as Standby.
My core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://auto-ha</value>
</property>
</configuration>
My hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
<description>
The actual address the RPC server will bind to. If this optional address is
set, it overrides only the hostname portion of dfs.namenode.rpc-address.
It can also be specified per name node or name service for HA/Federation.
This is useful for making the name node listen on all interfaces by
setting it to 0.0.0.0.
</description>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
<description>
The actual address the service RPC server will bind to. If this optional address is
set, it overrides only the hostname portion of dfs.namenode.servicerpc-address.
It can also be specified per name node or name service for HA/Federation.
This is useful for making the name node listen on all interfaces by
setting it to 0.0.0.0.
</description>
</property>
<property>
<name>dfs.namenode.http-bind-host</name>
<value>0.0.0.0</value>
<description>
The actual adress the HTTP server will bind to. If this optional address
is set, it overrides only the hostname portion of dfs.namenode.http-address.
It can also be specified per name node or name service for HA/Federation.
This is useful for making the name node HTTP server listen on all
interfaces by setting it to 0.0.0.0.
</description>
</property>
<property>
<name>dfs.namenode.https-bind-host</name>
<value>0.0.0.0</value>
<description>
The actual adress the HTTPS server will bind to. If this optional address
is set, it overrides only the hostname portion of dfs.namenode.https-address.
It can also be specified per name node or name service for HA/Federation.
This is useful for making the name node HTTPS server listen on all
interfaces by setting it to 0.0.0.0.
</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>auto-ha</value>
</property>
<property>
<name>dfs.ha.namenodes.auto-ha</name>
<value>nn01,nn02</value>
</property>
<property>
<name>dfs.namenode.rpc-address.auto-ha.nn01</name>
<value>master1:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.auto-ha.nn01</name>
<value>master1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.auto-ha.nn02</name>
<value>master2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.auto-ha.nn02</name>
<value>master2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master1:8485;master2:8485;master3:8485/auto-ha</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hdfs/journalnode</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/ikerlan/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.auto-ha</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master1:2181,master2:2181,master3:2181</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.auto-ha</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>
I have checked the logs and the problem is when trying to Fence, I have the next fail:
2017-02-24 12:46:29,389 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master2/172.16.8.232:8020. Already tried 0 time$
2017-02-24 12:46:49,399 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at master2/172.16.8.232:8020 $
org.apache.hadoop.net.ConnectTimeoutException: Call From master1/172.16.8.231 to master2:8020 failed on socket timeout exception: org.$
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.transitionToStandby(Unknown Source)
at org.apache.hadoop.ha.protocolPB.HAServiceProtocolClientSideTranslatorPB.transitionToStandby(HAServiceProtocolClientSideTran$
at org.apache.hadoop.ha.FailoverController.tryGracefulFence(FailoverController.java:172)
at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:514)
at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:505)
at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61)
at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:892)
at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:910)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:809)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:418)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Caused by: org.apache.hadoop.net.ConnectTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch :$
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:534)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
I just added the next properties and now it works fine:
HDFS_SITE.XML
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/bin/true)</value>
</property>
CORE-SITE.XML
<property>
<name>hs.zookeeper.quorum</name>
<value>master1:2181,master2:2181,master3:2181</value>
</property>
The problem was that can't connect using sshfence so using the shell(/bin/true) it works fine.

Hiveserver2 failed to open new session in beeline

I'm running hive 2.1.1, hadoop 2.7.3 on Ubuntu 16.04.
ps aux | grep hive shows that hiveserver2 is running.
I'm trying to login with user [hive2] and password [password] to hivesever2 through beeline.
Here is my beeline output:
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000:
Enter password for jdbc:hive2://localhost:10000:
17/02/14 13:51:41 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: server is not allowed to impersonate anonymous (state=08S01,code=0)
I am able to connect to the embedded mode by entering !connect jdbc:hive2:// in beeline.
Here's my hive-site.xml:
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
</property>
<property>
<name>beeline.hs2.connection.user</name>
<value>hive2</value>
</property>
<property>
<name>beeline.hs2.connection.password</name>
<value>password</value>
</property>
<property>
<name>beeline.hs2.connection.hosts</name>
<value>localhost:10000</value>
</property>
</configuration>
I removed beeline-hs2-connection.xml in case it will overwrite hive-site.xml.
Here's my core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.centos.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.centos.hosts</name>
<value>*</value>
</property>
</configuration>
How could I fix the error and connect to jdbc:hive2://localhost:10000?
Thanks!
User: server is not allowed to impersonate anonymous
Here server is the user which is attempting to impersonate anonymous user.
Add these properties to core-site.xml and restart the services.
<property>
<name>hadoop.proxyuser.server.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.server.groups</name>
<value>*</value>
</property>

Hadoop - Tables not displayed in Hive

The problem i am facing is:
Everytime I login in to HIVE CLI, all the created databases & tables are gone. I can see them in the warehouse directory in Hadoop GUI. However same is not reflecting through CLI. Please help me resolve the issue.
I am using Hadoop - 1.0.4 & Hive - 1.2.1.
I have configured (warehouse dir, temp dir, derby metastore dir) inhive-site.xml as per documentation.
properties in hive-site.xml
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hadoop/hive</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hadoop/hive</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>700</value>
<description>The permission for the user specific scratch directories that get created.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>3</value>
<description>Number of retries while opening a connection to metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/hadoop/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

Connection Refused - Why does zookeeper tries to connect to localhost instead of a server ip

My Reduce Tasks fail because all the namenode fails trying to connect to localhost instead of 10.10.187.170 ..
My application even tries to connect manually in the code..
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "10.10.187.170");
conf.set("hbase.zookeeper.property.clientPort","2181");
conf.set("hbase.master","10.10.187.170");
My hbase-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera CM on 2013-08-14T06:27:30.291Z-->
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ip-10-10-187-170.eu-west-1.compute.internal:8020/hbase</value>
</property>
<property>
<name>hbase.client.write.buffer</name>
<value>2097152</value>
</property>
<property>
<name>hbase.client.pause</name>
<value>1000</value>
</property>
<property>
<name>hbase.client.retries.number</name>
<value>10</value>
</property>
<property>
<name>hbase.client.scanner.caching</name>
<value>1</value>
</property>
<property>
<name>hbase.client.keyvalue.maxsize</name>
<value>10485760</value>
</property>
<property>
<name>hbase.rpc.timeout</name>
<value>60000</value>
</property>
<property>
<name>hbase.security.authentication</name>
<value>simple</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>60000</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/hbase</value>
</property>
<property>
<name>zookeeper.znode.rootserver</name>
<value>root-region-server</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ip-10-10-187-170.eu-west-1.compute.internal</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
Error:
2013-08-24 20:05:36,213 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2013-08-24 20:05:36,213 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
I finally found out why it was connecting to the localhost.
I was having more than one configuration object to interact with 2 or more tables in HBase. Only for one of the conf object, i was setting the zookeeper quorum address. Didn't set it for the other objects.
By setting the zookeeper quorum address for all the objects did the trick.
Edit : Using only one conf object is more desirable(mapper or reducer). And by using it like
public void setup(Configuration conf) {
conf.set("hbase.zookeeper.quorum", "xxxxx");
}
The above code is easier to follow than having multiple objects which reduces HBase access time.

Resources