Hadoop - Tables not displayed in Hive

Hadoop - Tables not displayed in Hive - hadoop

The problem i am facing is:
Everytime I login in to HIVE CLI, all the created databases & tables are gone. I can see them in the warehouse directory in Hadoop GUI. However same is not reflecting through CLI. Please help me resolve the issue.
I am using Hadoop - 1.0.4 & Hive - 1.2.1.
I have configured (warehouse dir, temp dir, derby metastore dir) inhive-site.xml as per documentation.
properties in hive-site.xml
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hadoop/hive</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hadoop/hive</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>700</value>
<description>The permission for the user specific scratch directories that get created.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>3</value>
<description>Number of retries while opening a connection to metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/hadoop/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

Related

Unable to access databases created with hive and run queries on hue

I would like to use Hue as a visualization interface for hive, the server hiveserver 2 starts well and I can work in command without problem.
My hadoop is also functional (single node running on localhost), I managed to configure the hdfs files for hue and I can easily view hdfs files with the interface hue. but my big problem for weeks is to make a HIVE request with hue (even if I configured according to the research I found on the internet). I can not do it and get stuck on it
your help will be really appreciated.
this is hive-site.xml
<?xml version="1.0"?>
-<configuration>
-<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive_temp</value>
<description>Local scratch space for Hive jobs</description>
</property>
-<property>
<name>hive.execution.engine</name>
<value>mr</value>
<description> Expects one of [mr, tez, spark]. Chooses execution engine. Options are: mr (Map reduce, default)</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<description>metadata is stored in a MySQL server</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepassword</value>
<description>password to use against metastore database</description>
</property>
-<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive_tmp</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
</property>
-<property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/warehouse</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
</property>
</configuration>
and hive configuration in HUE pseudo-distributed.ini
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=localhost
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10002
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/usr/local/hive/conf

cant impersonate on hive server2

Im trying to connect to hive server 2 through a JDBC connector, but Im getting the error:
'user x cant impersonate y'
I added these properties to my core-site.xml file:
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
Also, in hive-site.xml I have:
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
I have my authentication set to none and I am connecting as anonymous. I have restarted my cluster since changing the config files as well as running:
hadoop fs -chmod g+w /user/hive/warehouse
hadoop fs -chmod g+w /tmp
Can anyone suggest why Im still getting the error?

If you are trying to connect as user named anonymous, the properties should be
<property>
<name>hadoop.proxyuser.anonymous.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.anonymous.groups</name>
<value>*</value>
</property>

Can't create Hive table through JDBC connection

I am running Hive2 on my ubuntu and trying to create tables both through the hive interface and through beeline\jdbc.
I have no problem creating tables through hive interface but when accessing through jdbc I get a permission denied error.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/user/hive/warehouse/page_view":cto:supergroup:drwxrwxr-x
From the exception I see it is trying to create the table in a directory which does not exist (/user/hive/warehouse/...)
my hive-default.xml has the following configuration:
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/local/apache-hive-1.2.1-bin/metastore_db</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>3</value>
<description>Number of retries while opening a connection to metastore</description>
</property>
<property>
<name>hive.metastore.failure.retries</name>
<value>1</value>
<description>Number of retries upon failure of Thrift metastore calls</description>
</property>
<property>
<name>hive.metastore.client.connect.retry.delay</name>
<value>1s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
Number of seconds for the client to wait between consecutive connection attempts
</description>
</property>
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>600s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
MetaStore Client socket timeout in seconds
</description>
</property>
<property>
<name>hive.metastore.client.socket.lifetime</name>
<value>0s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
MetaStore Client socket lifetime in seconds. After this time is exceeded, client
reconnects on the next MetaStore operation. A value of 0s means the connection
has an infinite lifetime.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mine</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.ds.connection.url.hook</name>
<value/>
<description>Name of the hook to use for retrieving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used</description>
</property>
<property>
<name>javax.jdo.option.Multithreaded</name>
<value>true</value>
<description>Set this to true if multiple threads access metastore through JDO concurrently.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/local/apache-hive-1.2.1-bin/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
So why it is trying to create the metastore_db under /user/hive/warehouse?

My problem was I did not change the hive-default.xml.template name to hive-site.xml as I should.

Impala cannot find com.mysql.jdbc.Driver

I'm trying to set up Cloudera Impala with CDH4 in pseudo distributed mode on Red Hat 5. I have Hive using JDBC to connect to a MySQL metastore, but I'm having trouble setting up Impala with JDBC. I've been following the instructions found here: http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_impala_jdbc.html
I've extracted the JARs to a directory and included that directory in $CLASSPATH. I've also included /usr/lib/hive/lib in $CLASSPATH, which has mysql-connector-java-5.1.25-bin.jar.
In both my Hive and Impala conf directories, I have hive-site.xml including the following properties:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
But when I run sudo service impala-server restart, the server log has this error:
ERROR common.MetaStoreClientPool: Error initializing Hive Meta Store client
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Which it says is cause by this:
Caused by: org.datanucleus.store.rdbms.datasource.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.datasource.dbcp.DBCPDataSourceFactory.makePooledDataSource(DBCPDataSourceFactory.java:80)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initDataSourceTx(ConnectionFactoryImpl.java:144)
... 57 more
Is there any step I'm missing to configure Impala with JDBC?

I fixed this by copying mysql-connector-java-5.1.25-bin.jar to /var/lib/impala - the startup script was telling the classpath to look here for the connector jar for some reason.

MR scratch error on hive/hbase integration

I'm running hive and hbase on a 2-node-hadoop.
I'm using hadoop-0.20.205.0, hive-0.9.0, hbase-0.92.0, and zookeeper-3.4.2.
hive and hbase works fine separately. Then I followed this manual to integrate hive and hbase.
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
hive started without errors, and I created the sample table
CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
show tables in hive and list or scan in hbase works well.
But when I select * from hbase_table_1; in hive, I get errors
2012-09-12 11:25:56,975 ERROR ql.Driver (SessionState.java:printError(400)) - FAILED: Hive Internal Error: java.lang.RuntimeException(Error while making MR scratch directory - check filesystem config (null))
java.lang.RuntimeException: Error while making MR scratch directory - check filesystem config (null)
...
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://10.10.10.15:54310/tmp/hive-hadoop/hive_2012-09-12_11-25-56_602_1946700606338541381, expected: hdfs://hadoop01:54310
It says fs is wrong, but I don't think it's right to config fs to such a path, and where should I config it?
Here is my config files. Ip address of hadoop01 is 10.10.10.15.
hbase-site.xml
<configuration>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>10.10.10.15</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/datas/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop01:54310/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
Anyone can help please?

I solved it myself.
Modify $HADOOP_HOME/conf/core-site.xml, change dfs.default.name from ip to hostname. like this
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop01:54310/</value>
</property>
Make sure that both this property and hbase.rootdir property in hbase-site.xml use same hostname or ip.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hadoop - Tables not displayed in Hive - hadoop

Related

Unable to access databases created with hive and run queries on hue

cant impersonate on hive server2

Can't create Hive table through JDBC connection

Impala cannot find com.mysql.jdbc.Driver

MR scratch error on hive/hbase integration

Categories

Resources