I'm running hive and hbase on a 2-node-hadoop.
I'm using hadoop-, hive-0.9.0, hbase-0.92.0, and zookeeper-3.4.2.
hive and hbase works fine separately. Then I followed this manual to integrate hive and hbase.
hive started without errors, and I created the sample table
CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
show tables in hive and list or scan in hbase works well.
But when I select * from hbase_table_1; in hive, I get errors
2012-09-12 11:25:56,975 ERROR ql.Driver (SessionState.java:printError(400)) - FAILED: Hive Internal Error: java.lang.RuntimeException(Error while making MR scratch directory - check filesystem config (null))
java.lang.RuntimeException: Error while making MR scratch directory - check filesystem config (null)
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://, expected: hdfs://hadoop01:54310
It says fs is wrong, but I don't think it's right to config fs to such a path, and where should I config it?
Here is my config files. Ip address of hadoop01 is
<description>The directory shared by RegionServers.
<description>Property from ZooKeeper's config zoo.cfg.
The directory where the snapshot is stored.
<description>The directory shared by RegionServers.
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
Anyone can help please?
I solved it myself.
Modify $HADOOP_HOME/conf/core-site.xml, change dfs.default.name from ip to hostname. like this
Make sure that both this property and hbase.rootdir property in hbase-site.xml use same hostname or ip.
I would like to use Hue as a visualization interface for hive, the server hiveserver 2 starts well and I can work in command without problem.
My hadoop is also functional (single node running on localhost), I managed to configure the hdfs files for hue and I can easily view hdfs files with the interface hue. but my big problem for weeks is to make a HIVE request with hue (even if I configured according to the research I found on the internet). I can not do it and get stuck on it
your help will be really appreciated.
this is hive-site.xml
<?xml version="1.0"?>
<description>Local scratch space for Hive jobs</description>
<description> Expects one of [mr, tez, spark]. Chooses execution engine. Options are: mr (Map reduce, default)</description>
<description>metadata is stored in a MySQL server</description>
<description>Driver class name for a JDBC metastore</description>
<description>Username to use against metastore database</description>
<description>password to use against metastore database</description>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
and hive configuration in HUE pseudo-distributed.ini
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
# Port where HiveServer2 Thrift server runs on.
# Hive configuration directory, where hive-site.xml is located
I've created an HBase cluster in a Hadoop HA cluster. My region servers are failing to start with the following exception in the logs:
2017-09-12 11:41:32,116 ERROR [regionserver/my.hostname.com/] regionserver.HRegionServer: Failed init
java.io.IOException: Failed on local exception: java.net.SocketException: Invalid argument; Host Details : local host is: "my.hostname.com/"; destination host is: "":8020;
I'm pretty sure the problem is caused by the hadoop HA configuration
I think Hbase doesn't understand the nameservice and thinks it's an IP address.
excerpt from core-site.xml:
<description>NameNode URI</description>
excerpt from hdfs-site.xml:
my hbase-site.xml:
It was a silly mistake. Hbase was missing the path to the hadoop configuration files. Simply added HADOOP_CONF_DIR to hbase-env.sh
The problem i am facing is:
Everytime I login in to HIVE CLI, all the created databases & tables are gone. I can see them in the warehouse directory in Hadoop GUI. However same is not reflecting through CLI. Please help me resolve the issue.
I am using Hadoop - 1.0.4 & Hive - 1.2.1.
I have configured (warehouse dir, temp dir, derby metastore dir) inhive-site.xml as per documentation.
properties in hive-site.xml
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
<description>Local scratch space for Hive jobs</description>
<description>Temporary local directory for added resources in the remote file system.</description>
<description>The permission for the user specific scratch directories that get created.</description>
<description>location of default database for the warehouse</description>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
<description>Number of retries while opening a connection to metastore</description>
<description>JDBC connect string for a JDBC metastore</description>
I am setting up CDH4 in a pseudo-distributed mode.
I have set up Hadoop, and as suggested on CDH4 installation guide, have also completed the hdfs demo successfully.
I have also set up, HIVE, & HBase.
To populate the data in Hbase, I have written a java client, which populates the bulk data in HBase (around 1M rows each in 4 tables).
Now I am facing two issues:
When java client is running to port the dummy data into hbase, the regionserver shut down after around 4,50,000 rows of data is entered in total.
Using Hive, I am not able to access tables created in HBase, or worst, even cannot create tables from hive shell. Though, the hbase shell shows me the data/table structure (whetever has been generated before regionserver shut down.)
I have seen other posts regarding same. Seems that the 2nd issue is related to my /etc/hosts or hive-site.xml. Thus, I am pasting contents of both of them.
/etc/hosts u17162752.onlinehome-server.com u17162752 default-domain.com hbase.zookeeper.quorum localhost cloudera-vm # Added by NetworkManager localhost.localdomain localhost cloudera-vm-local localhost
<description>the URL of the MySQL database</description>
<description>IP address (or fully-qualified domain name) and port of the metastore host</description>
<description>Enable Hive's Table Lock Manager Service</description>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
These issue are holding me from accomplish the task, I am supposed to.
Thanks in advance
PS: This is my first post to this forum, so apologies, for anything inappropriate, you might have found! Thanks for bearing with me.
Hi Tariq, Thanks for your response. I have somehow managed to get over this. Now, I am facing another issue.
I am having 4 tables in HBase already, for which I want to create external tables in hive shell. But on running create external table commands on hive shell gives following error:
'ERROR: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in -ROOT- for region .META.,,1.1028785192 containing row'
Also, this error appears when I do something in HBase shell.
The other error that comes with the former one, on hbase shell is related to zookeeper. Stacktrace:
'WARN zookeeper.ZKUtil: catalogtracker-on- org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation#6a9a56bf- 0x1413718482c0010 Unable to get data of znode /hbase/unassigned/1028785192
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/unassigned/1028785192'
Please help. Thanks!
I'm trying to set up Cloudera Impala with CDH4 in pseudo distributed mode on Red Hat 5. I have Hive using JDBC to connect to a MySQL metastore, but I'm having trouble setting up Impala with JDBC. I've been following the instructions found here: http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_impala_jdbc.html
I've extracted the JARs to a directory and included that directory in $CLASSPATH. I've also included /usr/lib/hive/lib in $CLASSPATH, which has mysql-connector-java-5.1.25-bin.jar.
In both my Hive and Impala conf directories, I have hive-site.xml including the following properties:
But when I run sudo service impala-server restart, the server log has this error:
ERROR common.MetaStoreClientPool: Error initializing Hive Meta Store client
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Which it says is cause by this:
Caused by: org.datanucleus.store.rdbms.datasource.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.datasource.dbcp.DBCPDataSourceFactory.makePooledDataSource(DBCPDataSourceFactory.java:80)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initDataSourceTx(ConnectionFactoryImpl.java:144)
... 57 more
Is there any step I'm missing to configure Impala with JDBC?
I fixed this by copying mysql-connector-java-5.1.25-bin.jar to /var/lib/impala - the startup script was telling the classpath to look here for the connector jar for some reason.