The problem i am facing is:
Everytime I login in to HIVE CLI, all the created databases & tables are gone. I can see them in the warehouse directory in Hadoop GUI. However same is not reflecting through CLI. Please help me resolve the issue.
I am using Hadoop - 1.0.4 & Hive - 1.2.1.
I have configured (warehouse dir, temp dir, derby metastore dir) inhive-site.xml as per documentation.
properties in hive-site.xml
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hadoop/hive</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hadoop/hive</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>700</value>
<description>The permission for the user specific scratch directories that get created.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>3</value>
<description>Number of retries while opening a connection to metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/hadoop/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
Related
I would like to use Hue as a visualization interface for hive, the server hiveserver 2 starts well and I can work in command without problem.
My hadoop is also functional (single node running on localhost), I managed to configure the hdfs files for hue and I can easily view hdfs files with the interface hue. but my big problem for weeks is to make a HIVE request with hue (even if I configured according to the research I found on the internet). I can not do it and get stuck on it
your help will be really appreciated.
this is hive-site.xml
<?xml version="1.0"?>
-<configuration>
-<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive_temp</value>
<description>Local scratch space for Hive jobs</description>
</property>
-<property>
<name>hive.execution.engine</name>
<value>mr</value>
<description> Expects one of [mr, tez, spark]. Chooses execution engine. Options are: mr (Map reduce, default)</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<description>metadata is stored in a MySQL server</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
-<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hivepassword</value>
<description>password to use against metastore database</description>
</property>
-<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive_tmp</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
</property>
-<property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/warehouse</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission</description>
</property>
</configuration>
and hive configuration in HUE pseudo-distributed.ini
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=localhost
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10002
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/usr/local/hive/conf
Im trying to connect to hive server 2 through a JDBC connector, but Im getting the error:
'user x cant impersonate y'
I added these properties to my core-site.xml file:
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
Also, in hive-site.xml I have:
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
<description>
Setting this property to true will have HiveServer2 execute
Hive operations as the user making the calls to it.
</description>
</property>
I have my authentication set to none and I am connecting as anonymous. I have restarted my cluster since changing the config files as well as running:
hadoop fs -chmod g+w /user/hive/warehouse
hadoop fs -chmod g+w /tmp
Can anyone suggest why Im still getting the error?
If you are trying to connect as user named anonymous, the properties should be
<property>
<name>hadoop.proxyuser.anonymous.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.anonymous.groups</name>
<value>*</value>
</property>
I am running Hive2 on my ubuntu and trying to create tables both through the hive interface and through beeline\jdbc.
I have no problem creating tables through hive interface but when accessing through jdbc I get a permission denied error.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/user/hive/warehouse/page_view":cto:supergroup:drwxrwxr-x
From the exception I see it is trying to create the table in a directory which does not exist (/user/hive/warehouse/...)
my hive-default.xml has the following configuration:
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/local/apache-hive-1.2.1-bin/metastore_db</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value/>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>3</value>
<description>Number of retries while opening a connection to metastore</description>
</property>
<property>
<name>hive.metastore.failure.retries</name>
<value>1</value>
<description>Number of retries upon failure of Thrift metastore calls</description>
</property>
<property>
<name>hive.metastore.client.connect.retry.delay</name>
<value>1s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
Number of seconds for the client to wait between consecutive connection attempts
</description>
</property>
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>600s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
MetaStore Client socket timeout in seconds
</description>
</property>
<property>
<name>hive.metastore.client.socket.lifetime</name>
<value>0s</value>
<description>
Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified.
MetaStore Client socket lifetime in seconds. After this time is exceeded, client
reconnects on the next MetaStore operation. A value of 0s means the connection
has an infinite lifetime.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mine</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.ds.connection.url.hook</name>
<value/>
<description>Name of the hook to use for retrieving the JDO connection URL. If empty, the value in javax.jdo.option.ConnectionURL is used</description>
</property>
<property>
<name>javax.jdo.option.Multithreaded</name>
<value>true</value>
<description>Set this to true if multiple threads access metastore through JDO concurrently.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/usr/local/apache-hive-1.2.1-bin/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
So why it is trying to create the metastore_db under /user/hive/warehouse?
My problem was I did not change the hive-default.xml.template name to hive-site.xml as I should.
I'm trying to set up Cloudera Impala with CDH4 in pseudo distributed mode on Red Hat 5. I have Hive using JDBC to connect to a MySQL metastore, but I'm having trouble setting up Impala with JDBC. I've been following the instructions found here: http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Impala/ciiu_impala_jdbc.html
I've extracted the JARs to a directory and included that directory in $CLASSPATH. I've also included /usr/lib/hive/lib in $CLASSPATH, which has mysql-connector-java-5.1.25-bin.jar.
In both my Hive and Impala conf directories, I have hive-site.xml including the following properties:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
But when I run sudo service impala-server restart, the server log has this error:
ERROR common.MetaStoreClientPool: Error initializing Hive Meta Store client
javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Which it says is cause by this:
Caused by: org.datanucleus.store.rdbms.datasource.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
at org.datanucleus.store.rdbms.datasource.dbcp.DBCPDataSourceFactory.makePooledDataSource(DBCPDataSourceFactory.java:80)
at org.datanucleus.store.rdbms.ConnectionFactoryImpl.initDataSourceTx(ConnectionFactoryImpl.java:144)
... 57 more
Is there any step I'm missing to configure Impala with JDBC?
I fixed this by copying mysql-connector-java-5.1.25-bin.jar to /var/lib/impala - the startup script was telling the classpath to look here for the connector jar for some reason.
I'm running hive and hbase on a 2-node-hadoop.
I'm using hadoop-0.20.205.0, hive-0.9.0, hbase-0.92.0, and zookeeper-3.4.2.
hive and hbase works fine separately. Then I followed this manual to integrate hive and hbase.
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
hive started without errors, and I created the sample table
CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");
show tables in hive and list or scan in hbase works well.
But when I select * from hbase_table_1; in hive, I get errors
2012-09-12 11:25:56,975 ERROR ql.Driver (SessionState.java:printError(400)) - FAILED: Hive Internal Error: java.lang.RuntimeException(Error while making MR scratch directory - check filesystem config (null))
java.lang.RuntimeException: Error while making MR scratch directory - check filesystem config (null)
...
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://10.10.10.15:54310/tmp/hive-hadoop/hive_2012-09-12_11-25-56_602_1946700606338541381, expected: hdfs://hadoop01:54310
It says fs is wrong, but I don't think it's right to config fs to such a path, and where should I config it?
Here is my config files. Ip address of hadoop01 is 10.10.10.15.
hbase-site.xml
<configuration>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>10.10.10.15</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/datas/zookeeper</value>
<description>Property from ZooKeeper's config zoo.cfg.
The directory where the snapshot is stored.
</description>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop01:54310/hbase</value>
<description>The directory shared by RegionServers.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
</description>
</property>
Anyone can help please?
I solved it myself.
Modify $HADOOP_HOME/conf/core-site.xml, change dfs.default.name from ip to hostname. like this
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop01:54310/</value>
</property>
Make sure that both this property and hbase.rootdir property in hbase-site.xml use same hostname or ip.