Hive: Unable to access database - hadoop

I find myself in a bit of a 'hive' pickle here. On booting the Hive CLI from my home directory, I can access the 'fooDB' database, which I had created earlier:
hadoop#server-7:~$ hive
/usr/local/hive/hive-1.1.0-cdh5.5.2/bin/hive: line 258: no: command not found
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> SHOW DATABASES;
OK
default
fooDB
Time taken: 0.717 seconds, Fetched: 2 row(s)
But when I try to boot it from any other location in my file-system, I am unable to access 'fooDB':
hadoop#server-7:~/Downloads$ hive
/usr/local/hive/hive-1.1.0-cdh5.5.2/bin/hive: line 258: no: command not found
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> SHOW DATABASES;
OK
default
Time taken: 0.72 seconds, Fetched: 1 row(s)
Basically, the objects created after starting the Hive CLI from one particular location in the file-system, let's say '/home/hadoop/dir1', is not accessible from any other location in the file-system via. the Hive CLI and vice-versa.
The relevant-hive section from my .bashrc looks like this:
## HIVE VARIABLES ##
export HIVE_HOME=/usr/local/hive/hive-1.1.0-cdh5.5.2
export HIVE_CONF_DIR=$HIVE_HOME/conf
export PATH=$PATH:$HIVE_HOME/bin
So I am not really sure how to proceed here. I also tried using an alias for hive, which did not help. Any help would be appreciated. Thanks!

After scouring the web, I finally bumped into this
which was exactly what I was looking for.
Hope this helps out people coming across the above problem!

Related

Not able to see databases after creating new hive metastore

I have manually installed hadoop and hive on my Ubuntu 16.04 laptop. Hive was working fine and I created a few test databases (derby).
On restarting laptop, I found that hive was running but running any command like show databases, it was giving error.
I followed the solutions given web. ie:
1) rename metastore_db to metastore_db.tmp.
2) run schematool to generate new metastore_db
3) remove tmp metastore_db.tmp (Not removing gives error when you run hive)
Now I am able to run hive but on running show databases I see only default database.
Is there any way to add databases I created previously (for exxample /user/hive/warehouse/computersalesdb.db saved in hdfs filesystem) to newly generated metastore?
* UPDATE *
On further analysis I found, metastore_db folder is being created where ever I run hive. So this seems to be the cause of problem. The solution is:
1) As advised in comment by #cricket_007 have metastore in mysql or any other rdbms you are using.
2) Always run hive from same folder
3) set property “javax.jdo.option.ConnectionURL” to create metastore in specific folder, which is defined in hive-site.xml
Leaving this comment for the benefit of other nubes like me :D

Hive tables went missing

I had created a couple of tables in hive. I hit a few queries on them. Then exited hive, closed hadoop mapred and dfs after that. Then came back the next day only to see that tables went missing !!
My hive uses local metastore. After a lot of searching I saw only one such issue posted by someone. It was suggested in the answer that local if metastore is used then hive should be started from that same location. And I had done the same. I ran the hive from the master only, never even had logged into slave. Metastore folder is still there. So what must have gone wrong? I checked datanode logs of hadoop and hive metastore logs. But found nothing. Where can I found what went wrong? Please help me with this. Also what can be done to avoid such things?
If you use local metastore, Hive creates metastore_db in the directory from where hiveserver2 is started. So if you start the hiveserver2 from a different directory location next time, then a new metastore_db will be created at that location and this metastore_db will not have metadata about your earlier tables.
Where you using a database the first day? Where you using it the second day?
Meaning
hive> show databases;
OK
default
test
Time taken: 1.575 seconds
hive> use database test;
hive> show tables;
OK
blah
Time taken: 0.141 seconds
hive use table blah;
If you forgot to use a database or create one things could get messy.
Also what does the following command return?
sudo -u hdfs hadoop fs -ls -R \

how to access hadoop hdfs with greenplum external table

oue datawarehouse is based on hive,now we need to transform data from hive to greenplum,we want to use external table with gphdfs,but it looks something goes wrong.
the table creating script is
CREATE EXTERNAL TABLE flow.http_flow_data(like flow.zb_d_gsdwal21001)
LOCATION ('gphdfs://mdw:8081/user/hive/warehouse/flow.db/d_gsdwal21001/prov_id=018/day_id=22/month_id=201202/data.txt')
FORMAT 'TEXT' (DELIMITER ' ');
when we run
bitest=# select * from flow.http_flow_data limit 1;
ERROR: external table http_flow_data command ended with error. sh: java: command not found (seg12 slice1 sdw3:40000 pid=17778)
DETAIL: Command: gphdfs://mdw:8081/user/hive/warehouse/flow.db/d_gsdwal21001/prov_id=018/day_id=22/month_id=201202/data.txt
our hadoop is 1.0 and greenplum is 4.1.2.1
I want to know if we need to config something about to make gp access hadoop
Have you opened the port (8081) to listen for the month_id=201202 directory?
I would double check the admin guide, I think you can use gphdfs, but not until greenplum 4.2
have you checked to ensure that java is installed on your Greenplum system? as this is required in order for gphdfs to work.

hbase cannot find an existing table

I set up a hbase cluster to store data from opentsdb. Recently due to reboot of some of the nodes, hbase lost the table "tsdb". I can still it on hbase's master node page, but when I click on it, it gives me a tableNotFoundException
org.apache.hadoop.hbase.TableNotFoundException: tsdb
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:952)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:782)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:249)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:213)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:171)
......
I entered hbase shell, trying to locate 'tsdb' table, but got the similar message
hbase(main):018:0> scan 'tsdb'
ROW COLUMN+CELL
ERROR: Unknown table tsdb!
However when I tried to re-create this table, hbase shell told me the table already exist...
hbase(main):013:0> create 'tsdb', {NAME => 't', VERSIONS => 1, BLOOMFILTER=>'ROW'}
ERROR: Table already exists: tsdb!
And I can also list the table in hbase shell
hbase(main):001:0> list
TABLE
tsdb
tsdb-uid
2 row(s) in 0.6730 seconds
Taking a look at the log, I found this which should be the cause of my issue
2012-05-14 12:06:22,140 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: tsdb, row=tsdb,,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157)
at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)
It says cannot find row of tsbb in .META., but there are indeed tsdb rows in .META.
hbase(main):002:0> scan '.META.'
ROW COLUMN+CELL
tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:regioninfo, timestamp=1336311752799, value={NAME => 'tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x
x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00 05\x00\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.', STARTKEY => '\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\
\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5. x00\x05\x00\x001', ENDKEY => '\x00\x00\x10O\xA3\x8C\x80\x00\x00\x01\x00\x00\x0B\x00\x00\x02\x00\x00\x19\x00\x00\x03\x00\x00\x1A\x00\x00\x05\x00\x001', ENCODED => 7cd0d2205d9ae5f
cadf843972ec74ec5,}
tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:server, timestamp=1337011527000, value=brycobapd01.usnycbt.amrs.bankofamerica.com:60020
x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00
\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.
tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:serverstartcode, timestamp=1337011527000, value=1337011518948
......
tsdb-uid,,1336081042372.a30d8074431c6a31c6a0a30e61fedefa. column=info:server, timestamp=1337011527458, value=bry200163111d.usnycbt.amrs.bankofamerica.com:60020
tsdb-uid,,1336081042372.a30d8074431c6a31c6a0a30e61fedefa. column=info:serverstartcode, timestamp=1337011527458, value=1337011519807
6 row(s) in 0.2950 seconds
Here is the result after I ran "hbck" on the cluster
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/249438af5657bf1881a837c23997747e on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/4f8c65fb72910870690b94848879db1c on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/63276708b4ac9f11e241aca8b56e9def on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/e54ee4def67d7f3b6dba75a3430e0544 on HDFS, but not listed in META or deployed on any region server
ERROR: (region tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.) First region should start with an empty key. You need to create a new region and regioninfo in HDFS to plug the hole.
ERROR: Found inconsistency in table tsdb
Summary:
-ROOT- is okay.
Number of regions: 1
Deployed on: master-node,60020,1337011518948
.META. is okay.
Number of regions: 1
Deployed on: slave-node-2,60020,1337011519845
Table tsdb is inconsistent.
Number of regions: 5
Deployed on: slave-node-2,60020,1337011519845 slave-node-1,60020,1337011519807 master-node,60020,1337011518948
tsdb-uid is okay.
Number of regions: 1
Deployed on: slave-node-1,60020,1337011519807
5 inconsistencies detected.
Status: INCONSISTENT
I have run
bin/hbase hbck -fix
which unfortunately does not solve my problem
Could someone help me out on this that
Is it possible to recover this table "tsdb"?
If 1 cannot be done, is it a suggested way to gracefully remove 'tsdb', and create a new one?
I'd be greatly appreciated if anybody can let me know what is the most suggested way to reboot a node? Currently, I am leaving my master node always up. For other nodes, I run this command immediately after its reboot.
command:
# start data node
bin/hadoop-daemon.sh start datanode
bin/hadoop-daemon.sh start jobtracker
# start hbase
bin/hbase-daemon.sh start zookeeper
bin/hbase-daemon.sh start regionserver
Many Thanks!
A bit late, maybe it's helpful to the searcher.
Run the ZooKeeper shell hbase zkcli
In the shell run ls /hbase/table
Run rmr /hbase/table/TABLE_NAME
Restart Hbase
I am not very sure why you are unable to scan it. However, to recreate the table, you can try this:
1) Delete all entries in the .META table for this table manually, and
2) Delete the directory corresponding to this table from HDFS
Try creating the table again after that.
If you are using cdh4.3 then the path in zookeeper should be /hbase/table94/
To expand on #Devin Bayer's answer, run:
delete /hbase/table/<name_of_zombie_table>
if you find any zombie tables being maintained by the zookeeper. For more help on this issue, you should google 'HBase zombie tables'.
try to fix meta
hbase hbck
hbase hbck -fixMeta
hbase hbck -fixAssignments
hbase hbck -fixReferenceFiles
after and try again
More instructions on deleting the tables:
~/hbase-0.94.12/bin/hbase shell
> truncate 'tsdb'
> truncate 'tsdb-meta'
> truncate 'tsdb-uid'
> truncate 'tsdb-tree'
> exit
I also had to restart the tsd daemon.
I get a similar error message when I try an HBase connection from a Java client on a machine that doesn't have the TCP privilege to access the HBase machines.
The table indeed exists when I do hbase shell on the HBase machine itself.
Does opentsdb has all the privileges/port config to access the HBase machine ?
I do face these issues at my workplace. I usually either delete the znodes and them remove the corresponding table or restart hbase both HMaster and Hregionserver to get hbck status OK.
It is enough to remove the specified table from your zookeeper path.
For example if zookeeper.znode.parent is configured to blob in hbase-site.xml you should start zkCli.sh in your zookeeper server shell and remove that directory by rmr /blob/table/tsdb command.
hbase-clean.sh --cleanZk
It works well, simple enough.

Hive doesn't respond when I try to make a query

I have a setup on a EC2 instance that uses Whirr to spin up new hadoop instances. I have been trying to get Hive to work with this setup. Hive should be configured to use mysql as the local metastore. The issue that I am having is that every time I try to run a query like( CREATE TABLE testers (foo INT, bark STRING); ) via the hive interface it just hangs there and doesn't seem like it is doing anything.
Any help would be appreciated.
I would first get the debug output from the hive command line to see where it is hanging. Run the hive shell with this parameter, and then paste the output of your command.
hive -hiveconf hive.root.logger=DEBUG,console

Resources