ERROR: Found inconsistency in table HBase - hadoop

We are using HBase 0.94.7 with two region servers. We have a region always in transition. When I do the hbase hbck, there are inconsistencies found. But neither hbase hbck -repair nor hbase hbck -fix would help because of this region in transition. Here is the log from hbase hbck
ERROR: Region { meta => LogTable,\x00\x00\x01\xE8\x00\x00\x01#\x07B\x02\xCF\xEF\xCE>.,1374573828457.f41ff2fae25d1dab3f16306f4f995369., hdfs => hdfs://master:8020/hbase/LogTable/f41ff2fae25d1dab3f16306f4f995369, deployed => } not deployed on any region server.
ERROR: There is a hole in the region chain between \x00\x00\x01\xE8\x00\x00\x01#\x07B\x02\xCF\xEF\xCE>. and \x00\x00\x01\xFC\x00\x00\x01#\x08\x1E1\x0F\x07&\xCE\x11. You need to create a new .regioninfo and region dir in hdfs to plug the hole.
ERROR: Found inconsistency in table LogTable
ERROR: Found lingering reference file hdfs://master:8020/hbase/LogTable/f41ff2fae25d1dab3f16306f4f995369/l/d9c7d33257ae406caf8d94277ff6d247.fbda7904cd1f0ac9583e04029a138487
ERROR: Found lingering reference file hdfs://master:8020/hbase/LogTable/f41ff2fae25d1dab3f16306f4f995369/l/b4f4b4ba52f041d5b9ee03318cac7fb7.fbda7904cd1f0ac9583e04029a138487
ERROR: Found lingering reference file hdfs://master:8020/hbase/LogTable/f41ff2fae25d1dab3f16306f4f995369/l/ee7dd42b15fe4622882ec6a7a773e01f.fbda7904cd1f0ac9583e04029a138487
When I tried hbase hbck -repair, it loops infinitely because of the region in transition:
INFO util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {NAME => 'LogTable'....}
I have no clue how to resolve this problem, can anyone help?
Thanks

try this:
hbase org.apache.hadoop.hbase.util.Merge <br/>
Usage:
bin/hbase merge [table-name] region-1 region-2

This looks like you had a failed region split, see [HBASE-8052] (https://issues.apache.org/jira/browse/HBASE-8502) for more details.
This bug leaves references to parent regions that have been moved in HDFS. To fix, just delete the reference files listed in the HBCK output e.g. hadoop fs -rm hdfs://master:8020/hbase/LogTable/f41ff2fae25d1dab3f16306f4f995369/l/d9c7d33257ae406caf8d94277ff6d247.fbda7904cd1f0ac9583e04029a138487.
Once the bad references are gone the region should be assigned automatically. You may have to do the assignment from the shell, in my experience though it only takes a minute or two for the region to get reassigned. Then run hbase hbck -fix again to confirm there are no other inconsistencies.

Related

HBase: The table test does not exist in meta but has a znode. run hbck to fix inconsistencies (which fails)

I recently added a table test while getting started on HBase.
I decided to reinstall HBase due to some issues.
After reinstalling and running the HBase shell I tried:
hbase(main):004:0> list
TABLE
0 row(s) in 0.0070 seconds
=> []
So there are no tables. Now I tried to add the table test
hbase(main):005:0> create 'test', 'testfamily'
ERROR: Table already exists: test!
I took a look into the log files and found the following entry
2018-06-21 07:53:30,646 WARN [ProcedureExecutor-2]
procedure.CreateTableProcedure: The table test does not exist in meta
but has a znode. run hbck to fix inconsistencies.
I ran it and got the following
$ hbase hbck test
Table hbase:meta is okay.
Number of regions: 1
Deployed on: my_IP,16201,1529567081041
0 inconsistencies detected.
Status: OK
I'm wondering if there's a way to remove the znode by hand?
I have also faced the same issue where it was showing the following error
The table does not exist in meta but has a znode. run hbck to fix inconsistencies.
The answer is obvious in the error only.
Inconsistency is caused as the table exist in your zookeeper quorum(distributed/pseudo distributed mode) or single zookeeper node(for standalone mode) but is not present in hbase .
So the solution will be to remove the table from zookeeper node.
To do so -
Open zookeeper client. bin/zkCli.sh
You can see all the tables which are picked by zookeeper by ls /hbase/table
Try to find the table name mentioned in the error and run rmr /hbase/table/<table_name>.This will remove that table from the state of zookeeper.
Try to create table again from Hbase.It will get created without any problem.

FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

I shutdown my HDFS client while HDFS and hive instances were running. Now when I relogged into Hive, I can't execute any of my DDL Tasks e.g. "show tables" or "describe tablename" etc. It is giving me the error as below
ERROR exec.Task (SessionState.java:printError(401)) - FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
Can anybody suggest what do I need to do to get my metastore_db instantiated without recreating the tables? Otherwise, I have to duplicate the effort of creating the entire database/schema once again.
I have resolved the problem. These are the steps I followed:
Go to $HIVE_HOME/bin/metastore_db
Copied the db.lck to db.lck1 and dbex.lck to dbex.lck1
Deleted the lock entries from db.lck and dbex.lck
Log out from hive shell as well as from all running instances of HDFS
Re-login to HDFS and hive shell. If you run DDL commands, it may again give you the "Could not instantiate HiveMetaStoreClient error"
Now copy back the db.lck1 to db.lck and dbex.lck1 to dbex.lck
Log out from all hive shell and HDFS instances
Relogin and you should see your old tables
Note: Step 5 may seem a little weird because even after deleting the lock entry, it will still give the HiveMetaStoreClient error but it worked for me.
Advantage: You don't have to duplicate the effort of re-creating the entire database.
Hope this helps somebody facing the same error. Please vote if you find useful. Thanks ahead
I was told that generally we get this exception if we the hive console not terminated properly.
The fix:
Run the jps command, look for "RunJar" process and kill it using
kill -9 command
See: getting error in hive
Have you copied the jar containing the JDBC driver for your metadata db into Hive's lib dir?
For instance, if you're using MySQL to hold your metadata db, you wll need to copy
mysql-connector-java-5.1.22-bin.jar into $HIVE_HOME/lib.
This fixed that same error for me.
I faced the same issue and resolved it by starting the metastore service. Sometimes service might get stopped if your machine is re-booted or went down. You could start the service by running the command:
Login as $HIVE_USER
nohup hive --service metastore>$HIVE_LOG_DIR/hive.out 2>$HIVE_LOG_DIR/hive.log &
I had a similar problem with hive server and followed the below steps:
1. Go to $HIVE_HOME/bin/metastore_db
2. Copied the db.lck to db.lck1 and dbex.lck to dbex.lck1
3. Deleted the lock entries from db.lck and dbex.lck
4. Relogin from hive shell. It is working
Thanks
For instance, I use MySQL to hold metadata db, I copied
mysql-connector-java-5.1.22-bin.jar into $HIVE_HOME/lib folder
My error resolved
I also was facing the same problem, and figured out that I had both hive-deafult.xml and hive-site.xml(created manually by me),
I moved my hive-site.xml to hive-site.xml-template(as I was not needed this file) then
started hive, worked fine.
Cheers,
Ajmal
I have faced this issue and in my case it was while running hive command from command line.
I resolved this issue by running kinit command as I was using kerberized hive.
kinit -kt <your keytab file location> <kerberos principal>

NoServerForRegionException while running Hadoop MapReduce job on HBase

I am executing a simple Hadoop MapReduce program with HBase as an input and output.
I am getting the error:
java.lang.RuntimeException: org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for OutPut,,99999999999999 after 10 tries.
This exception appeared to us when there was difference in hbase version.
Our code was built with and running with 0.94.X version of hbase jars. Whereas the hbase server was running on 0.90.3.
When we changed our pom file with right version (0.90.3) of hbase jars it started working fine.
Query bin/hbase hbck and find in which machine Region server is running.
Make sure that all your Region server is up and running.
Use start regionserver for starting Region server
Even if Regionserver at the machine is started it may fail because of time sync.
Make sure you have NTP installed on all Regionserver nodes and HbaseMaster node.
As Hbase works on a key-value pair where it uses the Timestamp as the Index, So it allows a time skew less than 3 seconds.
Deleting (or move to /tmp) the WAL logs helped in our case:
hdfs dfs -mv /apps/hbase/data/MasterProcWALs/state-*.log /tmp

hdfs data directory "is in an inconsistent state: is incompatible with others."

Sorry, but this is getting on my nerves...
Exactly when i start loading a table through hive, I start getting this error. And dear old google is not able to help either.
my situation -
single node setup. Namenode working properly.
datanode startup is failing with this message -
ERROR datanode.DataNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /xxxxxx/hadoop/hdfs-data-dir is in an inconsistent state: is incompatible with others.
I have already tried to re-format my namenode, but it doesnt help.
Also, I tried to find ways to "format" my datanode, but no success so far..
help please...
This site pointed me at a solution after a drive got reformatted:
I ran into a problem with hadoop where it wouldn’t startup after I
reformatted a drive. To fix this make sure the VERSION number is the
same across all hadoop directories
md5sum /hadoop/sd*/dfs/data/current/VERSION
If they aren’t the same version across all partitions, then you will
get the error.
I simply copied the VERSION information from one of the other drives, changed permissions, and restarted HDFS.
Found a fix.
Needed to
create a fresh hdfs directory,
remove the write permissions from the group (chmod g-w xxxx) and
remove all temporary files from /tmp pertaining to hadoop/hdfs.
I am convinced that there could/would be a better/cleaner way to fix this.
still keeping the question open therefore.

hbase cannot find an existing table

I set up a hbase cluster to store data from opentsdb. Recently due to reboot of some of the nodes, hbase lost the table "tsdb". I can still it on hbase's master node page, but when I click on it, it gives me a tableNotFoundException
org.apache.hadoop.hbase.TableNotFoundException: tsdb
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:952)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:782)
at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:249)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:213)
at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:171)
......
I entered hbase shell, trying to locate 'tsdb' table, but got the similar message
hbase(main):018:0> scan 'tsdb'
ROW COLUMN+CELL
ERROR: Unknown table tsdb!
However when I tried to re-create this table, hbase shell told me the table already exist...
hbase(main):013:0> create 'tsdb', {NAME => 't', VERSIONS => 1, BLOOMFILTER=>'ROW'}
ERROR: Table already exists: tsdb!
And I can also list the table in hbase shell
hbase(main):001:0> list
TABLE
tsdb
tsdb-uid
2 row(s) in 0.6730 seconds
Taking a look at the log, I found this which should be the cause of my issue
2012-05-14 12:06:22,140 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table:
org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: tsdb, row=tsdb,,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157)
at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)
It says cannot find row of tsbb in .META., but there are indeed tsdb rows in .META.
hbase(main):002:0> scan '.META.'
ROW COLUMN+CELL
tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:regioninfo, timestamp=1336311752799, value={NAME => 'tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x
x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00 05\x00\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.', STARTKEY => '\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\
\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5. x00\x05\x00\x001', ENDKEY => '\x00\x00\x10O\xA3\x8C\x80\x00\x00\x01\x00\x00\x0B\x00\x00\x02\x00\x00\x19\x00\x00\x03\x00\x00\x1A\x00\x00\x05\x00\x001', ENCODED => 7cd0d2205d9ae5f
cadf843972ec74ec5,}
tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:server, timestamp=1337011527000, value=brycobapd01.usnycbt.amrs.bankofamerica.com:60020
x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00
\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.
tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\ column=info:serverstartcode, timestamp=1337011527000, value=1337011518948
......
tsdb-uid,,1336081042372.a30d8074431c6a31c6a0a30e61fedefa. column=info:server, timestamp=1337011527458, value=bry200163111d.usnycbt.amrs.bankofamerica.com:60020
tsdb-uid,,1336081042372.a30d8074431c6a31c6a0a30e61fedefa. column=info:serverstartcode, timestamp=1337011527458, value=1337011519807
6 row(s) in 0.2950 seconds
Here is the result after I ran "hbck" on the cluster
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/249438af5657bf1881a837c23997747e on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/4f8c65fb72910870690b94848879db1c on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/63276708b4ac9f11e241aca8b56e9def on HDFS, but not listed in META or deployed on any region server
ERROR: Region hdfs://slave-node-1:9000/hbase/tsdb/e54ee4def67d7f3b6dba75a3430e0544 on HDFS, but not listed in META or deployed on any region server
ERROR: (region tsdb,\x00\x00\x0FO\xA2\xF1\xD0\x00\x00\x01\x00\x00\x0E\x00\x00\x02\x00\x00\x12\x00\x00\x03\x00\x00\x13\x00\x00\x05\x00\x001,1336311752340.7cd0d2205d9ae5fcadf843972ec74ec5.) First region should start with an empty key. You need to create a new region and regioninfo in HDFS to plug the hole.
ERROR: Found inconsistency in table tsdb
Summary:
-ROOT- is okay.
Number of regions: 1
Deployed on: master-node,60020,1337011518948
.META. is okay.
Number of regions: 1
Deployed on: slave-node-2,60020,1337011519845
Table tsdb is inconsistent.
Number of regions: 5
Deployed on: slave-node-2,60020,1337011519845 slave-node-1,60020,1337011519807 master-node,60020,1337011518948
tsdb-uid is okay.
Number of regions: 1
Deployed on: slave-node-1,60020,1337011519807
5 inconsistencies detected.
Status: INCONSISTENT
I have run
bin/hbase hbck -fix
which unfortunately does not solve my problem
Could someone help me out on this that
Is it possible to recover this table "tsdb"?
If 1 cannot be done, is it a suggested way to gracefully remove 'tsdb', and create a new one?
I'd be greatly appreciated if anybody can let me know what is the most suggested way to reboot a node? Currently, I am leaving my master node always up. For other nodes, I run this command immediately after its reboot.
command:
# start data node
bin/hadoop-daemon.sh start datanode
bin/hadoop-daemon.sh start jobtracker
# start hbase
bin/hbase-daemon.sh start zookeeper
bin/hbase-daemon.sh start regionserver
Many Thanks!
A bit late, maybe it's helpful to the searcher.
Run the ZooKeeper shell hbase zkcli
In the shell run ls /hbase/table
Run rmr /hbase/table/TABLE_NAME
Restart Hbase
I am not very sure why you are unable to scan it. However, to recreate the table, you can try this:
1) Delete all entries in the .META table for this table manually, and
2) Delete the directory corresponding to this table from HDFS
Try creating the table again after that.
If you are using cdh4.3 then the path in zookeeper should be /hbase/table94/
To expand on #Devin Bayer's answer, run:
delete /hbase/table/<name_of_zombie_table>
if you find any zombie tables being maintained by the zookeeper. For more help on this issue, you should google 'HBase zombie tables'.
try to fix meta
hbase hbck
hbase hbck -fixMeta
hbase hbck -fixAssignments
hbase hbck -fixReferenceFiles
after and try again
More instructions on deleting the tables:
~/hbase-0.94.12/bin/hbase shell
> truncate 'tsdb'
> truncate 'tsdb-meta'
> truncate 'tsdb-uid'
> truncate 'tsdb-tree'
> exit
I also had to restart the tsd daemon.
I get a similar error message when I try an HBase connection from a Java client on a machine that doesn't have the TCP privilege to access the HBase machines.
The table indeed exists when I do hbase shell on the HBase machine itself.
Does opentsdb has all the privileges/port config to access the HBase machine ?
I do face these issues at my workplace. I usually either delete the znodes and them remove the corresponding table or restart hbase both HMaster and Hregionserver to get hbck status OK.
It is enough to remove the specified table from your zookeeper path.
For example if zookeeper.znode.parent is configured to blob in hbase-site.xml you should start zkCli.sh in your zookeeper server shell and remove that directory by rmr /blob/table/tsdb command.
hbase-clean.sh --cleanZk
It works well, simple enough.

Resources