HBase : HFile stats not changed after flush - hadoop

I have a HBase table 'emp'. I created some rows in it using hbase-shell, among which the biggest rowkey is 123456789.
When I check on HBase UI (the web console) following the below path :
regions -> emp,,1582232348771.4f2d545621630d98353802540fbf8b00. -> hdfs://namenode:9000/hbase/data/default/emp/4f2d545621630d98353802540fbf8b00/personal data/15a04db0d3a44d2ca7e12ab05684c876 (store file)
I can see Key of biggest row: 123456789, so everything is good.
But the problem came when I deleted the row containing the rowkey 123456789 using hbase-shell. I also put some other rows, then finally flush the table flush 'emp'.
I see a second HFile generated. But the Key of biggest row of the first HFile is always 123456789.
So I am very confused : this row no longer exist in my hbase table, and I already did a flush (so everything in memstore should be in HFile). Why in stats it always shows this rowkey ? What is going on behind the scene ?
And how can I update the stats ?

You're correct in that everything in the memstore is now in HFiles, but until a compaction takes place the deleted row will still exist, albeit marked for deletion in the new, second HFile.
If you force a compaction with major_compact ‘table_name’, ‘col_fam’, you should see this record disappear (and be left with one HFile). Maybe there's a small bug in stats that doesn't take deleted records into account?

Related

How to delete empty partitions in cratedb?

Cratedb:4.x.x
We have one table in which we are doing partition based on day.
we will take snapshot of tables based on that partition and after taking backup we delete the data of that day.
Due to multiple partition, shards count is more than 2000 and configured shard is 6
I have observed that old partitions have no data but still exist in database.
So it will take more time to become healthy and available to write data after restarting the crate.
So Is there any way to delete those partition?
Is there any way to stop replication of data on startup the cluster? cause it takes too much time to become healthy cluster and due to that table is not writable until that process finished.
Any solution for this issue will be great help?
You should be able to delete empty partitions with a DELETE with an exact match on the partitioned by column. Like DELETE FROM <tbl> WHERE <partitioned_by_column> = <value>

HDFS File Compaction with continuous ingestion

We have few tables in HDFS which are getting approx. 40k new files per day. We need to compact these tables every two weeks and for that we need to stop ingestion.
We have spark ingestion getting data from kafka and adding to HDFS (Hive external tables) every 30 mins. The data is queried as soon as it is ingested, our SLA is less than an hour so we can not increase the batch interval.
The tables are partition on two fields, we get older data constantly so most of the partitions are updated during each injection batch
eg:
/user/head/warehouse/main_table/state=CA/store=macys/part-00000-017258f8-aaa-bbb-ccc-wefdsds.c000.snappy.parquet
We are looking into ways to reduce number of file creations but even with that we will have to do compaction every 3/4 weeks if not two.
As most of the partitions are updated constantly, we need to stop the injection (~ 1 day) before starting compaction which is impacting our users.
I am looking for ways to compact automatically with out stopping the ingestion?
The chosen partitioning scheme is somewhat unfortunate. Still there are a couple of things you can do. I'm relying on the fact that you can change partition's location atomically in Hive (alter table ... partition ... set location):
Copy a partition's hdfs directory to a different location
Compact copied data
Copy new files that were ingested since step 1
do "alter table ... partition ... set location" to point Hive to a new compacted location.
Start ingesting to this new location (in case if this step is tricky you can just as well replace the "small" files in the original partition location with their compacted version and do "alter table ... partition ... set location" again to point Hive back to the original partition location.
You'll have to keep this process running iterating partition-by-partition on a continuous basis.
Thank you Facha for your suggestions, really appreciate it.
I am pretty new to HDFS concept so please dont mind basic questions,
What would be the impact on running queries which are accessing these specific files while doing swapping of uncompacted files with compacted files (alter table ... partition ... set location). I believe that the queries might fail. Who can we minimize the impact?
Copy a partition's hdfs directory to a different location
As we have two partitions in one table, state and store, will I have to iterate through each sub partition?
/tableName/state=CA/store=macys/file1.parquet
/tableName/state=CA/store=macys/file2.parquet
/tableName/state=CA/store=JCP/file2.parquet
/tableName/state=CA/store=JCP/file2.parquet
/tableName/state=NY/store=macys/file1.parquet
/tableName/state=NY/store=macys/file2.parquet
/tableName/state=NY/store=JCP/file2.parquet
/tableName/state=NY/store=JCP/file2.parquet
For each state
for each store
get list of files in this dir to replace later
compact
/tableName/state=$STATE/store=$STORE (SPARK JOb?)
replace uncompacted files with compacted files
alter table ... partition ... set location
I would prefer your other suggestion in step 5 " just as well replace the "small" files in the original partition location with their compacted version"
How would I go ahead with implementing it, will it be best done with scripting or scala or some other programing language. I have basic knowledge of scripting, good experiencs in java and new to scala but can learn in couple of days.
Regards,
P

Having trouble to import new data in an existing table using Hue Hadoop

When I'm loading new data in an existing table and then do select count(1) to get the total rows number loaded, I'm only getting the count of one HDFS file.
The rows number only represents the amount of one HDFS file.
To import the "new data" I'm clicking here:
Also, here the total count in MySQL:
And the total count in Hue Hadoop:
By the way, here the file browser:
Do you have any idea what I'm doing wrong?
Try this:
invalidate metadata default.movie;
Most probably you were using Impala as engine to retrieve the data and this command is to reload the metadata.
By default, the cached metadata for all tables is flushed. If you
specify a table name, only the metadata for that one table is flushed.
Even for a single table, INVALIDATE METADATA is more expensive than
REFRESH, so prefer REFRESH in the common case where you add new data
files for an existing table.
If you want to go further, check this out.

How row level deletes are handled in HBASE?

I am new bee in HBASE. So could someone please clarify my query on Row level deletes in HBase.
Say we have 10 records in a table. So every record will be stored in separate HFile. So if we try to delete any record, it will delete the
actual HFile. I understood, this is how row level deletes are handled in HBASE.
But during compaction Smaller HFiles will be converted to large HFile.
So all the data will be stored together in larger HFiles. Now, how row level deletes will be handled if all the data is stored together?
Basically it just gets marked for deletion and the actual deletion happens during the next compaction. Please see the Deletion in HBase article for details.
HFile is not created as soon as you insert data. First the data is stored in memstore. Once the memstore is sufficiently large, it is flushed to HFile. New HFile is not created for every record or row. Also remember since records are stored in memory, they get sorted and then flushed to HFile. This is how records in HFiles are always sorted.
HFiles are immutable [any files for that matter in HDFS are expected to be immutable]. Deletion of records does not happen right away. They are marked for deletion. And when the system runs compaction (Minor or Major), the records marked for deletion are actually deleted and the new HFile does not contain it. If the compaction is not initiated, the record still exists. However, it is masked from displaying whenever queried for.

How exactly Cassandra read procedure works?

I have a little experience with cassandra But I have one query regarding cassandra read process.
Suppose we have 7 sstables for a given table in our cassandra db now If we perform any read query which is not cached in memtable So Cassandra will look into the sstables. My question is:-
During this process will cassandra load all the sstables(7) into the memtable or It will just look into the all the sstables and will load relevant rows in memtable instead of loading all the sstables ?
Thanking you in advance!!
And please do correct me If I have interpreted something wrong.
And It also would be great If some one can explain/mention better resources to know about working of sstables.
During this process will cassandra load all the sstables(7)
No. Cassandra wouldn't load all the 7 SSTables. Each SSTable has a BloomFilter (in-memory) that tells the possibility for having the data in that SSTable.
If BloomFilter indicates a possibility of having the data in the SSTable, it looks into the partition key cache and gets the compression offset map (in-memory) to retrieve the compressed block that has the data we are looking for.
If found in the partition key cache, then the compressed block is read (I/O) to get the data.
If not found, it looks into partition summary to get the location of index entry and reads that location (I/O) into memory and continues with compression offset map flow earlier.
To start with, this Cassandra Reads link I think should help and depicts the flow pictorially. Capturing below the read path from above link for quick reference.
And one more thing, there is also a row cache which contains the hot rows (accessed frequently) and this will not result in hitting/loading the SSTable if found in the row cache.
Go through this rowcache link to understand row cache and partition key cache.
Another great presentation shared by Jeff Jirsa, Understanding Cassandra Table Options. Really worth going through it.
On a different note, there is compaction the happens periodically to reduce the number of SSTables and delete the rows based on tombstones.

Resources