I am trying to understand the read and write paths of hbase. When an update of row is done via put command for a specific row, the data must be written to the memstore buffer. But let us say for that key, there was an old value already present in block cache.
At this point a value X is present in block cache and new value Y is present in memstore buffer. If I execute a read command, I am getting Y. But isn't X the expected value? Because as per my understanding, whenever a read comes, block cache will be checked before memstore buffer.
Is my understanding wrong? Or is there any intermediate step where block cache gets updated or invalidated?
Most of the time this interaction is missing in all the docs. As per my understanding, before updating memstore, if the block is present in the block cache it is invalidated, to avoid the case which you are highlighting.
Related
I am trying to understand how read and write happens in hbase and how hbase does the caching.
From various articles and videos, I found that a read merge happens when a read request is made to hbase. What I understood is:
Whenever read request made, block cache is first checked for the data.
Then memstore is checked. If data found in both block cache or memstore, data is sent to client.
Else fetched from hfiles.
My doubts are
whether both block cache and memstore is always checked for the data? or whether memstore will be ignored if found in block cache.
If memstore not checked (since found in blockcache),how will client get latest value if there was an edit in memstore?
I created a new table. I added one row. I issued get command to fetch the data. I obtained the data but I didn't see any change in cache hits and reads of block cache. Why?
I know there are multiple questions but all these are linked to read merge and hbase caching. I need a clarity on these concepts and I could not find any in documentation.
On a spark shell I use the below code to read from a csv file
val df = spark.read.format("org.apache.spark.csv").option("header", "true").option("mode", "DROPMALFORMED").csv("/opt/person.csv") //spark here is the spark session
df.show()
Assuming this displays 10 rows. If I add a new row in the csv by editing it, would calling df.show() again show the new row? If so, does it mean that the dataframe reads from an external source (in this case a csv file) on every action?
Note that I am not caching the dataframe nor I am recreating the dataframe using the spark session
After each action spark forgets about the loaded data and any intermediate variables value you used in between.
So, if you invoke 4 actions one after another, it computes everything from beginning each time.
Reason is simple, spark works by building DAG, which lets it visualize path of operation from reading of data to action, and than it executes it.
That is the reason cache and broadcast variables are there. Onus is on developer to know and cache, if they know they are going to reuse that data or dataframe N number of times.
TL;DR DataFrame is not different than RDD. You can expect that the same rules apply.
With simple plan like this the answer is yes. It will read data for every show although, if action doesn't require all data (like here0 it won't read complete file.
In general case (complex execution plans) data can accessed from the shuffle files.
I am new bee in HBASE. So could someone please clarify my query on Row level deletes in HBase.
Say we have 10 records in a table. So every record will be stored in separate HFile. So if we try to delete any record, it will delete the
actual HFile. I understood, this is how row level deletes are handled in HBASE.
But during compaction Smaller HFiles will be converted to large HFile.
So all the data will be stored together in larger HFiles. Now, how row level deletes will be handled if all the data is stored together?
Basically it just gets marked for deletion and the actual deletion happens during the next compaction. Please see the Deletion in HBase article for details.
HFile is not created as soon as you insert data. First the data is stored in memstore. Once the memstore is sufficiently large, it is flushed to HFile. New HFile is not created for every record or row. Also remember since records are stored in memory, they get sorted and then flushed to HFile. This is how records in HFiles are always sorted.
HFiles are immutable [any files for that matter in HDFS are expected to be immutable]. Deletion of records does not happen right away. They are marked for deletion. And when the system runs compaction (Minor or Major), the records marked for deletion are actually deleted and the new HFile does not contain it. If the compaction is not initiated, the record still exists. However, it is masked from displaying whenever queried for.
I have a little experience with cassandra But I have one query regarding cassandra read process.
Suppose we have 7 sstables for a given table in our cassandra db now If we perform any read query which is not cached in memtable So Cassandra will look into the sstables. My question is:-
During this process will cassandra load all the sstables(7) into the memtable or It will just look into the all the sstables and will load relevant rows in memtable instead of loading all the sstables ?
Thanking you in advance!!
And please do correct me If I have interpreted something wrong.
And It also would be great If some one can explain/mention better resources to know about working of sstables.
During this process will cassandra load all the sstables(7)
No. Cassandra wouldn't load all the 7 SSTables. Each SSTable has a BloomFilter (in-memory) that tells the possibility for having the data in that SSTable.
If BloomFilter indicates a possibility of having the data in the SSTable, it looks into the partition key cache and gets the compression offset map (in-memory) to retrieve the compressed block that has the data we are looking for.
If found in the partition key cache, then the compressed block is read (I/O) to get the data.
If not found, it looks into partition summary to get the location of index entry and reads that location (I/O) into memory and continues with compression offset map flow earlier.
To start with, this Cassandra Reads link I think should help and depicts the flow pictorially. Capturing below the read path from above link for quick reference.
And one more thing, there is also a row cache which contains the hot rows (accessed frequently) and this will not result in hitting/loading the SSTable if found in the row cache.
Go through this rowcache link to understand row cache and partition key cache.
Another great presentation shared by Jeff Jirsa, Understanding Cassandra Table Options. Really worth going through it.
On a different note, there is compaction the happens periodically to reduce the number of SSTables and delete the rows based on tombstones.
Read this on apache documentation:
InputSplit represents the data to be processed by an individual Mapper.
Typically, it presents a byte-oriented view on the input and is the responsibility of RecordReader of the job to process this and present a record-oriented view.
Link - https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/mapred/InputSplit.html
Can somebody explain the difference between byte-oriented view and record-oriented view?
HDFS splits its blocks (byte-oriented view) so that each block is less than or equal to the block size configured. So it is considered to be not following a logical split. Means a part of last record may reside in one block and rest of it is in another block. This seems correct for storage. But At processing time, the partial records in a block cannot be processed as it is. So the record-oriented view comes into place. This will ensure to get the remaining part of the last record in the other block to make it a block of complete records. This is called input-split (record oriented view).