Given that HBase is a database with its files stored in HDFS, how does it enable random access to a singular piece of data within HDFS? By which method is this accomplished?
From the Apache HBase Reference Guide:
HBase internally puts your data in indexed "StoreFiles" that exist on HDFS for high-speed lookups. See the Chapter 5, Data Model and the rest of this chapter for more information on how HBase achieves its goals.
Scanning both chapters didn't reveal a high-level answer for this question.
So how does HBase enable random access to files stored in HDFS?
HBase stores data in HFiles that are indexed (sorted) by their key. Given a random key, the client can determine which region server to ask for the row from. The region server can determine which region to retrieve the row from, and then do a binary search through the region to access the correct row. This is accomplished by having sufficient statistics to know the number of blocks, block size, start key, and end key.
For example: a table may contain 10 TB of data. But, the table is broken up into regions of size 4GB. Each region has a start/end key. The client can get the list of regions for a table and determine which region has the key it is looking for. Regions are broken up into blocks, so that the region server can do a binary search through its blocks. Blocks are essentially long lists of key, attribute, value, version. If you know what the starting key is for each block, you can determine one file to access, and what the byte-offset (block) is to start reading to see where you are in the binary search.
hbase acess hdfs file by using hfile . you can check the url to get the detail: http://hbase.apache.org/book/hfilev2.html
Related
In most cases, Geode allocates one partitioned region for each data
structure. For example, each Sorted Set is allocated its own
partitioned region, in which the key is the user data and the value is
the user-provided score, and entries are indexed by score. The two
exceptions to this design are data types String and HyperLogLog. All
Strings are allocated to a single partitioned region.
For WAN replication, we create a gateway-sender and then assign this sender to a particular region for replication. With redis adaptor, we only have two regions by default as written above. Since a region for a "set" data structure will be created only when we add a key for it. How can we replicate those regions with redis adaptor?
https://cwiki.apache.org/confluence/display/GEODE/GemFire+Multi-site+%28WAN%29+Architecture
Steps for WAN replication done by me:
start locator --name=dc1 --properties-file=gemfire.properties
start server --name=redis --redis-port=11211 --J=-Dgemfireredis.regiontype=REPLICATE
create gateway-sender --id=dc1 --remote-distributed-system-id=3
create gateway-receiver
Now, I list regions which are currently available.
Cluster-1 gfsh>list regions
List of regions
---------------
ReDiS_HlL
ReDiS_StRiNgS
Assign both the regions to the gateway-sender
alter region --name=ReDiS_StRiNgS --gateway-sender-id=dc1
It is able to replicate the strings but not other data structures.
gemfire.properties
mcast-port=0
locators=1dc1[10334]
distributed-system-id=1
remote-locators=dc2[10334]
I have ran the same commands on dc2.
Before creating the region for other data structures, the Redis adapter implementation looks as the cache.xml to see if the region is defined. So, in your case, you can define a region with a gateway-sender in cache.xml while starting the server. Please see this reference for creating the cache.xml file this hierarchy information will also be useful. Once you have you can run the following command:
gfsh>start server --cache-xml-file=/path/to/cache.xml
I have got a question regarding hbase databases. We access the data first by defining a row key, column family and in the last by column qualifier.
My question is will HBase store all column families with the same row key together in one node or not?
UPDATE: As an example, I want to multiply val1 and val2 in a map/reduce job. While val1 and val2 are stored in database like this: Row=00000 Column Family:M, m000001_1234567=val1, Row=00000 Column Family: R, r000001_1234567=val2. Can I make sure that I have access to both val1 and val2 in the same node running the map?
As you might be aware its actually the HFile that has the actual key value data stored and it would be distributed accross the datanodes. The zookeeper / HLog /Memestore help in locating the rowkey data and retrieve it.
The Key-value storage would be grouped and stored in each node , say keys [A-L] goes to one node and the rest [M-z] to another node , considering 2 node scenario.
Question 1: Will HBase store all column families with the same row key together in one node?
Yes, but there are a few special cases.
The recommened way to set up an HBase cluster is the collocated (or co-located) configuration: use the some machines for HDFS Data Nodes and HBase Region Servers (in contrast to dedicating the machines to specifically one of these roles, in which case all reads would be remote and performance would suffer). In such a setup, when a Region Server saves data to HDFS, the first replica of the data will always get saved to the local disk. However, the placement of any further replicas are not consistent - different parts may be placed on different nodes. This means that if a machine dies, no data will get lost, but the data of that region will not be found on any single machine any more, bit will be scattered all around the cluster instead. Even in this case, a single row will probably still to be stored on a single Data Node, but it won't be local to the new Region Server any more.
This is not the only way how data locality can get lost, previously even restarting HBase had this effect. A lot of older posts mention this, but this has actually been fixed since then in HBASE-2896.
Even if data locality gets lost, the next major compaction will restore it.
Sources and recommended reading:
How Scaling Really Works in Apache HBase
HBase and data locality
HBase File Locality in HDFS
Major compaction and data locality
Question 2: When reading an HBase table from a MapReduce job, does each mapper run on the node where the data it uses is stored?
My understanding is that apart from the special case mentioned above, the answer is yes, but I couldn't find this explicitly mentioned anywhere.
Sources and recommended reading:
Understanding Map Reduce on HTable
The MapReduce Integration section of Tutorial: HBase
What is the fastest and most efficient way to bulk delete hbase records? Hbase client API or a MapReduce job?
The HBase Client API does not allow to do bulk deletions unless you know the row keys of the cells you want to delete.
The BulkDeleteEndpoint can be leveraged to do bulk deletes based on the results of a scanner.
The fastest and most efficient way for large contiguous datasets is to drop entire regions by deleting their HDFS directories and removing them from the META table. This incurs virtually no IO so it's arguably almost free.
Note, however that this is not yet available directly through the high level APIs, so you have to script / code it in order to get it done.
Here's an example, from the HBase mailing lists, of how you could do it using the shell.
Close the region from the shell (read up on how this works using shell
help -- don't do unassign)
Then just delete the content of the region in HDFS once the region is
closed (the region dir name in HDFS is the same as the region encoded name,
the last portion of a region name -- check refguide).
After the delete in HDFS, call assign region.
Source http://search-hadoop.com/m/YGbbl9ZaSQ2HLT&subj=Re+Delete+a+region+from+hbase
HBase Client API is faster because you perform operations directly on the database while using MapReduce which means that tasks will run over jobs and that's take time according to my experience.
Over than that HBase will allow you to run specific operations in Column families that MapReduce can't do.
I am trying to import some HDFS data to an already existing HBase table.
The table I have was created with 2 column families, and with all the default settings that HBase comes with when creating a new table.
The table is already filled up with a large volume of data, and it has 98 online regions.
The type of row keys it has, are under the form of(simplified version) :
2-CHARS_ID + 6-DIGIT-NUMBER + 3 X 32-CHAR-MD5-HASH.
Example of key: IP281113ec46d86301568200d510f47095d6c99db18630b0a23ea873988b0fb12597e05cc6b30c479dfb9e9d627ccfc4c5dd5fef.
The data I want to import is on HDFS, and I am using a Map-Reduce process to read it. I emit Put objects from my mapper, which correspond to each line read from the HDFS files.
The existing data has keys which will all start with "XX181113".
The job is configured with :
HFileOutputFormat.configureIncrementalLoad(job, hTable)
Once I start the process, I see it configured with 98 reducers (equal to the online regions the table has), but the issue is that 4 reducers got 100% of the data split among them, while the rest did nothing.
As a result, I see only 4 folder outputs, which have a very large size.
Are these files corresponding to 4 new regions which I can then import to the table? And if so, why only 4, while 98 reducers get created?
Reading HBase docs
In order to function efficiently, HFileOutputFormat must be configured such that each output HFile fits within a single region. In order to do this, jobs whose output will be bulk loaded into HBase use Hadoop's TotalOrderPartitioner class to partition the map output into disjoint ranges of the key space, corresponding to the key ranges of the regions in the table.
confused me even more as to why I get this behaviour.
Thanks!
The number of maps you'd get doesn't depend on the number of regions you have in the table but rather how the data is split into regions (each region contains a range of keys). since you mention that all your new data start with the same prefix it is likely it only fit into a few regions.
You can pre split your table so that the new data would be divided between more regions
i have a problem while write data in hbase.I have 4 region server.when i write data and use radom key ,data write to any region but they are in one region server.One server are busy, three server are free.How do write regularity in all region server.
HBase partitions it's tables across region servers. See :
How HBase partitions table across regionservers?
http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html
I am not sure how random or far apart your random key should be to be able to write to different partitions.
See discussions on hbase.hregion.max.filesize and base.hregion.maxfilesize which suggests that tables are split to new regions when the appropriate data size has been reached.