Triple composite key in Hbase - hadoop

I have a use case where I want 3 level composite key.
For eg.
Rollnumber:class:friendsRollNumber
I would want to query "Get all friends for a particular roll number and a class"
I could not find sufficient examples over net to use composite keys and range scans over it together.
Currently, I am doing the following.
byte[]rowkey = Bytes.add(Bytes.tobytes("myrollnumber"),Bytes.tobytes("myClass"),Bytes.tobytes("myFriendsRollNumber"))
This is the way I form row key .
Will it select region server based on myRollNumber and myClass ? If not How can I do that ?
Also , For range scan , what is the correct way to use it. I am doing it in the following way. I am still in process of writing the code , so have not tested it.
Scan s = new Scan();
Filter f = New PrefixFilter(Bytes.tobytes("myrollnumber"),Bytes.tobytes("class"))
s.setFilter(f)
Is the above way correct to scan as per my requirement ?
Also , how to get the individual parts of rowKey from the scanner ?

Try this:
byte[] prefix=Bytes.toBytes("rollnumber" + "class");
Scan scan = new Scan(prefix));
PrefixFilter prefixFilter = new PrefixFilter(prefix);
scan.addFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);

For your requirement, you can use start stop row feature of Scan. You do not need a filter.
byte[]startRow= Bytes.add(Bytes.tobytes("search_rollnumber"),Bytes.tobytes("myClass"));
byte[]stopRow= Bytes.add(Bytes.tobytes("search_rollnumber"),Bytes.tobytes("myClass"));
stopRow[stopRow.length - 1]++;
Scan s = new Scan(startRow, stopRow);
Using this scan you will get all rows starting with search_rollnumberMyClass.
I am not sure you use : in your rowkey. But i think you should if both rollnumber and class are represented as integer.

Related

MongoTemplate, Criteria and Hashmap

Good Morning.
I'm starting to learn some mongo right now.
I'm facing this problem right now, and i'm start to think if this is the best approach to resolve this "task", or if is bettert to turn around and write another way to solve this "problem".
My goal is to iterate a simple map of values (key) and vector\array (values)
My test map will be recived by a rest layer.
{
"1":["1","2","3"]
}
now after some logic, i need to use the Dao in order to look into db.
The Key will be "realm", the value inside vector are "castle".
Every Realm have some castle and every castle have some "rules".
I need to find every rules for each avaible combination of realm-castle.
AccessLevel is a pojo labeled by #Document annotation and it will have various params, such as castle and realm (both simple int)
So the idea will be to iterate a map and write a long query for every combination of key-value.
public AccessLevel searchAccessLevel(Map<String,Integer[]> request){
Query q = new Query();
Criteria c = new Criteria();
request.forEach((k,v)-> {
for (int i: Arrays.asList(v)
) {
q.addCriteria(c.andOperator(
Criteria.where("realm").is(k),
Criteria.where("castle").is(v))
);
}
});
List<AccessLevel> response=db.find(q,AccessLevel.class);
for (AccessLevel x: response
) {
System.out.println(x.toString());
}
As you can see i'm facing an error concerning $and.
Due to limitations of the org.bson.Document, you can't add a second '$and' expression specified as [...]
it seems mongo can't handle various $and, something i'm pretty used to abuse over sql
select * from a where id =1 and id=2 and id=3 and id=4
(not the best, sincei can use IN(), but sql allow me)
So, the point is: mongo can actualy work in this way and i need to dig more into the problem, or i need to do another approach, like using criterion.in(), and make N interrogation via mongotemplate one for every key in my Map?

Using Spring MongoTemplate to update nested arrays in MongoDB

Can anyone help with a MongoTemplate question?
I have got a record structure which has nested arrays and I want to update a specific entry in a 2nd level array. I can find the appropriate entry easier enough by the Set path needs the indexes of both array entries & the '$' only refers to the leaf item. For example if I had an array of teams which contained an array of player I need to generate an update path like :
val query = Query(Criteria.where( "teams.players.playerId").`is`(playerId))
val update = Update()
with(update) {
set("teams.$.players.$.name", player.name)
This fails as the '$' can only be used once to refer to the index in the players array, I need a way to generate the equivalent '$' for the index in the teams array.
I am thinking that I need to use a separate Aggregate query using the something like this but I can't get it to work.
project().and(ArrayOperators.arrayOf( "markets").indexOf("")).`as`("index")
Any ideas for this Mongo newbie?
For others who is facing similar issue, One option is to use arrayFilters in UpdateOptions. But looks like mongotemplate in spring does not yet support the use of UpdateOptions directly. Hence, what can be done is:
Sample for document which contain object with arrays of arrayObj (which contain another arrays of arrayObj).
Bson filter = eq("arrayObj.arrayObj.id", "12345");
UpdateResult result = mongoTemplate.getDb().getCollection(collectionName)
.updateOne(filter,
new Document("$set", new Document("arrayObj.$[].arrayObj.$[x].someField"), "someValueToUpdate"),
new UpdateOptions().arrayFilters(
Arrays.asList(Filters.eq("x.id, "12345))
));

Hbase filter to find rows without a specific column

I want to filter out all rows that do not have a specific column. any idea which comparator to use?
You can use skip filter combined with qualifier filter.
If you use the java client API:
Filter filter = new QualifierFilter(CompareFilter.CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes("column-name")));
Filter filter2 = new SkipFilter(filter);
scan.setFilter(filter2);
this will return all the row without that specific column
SingleColumnValueFilter has method setFilterIfMissing that excludes all row that do not contain given column if it is given true. All that is needed is to design filter so it will always pass and call setFilterIfMissing(true)
SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes(columnFamily), Bytes.toBytes("column_name"), CompareFilter.CompareOp.NOT_EQUAL, Bytes.toBytes("non-sense"));
filter.setFilterIfMissing(true);
scan.setFilter(filter);

Hadoop: Map Reduce: read from HBase, but filter rows by content of one column

I am really new to Hadoop and I am not able to find an answer to my question. I want to write a map reduce job, where I read from HBase and write then in a simple text file.
In HBase, Ive got a column representing an id. Now I dont want to work on all containing rows in my HBase Table, but only on those between a maxId and a minId.
I found out that I could possibly user filters (scan.setFilter), so that I can filter rows which dont match my request.
This is my first Map Reduce Job, so please be patient :-)
Ive got a Starter Class, where I configure the job and the Scan Object and then start the job.
Now, my first try looks like this:
private Scan getScan()
{
final Scan scan = new Scan();
// ** FILTER **
List<Filter> filters = new ArrayList<Filter>();
Filter filter1 = new ValueFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL, new BinaryComparator(Bytes.toBytes(Integer.parseInt(minId))));
filters.add(filter1);
Filter filter2 = new ValueFilter(CompareFilter.CompareOp.LESS_OR_EQUAL, new BinaryComparator(Bytes.toBytes(Integer.parseInt(maxId))));
filters.add(filter2);
FilterList filterList = new FilterList(filters);
scan.setFilter(filterList);
scan.setCaching(500);
scan.setCacheBlocks(false);
// id
scan.addColumn("columnfamily".getBytes(), "id".getBytes());
return scan;
}
Well, Im not sure if this is the right way to do it. I also read that I could pass my minId and maxId maybe with the Configuration Object to the Map Job, but Im not sure how.
Besides, what have I to do afterwards? I would normally just initiate the job with initTableMapperJob and pass the Scan Object to it. Ive read something of ResultScanner and so, do I need them? I thought the MapReduce Framework would now automatically pass the correct rows to my map job, is that correct?

Is it possible to set where condition to hbase row-keys?

Is it possible to set where condition to hbase row-keys? Suppose I have row-keys 1,2,3,4,5...
I need to query like "where row-key<4"??
I think you want an InclusiveStopFilter
s = new Scan(Bytes.toBytes("startRow"));
s.setFilter(new InclusiveStopFilter(Bytes.toBytes("stopRow")));
http://svn.apache.org/repos/asf/hbase/branches/0.90/src/main/java/org/apache/hadoop/hbase/filter/InclusiveStopFilter.java
You can easily write your own FilterBase implementation with any meaning you want.
http://svn.apache.org/repos/asf/hbase/branches/0.90/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java
override filterRowKey method like in InclusiveStopFilter sources.
You can have a scan start and stop row:
Scan s = new Scan();
s.setStartRow(Bytes.toBytes("startRow"));
s.setStopRow(Bytes.toBytes("endRow"));

Resources