Hbase - get column names for row by column name prefix - hadoop

I have a Hbase Table with the following description.
For a row key, my column would be of the form a_1, a_2,a_3,b_1,c_1,C_2 and so on, a compound key format.
Suppose one of my row is of the form
row key - row1
column family - c1
columns - a_1, a_2,a_3,b_1,b_2,c_1,C_2,d_9,d_99
Can I, by any operation retrieve a,b,c,d as the columns corresponding to row1, I am not bothered about whatever be the suffixes for a,b,c...
I can get all column names for a given row, add them to set by splitting the row keys by their first part and emit the set. I am worried, if there would be a better way of doing it by filters or some other hbase way of getting it done, please comment...

You can use COlumnPrefixFilter for that. You can see the following code
Configuration hadoopConf = new Configuration();
hadoopConf.set("hbase.zookeeper.quorum", "localhost");
hadoopConf.set("hbase.zookeeper.property.clientPort", "2181");
HTable hTable = new HTable(hadoopConf, "KunderaExamples");
Scan scan = new Scan();
scan.setFilter(new ColumnPrefixFilter("A".getBytes()));
ResultScanner scanner = hTable.getScanner(scan);
Iterator<Result> resultsIter = scanner.iterator();
while (resultsIter.hasNext())
{
Result result = resultsIter.next();
List<KeyValue> values = result.list();
for (KeyValue value : values)
{
System.out.println(value.getKey());
System.out.println(new String(value.getQualifier()));
System.out.println(value.getValue());
}
}

Related

How to set start and end row key HBASE

If i have row keys like
a_c
b_c
j_f
f_d
d_c
I should get all the rows matching _c. How to set start and stop row key here . I am trying to get the scan result out of start and stop row key and not with rowfilter or other filter types.
You can write your own filter function if you don't want to use RowFilter. But I suggest you to use PrefixFilter if you can't write your own filter function and don't want to use RowFilter
Example for Java:
byte[] prefixF= Bytes.toBytes("_c");
Scan scan = new Scan(prefixF));
PrefixFilter prefixFilter = new PrefixFilter(prefixF);
scan.addFilter(prefixFilter);
ResultScanner resultScanner = table.getScanner(scan);
Above code is equal to hbase> scan 'YourTablename', { FILTER => "PrefixFilter('_c')"}
You can use Hbase STARTROW and ENDROW filter. It basically scans data between the rowkey range (ENDROW excluded).
scan 'table_name', {STARTROW=>"<start_row_key>", ENDROW=>"<end_row_key>"}

Dynamically rename a set of columns using Power Query

I am trying to dynamically rename a set of columns in Power Query, List1 being the original column names and List2 the new column names. I think I need to merge List1 and List2 into a single list of pairs, but can't figure out the correct syntax.
Many thanks!
let
//list of original column names
List1= {"Name1","Name2","Name3","Name4"},
//Create test table
Source = Table.FromRows({{1231,1233,4121,5232},{3546,3426,1246,3464}} , List1),
//list of new column names
List2 = {"NewName 1","NewName 2","NewName 3","NewName 4"},
//Rename columns (in practice, the two lists of names will be dynamic, not hard coded as below)
Result = Table.RenameColumns(Source, {
{"Name1","NewName 1"},
{"Name2","NewName 2"},
{"Name3","NewName 3"},
{"Name4", "NewName 4"}})
in
Result
If you have a table with old and new names then you can use following pattern
let
rename_list = Table.ToColumns(Table.Transpose(Table2)),
result = Table.RenameColumns(Table1, rename_list, MissingField.Ignore)
in result
where Table2 is "Rename Table" and Table1 is initial table with data.
This idea is described in details here
https://bondarenkoivan.wordpress.com/2015/04/17/dynamic-table-headers-in-power-query-sap-bydesign-odata/
If you have the resulting column names you want, it seems like you could convert Source back to rows, then call Table.FromRows on List2
let
//list of original column names
List1= {"Name1","Name2","Name3","Name4"},
//Create test table
Source = Table.FromRows({{1231,1233,4121,5232},{3546,3426,1246,3464}} , List1),
//list of new column names
List2 = {"NewName 1","NewName 2","NewName 3","NewName 4"},
Result = Table.FromRows(Table.ToRows(Source), List2)
in
Result
(Unless it is wrong to assume that e.g. Name 2 will always be the second column.)
Stating the original problem according to Ivan's solution, here goes. Carl's has the same result and is a little simpler for the example I gave, however, my situation will benefit from having the rename pairs set out explicitly in a table (ie. Table2). Plus using the MissingField.Ignore parameter with Table.RenameColumns means that it will only change the selection of columns I want to rename in my production query, the rest will remain unchanged.
let
//list of original column names
List1= {"Name1","Name2","Name3","Name4"},
//Create test table
Source = Table.FromRows({{1231,1233,4121,5232},{3546,3426,1246,3464}} , List1),
//list of new column names
List2 = {"NewName 1","NewName 2","NewName 3","NewName 4"},
//Rename columns (in practice, the two lists of names will be dynamic, not hard coded as below)
//Bring List1 and List2 together as rows in a table
Table2 = Table.FromRows({List1,List2}),
//Create a list of rename pairs
RenameList = Table.ToColumns(Table2),
//Call to Table.RenameColumns
Result = Table.RenameColumns(Source, RenameList, MissingField.Ignore)
in
Result
Finally... figured it out using the following function
Table.TransformColumnNames(table as table, nameGenerator as function, optional options as nullable record) as table
First create a nameGenerator function (e.g. MyFuncRenameColumns) to provide a new column name given any original column name as an input.
In my example, here's my code for MyFuncRenameColumns:
let
MyFunctionSwitchColumnName = (originalColumnName) as text =>
let
//list of original column names
List1= {"Name1","Name2","Name3","Name4"},
//Create table
Source = Table.FromRows({{1231,1233,4121,5232},{3546,3426,1246,3464}} , List1),
//list of new column names
List2 = {"NewName 1","NewName 2","NewName 3","NewName 4"},
//Create table matching List1 to corresponding new value in List2
CreateRecord = Record.FromList(List2,List1),
ConvertedtoTable = Record.ToTable(CreateRecord),
//Filter table to just the row where the input originalColumnName matches
ReduceExcess = Table.SelectRows(ConvertedtoTable, each [Name] = originalColumnName),
//Return the matching result in the [Value] column (or give the original column name if there was no valid match)
NewColumnName = try ReduceExcess{0}[Value] otherwise originalColumnName
in
NewColumnName
in
MyFunctionSwitchColumnName
Here's where you use it as one of the parameters for Table.TransformColumnNames:
let
//list of original column names
List1= {"Name1","Name2","Name3","Name4"},
//Create table
Source = Table.FromRows({{1231,1233,4121,5232},{3546,3426,1246,3464}} , List1),
RenameColumns = Table.TransformColumnNames(Source, MyFuncRenameColumns)
in
RenameColumns
Hope that helps someone!

HBase Get values where rowkey in

How do I get all the values in HBase given Rowkey values?
val tableName = "myTable"
val hConf = HBaseConfiguration.create()
val hTable = new HTable(hConf, tableName)
val theget= new Get(Bytes.toBytes("1001-A")) // rowkey values (1001-A, 1002-A, 2010-A, ...)
val result = hTable.get(theget)
val values = result.listCells()
The code above only works for one rowkey.
You can use Batch operations. Please refer the link below for Javadoc : Batch Operations on HTable
Another approach is to Scan with a start row key & end row key (First & Last row keys from an sorted ascending set of keys). This makes more sense if there are too many values.
There is htable.get method that take list of Gets:
List<Get> gets = ....
List<Result> results = htable.get(gets)

How to apply several QualifierFilter to a row in HBase

we would like to filter a scan on a HBase table with two QualifierFilters.
Means we would like to only get the rows of the table which do have a certain column 'col_A' AND (!) a certain other column 'col_B'.
Our current approach looks like this:
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
Filter filter1 = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("col_A".getBytes()));
filterList.addFilter(filter1);
Filter filter2 = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("col_B".getBytes()));
filterList.addFilter(filter2);
Scan scan = new Scan();
scan.setFilter(filterList);
...
The ResultScanner does not return any results from this scan although there are several rows in the HBase table which do have both columns 'col_A' and 'col_B'.
If we only apply filter1 to the scan everything works fine an we do get all the rows which have 'col_A'.
If we only apply filter2 to the scan it is the same. We do get all rows which have 'col_B'.
Only if we combine these two filters we do not get any results.
What would be the right way to get only the rows from the table which do have col_A AND col_B?
You can achieve this by defining the following filters:
List<Filter> filters = new ArrayList<Filter>(2);
byte[] colfam = Bytes.toBytes("c");
byte[] fakeValue = Bytes.toBytes("DOESNOTEXIST");
byte[] colA = Bytes.toBytes("col_A");
byte[] colB = Bytes.toBytes("col_B");
SingleColumnValueFilter filter1 =
new SingleColumnValueFilter(colfam, colA , CompareOp.NOT_EQUAL, fakeValue);
filter1.setFilterIfMissing(true);
filters.add(filter1);
SingleColumnValueFilter filter2 =
new SingleColumnValueFilter(colfam, colB, CompareOp.NOT_EQUAL, fakeValue);
filter2.setFilterIfMissing(true);
filters.add(filter2);
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL, filters);
Scan scan = new Scan();
scan.setFilter(filterList);
The idea here is to define one SingleColumnValueFilter per column you are looking for, each with a fake value and a CompareOp.NOT_EQUAL operator. I.e:
such a SingleColumnValueFilter will return all columns for a given name.
Source: http://mapredit.blogspot.com/2012/05/using-filters-in-hbase-to-match-two.html
I think this line is the issue -
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
You want it to be -
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);
The filter will try to find a column that has both the column qualifier and there is no such column

How to iterate through table using selenium?

I have a table called UserManagement that contains information about the user.This table gets updated whenever new user is created. If i create two users then i need check whether two users are actually created or not. Table contains ID,UserName,FirstName,LastName,Bdate..ctc. Here ID will be generated automatically.
I am running Selenium-TestNG script.Using Selenium,how can i get the UserName of the two users which i have created? Should i have to iterate through table? If so how to iterate through the table?
Use ISelenium.GetTable(string) to get the contents of the table cells you want. For example,
selenium.GetTable("UserManagement.0.1");
will return the contents of the table's first row and second column. You could then assert that the correct username or usernames appear in the table.
Get the count of rows using Selenium.getxpathcount(\#id = fjsfj\td\tr") in a variable rowcount
Give the columncount in a variable
Ex:
int colcount = 5;
Give the req i.e New user
String user1 = "ABC"
for(i = 0;i <=rowcount;i++)
{
for(j=0;j<=colcount;j++)
{
if (user1==selenium.gettable("//#[id=dldl/tbody" +i "td"+j))
{
system.out.println(user1 + "Inserted");
break;
}
break;
}
}
Get the number of rows using:
int noOfRowsInTable = selenium.getXpathCount("//table[#id='TableId']//tr");
If the UserName you want to get is at fixed position, let's say at 2nd position, then for each row iterate as given below:
selenium.getText("xpath=//table[#id='TableId']//tr//td[1]");
Note: we can find the number of columns in that table using same procedure
int noOfColumnsInTable = selenium.getXpathCount("//table[#id='TableId']//tr//td");
Generically, something like this?
table = #browser.table(:id,'tableID')
table.rows.each do |row|
# perform row operations here
row.cells.each do |cell|
# do cell operations here
end
end

Resources