Hbase filter to find rows without a specific column - filter

I want to filter out all rows that do not have a specific column. any idea which comparator to use?

You can use skip filter combined with qualifier filter.
If you use the java client API:
Filter filter = new QualifierFilter(CompareFilter.CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes("column-name")));
Filter filter2 = new SkipFilter(filter);
scan.setFilter(filter2);
this will return all the row without that specific column

SingleColumnValueFilter has method setFilterIfMissing that excludes all row that do not contain given column if it is given true. All that is needed is to design filter so it will always pass and call setFilterIfMissing(true)
SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes(columnFamily), Bytes.toBytes("column_name"), CompareFilter.CompareOp.NOT_EQUAL, Bytes.toBytes("non-sense"));
filter.setFilterIfMissing(true);
scan.setFilter(filter);

Related

Filters For Charts on Apex

I have figured out how to add a filter to my chart, is there a way that if I leave this filter null that it will display all data as opposed to no data?
This is the line I used to create the filter:
Paint_shop = :P9_Select_Shop
Use
where (Paint_shop = :P9_Select_Shop or :P9_Select_Shop is null)

Combine PowerBI DAX Filter and SELECTCOLUMN

I want to create a new table based on this one:
that filters for Warehouse=2 and "drops" the columns "Price" and "Cost" like this:
I have managed to apply the filter in the first step using:
FILTER(oldtable;oldtable[Warehouse]=2)
and then in the next step cold create another table that only selects the required columns using:
newtable2=SELECTCOLUMNS("newtable1";"Articlename";...)
But I want to be able to combine these two functions and create the table straight away.
This is very simple, because in your first step, a table is returned which you can use directly in your second statement.
newTabel = SELECTCOLUMNS(FILTER(warehouse;warehouse[Warehouse]=2);"ArticleName";warehouse[Articlename];"AmountSold";warehouse[AmountSold];"WareHouse";warehouse[Warehouse])
If you want to keep the overview, you can also use variables and return:
newTabel =
var filteredTable = FILTER(warehouse;warehouse[Warehouse]=2)
return SELECTCOLUMNS(filteredTable;"ArticleName";warehouse[Articlename];"AmountSold";warehouse[AmountSold];"WareHouse";warehouse[Warehouse])

Custom filter to search criteria. Magento 2

I need to add range filter for year. I redefine the Magento class in my module FullText\Collection. I also made changes in search_request.xml file. I found the code and it works for me:
$skus = [
'CNS334',
'U012840'
];
$this->filterBuilder->setField('sku');
$this->filterBuilder->setValue($skus);
$this->filterBuilder->setConditionType('in');
$this->searchCriteriaBuilder->addFilter($this->filterBuilder->create());
But I have data in the another tables. I try to join, but I cant get any results for my filtering.
$this->getSelect()->join(
[
'my' => 'make_year',
],
'e.entity_id = my.product_id'
$this->searchCriteriaBuilder->addFilter($this->filterBuilder
->setField('year')
->setValue(2014)
->setConditionType('from')
->create());
$this->searchCriteriaBuilder->addFilter($this->filterBuilder
->setField('year')
->setValue(2015)
->setConditionType('to')
->create());
$this->searchCriteriaBuilder->addFilter($this->filterBuilder->create());
To join your query with other tables you need criteria mapper.
As an example please look at \Magento\CatalogInventory\Model\ResourceModel\Stock\Item\StockItemCriteria::setStockStatus() - here is initiated the filter.
Setting new item in data array triggers mapper method \Magento\CatalogInventory\Model\ResourceModel\Stock\Item\StockItemCriteriaMapper::mapStockStatus().
This mapper contains \Magento\Framework\DB\Select with which we are able to work like in old-style collection.
The name of mapper method correlates with criteria's data array index. So if you add $this->data['custom_field'] the mapper function should be mapCustomField().
Please also note that if there is specific mapper function for the field the mapper will try to filter by mapped fields.

Hadoop Pig: Show entries using STARTSWITH

I am having issues using the STARTSWITH string function. I want to display all records in System_Period that begins with 20040
transactions = LOAD '/home/cloudera/datasets/assignment2/Transactions.csv'
USING PigStorage(',') AS (Branch_Number:int, Contract_Number:int,
Customer_Number:int,Invoice_Date:chararray, Invoice_Number:int,
Product_Number:int, Sales_Amount:double, Employee_Number:int,
Service_Date:chararray, System_Period:int);
sysGroup = GROUP transactions BY System_Period;
sysFilter = FILTER sysGroup BY STARTSWITH(transactions.System_Period, 20040);
DUMP sysFilter;
The error I am receiving is
Could not infer the matching function for org.apache.pig.builtin.STARTSWITH as multiple or none of them fit. Please use an explicit cast.
STARTSWITH is only used to compare a tuple1 with tuple2 to check whether tuple1 contains tuple2. You cannot pass a relation or a bag to that. And one more thing to be noted is it accepts only String(chararray) not an integer. Either FILTER the system_period that begins with 20040 before the GROUP BY and load system_period as chararray and then cast it after the filter as per your need.
transactions = LOAD '/home/cloudera/datasets/assignment2/Transactions.csv'
USING PigStorage(',') AS (Branch_Number:int, Contract_Number:int,
Customer_Number:int,Invoice_Date:chararray, Invoice_Number:int,
Product_Number:int, Sales_Amount:double, Employee_Number:int,
Service_Date:chararray, System_Period:chararray);
sysFilter = FILTER transactions BY STARTSWITH(System_Period, '20040');
Else after GROUP BY FLATTEN the result and then filter
transactions = LOAD '/home/cloudera/datasets/assignment2/Transactions.csv'
USING PigStorage(',') AS (Branch_Number:int, Contract_Number:int,
Customer_Number:int,Invoice_Date:chararray, Invoice_Number:int,
Product_Number:int, Sales_Amount:double, Employee_Number:int,
Service_Date:chararray, System_Period:chararray);
sysGroup = GROUP transactions BY System_Period;
flatres = FOREACH sysGroup GENERATE group,FLATTEN(transactions);
sysFilter = FILTER flatres BY STARTSWITH(System_Period, '20040');

Column Value Range Filter in Hbase 0.94

I want to use a range filter in hbase on more than one column . I know we can use SingleColumnValueFilter implementing And/Or Conditions but I want to run the same filter condition against two different columns.
Example:myhbase table
rowkey,cf:bidprice,cf:askprice,cf:product
I want to filter all the rows with (cf:bidprice>10 and cf:bidprice<20) or (cf:askprice>10 and cf:askprice<20).
I think I figured it out. Below code snippet is an example implementation.
byte[] startRow=Bytes.toBytes("startrow");
byte[] endRow=Bytes.toBytes("stoprow");
SingleColumnValueFilter bidPriceGreaterFilter=new SingleColumnValueFilter("q".getBytes(), "bidprice".getBytes(), CompareFilter.CompareOp.GREATER_OR_EQUAL, "12345".getBytes());
SingleColumnValueFilter bidPricelesserFilter=new SingleColumnValueFilter("q".getBytes(), "bidprice".getBytes(), CompareFilter.CompareOp.LESS_OR_EQUAL, "12346".getBytes());
SingleColumnValueFilter askPriceGreaterFilter=new SingleColumnValueFilter("q".getBytes(), "askprice".getBytes(), CompareFilter.CompareOp.GREATER_OR_EQUAL, "12345".getBytes());
SingleColumnValueFilter askPricelesserFilter=new SingleColumnValueFilter("q".getBytes(), "askprice".getBytes(), CompareFilter.CompareOp.LESS_OR_EQUAL, "12346".getBytes());
FilterList andFilter1= new FilterList(FilterList.Operator.MUST_PASS_ALL);
andFilter1.addFilter(bidPriceGreaterFilter);
andFilter1.addFilter(bidPricelesserFilter);
FilterList andFilter2= new FilterList(FilterList.Operator.MUST_PASS_ALL);
andFilter2.addFilter(askPriceGreaterFilter);
andFilter2.addFilter(askPricelesserFilter);
FilterList finalFilterList=new FilterList(FilterList.Operator.MUST_PASS_ONE);
finalFilterList.addFilter(andFilter1);
finalFilterList.addFilter(andFilter2);
Scan scan = new Scan(startRow,endRow);
scan.setFilter(finalFilterList);

Resources