RethinkDB filter array, return only matched values - rethinkdb

I have a table like this
{
dummy: [
"new val",
"new val 2",
"new val 3",
"other",
]
}
want to get only matched values to "new", I am using query like this:
r.db('db').table('table')('dummy').filter(function (val) {
return val.match("^new")
})
but its giving error
e: Expected type STRING but found ARRAY in
what is wrong with query, if I remove .match("^new"), it returns all values
Thanks

The reason of why you're getting Expected type STRING but found ARRAY in is that the value of the dummy field is an array itself, and you cannot apply match to arrays.
Despite the filter you tried may look confusing, you have to rethink your query a bit: just remap the array to a new ^new-keyed array, and then just filter its values out in an inner expression.
For example:
r.db('db')
.table('table')
.getField('dummy')
.map((array) => array.filter((element) => element.match("^new")))
Output:
[
"new val" ,
"new val 2" ,
"new val 3"
]

Related

Search and extract element located in various path of json structure

I have a json in a PostgreSQL database and I need to extract an array not always located in same place.
Problem
Need to extract array choicies of a particular element name
Element name is known, but not where he's sitting in structure
Rules
All elements name are unique
choicies attribute could not be present
JSON structure
pages : [
{
name : 'page1',
elements : [
{ name : 'element1', choicies : [...]},
{ name : 'element2', choicies : [...]}
]
}, {
name : 'page2',
elements : [
{
name : 'element3',
templateElements : [
{
name : 'element4'
choicies : [...]
}, {
name : 'element5'
choicies : [...]
}
]
}, {
name : 'element6'
choicies : [...]
}
]
},{
name : 'element7',
templateElements : [
{
name : 'element8'
choicies : [...]
}
]
}
]
My try to extract elements by flatten the structure
SELECT pages::jsonb->>'name',
pageElements::jsonb ->> 'name',
pageElements::jsonb -> 'choicies',
pages.*
FROM myTable as myt,
jsonb_array_elements(myt.json -> 'pages') as pages,
jsonb_array_elements(pages -> 'elements') as pageElements
Alas column choicies is always null in my results. And that will not work when element is located somewhere else, like
page.elements.templateElements
page.templateElements
... and so on
I don't know if there is a way to search for a key (name) wherever it's sitting in json structure and extract an other key (choicies).
I wish to call a select with element name in parameter return choicies of this element.
By instance, if I call select with element name (element1 or element4 or element8), choicies array (as rows or json or text, no preference here) of this element should be return.
Wow! Solution founded goes belong expectation! JSONPath was the way to go
Amazing what we can do with this.
SQL
-- Use jsonpath to search, filter and return what's needed
SELECT jsonb_path_query(
myt.jsonb,
'$.** ? (#.name == "element_name_to_look_at")'
)->'choices' as jsonbChoices
FROM myTable as myt
Explanation of jsonpath in SQL
jsonb_path_query(jsonb_data, '$.** ? (#.name == "element_name_to_look_at")')->'choices'
jsonb_path_query : posgresql jsonpath function
jsonb_data : database column with jsonb data or jsonb expression
$.** : search everywhere from root element
? : where clause / filter
# : object return by search
#.name == "element_name_to_look_at" : every object name equals element_name_to_look_at
->'choices' : for each object returned by jsonpath, get choices attribute
Final version
After get choices jsonb array, we return a dataset with every choice.
choices arrays look like this :
[{value:'code1',text:'Code Label 1'}, {value:'code2',text:'Code Label 2'},...]
SELECT choices.*
FROM (
-- Use jsonpath to search, filter and return what's needed
SELECT jsonb_path_query(myt.jsonb, '$.** ? (#.name == "element_name_to_look_at")')->'choices' as jsonbChoices
FROM myTable as myt
) choice,
-- Explode json return array into columns
jsonb_to_recordset(choice.jsonbChoices) as choices(value text, text text);

Rethink merge array

I have a query -
r.table('orgs')
.filter(function(org) {
return r.expr(['89a26384-8fe0-11e8-9eb6-529269fb1459', '89a26910-8fe0-11e8-9eb6-529269fb1459'])
.contains(org('id'));
})
.pluck("users")
This returns following output-
{
"users": [
"3a415919-f50b-4e15-b3e6-719a2a6b26a7"
]
} {
"users": [
"d8c4aa0a-8df6-4d47-8b0e-814a793f69e2"
]
}
How do I get the result as -
[
"3a415919-f50b-4e15-b3e6-719a2a6b26a7","d8c4aa0a-8df6-4d47-8b0e-814a793f69e2"
]
First, don't use that complicated and resource-consuming .filter directly on a table. Since your tested field is already indexed (id), you can:
r.table('orgs').getAll('89...59', '89...59')
or
r.table('orgs').getAll(r.args(['89...59', '89...59']))
which is way faster (way!). I recently found this article about how faster that is.
Now to get an array of users without the wrapping, using the brackets operation:
r.table('orgs').getAll(...).pluck('users')('users')
will provide a result like
[
'123',
'456'
],
[
'123',
'789'
]
We just removed the "users" wrapping, but the result is an array of arrays. Let's flatten this 2D array with .concatMap:
r.table('orgs').getAll(...).pluck('users')('users').concatMap(function (usrs) {
return usrs;
})
Now we've concatenated the sub-arrays into one, however we see duplicates (from my previous result example, you'd have '123' twice). Just .distinct the thing:
r.table('orgs').getAll(...).pluck('users')('users').concatMap(function (usrs) {
return usrs;
}).distinct()
From the example I took, you have now:
'123',
'456',
'789'
Et voilà!

How to search for an exact sub-string in aws cloudsearch?

When I search for 'bcde' I would like to get all of the following matches:
'abcde'
'bcdef'
'abcdef'
What is the way to achieve this result in AWS cloudsearch (preferably with a simple query parser)? Prefix will not give me the first result. Is there any other way?
After a few attempts at examples and without success. I decided as follows:
I created a text-array field and stored it part by part of the string from back to front and it worked.
example: my string is "abcde" and i search bcde. this would not work
but in my field text-field will be the following strings:
e, de, cde, bcde, abcde. So you will find "abcde" because he will find the term in the text-array field.
Oh man, but if i search bcd this term not in text-array field.
All right but the string "bcde" starts with "bcd" so IT WORKS! =)
my php file to insert looks like this:
$term = "abcde";
$arrStr = str_split($term);
$arrTerms = [];
$aux = 1;
foreach($arrStr as $str){
$arrTerms[] = substr($term,($aux * -1));
$aux++;
}
$data = [
'type' => 'add',
'id'=> [your_id],
'fields' => [
'id'=> [your_id],
'field-text' => $term
'field-text-array' => $arrTerms
],
];
If your index field is of type "text", A simple structured query will return all the matches which include your query string.
Example
Query : ( and part_part_number:'009' )
Result:
1 _score 10.379914
part_part_number 009
2 _score 10.379914
part_part_number A-009-DY
3 _score 10.379914
part_part_number BY-009

Powerquery: Expand all columns of that have records in them

Using Power Query in Microsoft Excel 2013, I created a table that looks like this:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
)
Now, I want to expand the columns Address and Wife by using the name attribute
on both records. Manually, I would do it like this:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
t = Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
expAddress = Table.ExpandRecordColumn(t, "Address", {"name"}, {"Address → name"}),
expWife = Table.ExpandRecordColumn(expAddress, "Wife", {"name"}, {"Wife → name"})
in
expWife
Background
Whenever I have data tables that have a different layout, I need to rewrite the
query. In a fantasy world, you could expand all columns that have Records in
them using a specific key. Ideally, you would have the following library
functions:
// Returns a list with the names of the columns that match the secified type.
// Will also try to infer the type of a column if the table is untyped.
Table.ColumnsOfTypeInfer(
table as table,
listOfTypes as list
) as list
// Expands a column of records into columns with each of the values.
Table.ExpandRecordColumnByKey(
table as table,
columns as list,
key as text,
) as table
Then, I could call
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
t = Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
recordColumns = Table.ColumnsOfTypeInfer(t, {type record}),
expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
expAll
Question
Can you get a list of columns with a specific type that is not specified in the table, aka infer it?
Can you make that record expansion generic?
Edit: Added row #0 with two null cells.
(First off, thanks for the clear explanation and sample data and suggestions!)
1) There's no way in M code to do type inference. This limitation might almost be considered a "feature", because if the source data changes in a way that causes the inferred type to be different, it will almost certainly break your query.
Once you load your untyped data, it should be quick to use the Detect Data Type button to generate the M for this. Or if you are reading data from JSON it should be mostly typed enough already.
If you have a specific scenario where this doesn't work want to update your question? :)
2) It's very possible and only a little convoluted to make the record expansion generic, as long as the cell values of the table are records. This finds columns where all rows are either null or a record and expands the name column.
Here's some simple implementations you can add to your library:
let
t = Table.FromRows(
{
{"0", "Tom", null, null},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
Table.ColumnsOfAllRowType = (table as table, typ as type) as list => let
ColumnNames = Table.ColumnNames(table),
ColumnsOfType = List.Select(ColumnNames, (name) =>
List.AllTrue(List.Transform(Table.Column(table, name), (cell) => Type.Is(Value.Type(cell), typ))))
in
ColumnsOfType,
Table.ExpandRecordColumnByKey = (table as table, columns as list, key as text) as table =>
List.Accumulate(columns, table, (state, columnToExpand) =>
Table.ExpandRecordColumn(state, columnToExpand, {key}, { columnToExpand & " → " & key })),
recordColumns = Table.ColumnsOfAllRowType(t, type nullable record),
expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
expAll
If a new library function can be implemented in just M we're less likely to add it to our standard library, but if you feel it is missing feel free to suggest it at: https://ideas.powerbi.com/forums/265200-power-bi/
You might have a good argument for adding something like Table.ReplaceTypeFromFirstRow(table as table) as table, because constructing the type with M is very messy.
Sorry to come to this a bit late, but I just had a similar challenge. I tried using Chris Webb's ExpandAll function:
http://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/
... but that only works on Table-type columns, not Record-type columns, but I have managed to hack it to that purpose. I duplicated Chris' function as "ExpandAllRecords" and made 3 edits: :
replaced each if _ is table then Table.ColumnNames(_) with each if _ is record then Record.FieldNames(_)
replaced Table.ExpandTableColumn with Table.ExpandRecordColumn
replaced ExpandAll with ExpandAllRecords
I tried getting both tables and records expanding in one function, but I kept getting type errors.
Anyway, with that in place, the final query is just:
let
t = Table.FromRows(
{
{"1", "Tom", null, [ name="Jane", age=35 ]},
{"2", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"3", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
Output = ExpandAllRecords(t)
in
Output
Edit:
Out of concern that that one day the great snippet (by Chris Webb, mentioned by #MikeHoney) will one day disappear), I'll mirror the entire code here:
let
//Define function taking two parameters - a table and an optional column number
Source = (TableToExpand as table, optional ColumnNumber as number) =>
let
//If the column number is missing, make it 0
ActualColumnNumber = if (ColumnNumber=null) then 0 else ColumnNumber,
//Find the column name relating to the column number
ColumnName = Table.ColumnNames(TableToExpand){ActualColumnNumber},
//Get a list containing all of the values in the column
ColumnContents = Table.Column(TableToExpand, ColumnName),
//Iterate over each value in the column and then
//If the value is of type table get a list of all of the columns in the table
//Then get a distinct list of all of these column names
ColumnsToExpand = List.Distinct(List.Combine(List.Transform(ColumnContents,
each if _ is table then Table.ColumnNames(_) else {}))),
//Append the original column name to the front of each of these column names
NewColumnNames = List.Transform(ColumnsToExpand, each ColumnName & "." & _),
//Is there anything to expand in this column?
CanExpandCurrentColumn = List.Count(ColumnsToExpand)>0,
//If this column can be expanded, then expand it
ExpandedTable = if CanExpandCurrentColumn
then
Table.ExpandTableColumn(TableToExpand, ColumnName,
ColumnsToExpand, NewColumnNames)
else
TableToExpand,
//If the column has been expanded then keep the column number the same, otherwise add one to it
NextColumnNumber = if CanExpandCurrentColumn then ActualColumnNumber else ActualColumnNumber+1,
//If the column number is now greater than the number of columns in the table
//Then return the table as it is
//Else call the ExpandAll function recursively with the expanded table
OutputTable = if NextColumnNumber>(Table.ColumnCount(ExpandedTable)-1)
then
ExpandedTable
else
ExpandAll(ExpandedTable, NextColumnNumber)
in
OutputTable
in
Source
You can then use this function on the XML file as follows:
let
//Load XML file
Source = Xml.Tables(File.Contents("C:\Users\Chris\Documents\PQ XML Expand All Demo.xml")),
ChangedType = Table.TransformColumnTypes(Source,{{"companyname", type text}}),
//Call the ExpandAll function to expand all columns
Output = ExpandAll(ChangedType)
in
Output
(Source and downloadable example: Chris Webb's Bi Blog, 2014-05-21)

Keep id order as in query

I'm using elasticsearch to get a mapping of ids to some values, but it is crucial that I keep the order of the results in the order that the ids have.
Example:
def term_mapping(ids)
ids = ids.split(',')
self.search do |s|
s.filter :terms, id: ids
end
end
res = term_mapping("4,2,3,1")
The result collection should contain the objects with the ids in order 4,2,3,1...
Do you have any idea how I can achieve this?
If you need to use search you can sort ids before you send them to elasticsearch and retrive results sorted by id, or you can create a custom sort script that will return the position of the current document in the array of ids. However, a simpler and faster solution would be to simply use Multi-Get instead of search.
One option is to use the Multi GET API. If this doesn't work for you, another solution is to sort the results after you retrieve them from es. In python, this can be done this way:
doc_ids = ["123", "333", "456"] # We want to keep this order
order = {v: i for i, v in enumerate(doc_ids)}
es_results = [{"_id": "333"}, {"_id": "456"}, {"_id": "123"}]
results = sorted(es_results, key=lambda x: order[x['_id']])
# Results:
# [{'_id': '123'}, {'_id': '333'}, {'_id': '456'}]
May be this problem is resolved,, but someone will help with this answer
we can used the pinned_query for the ES. Do not need the loop for the sort the order
**qs = {
"size" => drug_ids.count,
"query" => {
"pinned" => {
"ids" => drug_ids,
"organic" => {
"terms": {
"id": drug_ids
}
}
}
}
}**
It will keep the sequence of the input as it

Resources