How do I dynamically name a collection? - syntax

Title: How do I dynamically name a collection?
Pseudo-code: collect(n) AS :Label
The primary purpose of this is for easy reading of the properties in the API Server (node application).
Verbose example:
MATCH (user:User)--(n)
WHERE n:Movie OR n:Actor
RETURN user,
CASE
WHEN n:Movie THEN "movies"
WHEN n:Actor THEN "actors"
END as type, collect(n) as :type
Expected output in JSON:
[{
"user": {
....
},
"movies": [
{
"_id": 1987,
"labels": [
"Movie"
],
"properties": {
....
}
}
],
"actors:" [ .... ]
}]
The closest I've gotten is:
[{
"user": {
....
},
"type": "movies",
"collect(n)": [
{
"_id": 1987,
"labels": [
"Movie"
],
"properties": {
....
}
}
]
}]
The goal is to be able to read the JSON result with ease like so:
neo4j.cypher.query(statement, function(err, results) {
for result of results
var user = result.user
var movies = result.movies
}
Edit:
I apologize for any confusion in my inability to correctly name database semantics.

I'm wondering if it's enough just to output the user and their lists of both actors and movies, rather than trying to do a more complicated means of matching and combining both.
MATCH (user:User)
OPTIONAL MATCH (user)--(m:Movie)
OPTIONAL MATCH (user)--(a:Actor)
RETURN user, COLLECT(m) as movies, COLLECT(a) as actors

This query should return each User and his/her related movies and actors (in separate collections):
MATCH (user:User)--(n)
WHERE n:Movie OR n:Actor
RETURN user,
REDUCE(s = {movies:[], actors:[]}, x IN COLLECT(n) |
CASE WHEN x:Movie
THEN {movies: s.movies + x, actors: s.actors}
ELSE {movies: s.movies, actors: s.actors + x}
END) AS types;

As far as a dynamic solution to your question, one that will work with any node connected to your user, there are a few options, but I don't believe you can get the column names to be dynamic like this, or even the names of the collections returned, though we can associate them with the type.
MATCH (user:User)--(n)
WITH user, LABELS(n) as type, COLLECT(n) as nodes
WITH user, {type:type, nodes:nodes} as connectedNodes
RETURN user, COLLECT(connectedNodes) as connectedNodes
Or, if you prefer working with multiple rows, one row each per node type:
MATCH (user:User)--(n)
WITH user, LABELS(n) as type, COLLECT(n) as collection
RETURN user, {type:type, data:collection} as connectedNodes
Note that LABELS(n) returns a list of labels, since nodes can be multi-labeled. If you are guaranteed that every interested node has exactly one label, then you can use the first element of the list rather than the list itself. Just use LABELS(n)[0] instead.

You can dynamically sort nodes by label, and then convert to the map using the apoc library:
WITH ['Actor','Movie'] as LBS
// What are the nodes we need:
MATCH (U:User)--(N) WHERE size(filter(l in labels(N) WHERE l in LBS))>0
WITH U, LBS, N, labels(N) as nls
UNWIND nls as nl
// Combine the nodes on their labels:
WITH U, LBS, N, nl WHERE nl in LBS
WITH U, nl, collect(N) as RELS
WITH U, collect( [nl, RELS] ) as pairs
// Convert pairs "label - values" to the map:
CALL apoc.map.fromPairs(pairs) YIELD value
RETURN U as user, value

Related

ElasticSearch - Combining Queries for 4 seperate randomly sourted groups?

I'm fairly new to elasticsearch (though with a fair bit of SQL experience) and am currently struggling with putting a proper query together. I have 2 boolean fields isPlayer and isEvil that an entry is either true or false on. Based on that, I want to split my dataset into 4 groups:
isPlayer: true, isEvil: true
isPlayer: true, isEvil: false
isPlayer: false, isEvil: true
isPlayer: false, isEvil: false
These groups I want to randomly sort within themselves, then attach them to be one long list that I can paginate. I'd like to do that inside the query, as that seems like the "correct" way to do this, since I'd do it similarly in SQL. In that list, the groups are to be sorted in order, so first all entries of Group 1 in a random order, then all entries of Group 2 in a random order, then all entries of Group 3 etc. . It is necessary that the randomness of the sorting is reproducible if given the same inputs, so if the sorting is based on random_score ideally I'd be using a seed for the randomness.
I can build a single query, but how do I combine 4?
As approaches I've found so far MultiSearch and Disjunction Max Query. MultiSearch seems like it doesn't support Pagination. Regarding Disjunction Max Query it might be that I'm missing the forest for the trees, but there I'm struggling in having the subqueries be randomly sorted only within themselves before appending them to one another.
Here how I write a single query for now without Disjunction Max Query, in case it helps:
{
"query": {
"bool": {
"should": [
{
"term": {
"isPlayer": true
}
},
{
"term": {
"isEvil": true
}
}
]
}
}
}
The solution to this problem is not doing 4 separate groups, but instead ensuring they all have different ranges of scores and sorting by scores. This can be achieved, by scoring the hits not by some kind of matching criteria, but through a script-score field. This field allows you to write code yourself that returns a logic score (The default language is called "painless", but I've seen examples of groovy as well).
The logic is fairly simple:
If isPlayer = true, add 2 points to the score
If isEvil = true, add 4 points to the score
Either way, add a random number between 0 and 1 to the score at the end
This creates the 4 groups I wanted with distinct score-ranges:
isPlayer = true, isEvil = true --> Score-range: 6-7
isPlayer = false, isEvil = true --> Score-range: 4-5
isPlayer = true, isEvil = false --> Score-range: 2-3
isPlayer = false, isEvil = false --> Score-range: 0-1
The query would look like this:
"query": {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": """
double score = 0;
if(doc['isPlayer']){
score += 2;
}
if(doc['isEvil']){
score += 4;
}
int partialSeed = 1;
score += randomScore(partialSeed, 'id');
return score;
"""
}
}
}
}

Rethink merge array

I have a query -
r.table('orgs')
.filter(function(org) {
return r.expr(['89a26384-8fe0-11e8-9eb6-529269fb1459', '89a26910-8fe0-11e8-9eb6-529269fb1459'])
.contains(org('id'));
})
.pluck("users")
This returns following output-
{
"users": [
"3a415919-f50b-4e15-b3e6-719a2a6b26a7"
]
} {
"users": [
"d8c4aa0a-8df6-4d47-8b0e-814a793f69e2"
]
}
How do I get the result as -
[
"3a415919-f50b-4e15-b3e6-719a2a6b26a7","d8c4aa0a-8df6-4d47-8b0e-814a793f69e2"
]
First, don't use that complicated and resource-consuming .filter directly on a table. Since your tested field is already indexed (id), you can:
r.table('orgs').getAll('89...59', '89...59')
or
r.table('orgs').getAll(r.args(['89...59', '89...59']))
which is way faster (way!). I recently found this article about how faster that is.
Now to get an array of users without the wrapping, using the brackets operation:
r.table('orgs').getAll(...).pluck('users')('users')
will provide a result like
[
'123',
'456'
],
[
'123',
'789'
]
We just removed the "users" wrapping, but the result is an array of arrays. Let's flatten this 2D array with .concatMap:
r.table('orgs').getAll(...).pluck('users')('users').concatMap(function (usrs) {
return usrs;
})
Now we've concatenated the sub-arrays into one, however we see duplicates (from my previous result example, you'd have '123' twice). Just .distinct the thing:
r.table('orgs').getAll(...).pluck('users')('users').concatMap(function (usrs) {
return usrs;
}).distinct()
From the example I took, you have now:
'123',
'456',
'789'
Et voilà!

Group list of objects/AR relation by user_id

I have a list of objects which is actually AR Relation. My object has these fields :
{
agreement_id: 1,
app_user_id: 1,
agency_name: 'Small business 1'
..etc..
},
{
agreement_id: 2,
app_user_id: 1,
agency_name: 'Small business 2'
..etc..
}
I m representing my object as a Hash for easier understanding. I need to map my list of objects to format like this :
{
1 => [1,2]
}
This represents a list of agreement_ids grouped by the user. I always know which user I m grouping on. Here is what I've tried so far :
where(app_user_id: user_id).where('...').select('app_user_id, agreement_id').group_by(&:app_user_id)
This gives me the structure what I want but not exactly the data that I want, here is an output of this :
{1=>
[#<Agreement:0x6340fdbb agreement_id: 1, app_user_id: 1>,
#<Agreement:0x91bd4dd agreement_id: 2, app_user_id: 1>]
}
I've also thought I was going to be able to do this with map method, and here is what I tried :
where(app_user_id: user_id).where('....').select('app_user_id, agreement_id').map do |ag|
{ ag.app_user_id => ag.agreement_id }
end.reduce(&:merge)
But it only produces the mapping with the last agreement_id like this :
{1=>2}
I've tried some other things not worth mentioning. Can anyone suggest a way that would make this work?
This might work :
where(app_user_id: user_id)
.where('...')
.select('app_user_id, agreement_id')
.group_by(&:app_user_id).map{|k,v| Hash[k, v.map(&:agreement_id)]}
Try this one
where(app_user_id: user_id).
where('...').
select('app_user_id, agreement_id').
map { |a| [a.app_user_id, a.agreement_id] }.
group_by(&:first)

Powerquery: Expand all columns of that have records in them

Using Power Query in Microsoft Excel 2013, I created a table that looks like this:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
)
Now, I want to expand the columns Address and Wife by using the name attribute
on both records. Manually, I would do it like this:
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
t = Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
expAddress = Table.ExpandRecordColumn(t, "Address", {"name"}, {"Address → name"}),
expWife = Table.ExpandRecordColumn(expAddress, "Wife", {"name"}, {"Wife → name"})
in
expWife
Background
Whenever I have data tables that have a different layout, I need to rewrite the
query. In a fantasy world, you could expand all columns that have Records in
them using a specific key. Ideally, you would have the following library
functions:
// Returns a list with the names of the columns that match the secified type.
// Will also try to infer the type of a column if the table is untyped.
Table.ColumnsOfTypeInfer(
table as table,
listOfTypes as list
) as list
// Expands a column of records into columns with each of the values.
Table.ExpandRecordColumnByKey(
table as table,
columns as list,
key as text,
) as table
Then, I could call
// To insert this in Power Query, append a '=' before the 'Table.FromRows'
let
t = Table.FromRows(
{
{"0", "Tom", "null", "null"},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
recordColumns = Table.ColumnsOfTypeInfer(t, {type record}),
expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
expAll
Question
Can you get a list of columns with a specific type that is not specified in the table, aka infer it?
Can you make that record expansion generic?
Edit: Added row #0 with two null cells.
(First off, thanks for the clear explanation and sample data and suggestions!)
1) There's no way in M code to do type inference. This limitation might almost be considered a "feature", because if the source data changes in a way that causes the inferred type to be different, it will almost certainly break your query.
Once you load your untyped data, it should be quick to use the Detect Data Type button to generate the M for this. Or if you are reading data from JSON it should be mostly typed enough already.
If you have a specific scenario where this doesn't work want to update your question? :)
2) It's very possible and only a little convoluted to make the record expansion generic, as long as the cell values of the table are records. This finds columns where all rows are either null or a record and expands the name column.
Here's some simple implementations you can add to your library:
let
t = Table.FromRows(
{
{"0", "Tom", null, null},
{"1", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"2", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
Table.ColumnsOfAllRowType = (table as table, typ as type) as list => let
ColumnNames = Table.ColumnNames(table),
ColumnsOfType = List.Select(ColumnNames, (name) =>
List.AllTrue(List.Transform(Table.Column(table, name), (cell) => Type.Is(Value.Type(cell), typ))))
in
ColumnsOfType,
Table.ExpandRecordColumnByKey = (table as table, columns as list, key as text) as table =>
List.Accumulate(columns, table, (state, columnToExpand) =>
Table.ExpandRecordColumn(state, columnToExpand, {key}, { columnToExpand & " → " & key })),
recordColumns = Table.ColumnsOfAllRowType(t, type nullable record),
expAll = Table.ExpandRecordColumnByKey(t, recordColumns, "name")
in
expAll
If a new library function can be implemented in just M we're less likely to add it to our standard library, but if you feel it is missing feel free to suggest it at: https://ideas.powerbi.com/forums/265200-power-bi/
You might have a good argument for adding something like Table.ReplaceTypeFromFirstRow(table as table) as table, because constructing the type with M is very messy.
Sorry to come to this a bit late, but I just had a similar challenge. I tried using Chris Webb's ExpandAll function:
http://blog.crossjoin.co.uk/2014/05/21/expanding-all-columns-in-a-table-in-power-query/
... but that only works on Table-type columns, not Record-type columns, but I have managed to hack it to that purpose. I duplicated Chris' function as "ExpandAllRecords" and made 3 edits: :
replaced each if _ is table then Table.ColumnNames(_) with each if _ is record then Record.FieldNames(_)
replaced Table.ExpandTableColumn with Table.ExpandRecordColumn
replaced ExpandAll with ExpandAllRecords
I tried getting both tables and records expanding in one function, but I kept getting type errors.
Anyway, with that in place, the final query is just:
let
t = Table.FromRows(
{
{"1", "Tom", null, [ name="Jane", age=35 ]},
{"2", "Bob", [ name="Berlin" , street="BarStreet" ], [ name="Mary", age=25 ]},
{"3", "Jim", [ name="Hamburg", street="FooStreet" ], [ name="Marta", age=30 ]}
},
{"ID", "Name", "Address", "Wife"}
),
Output = ExpandAllRecords(t)
in
Output
Edit:
Out of concern that that one day the great snippet (by Chris Webb, mentioned by #MikeHoney) will one day disappear), I'll mirror the entire code here:
let
//Define function taking two parameters - a table and an optional column number
Source = (TableToExpand as table, optional ColumnNumber as number) =>
let
//If the column number is missing, make it 0
ActualColumnNumber = if (ColumnNumber=null) then 0 else ColumnNumber,
//Find the column name relating to the column number
ColumnName = Table.ColumnNames(TableToExpand){ActualColumnNumber},
//Get a list containing all of the values in the column
ColumnContents = Table.Column(TableToExpand, ColumnName),
//Iterate over each value in the column and then
//If the value is of type table get a list of all of the columns in the table
//Then get a distinct list of all of these column names
ColumnsToExpand = List.Distinct(List.Combine(List.Transform(ColumnContents,
each if _ is table then Table.ColumnNames(_) else {}))),
//Append the original column name to the front of each of these column names
NewColumnNames = List.Transform(ColumnsToExpand, each ColumnName & "." & _),
//Is there anything to expand in this column?
CanExpandCurrentColumn = List.Count(ColumnsToExpand)>0,
//If this column can be expanded, then expand it
ExpandedTable = if CanExpandCurrentColumn
then
Table.ExpandTableColumn(TableToExpand, ColumnName,
ColumnsToExpand, NewColumnNames)
else
TableToExpand,
//If the column has been expanded then keep the column number the same, otherwise add one to it
NextColumnNumber = if CanExpandCurrentColumn then ActualColumnNumber else ActualColumnNumber+1,
//If the column number is now greater than the number of columns in the table
//Then return the table as it is
//Else call the ExpandAll function recursively with the expanded table
OutputTable = if NextColumnNumber>(Table.ColumnCount(ExpandedTable)-1)
then
ExpandedTable
else
ExpandAll(ExpandedTable, NextColumnNumber)
in
OutputTable
in
Source
You can then use this function on the XML file as follows:
let
//Load XML file
Source = Xml.Tables(File.Contents("C:\Users\Chris\Documents\PQ XML Expand All Demo.xml")),
ChangedType = Table.TransformColumnTypes(Source,{{"companyname", type text}}),
//Call the ExpandAll function to expand all columns
Output = ExpandAll(ChangedType)
in
Output
(Source and downloadable example: Chris Webb's Bi Blog, 2014-05-21)

Keep id order as in query

I'm using elasticsearch to get a mapping of ids to some values, but it is crucial that I keep the order of the results in the order that the ids have.
Example:
def term_mapping(ids)
ids = ids.split(',')
self.search do |s|
s.filter :terms, id: ids
end
end
res = term_mapping("4,2,3,1")
The result collection should contain the objects with the ids in order 4,2,3,1...
Do you have any idea how I can achieve this?
If you need to use search you can sort ids before you send them to elasticsearch and retrive results sorted by id, or you can create a custom sort script that will return the position of the current document in the array of ids. However, a simpler and faster solution would be to simply use Multi-Get instead of search.
One option is to use the Multi GET API. If this doesn't work for you, another solution is to sort the results after you retrieve them from es. In python, this can be done this way:
doc_ids = ["123", "333", "456"] # We want to keep this order
order = {v: i for i, v in enumerate(doc_ids)}
es_results = [{"_id": "333"}, {"_id": "456"}, {"_id": "123"}]
results = sorted(es_results, key=lambda x: order[x['_id']])
# Results:
# [{'_id': '123'}, {'_id': '333'}, {'_id': '456'}]
May be this problem is resolved,, but someone will help with this answer
we can used the pinned_query for the ES. Do not need the loop for the sort the order
**qs = {
"size" => drug_ids.count,
"query" => {
"pinned" => {
"ids" => drug_ids,
"organic" => {
"terms": {
"id": drug_ids
}
}
}
}
}**
It will keep the sequence of the input as it

Resources