Rethink merge array - rethinkdb

I have a query -
r.table('orgs')
.filter(function(org) {
return r.expr(['89a26384-8fe0-11e8-9eb6-529269fb1459', '89a26910-8fe0-11e8-9eb6-529269fb1459'])
.contains(org('id'));
})
.pluck("users")
This returns following output-
{
"users": [
"3a415919-f50b-4e15-b3e6-719a2a6b26a7"
]
} {
"users": [
"d8c4aa0a-8df6-4d47-8b0e-814a793f69e2"
]
}
How do I get the result as -
[
"3a415919-f50b-4e15-b3e6-719a2a6b26a7","d8c4aa0a-8df6-4d47-8b0e-814a793f69e2"
]

First, don't use that complicated and resource-consuming .filter directly on a table. Since your tested field is already indexed (id), you can:
r.table('orgs').getAll('89...59', '89...59')
or
r.table('orgs').getAll(r.args(['89...59', '89...59']))
which is way faster (way!). I recently found this article about how faster that is.
Now to get an array of users without the wrapping, using the brackets operation:
r.table('orgs').getAll(...).pluck('users')('users')
will provide a result like
[
'123',
'456'
],
[
'123',
'789'
]
We just removed the "users" wrapping, but the result is an array of arrays. Let's flatten this 2D array with .concatMap:
r.table('orgs').getAll(...).pluck('users')('users').concatMap(function (usrs) {
return usrs;
})
Now we've concatenated the sub-arrays into one, however we see duplicates (from my previous result example, you'd have '123' twice). Just .distinct the thing:
r.table('orgs').getAll(...).pluck('users')('users').concatMap(function (usrs) {
return usrs;
}).distinct()
From the example I took, you have now:
'123',
'456',
'789'
Et voilĂ !

Related

RethinkDB filter array, return only matched values

I have a table like this
{
dummy: [
"new val",
"new val 2",
"new val 3",
"other",
]
}
want to get only matched values to "new", I am using query like this:
r.db('db').table('table')('dummy').filter(function (val) {
return val.match("^new")
})
but its giving error
e: Expected type STRING but found ARRAY in
what is wrong with query, if I remove .match("^new"), it returns all values
Thanks
The reason of why you're getting Expected type STRING but found ARRAY in is that the value of the dummy field is an array itself, and you cannot apply match to arrays.
Despite the filter you tried may look confusing, you have to rethink your query a bit: just remap the array to a new ^new-keyed array, and then just filter its values out in an inner expression.
For example:
r.db('db')
.table('table')
.getField('dummy')
.map((array) => array.filter((element) => element.match("^new")))
Output:
[
"new val" ,
"new val 2" ,
"new val 3"
]

How do I dynamically name a collection?

Title: How do I dynamically name a collection?
Pseudo-code: collect(n) AS :Label
The primary purpose of this is for easy reading of the properties in the API Server (node application).
Verbose example:
MATCH (user:User)--(n)
WHERE n:Movie OR n:Actor
RETURN user,
CASE
WHEN n:Movie THEN "movies"
WHEN n:Actor THEN "actors"
END as type, collect(n) as :type
Expected output in JSON:
[{
"user": {
....
},
"movies": [
{
"_id": 1987,
"labels": [
"Movie"
],
"properties": {
....
}
}
],
"actors:" [ .... ]
}]
The closest I've gotten is:
[{
"user": {
....
},
"type": "movies",
"collect(n)": [
{
"_id": 1987,
"labels": [
"Movie"
],
"properties": {
....
}
}
]
}]
The goal is to be able to read the JSON result with ease like so:
neo4j.cypher.query(statement, function(err, results) {
for result of results
var user = result.user
var movies = result.movies
}
Edit:
I apologize for any confusion in my inability to correctly name database semantics.
I'm wondering if it's enough just to output the user and their lists of both actors and movies, rather than trying to do a more complicated means of matching and combining both.
MATCH (user:User)
OPTIONAL MATCH (user)--(m:Movie)
OPTIONAL MATCH (user)--(a:Actor)
RETURN user, COLLECT(m) as movies, COLLECT(a) as actors
This query should return each User and his/her related movies and actors (in separate collections):
MATCH (user:User)--(n)
WHERE n:Movie OR n:Actor
RETURN user,
REDUCE(s = {movies:[], actors:[]}, x IN COLLECT(n) |
CASE WHEN x:Movie
THEN {movies: s.movies + x, actors: s.actors}
ELSE {movies: s.movies, actors: s.actors + x}
END) AS types;
As far as a dynamic solution to your question, one that will work with any node connected to your user, there are a few options, but I don't believe you can get the column names to be dynamic like this, or even the names of the collections returned, though we can associate them with the type.
MATCH (user:User)--(n)
WITH user, LABELS(n) as type, COLLECT(n) as nodes
WITH user, {type:type, nodes:nodes} as connectedNodes
RETURN user, COLLECT(connectedNodes) as connectedNodes
Or, if you prefer working with multiple rows, one row each per node type:
MATCH (user:User)--(n)
WITH user, LABELS(n) as type, COLLECT(n) as collection
RETURN user, {type:type, data:collection} as connectedNodes
Note that LABELS(n) returns a list of labels, since nodes can be multi-labeled. If you are guaranteed that every interested node has exactly one label, then you can use the first element of the list rather than the list itself. Just use LABELS(n)[0] instead.
You can dynamically sort nodes by label, and then convert to the map using the apoc library:
WITH ['Actor','Movie'] as LBS
// What are the nodes we need:
MATCH (U:User)--(N) WHERE size(filter(l in labels(N) WHERE l in LBS))>0
WITH U, LBS, N, labels(N) as nls
UNWIND nls as nl
// Combine the nodes on their labels:
WITH U, LBS, N, nl WHERE nl in LBS
WITH U, nl, collect(N) as RELS
WITH U, collect( [nl, RELS] ) as pairs
// Convert pairs "label - values" to the map:
CALL apoc.map.fromPairs(pairs) YIELD value
RETURN U as user, value

With a hash of lists how do I operate on each key/list element once in random order?

For example, if my HoL looks like:
%HoL = (
"flintstones" => [ "fred", "barney" ],
"jetsons" => [ "george", "jane", "elroy" ],
"simpsons" => [ "homer", "marge", "bart" ],
);
And I want to create a loop that will allow me to operate only once on each key/element pair in a completely random order (so that it jumps between keys randomly too, not just elements), how do I do that? I'm thinking it will use shuffle, but figuring out the specifics is defeating me.
(Sorry for noobishness of question; I haven't been coding long. I was also unable to find an answer for this specific problem by googling, though I daresay it's been answered somewhere before.)
Build an array of all key-value pairs, then shuffle that:
use List::Util 'shuffle';
my %HoL = (
"flintstones" => [ "fred", "barney" ],
"jetsons" => [ "george", "jane", "elroy" ],
"simpsons" => [ "homer", "marge", "bart" ],
);
# Build an array of arrayrefs ($ref->[0] is the key and $ref->[1] is the value)
my #ArrayOfPairs = map {
my $key = $_;
map { [ $key, $_ ] } #{$HoL{$key}}
} keys %HoL;
for my $pair (shuffle #ArrayOfPairs) {
print "$pair->[1] $pair->[0]\n";
}

Keep id order as in query

I'm using elasticsearch to get a mapping of ids to some values, but it is crucial that I keep the order of the results in the order that the ids have.
Example:
def term_mapping(ids)
ids = ids.split(',')
self.search do |s|
s.filter :terms, id: ids
end
end
res = term_mapping("4,2,3,1")
The result collection should contain the objects with the ids in order 4,2,3,1...
Do you have any idea how I can achieve this?
If you need to use search you can sort ids before you send them to elasticsearch and retrive results sorted by id, or you can create a custom sort script that will return the position of the current document in the array of ids. However, a simpler and faster solution would be to simply use Multi-Get instead of search.
One option is to use the Multi GET API. If this doesn't work for you, another solution is to sort the results after you retrieve them from es. In python, this can be done this way:
doc_ids = ["123", "333", "456"] # We want to keep this order
order = {v: i for i, v in enumerate(doc_ids)}
es_results = [{"_id": "333"}, {"_id": "456"}, {"_id": "123"}]
results = sorted(es_results, key=lambda x: order[x['_id']])
# Results:
# [{'_id': '123'}, {'_id': '333'}, {'_id': '456'}]
May be this problem is resolved,, but someone will help with this answer
we can used the pinned_query for the ES. Do not need the loop for the sort the order
**qs = {
"size" => drug_ids.count,
"query" => {
"pinned" => {
"ids" => drug_ids,
"organic" => {
"terms": {
"id": drug_ids
}
}
}
}
}**
It will keep the sequence of the input as it

Ordering array by dependencies with perl

Have an array of hashes,
my #arr = get_from_somewhere();
the #arr contents (for example) is:
#arr = (
{ id => "id2", requires => 'someid', text => "another text2" },
{ id => "xid4", requires => 'id2', text => "text44" },
{ id => "someid", requires => undef, text => "some text" },
{ id => "id2", requires => 'someid', text => "another text2" },
{ id => "aid", requires => undef, text => "alone text" },
{ id => "id2", requires => 'someid', text => "another text2" },
{ id => "xid3", requires => 'id2', text => "text33" },
);
need something like:
my $texts = join("\n", get_ordered_texts(#arr) );
soo need write a sub what return the array of texts from the hashes, - in the dependent order, so from the above example need to get:
"some text", #someid the id2 depends on it - so need be before id2
"another text2", #id2 the xid3 and xid4 depends on it - and it is depends on someid
"text44", #xid4 the xid4 and xid3 can be in any order, because nothing depend on them
"text33", #xid3 but need be bellow id2
"alone text", #aid nothing depends on aid and hasn't any dependencies, so this line can be anywhere
as you can see, in the #arr can be some duplicated "lines", ("id2" in the above example), need output only once any id.
Not providing any code example yet, because havent any idea how to start. ;(
Exists some CPAN module what can be used to the solution?
Can anybody points me to the right direction?
Using Graph:
use Graph qw( );
my #recs = (
{ id => "id2", requires => 'someid', text => "another text2" },
{ id => "xid4", requires => 'id2', text => "text44" },
{ id => "someid", requires => undef, text => "some text" },
{ id => "id2", requires => 'someid', text => "another text2" },
{ id => "aid", requires => undef, text => "alone text" },
{ id => "id2", requires => 'someid', text => "another text2" },
{ id => "xid3", requires => 'id2', text => "text33" },
);
sub get_ordered_recs {
my %recs;
my $graph = Graph->new();
for my $rec (#_) {
my ($id, $requires) = #{$rec}{qw( id requires )};
$graph->add_vertex($id);
$graph->add_edge($requires, $id) if $requires;
$recs{$id} = $rec;
}
return map $recs{$_}, $graph->topological_sort();
}
my #texts = map $_->{text}, get_ordered_recs(#recs);
An interesting problem.
Here's my first round solution:
sub get_ordered_texts {
my %dep_found; # track the set of known dependencies
my #sorted_arr; # output
my $last_count = scalar #_; # infinite loop protection
while (#_ > 0) {
for my $value (#_) {
# next unless we are ready for this text
next if defined $value->{requires}
and not $dep_found{ $value->{requires} };
# Add to the sorted list
push #sorted_arr, $value->{text};
# Remember that we found it
$dep_found{ $value->{id} }++;
}
if (scalar #_ == $last_count) die "some requirements don't exist or there is a dependency loop";
$last_count = scalar #_;
}
return \#sorted_arr;
}
This is not terribly efficient and probably runs in O(n log n) time or something, but if you don't have a huge dataset, it's probably OK.
I would use a directed graph to represent the dependency tree and then walk the graph. I've done something very similiar using Graph.pm
Each of your hashes would be a graph vertex and the edge would represent the dependency.This has the added benefit of supporting more complex dependencies in the future as well as providing shortcut functions for working with the graph.
you didn't say what to do of the dependencies are "independent" of each other.
E.g. id1 requires id2; id3 requires id4; id3 requires id5. What should the order be? (other than 1 before 2 and 3 before both 4/5)
What you want is basically a BFS (Breadth First Search) of a tree (directed graph) of dependencies (or a forest depending on answers to #1 - the forest being a set of non-connected trees).
To do that:
Find all of the root nodes (ids that don't have a requirement themselves)
You can easily do that by making a hash of ALL the IDs using grep on your data structure
Put all those root modes into a starting array.
Then implement BFS. If you need help implementing basic BFS using an array and a loop in Perl, ask a separate question. There may be a CPAN module but the algorithm/code is rather trivial (at least once you wrote it once :)

Resources