Keep id order as in query - elasticsearch

I'm using elasticsearch to get a mapping of ids to some values, but it is crucial that I keep the order of the results in the order that the ids have.
Example:
def term_mapping(ids)
ids = ids.split(',')
self.search do |s|
s.filter :terms, id: ids
end
end
res = term_mapping("4,2,3,1")
The result collection should contain the objects with the ids in order 4,2,3,1...
Do you have any idea how I can achieve this?

If you need to use search you can sort ids before you send them to elasticsearch and retrive results sorted by id, or you can create a custom sort script that will return the position of the current document in the array of ids. However, a simpler and faster solution would be to simply use Multi-Get instead of search.

One option is to use the Multi GET API. If this doesn't work for you, another solution is to sort the results after you retrieve them from es. In python, this can be done this way:
doc_ids = ["123", "333", "456"] # We want to keep this order
order = {v: i for i, v in enumerate(doc_ids)}
es_results = [{"_id": "333"}, {"_id": "456"}, {"_id": "123"}]
results = sorted(es_results, key=lambda x: order[x['_id']])
# Results:
# [{'_id': '123'}, {'_id': '333'}, {'_id': '456'}]

May be this problem is resolved,, but someone will help with this answer
we can used the pinned_query for the ES. Do not need the loop for the sort the order
**qs = {
"size" => drug_ids.count,
"query" => {
"pinned" => {
"ids" => drug_ids,
"organic" => {
"terms": {
"id": drug_ids
}
}
}
}
}**
It will keep the sequence of the input as it

Related

How to sort string with numbers in it numerically?

So I am getting some json data and putting it inside of a Mutable List. I have a class with id, listId, and name inside of it. Im trying to sort the output of the list by listId which are just integers and then also the name which has a format of "Item 123". Im doing the following
val sortedList = data.sortedWith(compareBy({ it.listId }, { it.name }))
This sorts the listId correctly but the names is sorted alphabetically so the numbers go 1, 13, 2, 3. How am I able to sort both the categories but make the "name" also be sorted numerically?
I think
val sortedList = data.sortedWith(compareBy(
{ it.listId },
{ it.name.substring(0, it.name.indexOf(' ')) },
{ it.name.substring(it.name.indexOf(' ') + 1).toInt() }
))
will work but it is not computationally efficient because it will call String.indexOf() many times.
If you have a very long list, you should consider making another list whose each item has String and Int names.

Group list of objects/AR relation by user_id

I have a list of objects which is actually AR Relation. My object has these fields :
{
agreement_id: 1,
app_user_id: 1,
agency_name: 'Small business 1'
..etc..
},
{
agreement_id: 2,
app_user_id: 1,
agency_name: 'Small business 2'
..etc..
}
I m representing my object as a Hash for easier understanding. I need to map my list of objects to format like this :
{
1 => [1,2]
}
This represents a list of agreement_ids grouped by the user. I always know which user I m grouping on. Here is what I've tried so far :
where(app_user_id: user_id).where('...').select('app_user_id, agreement_id').group_by(&:app_user_id)
This gives me the structure what I want but not exactly the data that I want, here is an output of this :
{1=>
[#<Agreement:0x6340fdbb agreement_id: 1, app_user_id: 1>,
#<Agreement:0x91bd4dd agreement_id: 2, app_user_id: 1>]
}
I've also thought I was going to be able to do this with map method, and here is what I tried :
where(app_user_id: user_id).where('....').select('app_user_id, agreement_id').map do |ag|
{ ag.app_user_id => ag.agreement_id }
end.reduce(&:merge)
But it only produces the mapping with the last agreement_id like this :
{1=>2}
I've tried some other things not worth mentioning. Can anyone suggest a way that would make this work?
This might work :
where(app_user_id: user_id)
.where('...')
.select('app_user_id, agreement_id')
.group_by(&:app_user_id).map{|k,v| Hash[k, v.map(&:agreement_id)]}
Try this one
where(app_user_id: user_id).
where('...').
select('app_user_id, agreement_id').
map { |a| [a.app_user_id, a.agreement_id] }.
group_by(&:first)

How do I dynamically name a collection?

Title: How do I dynamically name a collection?
Pseudo-code: collect(n) AS :Label
The primary purpose of this is for easy reading of the properties in the API Server (node application).
Verbose example:
MATCH (user:User)--(n)
WHERE n:Movie OR n:Actor
RETURN user,
CASE
WHEN n:Movie THEN "movies"
WHEN n:Actor THEN "actors"
END as type, collect(n) as :type
Expected output in JSON:
[{
"user": {
....
},
"movies": [
{
"_id": 1987,
"labels": [
"Movie"
],
"properties": {
....
}
}
],
"actors:" [ .... ]
}]
The closest I've gotten is:
[{
"user": {
....
},
"type": "movies",
"collect(n)": [
{
"_id": 1987,
"labels": [
"Movie"
],
"properties": {
....
}
}
]
}]
The goal is to be able to read the JSON result with ease like so:
neo4j.cypher.query(statement, function(err, results) {
for result of results
var user = result.user
var movies = result.movies
}
Edit:
I apologize for any confusion in my inability to correctly name database semantics.
I'm wondering if it's enough just to output the user and their lists of both actors and movies, rather than trying to do a more complicated means of matching and combining both.
MATCH (user:User)
OPTIONAL MATCH (user)--(m:Movie)
OPTIONAL MATCH (user)--(a:Actor)
RETURN user, COLLECT(m) as movies, COLLECT(a) as actors
This query should return each User and his/her related movies and actors (in separate collections):
MATCH (user:User)--(n)
WHERE n:Movie OR n:Actor
RETURN user,
REDUCE(s = {movies:[], actors:[]}, x IN COLLECT(n) |
CASE WHEN x:Movie
THEN {movies: s.movies + x, actors: s.actors}
ELSE {movies: s.movies, actors: s.actors + x}
END) AS types;
As far as a dynamic solution to your question, one that will work with any node connected to your user, there are a few options, but I don't believe you can get the column names to be dynamic like this, or even the names of the collections returned, though we can associate them with the type.
MATCH (user:User)--(n)
WITH user, LABELS(n) as type, COLLECT(n) as nodes
WITH user, {type:type, nodes:nodes} as connectedNodes
RETURN user, COLLECT(connectedNodes) as connectedNodes
Or, if you prefer working with multiple rows, one row each per node type:
MATCH (user:User)--(n)
WITH user, LABELS(n) as type, COLLECT(n) as collection
RETURN user, {type:type, data:collection} as connectedNodes
Note that LABELS(n) returns a list of labels, since nodes can be multi-labeled. If you are guaranteed that every interested node has exactly one label, then you can use the first element of the list rather than the list itself. Just use LABELS(n)[0] instead.
You can dynamically sort nodes by label, and then convert to the map using the apoc library:
WITH ['Actor','Movie'] as LBS
// What are the nodes we need:
MATCH (U:User)--(N) WHERE size(filter(l in labels(N) WHERE l in LBS))>0
WITH U, LBS, N, labels(N) as nls
UNWIND nls as nl
// Combine the nodes on their labels:
WITH U, LBS, N, nl WHERE nl in LBS
WITH U, nl, collect(N) as RELS
WITH U, collect( [nl, RELS] ) as pairs
// Convert pairs "label - values" to the map:
CALL apoc.map.fromPairs(pairs) YIELD value
RETURN U as user, value

Faster query by value

I want to query MongoDB to find, in the results top level document, how many nested documents of it have value 0.
For instance, in this collection:
{name: "mary", results: {"foo" : 0, "bar" : 8}}
{name: "bob", results: {"baz" : 9, "qux" : 0}}
{name: "leia", results: {"foo" : 9, "norf" : 5}}
my query should return 2, because two of the documents have 0 as a value of a nested document of results.
Here's my attempt
db.collection.find({$where : function() {
for (var key in this.results) {
if (this.results[key] === 0) { return true;} } return false; } })
which works on the above dataset, but is too slow. My real data are 100k documents, each having 500 nested documents inside results, and the above query takes a few minutes. Is it possible to design this query in a faster way?
There is no way to do it, other than the one you are doing.
You can only change the schema or use aggregations but I don't think that this is what you want.
There is a post about it you can check here:
mongoDB: find by embedded value

increment value in a hash

I have a bunch of posts which have category tags in them.
I am trying to find out how many times each category has been used.
I'm using rails with mongodb, BUT I don't think I need to be getting the occurrence of categories from the db, so the mongo part shouldn't matter.
This is what I have so far
#recent_posts = current_user.recent_posts #returns the 10 most recent posts
#categories_hash = {'tech' => 0, 'world' => 0, 'entertainment' => 0, 'sports' => 0}
#recent_posts do |cat|
cat.categories.each do |addCat|
#categories_hash.increment(addCat) #obviously this is where I'm having problems
end
end
end
the structure of the post is
{"_id" : ObjectId("idnumber"), "created_at" : "Tue Aug 03...", "categories" :["world", "sports"], "message" : "the text of the post", "poster_id" : ObjectId("idOfUserPoster"), "voters" : []}
I'm open to suggestions on how else to get the count of categories, but I will want to get the count of voters eventually, so it seems to me the best way is to increment the categories_hash, and then add the voters.length, but one thing at a time, i'm just trying to figure out how to increment values in the hash.
If you aren't familiar with map/reduce and you don't care about scaling up, this is not as elegant as map/reduce, but should be sufficient for small sites:
#categories_hash = Hash.new(0)
current_user.recent_posts.each do |post|
post.categories.each do |category|
#categories_hash[category] += 1
end
end
If you're using mongodb, an elegant way to aggregate tag usage would be, to use a map/reduce operation. Mongodb supports map/reduce operations using JavaScript code. Map/reduce runs on the db server(s), i.e. your application does not have to retrieve and analyze every document (which wouldn't scale well for large collections).
As an example, here are the map and reduce functions I use in my blog on the articles collection to aggregate the usage of tags (which is used to build the tag cloud in the sidebar). Documents in the articles collection have a key named 'tags' which holds an array of strings (the tags)
The map function simply emits 1 on every used tag to count it:
function () {
if (this.tags) {
this.tags.forEach(function (tag) {
emit(tag, 1);
});
}
}
The reduce function sums up the counts:
function (key, values) {
var total = 0;
values.forEach(function (v) {
total += v;
});
return total;
}
As a result, the database returns a hash that has a key for every tag and its usage count as a value. E.g.:
{ 'rails' => 5, 'ruby' => 12, 'linux' => 3 }

Resources