Faster query by value

Faster query by value - performance

I want to query MongoDB to find, in the results top level document, how many nested documents of it have value 0.
For instance, in this collection:
{name: "mary", results: {"foo" : 0, "bar" : 8}}
{name: "bob", results: {"baz" : 9, "qux" : 0}}
{name: "leia", results: {"foo" : 9, "norf" : 5}}
my query should return 2, because two of the documents have 0 as a value of a nested document of results.
Here's my attempt
db.collection.find({$where : function() {
for (var key in this.results) {
if (this.results[key] === 0) { return true;} } return false; } })
which works on the above dataset, but is too slow. My real data are 100k documents, each having 500 nested documents inside results, and the above query takes a few minutes. Is it possible to design this query in a faster way?

There is no way to do it, other than the one you are doing.
You can only change the schema or use aggregations but I don't think that this is what you want.
There is a post about it you can check here:
mongoDB: find by embedded value

Related

amount of character which are between epiosde 1 and 2 with the name Rick

I have a problem. I am using the https://rickandmortyapi.com/graphql API for graphql. I want to query the amount of character which are between epiosde 1 and 2 with the name Rick. Unfortunately my query is wrong. How could I get the desired output. I want to use the _" ..."Meta to get the meta data.
query {
characters(filter: {name: "Rick"}) {
_episode(filter: {id : 1
id: 2}) {
count
}
}
}

MongoDB Query if (Field A - Field B) > N

I have been stuck on this for several hours now. I need to write a query that returns all documents where (Field A - Field B) > N
// sample data
{ _id: '...', estimated_hours: 0, actual_hours: 0 },
{ _id: '...', estimated_hours: 10, actual_hours: 9 },
{ _id: '...', estimated_hours: 20, actual_hours: 30 }
Borrowing answers from this stack question I wrote the below, In my mind this should have worked, however I am consistently getting records back that do not match the query...
## Attempt 1
n = 0
records = API::Record.where('$where': "(this.estimated_hours - this.actual_hours) > #{n}")
## should return the following, but im getting additional records
#=> [{ _id: '...', estimated_hours: 10, actual_hours: 9 }]
I know I can likely accomplish this with $project however i have to explicitly tell $project what fields I want returned. I need all the fields to be returned, we use a third party library that handles pagination

play
db.collection.find({
$where: "(this.estimated_hours - this.actual_hours) > 1"
})
Similar example for reference

Group list of objects/AR relation by user_id

I have a list of objects which is actually AR Relation. My object has these fields :
{
agreement_id: 1,
app_user_id: 1,
agency_name: 'Small business 1'
..etc..
},
{
agreement_id: 2,
app_user_id: 1,
agency_name: 'Small business 2'
..etc..
}
I m representing my object as a Hash for easier understanding. I need to map my list of objects to format like this :
{
1 => [1,2]
}
This represents a list of agreement_ids grouped by the user. I always know which user I m grouping on. Here is what I've tried so far :
where(app_user_id: user_id).where('...').select('app_user_id, agreement_id').group_by(&:app_user_id)
This gives me the structure what I want but not exactly the data that I want, here is an output of this :
{1=>
[#<Agreement:0x6340fdbb agreement_id: 1, app_user_id: 1>,
#<Agreement:0x91bd4dd agreement_id: 2, app_user_id: 1>]
}
I've also thought I was going to be able to do this with map method, and here is what I tried :
where(app_user_id: user_id).where('....').select('app_user_id, agreement_id').map do |ag|
{ ag.app_user_id => ag.agreement_id }
end.reduce(&:merge)
But it only produces the mapping with the last agreement_id like this :
{1=>2}
I've tried some other things not worth mentioning. Can anyone suggest a way that would make this work?

This might work :
where(app_user_id: user_id)
.where('...')
.select('app_user_id, agreement_id')
.group_by(&:app_user_id).map{|k,v| Hash[k, v.map(&:agreement_id)]}

Try this one
where(app_user_id: user_id).
where('...').
select('app_user_id, agreement_id').
map { |a| [a.app_user_id, a.agreement_id] }.
group_by(&:first)

Keep id order as in query

I'm using elasticsearch to get a mapping of ids to some values, but it is crucial that I keep the order of the results in the order that the ids have.
Example:
def term_mapping(ids)
ids = ids.split(',')
self.search do |s|
s.filter :terms, id: ids
end
end
res = term_mapping("4,2,3,1")
The result collection should contain the objects with the ids in order 4,2,3,1...
Do you have any idea how I can achieve this?

If you need to use search you can sort ids before you send them to elasticsearch and retrive results sorted by id, or you can create a custom sort script that will return the position of the current document in the array of ids. However, a simpler and faster solution would be to simply use Multi-Get instead of search.

One option is to use the Multi GET API. If this doesn't work for you, another solution is to sort the results after you retrieve them from es. In python, this can be done this way:
doc_ids = ["123", "333", "456"] # We want to keep this order
order = {v: i for i, v in enumerate(doc_ids)}
es_results = [{"_id": "333"}, {"_id": "456"}, {"_id": "123"}]
results = sorted(es_results, key=lambda x: order[x['_id']])
# Results:
# [{'_id': '123'}, {'_id': '333'}, {'_id': '456'}]

May be this problem is resolved,, but someone will help with this answer
we can used the pinned_query for the ES. Do not need the loop for the sort the order
**qs = {
"size" => drug_ids.count,
"query" => {
"pinned" => {
"ids" => drug_ids,
"organic" => {
"terms": {
"id": drug_ids
}
}
}
}
}**
It will keep the sequence of the input as it

increment value in a hash

I have a bunch of posts which have category tags in them.
I am trying to find out how many times each category has been used.
I'm using rails with mongodb, BUT I don't think I need to be getting the occurrence of categories from the db, so the mongo part shouldn't matter.
This is what I have so far
#recent_posts = current_user.recent_posts #returns the 10 most recent posts
#categories_hash = {'tech' => 0, 'world' => 0, 'entertainment' => 0, 'sports' => 0}
#recent_posts do |cat|
cat.categories.each do |addCat|
#categories_hash.increment(addCat) #obviously this is where I'm having problems
end
end
end
the structure of the post is
{"_id" : ObjectId("idnumber"), "created_at" : "Tue Aug 03...", "categories" :["world", "sports"], "message" : "the text of the post", "poster_id" : ObjectId("idOfUserPoster"), "voters" : []}
I'm open to suggestions on how else to get the count of categories, but I will want to get the count of voters eventually, so it seems to me the best way is to increment the categories_hash, and then add the voters.length, but one thing at a time, i'm just trying to figure out how to increment values in the hash.

If you aren't familiar with map/reduce and you don't care about scaling up, this is not as elegant as map/reduce, but should be sufficient for small sites:
#categories_hash = Hash.new(0)
current_user.recent_posts.each do |post|
post.categories.each do |category|
#categories_hash[category] += 1
end
end

If you're using mongodb, an elegant way to aggregate tag usage would be, to use a map/reduce operation. Mongodb supports map/reduce operations using JavaScript code. Map/reduce runs on the db server(s), i.e. your application does not have to retrieve and analyze every document (which wouldn't scale well for large collections).
As an example, here are the map and reduce functions I use in my blog on the articles collection to aggregate the usage of tags (which is used to build the tag cloud in the sidebar). Documents in the articles collection have a key named 'tags' which holds an array of strings (the tags)
The map function simply emits 1 on every used tag to count it:
function () {
if (this.tags) {
this.tags.forEach(function (tag) {
emit(tag, 1);
});
}
}
The reduce function sums up the counts:
function (key, values) {
var total = 0;
values.forEach(function (v) {
total += v;
});
return total;
}
As a result, the database returns a hash that has a key for every tag and its usage count as a value. E.g.:
{ 'rails' => 5, 'ruby' => 12, 'linux' => 3 }

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Faster query by value - performance

There is no way to do it, other than the one you are doing. You can only change the schema or use aggregations but I don't think that this is what you want. There is a post about it you can check here: mongoDB: find by embedded value

Related

amount of character which are between epiosde 1 and 2 with the name Rick

MongoDB Query if (Field A - Field B) > N

Group list of objects/AR relation by user_id

Keep id order as in query

increment value in a hash

Categories

Resources