Selecting age count without intervals - ruby

so what I am trying to is write a query that will return a count of people that are each age - not increments. So the count of people that have been alive for 1, 2, 3, ... 67 ... 99, ... years.
I am not familiar with NoSQL but I know that because time is ongoing, the ages count will have to be periodically updated/refreshed. What I was thinking was to have a collection or something that has a key of the age and the value as the number of people that are the age. When a new person is created, it will increment the amount of people in his or her age - then as I said earlier have something to update it.
What I am trying to figure out is if there is a way to actively fetch the amount of amount of people (real time) of all different ages without having a counter. Or if I must use a counter, how can I have the database automatically increment the counter so I don't need to interact with the program?

You can achieve this by using MongoDB's aggregation framework. In order to keep it up to date in real time, what you need to do is the following:
Project an ageMillis field by subtracting the date of birth (dob) from the current date. You will get an age value in milliseconds.
Divide ageMillis by the number of milliseconds in a year (in JavaScript it is 31536000000) and project this onto an ageDecimal field. You don't want to use this age to group because it contains a decimal.
Project the ageDecimal field and a decimal field containing the decimal portion of the age. You are able to do this using the $mod operator.
Subtract decimal from ageDecimal and project it to an age field. This gives you the age value in years.
Group by the age field and keep track of the count using $sum. Basically you add 1 for every document you see for that age.
If needed, sort by age field.
The command in the mongo shell would look something like the command below, using JavaScript's Date() object to get the current date. If you want to do this in Ruby, you would have to change that bit of code and make sure that for the rest, you follow the syntax for the Ruby driver.
db.collection.aggregate([
{ "$project" :
{
"ageMillis" : { "$subtract" : [ new Date(), "$dob" ]}
}
},
{ "$project" :
{
"ageDecimal" : { "$divide" : [ "$ageMillis", 31536000000 ]}
}
},
{ "$project" :
{
"ageDecimal" : "$ageDecimal",
"decimal" : { "$mod" : [ "$ageDecimal", 1 ]}
}
},
{ "$project" :
{
"age" : { "$subtract" : [ "$ageDecimal", "$decimal" ]}
}
},
{ "$group" :
{
"_id" : { "age" : "$age" },
"count" : { "$sum" : 1 }
}
},
{ "$sort" :
{
"_id.age" : 1
}
}
]);
This should give you the results that you want. Note that the aggregate() method returns a cursor. You will have to iterate through it to get the results.

The aggregation framework is the best approach for this. Mongoid exposes the lower level collection object through a .collection accessor. This allows the native driver implementation of aggregate to be used.
The basic math here is:
Rounded Result of:
( difference from date of birth to now in milliseconds /
number of milliseconds in a year )
Feed the current Time value into your aggregation statement to get the current age
res = Model.collection.aggregate([
{ "$group" => {
"_id" => {
"$subtract" => [
{ "$divide" => [
{ "$subtract" => [ Time.now, "$dob" ] },
31536000000
]},
{ "$mod" => [
{ "$divide" => [
{ "$subtract" => [ Time.now, "$dob" ] },
31536000000
]},
1
]}
]
},
"count" => { "$sum" => 1 }
}},
{ "$sort" => { "_id" => -1 } }
])
pp res

Related

How to search through nested array and retreive only matched elements with mongo and springdata [duplicate]

This question already has answers here:
Find in Double Nested Array MongoDB
(2 answers)
Spring data Match and Filter Nested Array
(1 answer)
Closed 3 years ago.
I'm looking to search into my collection and retreive only element who matched Criteria.
Here is my collection :
{
"_id" : "id",
"name" : "test",
"groupUsers" : [
{
"name" : "blabla",
"toys" : [
{
"createdAt" : ISODate("2019-10-30T12:59:41.409Z"),
},
{
"createdAt" : ISODate("2019-11-30T12:59:10.409Z"),
},
{
"createdAt" : ISODate("2019-12-30T12:59:12.409Z"),
}
],
"createdAt" : ISODate("2019-10-30T12:33:39.036Z")
},
{
"name" : "blabla2",
"toys" : [
{
"createdAt" : ISODate("2019-10-32T12:59:41.409Z"),
},
{
"createdAt" : ISODate("2019-11-30T12:59:56.409Z"),
},
{
"createdAt" : ISODate("2019-12-30T12:59:15.409Z"),
}
],
"createdAt" : ISODate("2019-10-32T12:33:39.036Z")
}
],
}
I want to retreive the whole collection but it depends when the user was added to the group for example, user blabla2 (in the example above) will only get the whole group but with only the two last toys of the first user in the response.
Anyway, I guess it's something really basic but I don't know why I can't figure it out.
What I'm Doing
I'm doing a first query to get the current user and get when he was added in the group (notice that the date gets converted into java Date Util here).
Aggregation groupAgg = newAggregation(match(Criteria.where("_id").is(groupId).and("groupUsers.userId").is(userId)));
GroupUser groupUser = mongoTemplate.aggregate(groupAgg, Group.class, GroupUser.class).getUniqueMappedResult();
In a second query, I want to get the whole document but only with the Criteria that I define before.
MatchOperation matchedGroup = match(new Criteria("_id").is(groupId));
MatchOperation matchedToys = match(
new Criteria("groupUsers.toys.createdAt").gte(groupUser.getCreatedAt()));
Aggregation aggregation = newAggregation(matchedGroup, matchedToys);
AggregationResults<Group> result = mongoTemplate.aggregate(aggregation, Group.class, Group.class);
Group group = result.getUniqueMappedResult();
This query doesn't work, and I'm looking to something like even if there is no match (for example, none toys has been created yet), it still return the group basic response and not null.
Maybe I need to unwind the nested array ?
Any help is appreciate. I'm using spring data.
Try this query
db.testers.aggregate([
{
$addFields:{
"groupUsers":{
$map:{
"input":"$groupUsers",
"as":"doc",
"in":{
$mergeObjects:[
"$$doc",
{
"toys":{
$filter:{
"input":"$$doc. toys",
"as":"sn",
"cond": {
"$and": [
{ "$gte": [ "$$sn.createdAt", ISODate('2015-06-17T10:03:46.000Z') ] },
]
}
}
}
}
]
}
}
}
}
}
]).pretty()

MongoDb: how to search in multiple collections?

I got this structure in my mongodb (2 collections: restaurant and cocktail)
restaurant {id=1001, name="Res1", coor=[12.392, 19.123], cocktail=[13, 0, 92]}
cocktail {id=13, name="Capiroska"}, {id=167, name="Capirinha"}, {id=92, name="Negroni"}, {id=0, name="Martini"}
Multiple restaurants and multiple cocktails, N:N relationship.
My goal is to find which different cocktails I can drink within a specified area.
I've already written a query that finds all restaurants near my position like this:
mongoTemplate.find(new Query(Criteria.where("address.location").withinSphere(new Circle(latitude, longitude, radius/6371))
), Restaurant.class);
So that I obtain a list of restaurants.
Next steps are:
How to obtain distinct cocktail's id (no repetitions allowed)
How to look into cocktail collection in order to obtain all cocktail names
TY in advance!
This might not answer your question completely but can help
how to obtain distinct cocktails id (no repetitions allowed)
Your cocktail is in array so direct group or distinct might not work you can use $unwind.
What $unwind does is allow you to peel off a document for each element
and returns that resulting document
eg: for this object
{id=1001, name="Res1", coor=[12.392, 19.123], cocktail=[13, 0, 92]}
and this query
db.temp.aggregate( [ { $unwind: "$cocktail" } ] )
will result in
{ "_id" : 1001, "name" : "Res1", coor=[12.392, 19.123],, "cocktail" : 13 }
{ "_id" : 1001, "name" : "Res1", coor=[12.392, 19.123],, "cocktail" : 0 }
{ "_id" : 1001, "name" : "Res1", coor=[12.392, 19.123],, "cocktail" : 92 }
Now once you have all individual record you can group by cocktail
db.temp.aggregate( [ { $unwind: "$cocktail" },
{
"$group": {
_id: {
"_id": "$_id",
items: {$addToSet: '$cocktail'}}
}
}
}
] );
This should answer your 1st query
For getting cocktail names you need to use lookup, group and project something like this
db.temp.aggregate([
{
"$unwind": "$cocktail"
},
{
"$lookup": {
"from": "cocktail ",
"localField": "restaurant.cocktail._id",
"foreignField": "_id",
"as": "cocktails"
}
},
{ "$unwind": "$cocktails" },
{
"$group": {
"_id": "$_id",
"cocktail": { "$cocktail": "$cocktail" },
}
},
{
"$project": {
"name": 1,
"coor" : 1,
"cocktail.name" : 1,
}
}
]).pretty()
Note: This is just one approach, might not be the best way and also untested.
You can search data from two or more collection using join in MongoDB.
Depending on your scenario,the following links might help.
https://www.mongodb.com/blog/post/joins-and-other-aggregation-enhancements-coming-in-mongodb-3-2-part-1-of-3-introduction
what about using aggregation.
first perform match stage where you get all restaurant then perform unwind operation on cocktail after this you can perform group by on cocktail field. At this stage you have all unique cocktail id then perform lookup stage .
Order of stage
match
project if you want its optional
unwind
group
lookup
project //because you want only name of cocktail instead of complete
collection.
The code is in kotlin just convert it to java if you are using intellij as ide then it will convert it into java for you.
var match = Aggregation.match(Criteria.where("address.location").withinSphere(Circle(latitude, longitude, radius / 6371)))
var project = Aggregation.project("cocktail")
var unwind = Aggregation.unwind("cocktail")
var group = Aggregation.group("cocktail")
var lookup = Aggregation.lookup("your cocktail collection name","_id.cocktail","id","cocktailCollection")
var project1 = Aggregation.project("cocktailCollection.name").andExclude("_id")
var aggregation = Aggregation.newAggregation(match,project,unwind,group,lookup,project1)
println(aggregation) // if you want to see the query
var result = mongoTemplate.aggregate(aggregation,String::class.java)

Springdata mongodb aggregation match

After asking question to understand a bit more of the aggregation framework in MongoDB I finally found the way to do aggregation for my need (thanks to a StackExchange user)
So basically here is a document from my collection:
{
"_id" : ObjectId("s4dcsd5s4d6c54s6d"),
"items" : [
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_3",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
}
]
}
The idea was to be able to filter only some elements of my collections (avoiding Type 2 and 3). In fact I have more than 30 types and 6 are not allowed but for simplicity I made this example.
So the aggregation command in command line is this one:
db.history.aggregate([{
$match: {
_id: ObjectId("s4dcsd5s4d6c54s6d")
}
}, {
$unwind: '$items'
}, {
$match: {
'items.type': { '$nin': [ "TYPE_2" , "TYPE_3"] }
}
},
{ $limit: 10 }
]);
With this I am able to retrieve the 10 elements items of this document which do not match TYPE_2 and TYPE_3
However when I am using spring data there is no output. I looked a bit at the example to build mine but its still not working.
So I did:
Aggregation aggregation = newAggregation(
match(Criteria.where("id").is(myID)),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
AggregationResults<PersonnalHistory> results = mongAccess.getOperation().aggregate(query,
"items", PersonnalHistory.class);
PersonnalHistory is marked with annotation #Document(collection = "history") and id with the #id annotation
ignoreditemstype is a list containing TYPE_2 and TYPE_3
Here is what I have in the toString method of aggregation:
{
"aggregate" : "__collection__" ,
"pipeline" : [
{ "$match": { "id" : "s4dcsd5s4d6c54s6d"} },
{ "$unwind": "$items"},
{ "$match": { "items.type": { "$nin" : [ "TYPE_2" , "TYPE_3" ] } } },
{ "$limit" : 3},
{ "$skip" : 0 }
]
}
I tried a lot of stuff (to have at least an answer :) ) like removing id or the nin:
aggregation = newAggregation(
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
aggregation = newAggregation(
match(Criteria.where("id").is(myid)),
unwind("items")
);
For information when I do a simple query like:
query.addCriteria(Criteria.where("id").is(myID));
My document is returned. However I have thousands of items. So I just want to have the 15 first (in fact the 15 first are the 15 last added)
Do you maybe see what I am doing wrong?
Yeah looks like you are passing simple String while it is expecting ObjectId
Aggregation aggregation = newAggregation(
match(Criteria.where("_id").is(new ObjectId(myID))),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
Now the question is why it works with simple query, my answer would be because spring-data driver is not that mature at least not with aggregation pipeline.

Query Mongo Embedded Documents with a size

I have a ruby on rails app using Mongoid and MongoDB v2.4.6.
I have the following MongoDB structure, a record which embeds_many fragments:
{
"_id" : "76561198045636214",
"fragments" : [
{
"id" : 76561198045636215,
"source_id" : "source1"
},
{
"id" : 76561198045636216,
"source_id" : "source2"
},
{
"id" : 76561198045636217,
"source_id" : "source2"
}
]
}
I am trying to find all records in the database that contain fragments with duplicate source_ids.
I'm pretty sure I need to use $elemMatch as I need to query embedded documents.
I have tried
Record.elem_match(fragments: {source_id: 'source2'})
which works but doesn't restrict to duplicates.
I then tried
Record.elem_match(fragments: {source_id: 'source2', :source_id.with_size => 2})
which returns no results (but is a valid query). The query Mongoid produces is:
selector: {"fragments"=>{"$elemMatch"=>{:source_id=>"source2", "source_id"=>{"$size"=>2}}}}
Once that works I need to update it to $size is >1.
Is this possible? It feels like I'm very close. This is a one-off cleanup operation so query performance isn't too much of an issue (however we do have millions of records to update!)
Any help is much appreciated!
I have been able to achieve desired outcome but in testing it's far too slow (will take many weeks to run across our production system). The problem is double query per record (we have ~30 million records in production).
Record.where('fragments.source_id' => 'source2').each do |record|
query = record.fragments.where(source_id: 'source2')
if query.count > 1
# contains duplicates, delete all but latest
query.desc(:updated_at).skip(1).delete_all
end
# needed to trigger after_save filters
record.save!
end
The problem with the current approach in here is that the standard MongoDB query forms do not actually "filter" the nested array documents in any way. This is essentially what you need in order to "find the duplicates" within your documents here.
For this, MongoDB provides the aggregation framework as probably the best approach to finding this. There is no direct "mongoid" style approach to the queries as those are geared towards the existing "rails" style of dealing with relational documents.
You can access the "moped" form though through the .collection accessor on your class model:
Record.collection.aggregate([
# Find arrays two elements or more as possibles
{ "$match" => {
"$and" => [
{ "fragments" => { "$not" => { "$size" => 0 } } },
{ "fragments" => { "$not" => { "$size" => 1 } } }
]
}},
# Unwind the arrays to "de-normalize" as documents
{ "$unwind" => "$fragments" },
# Group back and get counts of the "key" values
{ "$group" => {
"_id" => { "_id" => "$_id", "source_id" => "$fragments.source_id" },
"fragments" => { "$push" => "$fragments.id" },
"count" => { "$sum" => 1 }
}},
# Match the keys found more than once
{ "$match" => { "count" => { "$gte" => 2 } } }
])
That would return you results like this:
{
"_id" : { "_id": "76561198045636214", "source_id": "source2" },
"fragments": ["76561198045636216","76561198045636217"],
"count": 2
}
That at least gives you something to work with on how to deal with the "duplicates" here

Difference with count result in Mongo group by query with Ruby/Javascript

I'm using Mongoid to get a count of certain types of records in a Mongo database. When running the query with the javascript method:
db.tags.group({
cond : { tag: {$ne:'donotwant'} },
key: { tag: true },
reduce: function(doc, out) { out.count += 1; },
initial: { count: 0 }
});
I get the following results:
[
{"tag" : "thing", "count" : 4},
{"tag" : "something", "count" : 1},
{"tag" : "test", "count" : 1}
]
Does exactly what I want it to do. However, when I utilize the corresponding Mongoid code to perform the same query:
Tag.collection.group(
:cond => {:tag => {:$ne => 'donotwant'}},
:key => [:tag],
:reduce => "function(doc, out) { out.count += 1 }",
:initial => { :count => 0 },
)
the count parameters are (seemingly) selected as floats instead of integers:
[
{"tag"=>"thing", "count"=>4.0},
{"tag"=>"something", "count"=>1.0},
{"tag"=>"test", "count"=>1.0}
]
Am I misunderstanding what's going on behind the scenes? Do I need to (can I?) cast those counts or is the javascript result just showing it without the .0?
JavaScript doesn't distinguish between floats and ints. It has one Number type that is implemented as a double. So what you are seeing in Ruby is correct, the mongo shell output follows javascript printing conventions and displays Numbers that don't have a decimal component without the '.0'

Resources