Update object in array with new fields mongodb - ruby

ai have some mongodb document
horses is array with id, name, type
{
"_id" : 33333333333,
"horses" : [
{
"id" : 72029,
"name" : "Awol",
"type" : "flat",
},
{
"id" : 822881,
"name" : "Give Us A Reason",
"type" : "flat",
},
{
"id" : 826474,
"name" : "Arabian Revolution",
"type" : "flat",
}
}
I need to add new fields
I thought something like that, but I did not go to his head
horse = {
"place" : 1,
"body" : 11
}
Card.where({'_id' => 33333333333}).find_and_modify({'$set' => {'horses.' + index.to_s => horse}}, upsert:true)
But all existing fields are removed and inserted new how to do that would be new fields added to existing

Indeed, this command will overwrite the subdocument
'$set': {
'horses.0': {
"place" : 1,
"body" : 11
}
}
You need to set individual fields:
'$set': {
'horses.0.place': 1,
'horses.0.body': 11
}

Related

Get document by min size of array in Mongodb

I have mongo collection:
{
"_id" : 123,
"index" : "111",
"students" : [
{
"firstname" : "Mark",
"lastname" : "Smith"),
}
],
}
{
"_id" : 456,
"index" : "222",
"students" : [
{
"firstname" : "Mark",
"lastname" : "Smith"),
}
],
}
{
"_id" : 789,
"index" : "333",
"students" : [
{
"firstname" : "Neil",
"lastname" : "Smith"),
},
{
"firstname" : "Sofia",
"lastname" : "Smith"),
}
],
}
I want to get document that has index that is in the set of the given indexes, for example givenSet = ["111","333"] and has min length of students array.
Result should be the first document with _id:123, because its index is in the givenSet and studentsArrayLength = 1, which is smaller than third.
I need to write custom JSON #Query for Spring Mongo Repository. I am new to Mongo and am stuck a bit with this problem.
I wrote something like this:
#Query("{'index':{$in : ?0}, length:{$size:$students}, $sort:{length:-1}, $limit:1}")
Department getByMinStudentsSize(Set<String> indexes);
And got error: error message '$size needs a number'
Should I just use .count() or something like that?
you should use the aggregation framework for this type of query.
filter the result based on your condition.
add a new field and assign the array size to it.
sort based on the new field.
limit the result.
the solution should look something like this:
db.collection.aggregate([
{
"$match": {
index: {
"$in": [
"111",
"333"
]
}
}
},
{
"$addFields": {
"students_size": {
"$size": "$students"
}
}
},
{
"$sort": {
students_size: 1
}
},
{
"$limit": 1
}
])
working example: https://mongoplayground.net/p/ih4KqGg25i6
You are getting the issue because the second param should be enclosed in curly braces. And second param is projection
#Query("{{'index':{$in : ?0}}, {length:{$size:'$students'}}, $sort:{length:1}, $limit:1}")
Department getByMinStudentsSize(Set<String> indexes);
Below is the mongodb query :
db.collection.aggregate(
[
{
"$match" : {
"index" : {
"$in" : [
"111",
"333"
]
}
}
},
{
"$project" : {
"studentsSize" : {
"$size" : "$students"
},
"students" : 1.0
}
},
{
"$sort" : {
"studentsSize" : 1.0
}
},
{
"$limit" : 1.0
}
],
{
"allowDiskUse" : false
}
);

Aggregating on generic nested array in Elasticsearch with NEST

I'm trying to analyse data with Elasticsearch. I've started working with Elasticsearch and Nest about four months ago, so I might have missed some obvious stuff. All examples are simplified or altered, but the core is the same.
The data contains an array of nested objects, each of which also contain an array of nested objects, and again, each contains an array of nested objects. The data is obtained from an information request which contains XML messages. The messages are parsed and each element containing (multiple) text elements is saved with their element name, location, and an array with all text element names and values under the message name. I'm thinking this set-up might make analyzing the data easier.
Mapping example:
{
"data" : {
"properties" : {
"id" : { "type" : "string" },
"action" : { "type" : "string" },
"result" : { "type" : "string" },
"details" : {
"type" : "nested",
"properties" : {
"description" : { "type" : "string" },
"message" : {
"type" : "nested",
"properties" : {
"name" : { "type" : "string" },
"nodes" : {
"type" : "nested",
"properties" : {
"name" : { "type" : "string" },
"value" : { "type" : "string" }
}
},
"source" : { "type" : "string" }
}
}
}
}
}
}
}
Data example:
{
"id" : "123456789",
"action" : "GetInformation",
"result" : "Success",
"details" : [{
"description" : "Request",
"message" : [{
"name" : "Body",
"source" : "Message|Body",
"nodes" : [{
"name" : "Action",
"value" : "GetInformation"
}, {
"name" : "Identity",
"value" : "1234"
}
]
}
]
}, {
"description" : "Response",
"message" : [{
"name" : "Object",
"source" : "Message|Body|Object",
"nodes" : [{
"name" : "ID",
"value" : "123"
}, {
"name" : "Name",
"value" : "Jim"
}
]
}, {
"name" : "Information",
"source" : "Message|Body|Information",
"nodes" : [{
"name" : "Type",
"value" : "Birth City"
}, {
"name" : "City",
"value" : "Los Angeles"
}
]
}, {
"name" : "Information",
"source" : "Message|Body|Information",
"nodes" : [{
"name" : "Type",
"value" : "City of Residence"
}, {
"name" : "City",
"value" : "New York"
}
]
}
]
}
]
}
XML Example:
<Message>
<Body>
<Object>
<ID>123</ID>
<Name>Jim</Name>
</Object>
<Information>
<Type>Birth City</Type>
<City>Los Angeles</City>
<Information>
<Information>
<Type>City of Residence</Type>
<City>New York</City>
<Information>
</Body>
</Message>
I want to analyse the Name and Value properties of Nodes so I can get an overview of each city within the index that functions as a birthplace and how many people were born in them. Something like:
Dictionary<string, int> birthCities = {
{"Los Angeles", 400}, {"New York", 800},
{"Detroit", 500}, {"Michigan", 700} };
The code I have so far:
var response = client.Search<Data>(search => search
.Query(query =>
query.Match(match=> match
.OnField(data=>data.Action)
.Query("GetInformation")
)
)
.Aggregations(a1 => a1
.Nested("Messages", messages => messages
.Path(data => data.details.FirstOrDefault().Message)
.Aggregations(a2 => a2
.Terms("Sources", termSource => termSource
.Field(data => data.details.FirstOrDefault().Message.FirstOrDefault().Source)
.Aggregations(a3 => a3
.Nested("Nodes", nodes => nodes
.Path(dat => data.details.FirstOrDefault().Message.FirstOrDefault().Nodes)
.Aggregations(a4 => a4
.Terms("Names", termName => termName
.Field(data => data.details.FirstOrDefault().Message.FirstOrDefault().Nodes.FirstOrDefault().Name)
.Aggregations(a5 => a5
.Terms("Values", termValue => termValue
.Field(data => data.details.FirstOrDefault().Message.FirstOrDefault().Nodes.FirstOrDefault().Value)
)
)
)
)
)
)
)
)
)
)
);
var dict = new Dictionary<string, long>();
var sAggr = response.Aggs.Nested("Messages").Terms("Sources");
foreach (var item in sAggr.Items)
{
if (item.Key.Equals("information"))
{
var nAggr = item.Nested("Nodes").Terms("Names");
foreach (var nItem in nAggr.Items)
{
if (nItem.Key.Equals("city"))
{
var vAgg = nItem.Terms("Values");
foreach (var vItem in vAgg.Items)
{
if (!dict.ContainsKey(vItem.Key))
{
dict.Add(vItem.Key, 0);
}
dict[vItem.Key] += vItem.DocCount;
}
}
}
}
}
This code gives me every city and how many times they occur, but since they're saved with the same element name and at the same location (both of which I'm not able to change), I've found no way to distinguish between birth cities and cities of residence.
Specific types for each action are sadly not an option. So my question is: How can I count all occurrences of a city name with Birth City type, preferably without having to import and go through all documents.

Recursively create hierarchical hash from Mongo collection in Ruby

I am trying to create a hash that looks like this:
{"difficulty"=>{"easy"=>{}, "normal"=>{}, "hard"=>{}}, "terrain"=>{"snow"=>{"sleet"=>{}, "powder"=>{}}, "jungle"=>{}, "city"=>{}}}
From a MongoDB collection which is and enumerable list of hashes that looks like this:
{
"_id" : "globalSettings",
"groups" : [
"difficulty",
"terrain"
],
"parent" : null,
"settings" : {
"maxEnemyCount" : 10,
"maxDamageInflicted" : 45,
"enemyHealthPoints" : 40,
"maxEnemySpeed" : 25,
"maxPlayerSpeed" : 32,
"lightShader" : "diffuse",
"fogDepth" : 12,
"terrainModifier" : 9
}
},
{
"_id" : "difficulty",
"groups" : [
"easy",
"normal",
"hard"
],
"parent" : "globalSettings",
"settings" : {
}
}
{
"_id" : "terrain",
"groups" : [
"snow",
"jungle",
"city"
],
"parent" : "globalSettings",
"settings" : {
}
}
{
"_id" : "snow",
"groups" : [
"sleet",
"powder"
],
"parent" : "terrain",
"settings" : {
"fogDepth" : 4
}
}
{
"_id" : "jungle",
"groups" : [ ],
"parent" : "terrain",
"settings" : {
"terrainModifier" : 6
}
}
{
"_id" : "city",
"groups" : [ ],
"parent" : "terrain",
"settings" : {
"lightShader" : "bumpedDiffuse"
}
}
{
"_id" : "easy",
"groups" : [ ],
"parent" : "difficulty",
"settings" : {
"maxEnemyCount" : 5
}
}
{
"_id" : "normal",
"groups" : [ ],
"parent" : "difficulty",
"settings" : {
}
}
{
"_id" : "hard",
"groups" : [ ],
"parent" : "difficulty",
"settings" : {
"maxEnemyCount" : 20
}
}
{
"_id" : "sleet",
"groups" : [ ],
"parent" : "snow",
"settings" : {
"fogDepth" : 2
}
}
{
"_id" : "powder",
"groups" : [ ],
"parent" : "snow",
"settings" : {
"terrainModifier" : 2
}
}
Every time I try to write the function, I get stuck when setting the parent of the group. How do I recurse, yet keep track of the path of hierarchy?
The closest I've come is with this:
def dbCurse(nodes, parent = nil)
withParent, withoutParent = nodes.partition { |n| n['parent'] == parent }
withParent.map do |node|
newNode={}
newNode[node["_id"]]={}
newNode[node["_id"]].merge(node["_id"] => dbCurse(withoutParent, node['_id']))
end
end
which gives me a crazy mix of arrays and hashes:
{"globalSettings"=>[{"difficulty"=>[{"easy"=>[]}, {"normal"=>[]}, {"hard"=>[]}]}, {"terrain"=>[{"snow"=>[{"sleet"=>[]}, {"powder"=>[]}]}, {"jungle"=>[]}, {"city"=>[]}]}]}
I think the arrays are getting mixed in there from the #map but I'm not sure how to get rid of them to get the clean hash of hashes I show at the top of my question.
Thank you,
David
So looking at your sample input, I'm going to make a core assumption:
The order of the hash objects in the input list are somewhat well-defined. Namely, that if the globalSettings hash refers to the group difficulty, then the next object in the list with _id == 'difficulty' and parent == 'globalSettings' is the correct match.
If this is true, then you can write a function that accepts a description of what you're looking for (i.e., the object with _id == 'difficulty' and parent == 'globalSettings') along with a reference to where you want that data stored, which can then recurse using deeper references.
def doit(obj_list, work = {})
work.each do |key, data|
# fetch the node
node_i = obj_list.index { |n| n['_id'] == key && n['parent'] == data[:parent] } or next
node = obj_list.delete_at(node_i)
# for each group of this node, create a new hash
# under this node's output pointer and queue it for parsing
new_work = {}
node['groups'].each do |group|
data[:output][group] = {}
new_work[group] = { parent: key, output: data[:output][group] }
end
# get the group data for this node
doit(obj_list, new_work)
end
end
input_data = JSON.parse(IO.read './input.json')
output_data = {}
doit( input_data, 'globalSettings' => { parent: nil, output: output_data } )
The trick here is that I'm handing the recursive call to doit the names of the next objects that I'm looking for from the list (using the current object's group list) and pairing each of those desired names with their parent and a reference to where I want the function to put the found data. Each recursive call to doit will use deeper and deeper references into the original output hash.

Update an existing collection mongodb

I have a collection
{
"_id" : 100000001,
"horses" : []
"race" : {
"date" : ISODate("2014-06-05T00:00:00.000Z"),
"time" : ISODate("2014-06-05T02:40:00.000Z"),
"type" : "Flat",
"name" : "Hindwoods Maiden Stakes (Div I)",
"run_befor" : 11,
"finish" : null,
"channel" : "ATR",
},
"track" : {
"fences" : 0,
"omitted" : 0,
"hdles" : 0,
"name" : "Lingfield",
"country" : "GB",
"type" : "",
"going" : "Good"
}
}
I'm trying to update it
#result value
{
"race":{
"run_after":"10",
"finish":{
"time":152.34,
"slow":1,
"fast":0,
"gap":5.34
}
},
"track":{
"name":"Lingfield",
"country":"GB",
"type":"",
"going":"Good",
"fences":0,
"omitted":0,
"hdles":0
}
}
Card.where(_id:100000001).update(#result)
When I update the collection of all data is deleted and inserted new
If do set() same
How do to upgrade the existing collection records are updated and not existing added?

Dynamic fields and slow queries

Currently, I'm managing a set of lists containing a number of members.
Every list can look different, when it comes to fields and the naming of these fields.
Typically, a basic list member could look like so (from my members collection):
{
"_id" : ObjectId("52284ae408edcb146200009f"),
"list_id" : 1,
"status" : "active",
"imported" : 1,
"fields" : {
"firstname" : "John",
"lastname" : "Doe",
"email" : "john#example.com",
"birthdate" : ISODate("1977-09-03T23:08:20.000Z"),
"favorite_color" : "Green",
"interests" : [
{
"id" : 8,
"value" : "Books"
},
{
"id" : 10,
"value" : "Travel"
},
{
"id" : 12,
"value" : "Cooking"
},
{
"id" : 15,
"value" : "Wellnes"
}
]
},
"created_at" : ISODate("2012-05-06T15:12:26.000Z"),
"updated_at" : ISODate("2012-05-06T15:12:26.000Z")
}
All the fields under the "fields" index, is fields that is unique for the current list id - and these fields can change for every list ID, which means a new list could look like so:
{
"_id" : ObjectId("52284ae408edcb146200009f"),
"list_id" : 2,
"status" : "active",
"imported" : 1,
"fields" : {
"fullname" : "John Doe",
"email" : "john#example.com",
"cell" : 123456787984
},
"created_at" : ISODate("2012-05-06T15:12:26.000Z"),
"updated_at" : ISODate("2012-05-06T15:12:26.000Z")
}
Currently, my application is allowing users to search dynamically in each of the customs fields, but since they have no indexes, this process can be very slow.
I don't believe it's an option to allow list creaters to select which fields should be indexed - but I really need to speed this up.
Is there any solution for this?
If you refactor your documents in a way that you have an array of fields, you can leverage indexes.
fields: [
{ name: 'fullName', value: 'John Doe' },
{ name: 'email', value: 'john#example.com' },
...
]
Create an index on fields.name and fields.value.
Of course this is not a solution for "deeper" values like your interests list.

Resources