i have List<Item> A in my database and i got List<Item> B from outside. and I want to get the
List A and B have the same properties.
List<Item> A
{ ID : 1, Name: "Tool A" , Status : 1, Price : 100 }
{ ID : 2, Name: "Tool B" , Status : 2, Price : 200 }
{ ID : 3, Name: "Tool C" , Status : 3, Price : 300 }
{ ID : 4, Name: "Tool D" , Status : 4, Price : 400 }
{ ID : 5, Name: "Tool E" , Status : 5, Price : 500 }
List<Item> B
{ ID : 1, Name: null , Status : 1, Price : 100 }
{ ID : 2, Name: null , Status : 2, Price : 200 }
{ ID : 3, Name: null , Status : 3000, Price : 300 }
{ ID : 4, Name: null , Status : 4, Price : 40000 }
{ ID : 5, Name: null , Status : 5, Price : 500 }
I want to store updated data into database which would be
{ ID : 3, Name: "Tool C", Status : 3000, Price : 300 }
{ ID : 4, Name: "Tool D", Status : 4, Price : 40000 }
the values of the property "Name" are from List<Item> A even though they are null in List<Item> B
I was thinking something like this but it gives me an odd results
var ChagnedList = A.AsEnumerable()
.Where(aa => B.Any(bb => aa.Status != bb.Status || aa.Price != bb.Price))
You can use ExceptBy method in MoreLinq:
var ChagnedList = A.AsEnumerable()
.ExceptBy(B, item => new {item.Status,item.Price});
Related
From the below response, want to fail the test if the queryTime value is more than 1000ms.
Response Data:
{
"metadata" :{
"count" : 1,
"pageSize" : 100,
"page" : 1,
"TotalPages" : 1,
"queryTime" : "5224ms"
},
"result": {
"transactionName" : "Test"
}
}
You can use a JSR223 assertion for this with some script inside,
var json = JSON.parse(prev.getResponseDataAsString());
var queryTime = json.metadata.queryTime
var time = parseInt(queryTime.split("m")[0])
if (time > 1000 )
{
log.info("QueryTime " + time);
AssertionResult.setFailure(true);
}
I have People documents in my elastic index and each person has multiple addresses, each address has a lat/long point associated.
I'd like to geo sort all the people by proximity to a specific origin location, however multiple locations per person complicates this matter. What has been decided is [Objective:] to take the shortest distance per person to the origin point and use that number as the sort number.
Example of my people index roughed out in 'pseudo-JSON' showing a couple of person documents each having multiple addresses:
person {
name: John Smith
addresses [
{ lat: 43.5234, lon: 32.5432, 1 Main St. }
{ lat: 44.983, lon: 37.3432, 2 Queen St. W. }
{ ... more addresses ... }
]
}
person {
name: Jane Doe
addresses [
... she has a bunch of addresses too ...
]
}
... many more people docs each having multiple addresses like above ...
Currently I'm using an elastic script field with an inline groovy script like so - it uses a groovy script to calculate meters from origin for each address, shoves all those meter distances into an array per person and picks the minimum number from the array per person making it the sort value.
string groovyShortestDistanceMetersSortScript = string.Format("[doc['geo1'].distance({0}, {1}), doc['geo2'].distance({0}, {1})].min()",
origin.Latitude,
origin.Longitude);
var shortestMetersSort = new SortDescriptor<Person>()
.Script(sd => sd
.Type("number")
.Script(script => script
.Inline(groovyShortestDistanceMetersSortScript)
)
.Order(SortOrder.Ascending)
);
Although this works, I wonder if using a scripted field might be more expensive or too complex at querying time, and if there is a better way to achieve the desired sort order outcome by indexing the data differently and/or by using aggregations, maybe even doing away with the script field altogether.
Any thoughts and guidance are appreciated. I'm sure somebody else has run into this same requirement (or similar) and has found a different or better solution.
I'm using the Nest API in this code sample but will gladly accept answers in elasticsearch JSON format because I can port those into the NEST API code.
When sorting on distance from a specified origin where the field being sorted on contains a collection of values (in this case geo_point types), we can specify how a value should be collected from the collection using the sort_mode. In this case, we can specify a sort_mode of "min" to use the nearest location to the origin as the sort value. Here's an example
public class Person
{
public string Name { get; set; }
public IList<Address> Addresses { get; set; }
}
public class Address
{
public string Name { get; set; }
public GeoLocation Location { get; set; }
}
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var indexName = "people";
var connectionSettings = new ConnectionSettings(pool)
.InferMappingFor<Person>(m => m.IndexName(indexName));
var client = new ElasticClient(connectionSettings);
if (client.IndexExists(indexName).Exists)
client.DeleteIndex(indexName);
client.CreateIndex(indexName, c => c
.Settings(s => s
.NumberOfShards(1)
.NumberOfReplicas(0)
)
.Mappings(m => m
.Map<Person>(mm => mm
.AutoMap()
.Properties(p => p
.Nested<Address>(n => n
.Name(nn => nn.Addresses.First().Location)
.AutoMap()
)
)
)
)
);
var people = new[] {
new Person {
Name = "John Smith",
Addresses = new List<Address>
{
new Address
{
Name = "Buckingham Palace",
Location = new GeoLocation(51.501476, -0.140634)
},
new Address
{
Name = "Empire State Building",
Location = new GeoLocation(40.748817, -73.985428)
}
}
},
new Person {
Name = "Jane Doe",
Addresses = new List<Address>
{
new Address
{
Name = "Eiffel Tower",
Location = new GeoLocation(48.858257, 2.294511)
},
new Address
{
Name = "Uluru",
Location = new GeoLocation(-25.383333, 131.083333)
}
}
}
};
client.IndexMany(people);
// call refresh for testing (avoid in production)
client.Refresh("people");
var towerOfLondon = new GeoLocation(51.507313, -0.074308);
client.Search<Person>(s => s
.MatchAll()
.Sort(so => so
.GeoDistance(g => g
.Field(f => f.Addresses.First().Location)
.PinTo(towerOfLondon)
.Ascending()
.Unit(DistanceUnit.Meters)
// Take the minimum address location distance from
// our target location, The Tower of London
.Mode(SortMode.Min)
)
)
);
}
This creates the following search
{
"query": {
"match_all": {}
},
"sort": [
{
"_geo_distance": {
"addresses.location": [
{
"lat": 51.507313,
"lon": -0.074308
}
],
"order": "asc",
"mode": "min",
"unit": "m"
}
}
]
}
which returns
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : null,
"hits" : [ {
"_index" : "people",
"_type" : "person",
"_id" : "AVcxBKuPlWTRBymPa4yT",
"_score" : null,
"_source" : {
"name" : "John Smith",
"addresses" : [ {
"name" : "Buckingham Palace",
"location" : {
"lat" : 51.501476,
"lon" : -0.140634
}
}, {
"name" : "Empire State Building",
"location" : {
"lat" : 40.748817,
"lon" : -73.985428
}
} ]
},
"sort" : [ 4632.035195223564 ]
}, {
"_index" : "people",
"_type" : "person",
"_id" : "AVcxBKuPlWTRBymPa4yU",
"_score" : null,
"_source" : {
"name" : "Jane Doe",
"addresses" : [ {
"name" : "Eiffel Tower",
"location" : {
"lat" : 48.858257,
"lon" : 2.294511
}
}, {
"name" : "Uluru",
"location" : {
"lat" : -25.383333,
"lon" : 131.083333
}
} ]
},
"sort" : [ 339100.6843074794 ]
} ]
}
}
The value returned in the sort array for each hit is the minimum distance in the sort unit specified (in our case, metres) from the specified point (The Tower of London) and the addresses for each person.
Per the guidelines in Sorting By Distance documentation, often it can make more sense to score by distance, which can be achieved by using function_score query with a decay function.
I try to go ahead with sum using category and subcategory datas but have to say it's difficult to get the right syntaxe.
Here is my datas :
{ id : 1, category: 12, subcategory: 14, inv_amt: 470 },
{ id : 2, category: 12, subcategory: 14, inv_amt: 660 },
{ id : 3, category: 12, subcategory: 15, inv_amt: 605 },
{ id : 4, category: 13, subcategory: 14, inv_amt: 4760 },
{ id : 5, category: 13, subcategory: 16, inv_amt: 6600 },
{ id : 6, category: 13, subcategory: 16, inv_amt: 6050 },
{ id : 7, category: 14, subcategory: 17, inv_amt: 460 }
I would like to get the total amount by categories and subcategory, as :
for category = 12
subcategory : 14 => Sum = 1130
subcategory : 15 => Sum = 605
for (category = 13) :
subcategory : 14 => Sum = 4760
subcategory : 16 => Sum = 12650
for (category = 14) :
subcategory : 17 => Sum = 460
I use this code which gave me the sum for a category.
this.totalCreances = this.creances
.filter(c => c.category === remise.category)
.map(c => c.inv_amt)
.reduce((sum, current) => sum + current);
It give one result (number)
I tried to add a second filter (on the map), but it didn't work.
How is it possible to made a 2nd selection on the result of the first one ?
Thanks,
Bea
You can achieve this using array.reduce
items.reduce<{[id:number]: {[id:number]: number}}>((acc, cur) => {
if (cur.category in acc) {
if (cur.subcategory in acc[cur.category]) {
acc[cur.category][cur.subcategory] += cur.inv_amt;
} else {
acc[cur.category][cur.subcategory] = cur.inv_amt;
}
} else {
acc[cur.category] = {
[cur.subcategory]: cur.inv_amt
}
}
return acc;
}, {});
This will produce an object that contains your data. With your example data, that object looks like this:
{ '12': { '14': 1130, '15': 605 },
'13': { '14': 4760, '16': 12650 },
'14': { '17': 460 } }
You can then traverse that object to print out the various totals.
The generic form of what you're looking for is called groupBy. You could probably find an implementation of that for JS and use that if you preferred to.
There's another way to do it which is slightly slower but may be intuitively clearer. This includes the printing step.
let uniqueCategories = [...new Set(items.map(x => x.category))];
uniqueCategories.forEach(category => {
console.log(`for (category = ${category}) :`);
// get entries with a category
let subcategories = [...new Set(items.filter(x => x.category === category).map(x => x.subcategory))];
subcategories.forEach(subcategory => {
let sum :number = items.filter(x => x.category === category && x.subcategory === subcategory)
.reduce((acc, cur) => acc + cur.inv_amt, 0);
console.log(`subcategory : ${subcategory} => Sum = ${sum}`)
});
});
You get the categories, you loop through them, you get the subcategories, you loop through them. You filter the entire list down each time.
I need to do an update to a specific soldier in this user collection:
For example:
user: {
myArmy : {
money : 100,
fans : 100,
mySoldiers : [{
_id : ddd111bbb,
mySkill : 50,
myStamina : 50,
myMoral : 50,
},
{
_id : ddd111dd ,
mySkill : 50,
myStamina : 50,
myMoral : 50,
}],
}
}
I want in my update query to do like the following:
conditions = { _id : user._id };
update =
{ 'myArmy.mySoldiers._id' : soldierId},
{
'$set': {
'myArmy.money' : balanceToSet,
'myArmy.fans' : fansToSet,
'myArmy.mySoldiers.$.skill': skillToSet,
'myArmy.mySoldiers.$.stamina': staminaToSet,
'myArmy.mySoldiers.$.moral': moralToSet
}
}
and this is the final query:
User.update(conditions, update, options, function(err){
if (err) deferred.reject;
stream.resume();
});
And the end result if soldierId is 'ddd111bbb':
user: {
myArmy : {
money : 200,
fans : 100,
mySoldiers : [{
_id : ddd111bbb,
mySkill : 150,
myStamina : 250,
myMoral : 50,
},
{
_id : ddd111dd ,
mySkill : 50,
myStamina : 50,
myMoral : 50,
}],
}
}
Those skill, moral and stamina should change only on the specific soldier.
How do i get the $ to be the index number of this soldier, what is missing from the update query above?
This is what i was looking for:
conditions = { _id : user._id , 'myArmy.mySoldiers._id' : soldierId};
update = {
$set: {
'myArmy.balance': balanceToSet,
'myArmy.fans' : fansToSet,
'myArmy.tokens' : tokensToSet,
'myArmy.mySoldiers.$.skill' : skillToSet,
'myArmy.mySoldiers.$.stamina': staminaToSet,
'myArmy.mySoldiers.$.moral' : moralToSet
}
}
This gave me the result i wanted, before i accidentally inserted the condition query with the update one...
About
I have raw data which is processed by aggregation framework and later results saved in another collection. Let's assume aggregation results in something like this:
cursor = {
"result" : [
{
"_id" : {
"x" : 1,
"version" : [
"2_0"
],
"date" : {
"year" : 2015,
"month" : 3,
"day" : 26
}
},
"x" : 1,
"count" : 2
}
],
"ok" : 1
};
Note that in most cases cursor length are more than about 2k elements.
So, now i'm looping thought cursor ( cursor.forEach ) and performing following steps:
// Try to increment values:
var inc = db.runCommand({
findAndModify: my_coll,
query : {
"_id.x" : "1",
"value.2_0" : {
"$elemMatch" : {
"date" : ISODate("2015-12-18T00:00:00Z")
}
}
},
update : { $inc: {
"value.2_0.$.x" : 1
} }
});
// If there's no effected row via inc operation, - sub-element doesn't exists at all
// so let's push it
if (inc.value == null) {
date[date.key] = date.value;
var up = db.getCollection(my_coll).update(
{
"_id.x" : 1
},
{
$push : {}
},
{ writeConcern: { w: "majority", wtimeout: 5000 } }
);
// No document found for inserting sub element, let's create it
if (up.nMatched == 0) {
db.getCollection(my_coll).insert({
"_id" : {
"x" : 1
},
"value" : {}
});
}}
Resulting data-structure:
data = {
"_id" : {
"x" : 1,
"y" : 1
},
"value" : {
"2_0" : [
{
"date" : ISODate("2014-12-17T00:00:00.000Z"),
"x" : 1
},
{
"date" : ISODate("2014-12-18T00:00:00.000Z"),
"x" : 2
}
]
}
};
In short i have to apply these actions to process my data:
Try to increment values.
If there's no effected data by increment operation push data to array.
If there's no effected data by push operation create new document.
Problem:
In some cases aggregation result returns more than 2k results i have to apply mentioned steps who causes performance bottleneck. While i'm processing already aggregated data, - new data accumulates for aggregation and later i cannot apply even aggregation to this new raw data because it exceeds 64MB size limit due firsts slowness.
Question:
How i can with this data-structure improve performance when increasing x ( see data-stucture ) values or adding-sub elements?
Also i cannot apply mongodb bulk operations due nested structure using positional parameter.
Maybe chosen data-model is not correct? Or maybe i'm doing not correctly aggregation task at all?
How i can improve aggregated data insertions?