I have the following aggregation query that works when I use the command line in Mongo.
{'$group':
{ '_id':
{'serviceName': '$serviceName'},
'timeAverage':
{'$avg':
{'$subtract': ['$lastCheckTime', '$enqueuedTime']}
}
}
}
But as far as I can tell, in Spring MongoDB there is no support for doing "subtract" inside of an avg operation in a group operation.
How would I go about making this work?
You could try projecting the difference field first by using the SpEL andExpression in the projection operation and then use it in the avg accumulator in the group operation:
Aggregation agg = newAggregation(
project("serviceName")
.andExpression("lastCheckTime - enqueuedTime").as("interval")
group("serviceName")
.avg("interval").as("timeAverage")
);
or use the $subtract arithmetic aggregation operator which is supported in Spring Data MongoDB as minus()
Aggregation agg = newAggregation(
project("serviceName")
.and("lastCheckTime").minus("enqueuedTime").as("interval")
group("serviceName")
.avg("interval").as("timeAverage")
);
This translates to following native aggregation operation:
[
{
"$project": {
"serviceName": 1,
"interval": { "$subtract":[ "$lastCheckTime", "$enqueuedTime" ] }
}
},
{
"$group": {
"_id": "$serviceName",
"timeAverage": { "$avg": "$interval" }
}
}
]
Related
I've been at this for a day and I don't quite understand how I do it! This is the query I want to "recreate" with the new Java API Client (using Spring Boot)
{
"aggs": {
"range": {
"date_range": {
"field": "timestamp",
"ranges": [
{ "to": "now-2d" }
]
}
}
,
"aggs": {
"top_hits": {
"_source": {
"includes": [ "Id", "timestamp" ]
}
}
}
}
}
I tried doing it with DateRangeAggregation.of but I can't seem to get the right results or type. Here's what I have
SearchResponse<MyDto> response = client.search(b -> b
.index("test-index")
.size(0)
.aggregations("range",a->a.dateRange(DateRangeAggregation.of(d->d
.field("timestamp").ranges(r->r.to(t->t.expr("now-2d")))))),
.aggregations("hits", a -> a
.topHits(h->h.source(SourceConfig.of(c->c.filter(f->f.includes(Arrays.asList("Id", "timestamp"))))))),
MyDto.class
);
I've also tried removing the subaggregation and query for now, but I don't seem to be on the right track to even get the number of doc_count from the bucket. I kind of don't get how to work with the dateRange() here.
Edit: I played around a bit and was able to at least get the number of doc_count, I'm not very sure if this is a good way to do it though?
Aggregation agg = Aggregation.of(a -> a
.dateRange(d->d.field("timestamp").ranges(r->r.to(FieldDateMath.of(v->v.expr("now-2d"))))));
SearchResponse<MyDto> response = client.search(b -> b
.index("test-index")
.size(0)
.aggregations("range", agg),
MyDto.class
);
return response.aggregations().get("range").dateRange().buckets().array().get(0).docCount();
I also fixed the query above, it had an unnecessary extra query that broke the result.
My thought process was wrong. I wanted the documents that were aggregated within this a time but I misunderstood and thought tophits would give them to me, but that's not how it works! I made a seperate range query that actually queries the documents I needed back instead.
Mongodb states that it is possible to put multiple conditions inside cond of filter. Like this:
{
$filter: {
input: [ 1, "a", 2, null, 3.1, NumberLong(4), "5" ],
as: "num",
cond: { $and: [
{ $gte: [ "$$num", NumberLong("-9223372036854775807") ] },
{ $lte: [ "$$num", NumberLong("9223372036854775807") ] }
] }
}
}
Source: - https://docs.mongodb.com/manual/reference/operator/aggregation/filter/
I would like to do the same inside spring data mongodb. I have this aggregation:
val aggregation = Aggregation.newAggregation(
project()
.and(
filter("some")
.`as`("some")
.by(
ComparisonOperators.Lte.valueOf("some.field").lessThanEqualToValue(2000)
/* here i would like to put "and" like .and(valueOf("some.field").greaterThanValue(1000) but it does not exists on returned value of valueOf*/
)
)
)
How can i do this ? I was looking for some "and" function on the level of "by" or "filter" but it does not exists. Adding another and will override my previous value (if for example i would like to filter using range of some.field or by multiple fields). How do i do this for multiple fields ?
I have articles & tags collection. Articles contain tags which is array of objectId. I want to fetch tagName as well, so I unwind (this gives me multiple rows - 1 per tag array entry) => lookup (joins with tabs collection) => group (combine it into original result set)
My mongodb query is as follows, which gives me correct result:
db.articles.aggregate([
{"$unwind": "$tags"},
{
"$lookup": {
"localField": "tags",
"from": "tags",
"foreignField": "_id",
"as": "materialTags"
}
},
{
"$group": {
"_id": "$_id",
"title": {"$first": "$title"},
"materialTags": {"$push": "$materialTags"}
}
}
])
My corresponding Spring code:
UnwindOperation unwindOperation = Aggregation.unwind("tags");
LookupOperation lookupOperation1 = LookupOperation.newLookup()
.from("tags")
.localField("tags")
.foreignField("_id")
.as("materialTags");
//I also want to add group operation but unable to find the proper syntax ??.
Aggregation aggregation = Aggregation.newAggregation(unwindOperation,
lookupOperation1, ??groupOperation?? );
AggregationResults<Article> resultList
= mongoTemplate.aggregate(aggregation, "articles", Article.class);
I tried to play around with group operation but without much luck. How can I add group operations as per original query ?
Thanks in advance.
Group query syntax in Spring for
{
"$group": {
"_id": "$_id",
"title": {"$first": "$title"},
"materialTags": {"$push": "$materialTags"}
}
}
is
Aggregation.group("_id").first("title").as("title").push("materialTags").as("materialTags")
Final query
UnwindOperation unwindOperation = Aggregation.unwind("tags");
LookupOperation lookupOperation1 = LookupOperation.newLookup()
.from("tags")
.localField("tags")
.foreignField("_id")
.as("materialTags");
Aggregation aggregation = Aggregation.newAggregation(unwindOperation,
lookupOperation1, Aggregation.group("_id").first("title").as("title").push("materialTags").as("materialTags") );
AggregationResults<Article> resultList
= mongoTemplate.aggregate(aggregation, "articles", Article.class);
To get more info please go thru the below references
http://www.baeldung.com/spring-data-mongodb-projections-aggregations
spring data mongodb group by
Create Spring Data Aggregation from MongoDb aggregation query
https://www.javacodegeeks.com/2016/04/data-aggregation-spring-data-mongodb-spring-boot.html
I have collection in following format.
{
id:____,
name:"Carlos",
city:"Mumbai"
},
{
id:____,
name:"Pravin",
city:"Mumbai"
},
{
id:_____,
name:"Gaurav",
city:"Ahmedabad"
}
I want whole document distinct by city. I tried the db.collection.distinct("city"). But it returns only distinct cities.
Current Output:
["Mumbai","Ahmedabad"]
Expected Output:
{
id:____,
name:"Carlos",
city:"Mumbai"
},
{
id:_____,
name:"Gaurav",
city:"Ahmedabad"
}
Above you can see there is only one record of "Mumbai". I need this kind of output.
Anyone know how we can get whole document with distinct in spring-mongodb?
You could try running an aggregation pipeline operation where you can include the the other fields inside the $group pipeline stage using the $first operator. Two examples that show this approach follow:
Mongo Shell:
pipeline = [
{
"$group": {
"_id": "$city",
"id": { "$first": "$_id" },
"name": { "$first": "$name" }
}
}
]
db.collection.aggregate(pipeline);
Spring Data MongoDB:
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
MongoTemplate mongoTemplate = repository.getMongoTemplate();
Aggregation agg = newAggregation(
group("city")
.first("_id").as("id")
.first("name").as("name")
);
AggregationResults<OutputType> result = mongoTemplate.aggregate(agg,
"collection", OutputType.class);
List<OutputType> mappedResult = result.getMappedResults();
I'm using Elasticsearch and Nest to create a query for documents within a specific time range as well as doing some filter facets. The query looks like this:
{
"facets": {
"notfound": {
"query": {
"term": {
"statusCode": {
"value": 404
}
}
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"time": {
"from": "2014-04-05T05:25:37",
"to": "2014-04-07T05:25:37"
}
}
}
]
}
}
}
In the specific case, the total hits of the search is 21 documents, which fits the documents within that time range in Elasticsearch. But the "notfound" facet returns 38, which fits the total number of ErrorDocuments with a StatusCode value of 404.
As I understand the documentation, facets collects data from withing the search. In this case, the "notfound" facet should never be able to return a count higher that 21.
What am I doing wrong here?
There's a distinct difference between filter/query/filtered_query/facet filter which is good to know.
Top level filter
{
filter: {}
}
This acts as a post-filter, meaning it will filter the results after the query phase has ended. Since facets are part of the query phase filters do not influence the documents that are facetted over. Filters do not alter score and are therefor very cacheable.
Top level query
{
query: {}
}
Queries influence the score of a document and are therefor less cacheable than filters. Queries run in the query phase and thus also influence the documents that are facetted over.
Filtered query
{
query: {
filtered: {
filter: {}
query: {}
}
}
}
This allows you to run filters in the query phase taking advantage of their better cacheability and have them influence the documents that are facetted over.
Facet filter
"facets" : {
"<FACET NAME>" : {
"<FACET TYPE>" : {
...
},
"facet_filter" : {
"term" : { "user" : "kimchy"}
}
}
}
this allows you to apply a filter to the documents that the facet is run over. Remember that the it'll be a combination of the queryphase/facetfilter unless you also specify global:true on the facet as well.
Query Facet/Filter Facet
{
"facets" : {
"wow_facet" : {
"query" : {
"term" : { "tag" : "wow" }
}
}
}
}
Which is the one that #thomasardal is using in this case which is perfectly fine, it's a facet type which returns a single value: the query hit count.
The fact that your Query Facet returns 38 and not 21 is because you use a filter for your time range.
You can fix this by either doing the filter in a filtered_query in the query phase or apply a facet filter(not a filter_facet) to your query_facet although because filters are cached better you better use facet filter inside you filter facet.
Confusingly Filter Facets are specified using .FacetFilter() on the search object. I will change this in 1.0 to avoid future confusion.
Sadly: .FacetFilter() and .FacetQuery() in NEST do not allow you to specify a facet filter like you can with other facets:
var results = typedClient.Search<object>(s => s
.FacetTerm(ft=>ft
.OnField("myfield")
.FacetFilter(f=>f.Term("filter_facet_on_this_field", "value"))
)
);
You issue here is that you are performing a Filter Facet and not a normal facet on your query (which will follow the restrictions applied via the query filter). In the JSON, the issue is because of the "query" between the facet name "notfound" and the "terms" entry. This is telling Elasticsearch to run this as a separate query and facet on the results of this separate query and not your main query with the date range filter. So your JSON should look like the following:
{
"facets": {
"notfound": {
"term": {
"statusCode": {
"value": 404
}
}
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"time": {
"from": "2014-04-05T05:25:37",
"to": "2014-04-07T05:25:37"
}
}
}
]
}
}
}
Since I see you have this tagged with NEST as well, in your call using NEST, you are probably using FacetFilter on your search request, switch this to just Facet to get the desired result.