Elasticsearch - Grouping aggregation - 2 fields - elasticsearch

My mapping is
{
"myapp": {
"mappings": {
"attempts": {
"properties": {
"answers": {
"properties": {
"question_id": {
"type": "long"
},
"status": {
"type": "long"
}
}
},
"exam_id": {
"type": "long"
}
}
}
}
}
}
and i want to group by question_id and status
I want to know for every question_id how many has status 1 or 2
P.S. 2 attempts can have the same questions

First of All, you need to update your mapping and make answers a nested field. Not making it nested will make the answers to lose the correlation between question_id field and status field.
{
"myapp": {
"mappings": {
"attempts": {
"properties": {
"answers": {
"type":"nested", <-- Update here
"properties": {
"question_id": {
"type": "long"
},
"status": {
"type": "long"
}
}
},
"exam_id": {
"type": "long"
}
}
}
}
}
}
You can use status in a sub-aggregation as shown below
"aggs": {
"nested_qid_agg": {
"nested": {
"path": "answers"
},
"aggs": {
"qid": {
"terms": {
"field": "answers.question_id",
"size": 0
},
"aggs": {
"status": {
"terms": {
"field": "answers.status",
"size": 0
}
}
}
}
}
}
}

Related

Aggregation based off of nested document field with filters on nested document and parent

I have the following mapping:
{
"accountId": {
"type": "long"
},
"storeProductId": {
"type": "long"
},
"storeSchemaId": {
"type": "long"
},
"yoyoValues": {
"type": "nested",
"properties": {
"yoyoNameId": {
"type": "long"
},
"dataType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "long"
},
"languageId": {
"type": "long"
},
"value_Number": {
"type": "float"
},
"value_Raw": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
and I'm trying to get the max and min values for value_number for all nested documents with yoyoNameId of 3 that also has a parent document with an accountId of 1285 and storeSchemaId of 241.
Everytime I've tried, I've been unable to properly filter the nested documents so it ends up being the min and max values for all nested documents with the correct parent document values.
I've tried several different queries but my most recent one is as follows:
{
"size": 0,
"aggs": {
"filter-layer": {
"filters": {
"filters": [
{
"term": {
"accountId": 1285
}
},
{
"term": {
"yoyoSchemaId": 241
}
},
{
"nested": {
"path": "yoyoValues",
"query": {
"bool": {
"filter": [
{
"term": {
"yoyoValues.yoyoNameId": 3
}
}
]
}
}
}
}
]
},
"aggs": {
"yoyoValues": {
"nested": {
"path": "yoyoValues"
},
"inner": {
"filter": {
"term": {
"yoyoValues.yoyoNameId": 3
}
},
"aggs": {
"min_value": {
"min": {
"field": "yoyoValues.value_Number"
}
},
"max_value": {
"max": {
"field": "yoyoValues.value_Number"
}
}
}
}
}
}
}
}
}
Can someone please help me correct this query? I'm limited to elastic v7.13.

ElasticSearch - How can I do nested field aggregation with field aliases?

I'm trying to query a nested field's inner hits for cardinality, however it's not working for field aliases (where resellers.price is an alias). I'm using an elastic search example to show this
GET /products/_search
{
"aggs": {
"resellers": {
"nested": {
"path": "resellers"
},
"aggs": {
"unique_prices": {
"cardinality": { "field": "resellers.price" }
}
}
}
}
}
Adding a working example with index data, mapping, search query and search result
Index Mapping:
{
"mappings": {
"properties": {
"resellers": {
"type": "nested",
"properties": {
"cost": {
"type": "integer"
},
"price": {
"type": "alias",
"path": "resellers.cost"
}
}
}
}
}
}
Index Data:
{
"resellers": {
"cost": 200
}
}
{
"resellers": {
"cost": 100
}
}
{
"resellers": {
"cost": 200
}
}
Search Query:
{
"size": 0,
"aggs": {
"resellers": {
"nested": {
"path": "resellers"
},
"aggs": {
"unique_prices": {
"cardinality": {
"field": "resellers.price"
}
}
}
}
}
}
Search Result:
"aggregations": {
"resellers": {
"doc_count": 3,
"unique_prices": {
"value": 2
}
}
}

In Kibana how can you sum nested fields and then bucket for each document?

We have multiple nested fields which need to be summed and then graphed almost as if it were a value of the parent document (using scripted fields is not an ideal solution for us).
Given the example index mapping:
{
"mapping": {
"_doc": {
"properties": {
"build_name": { "type": "keyword" },
"start_ms": { "type": "date" },
"projects": {
"type": "nested",
"properties": {
"project_duration_ms": { type": "long" },
"project_name": { "type": "keyword" }
}
}
}
}
}
}
Example doc._source:
{
"build_name": "example_build_1",
"start_ms": "1611252094540",
"projects": [
{ "project_duration_ms": "19381", project_name": "example_project_1" },
{ "project_duration_ms": "2081", "project_name": "example_project_2" }
]
},
{
"build_name": "example_build_2",
"start_ms": "1611252097638",
"projects": [
{ "project_duration_ms": "21546", project_name": "example_project_1" },
{ "project_duration_ms": "2354", "project_name": "example_project_2" }
]
}
It would be ideal to get a aggregation something like:
....
"aggregations" : {
"builds" : {
"total_durations" : {
"buckets" : [
{
"key": "example_build_1",
"start_ms": "1611252094540",
"total_duration": "21462"
},
{
"key": "example_build_2",
"start_ms": "1611252097638",
"total_duration": "23900"
}
}
}
}
}
}
No scripted fields necessary. This nested sum aggregation should do the trick:
{
"size": 0,
"aggs": {
"builds": {
"terms": {
"field": "build_name"
},
"aggs": {
"total_durations_parent": {
"nested": {
"path": "projects"
},
"aggs": {
"total_durations": {
"sum": {
"field": "projects.project_duration_ms"
}
}
}
}
}
}
}
}
Your use case is a great candidate for employing the copy_to parameter which'll put the build durations into one top-level list of longs so that the nested query won't be required when we're summing them up.
Adjust the mapping like so:
"properties": {
"build_name": { "type": "keyword" },
"start_ms": { "type": "date" },
"total_duration_ms": { "type": "long" }, <--
"projects": {
"type": "nested",
"properties": {
"project_duration_ms": {
"type": "long",
"copy_to": "total_duration_ms" <--
},
"project_name": { "type": "keyword" }
}
}
}
After reindexing (which is required due to the newly added field), the above query gets simplified to:
{
"size": 0,
"aggs": {
"builds": {
"terms": {
"field": "build_name"
},
"aggs": {
"total_durations": {
"sum": {
"field": "total_duration_ms"
}
}
}
}
}
}

Nested aggregation in nested field?

I am new to elasticsearch and don't know a lot about aggregations but I have this ES6 mapping:
{
"mappings": {
"test": {
"properties": {
"id": {
"type": "integer"
}
"countries": {
"type": "nested",
"properties": {
"global_id": {
"type": "keyword"
},
"name": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
},
"areas": {
"type": "nested",
"properties": {
"global_id": {
"type": "keyword"
},
"name": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"parent_global_id": {
"type": "keyword"
}
}
}
}
}
}
}
How can I get all documents grouped by areas which is then grouped by countries. Also the document has to be returned in full, not just the nested document. Is this even possible ?
1) Aggregation _search query:
first agg by area, with the path as this is nested. Then reverse to the root document and nested agg to country.
{
"size": 0,
"aggs": {
"agg_areas": {
"nested": {
"path": "areas"
},
"aggs": {
"areas_name": {
"terms": {
"field": "areas.name"
},
"aggs": {
"agg_reverse": {
"reverse_nested": {},
"aggs": {
"agg_countries": {
"nested": {
"path": "countries"
},
"aggs": {
"countries_name": {
"terms": {
"field": "countries.name"
}
}
}
}
}
}
}
}
}
}
}
}
2) retrieve documents:
add a tophits inside your aggregation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html
top_hits is slow so you will have to read documentation and adjust size and sort to your context.
...
"terms": {
"field": "areas.name"
},
"aggregations": {
"hits": {
"top_hits": { "size": 100}
}
},
...

Elasticsearch aggregation on nested objects

I have an document with the following mappings:
{
"some_doc_name": {
"mappings": {
"_doc": {
"properties": {
"stages": {
"properties": {
"name": {
"type": "text"
},
"durationMillis": {
"type": "long"
}
}
}
}
}
}
}
}
And I would like to have an aggregation like: "The average duration of the stages which name contains the SCM token"
I tried something like:
{
"aggs": {
"scm_stage": {
"filter": {
"bool": {
"should": [{
"match_phrase": {
"stages.name": "SCM"
}
}]
}
},
"aggs" : {
"avg_duration": {
"avg": {
"field": "stages.durationMillis"
}
}
}
}
}
}
But that's giving me the average of all stages for all documents that contain at least one stage with the SCM token. Any advice on how to get this aggregation right?
Answering my own question thanks to the help of val
My mappings file was missing the "type": "nested", something like:
...
"stages": {
"type": "nested",
"properties": {
"id": {
"type": "keyword",
"ignore_above": 256
},
...
Then I can get my aggregation working with something like this:
{
"size": 0,
"query": {
"nested": {
"path": "stages",
"query": {
"match": {
"stages.name": "scm"
}
}
}
},
"aggs": {
"stages": {
"nested": {
"path": "stages"
},
"aggs": {
"stages-filter": {
"filter": {
"terms": {
"stages.name": [
"scm"
]
}
},
"aggs": {
"avg_duration": {
"avg": {
"field": "stages.durationMillis"
}
}
}
}
}
}
}
}

Resources