simple group by in elastic search - elasticsearch

This is my search:
{
"query": {
"filtered": {
"filter": {
"term": { "cityId": "10777"}
},
"query" : {
"query_string": {
"query": "pizza",
"fields": ["name", "main", "category.name"]
}
}
}
},
"sort": [
{ "premium": { "order": "desc" } }
]
}
This works perfectly.
He brings me several categories, and I would like to group by them.
example:
Group by category "pizzerias"

All you have to do is to add a terms aggregation to the mix and you're done.
Supposing your category field is the category.name one, you can do it like this.
{
"query": {
"filtered": {
"filter": {
"term": {
"cityId": "10777"
}
},
"query": {
"query_string": {
"query": "pizza",
"fields": [
"name",
"main",
"category.name"
]
}
}
}
},
"sort": [
{
"premium": {
"order": "desc"
}
}
],
"aggs": {
"categories": {
"terms": {
"field": "category.name"
}
}
}
}

Related

sorting on multiple aggregations

Say I have documents like so, in a people index:
{
zip: string,
birthDate: Date,
graduationDate: Date,
marriedDate: Date,
deathDate: Date,
...
}
I want to be able to do a single query to elastic search where I retrieve several different counts of records, all with a birthDate within a specific range, then a secondary term query like graduationDate:* or marriedDate:*, then grouped by zip. The kicker is that I want to be able to sort by these counts.
So far I have this:
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"query_string": {
"query": "birthDate:[1979-03-01 TO 1979-03-31]",
}
}
]
}
},
"aggs": {
“total”: {
"aggs": {
"group_by_zip": {
"composite": {
"sources": [
{
"zip": {
"terms": {
"field": "zip"
}
}
}
]
}
}
}
},
"graduated": {
"filter": {
"query_string": {
"query": "graduationDate:*",
}
},
"aggs": {
"group_by_zip": {
"composite": {
"sources": [
{
"zip": {
"terms": {
"field": "zip"
}
}
}
]
}
}
}
},
"married": {
"filter": {
"query_string": {
"query": "marriedDate:*",
}
},
"aggs": {
"group_by_zip": {
"composite": {
"sources": [
{
"zip": {
"terms": {
"field": "zip"
}
}
}
]
}
}
}
},
"died": {
"filter": {
"query_string": {
"query": "deathDate:*",
}
},
"aggs": {
"group_by_zip": {
"composite": {
"sources": [
{
"zip": {
"terms": {
"field": "zip"
}
}
}
]
}
}
}
}
}
}
But I can't figure out how to SORT this by, say, the _docCount of married:desc and get the same collection of zips for each of the aggs. There are 41692 zip codes, so this needs to page obviously.
I figured it out!
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"query_string": {
"query": "birthDate:[1979-03-28 TO 1979-03-31]",
}
}
]
}
},
"aggs": {
"group_by_zip": {
"composite": {
"sources": [
{
"zip": {
"terms": {
"field": "zip"
}
}
}
]
},
"aggs": {
"total": {
"filter": {
"query_string": {
"query": "*"
}
}
},
"graduated": {
"filter": {
"query_string": {
"query": "graduationDate:*" }
}
},
"married": {
"filter": {
"query_string": {
"query": "marriedDate:*"
}
}
},
"died": {
"filter": {
"query_string": {
"query": "deadDate:*"
}
}
},
"sort_it": {
"bucket_sort": {
"sort": [
{"graduated>_count": {"order": "desc"}}
]
}
}
}
}
}
}

ELASTICSERCH - Inner_hits aggregations

I am trying to do an aggregation of the {"wildcare": {"data.addresses.ces.cp": "maria*"},
{"macth": { "data.addresses.ces.direction": "rodriguez"}} fields, but it does not return the results of the query.
{ "_source": "created_at",
"size": 1,
"sort": [
{
"created_at.keyword": {
"order": "desc"
}
}
],
"query": {
"nested": {
"path": "data.addresses",
"inner_hits": {
},
"query": {
"nested": {
"path": "data.addresses.ces",
"query":
{"wildcare": {"data.addresses.ces.cp": "maria*"},
{"macth": { "data.addresses.ces.direction": "rodriguez"}}
}
}
}
}
}
How can I perform an aggregation that returns the values ​​of the query, and not all the values ​​of the JSON?
In case the aggregations don't support inner_hits, how could I get wildcare and macth in aggs?
You need to repeat the filter conditions in the aggregation part so that the aggregation only runs on the selected nested documents:
{
"_source": "created_at",
"size": 1,
"sort": [
{
"created_at.keyword": {
"order": "desc"
}
}
],
"query": {
"nested": {
"path": "data.addresses",
"inner_hits": {},
"query": {
"nested": {
"path": "data.addresses.ces",
"query": {
"bool": {
"filter": [
{
"wildcard": {
"data.addresses.ces.cp": "maria*"
}
},
{
"match": {
"data.addresses.ces.direction": "rodriguez"
}
}
]
}
}
}
}
}
},
"aggs": {
"addresses": {
"nested": {
"path": "data.addresses"
},
"aggs": {
"ces": {
"nested": {
"path": "data.addresses.ces"
},
"aggs": {
"query": {
"filter": {
"bool": {
"filter": [
{
"wildcard": {
"data.addresses.ces.cp": "maria*"
}
},
{
"match": {
"data.addresses.ces.direction": "rodriguez"
}
}
]
}
},
"aggs": {
"cp": {
"terms": {
"field": "data.addresses.ces.cp"
}
},
"direction": {
"terms": {
"field": "data.addresses.ces.direction"
}
}
}
}
}
}
}
}
}
}

What is wrong with my elasticsearch query ? Getting a expected end object error

I'm trying to do a elasticsearch query that does geolocation filter and does some matching on nested documents, but I'm getting this error whenever I add in the nested query.
"[bool] malformed query, expected [END_OBJECT] but found [FIELD_NAME]"
{
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "10km",
"geolocation": [
-73.980090948125,
40.747844918436
]
}
},
"must": {
"multi_match": {
"query": "New York",
"fields": [
"name^2",
"city",
"state",
"zip"
],
"type": "best_fields"
}
}
},
"nested": {
"path": "amenities",
"query": {
"bool": {
"must": [
{
"match": {
"amenities.name": "Pool"
}
}
]
}
}
}
},
"aggs": {
"reviews": {
"nested": {
"path": "reviews"
},
"aggs": {
"avg_rating": {
"avg": {
"field": "reviews.rating"
}
}
}
}
}
}
You just has misplaced the nested query, try like this:
{
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "10km",
"geolocation": [
-73.980090948125,
40.747844918436
]
}
},
"must": [
{
"multi_match": {
"query": "New York",
"fields": [
"name^2",
"city",
"state",
"zip"
],
"type": "best_fields"
}
},
{
"nested": {
"path": "amenities",
"query": {
"match": {
"amenities.name": "Pool"
}
}
}
}
]
}
},
"aggs": {
"reviews": {
"nested": {
"path": "reviews"
},
"aggs": {
"avg_rating": {
"avg": {
"field": "reviews.rating"
}
}
}
}
}
}

Is it possible to do nested sort with a conditional query on a sort element

we have following structure in an index - following is only a partial and doc relevant for this question.
"instance" : {
"id" : 1,
{"instFields": [
{
"sourceFieldId": 2684,
"fieldValue": "false",
"fieldBoolean": false
},
{
"sourceFieldId": 1736,
"fieldValue": "DODGE",
"fieldString": "DODGE"
},
{
"sourceFieldId": 1560,
"fieldValue": "GRAY",
"fieldString": "GRAY"
},
{
"sourceFieldId": 1558,
"fieldValue": "CHALLENGER",
"fieldString": "CHALLENGER"
},
{
"sourceFieldId": 1556,
"fieldValue": "2010",
"fieldDouble": 2010
}
]
}
first user query is give me all instances where sourceFieldId=1736 - this returns all the DODGE instances[] - all this is working fine with an appripriate Elastic Search query. now when user is seeing all DODGE records - user wants to sort by any of those sourceFieldIds for e.g. say user is wanting to sort results by - color - sourceFieldId=1560.
say we have following sort query
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"instance.dataSourceId": "196"
}
},
{
"term": {
"instance.dsTypeId": "5"
}
},
{
"nested": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"instance.instFields.sourceFieldId": "1558"
}
},
{
"term": {
"instance.instFields.fieldString.raw": "challenger"
}
}
]
}
}
}
},
"path": "instance.instFields"
}
}
]
}
}
}
},
"path": "instance",
"inner_hits": {
"name": "inner_data"
}
}
},
{
"nested": {
"query": {
"bool": {
"should": {
"bool": {
"must": [
{
"match": {
"instance.entitlements.roleId": {
"query": "1",
"type": "boolean"
}
}
},
{
"match": {
"instance.entitlements.read": {
"query": "true",
"type": "boolean"
}
}
}
]
}
}
}
},
"path": "instance.entitlements"
}
}
]
}
}
}
},
"sort": {
"instance.instFields.fieldString.raw": {
"order": "asc",
"nested_path": "instance.instFields",
"nested_filter": {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"instance.dataSourceId": "196"
}
},
{
"term": {
"instance.dsTypeId": "5"
}
},
{
"nested": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"instance.instFields.sourceFieldId": "1558"
}
},
{
"term": {
"instance.instFields.fieldString.raw": "challenger"
}
}
]
}
}
}
},
"path": "instance.instFields"
}
}
]
}
}
}
},
"path": "instance",
"inner_hits": {
"name": "inner_data1"
}
}
},
{
"nested": {
"query": {
"bool": {
"should": {
"bool": {
"must": [
{
"match": {
"instance.entitlements.roleId": {
"query": "1",
"type": "boolean"
}
}
},
{
"match": {
"instance.entitlements.read": {
"query": "true",
"type": "boolean"
}
}
}
]
}
}
}
},
"path": "instance.entitlements"
}
}
]
}
}
}
}
}
}
}
resulting docs must return entire instance with all the soureceFields - as on a user page it displays other values of DODGE as well.
now issue is- sort query still has to have knowledge to sort where - "sourceFieldId": 1560 (which is a sourceFieldId for color) to sort on color
is there a way to achieve such a sort query in ES without using dynamic scripting/dynamic templating? something like
"sort": {
"instance.instFields.fieldString.raw": (where sourceFieldId=1560?)
Should be able to achieve this using nested_filter option in sort
From the documentation:
nested_filter A filter that the inner objects inside the nested path
should match with in order for its field values to be taken into
account by sorting. Common case is to repeat the query / filter inside
the nested filter or query. By default no nested_filter is active.
For example to sort on color field it would be:
{
"sort": {
"instance.instFields.fieldValue.raws": {
"order": "asc",
"nested_path": "instance.instFields",
"nested_filter": {
"term": {
"instance.instFields.sourceFieldId": "1560"
}
}
}
}
}
Edited
"sort": [{
"instance.instFields.fieldValue": {
"order": "asc",
"nested_path": "instance.instFields",
"nested_filter": {
"term": {
"instance.instFields.sourceFieldId": "1560"
}
}
}
},
{
"instance.instFields.fieldValue": {
"order": "asc",
"nested_path": "instance.instFields",
"nested_filter": {
"term": {
"instance.instFields.sourceFieldId": "1558"
}
}
}
}
]

elasticsearch sort not working with partial_fields

I have this kind of sort:
"sort": [
{
"_script": {
"script": "return doc.score*10 + doc['field2'].value",
"type": "number",
"order": "asc"
}
}
]
partial fields:
"filter": {
"partial_fields": {
"fields": {
"exclude": [
"field5*"
]
}
}
}
Problem is that sort does not work if partial_fields is set.. is there a reason for this ? or how do I have to remove partial_fields in order to get sort working ?
here's the whole query:
{
"size": 10,
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"text": {
"name_en": {
"query": "testing",
"operator": "or",
"boost": 20
}
}
}
]
}
},
"filter": {
"and": [
{
"term": {
"_type": "test"
}
}
]
}
},
"filter": {
"partial_fields": {
"fields": {
"exclude": [
"field2*"
]
}
}
}
},
"sort": [
{
"_script": {
"script": "return doc.score*1000 + doc['field2'].value",
"type": "number",
"order": "asc"
}
}
]
}
Thanks.
According to the documentation for partial fields, I do not see the usage as being nested under a filter node in the JSON request. I think this could be your issue, try moving the partial_fields section up to the same level as sort like the following:
{
"size": 10,
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"text": {
"name_en": {
"query": "testing",
"operator": "or",
"boost": 20
}
}
}
]
}
},
"filter": {
"and": [
{
"term": {
"_type": "test"
}
}
]
}
}
},
"partial_fields": {
"fields": {
"exclude": [
"field2*"
]
}
},
"sort": [
{
"_script": {
"script": "return doc.score*1000 + doc['field2'].value",
"type": "number",
"order": "asc"
}
}
]
}

Resources