How to sort an object by it's child object fields? - elasticsearch

How could I sort products by GroupModel.PeerOrder but only if GroupModel.ParentGroupId matches some id?
My C# models:
public class ProductModel
{
public int Id { get; set; }
public int Title { get; set; }
public List<GroupModel> Groups { get; set; }
}
public class GroupModel
{
public int Id { get; set; }
public int Title { get; set; }
public int ParentGroupId { get; set; }
public int PeerOrder { get; set; }
}

Use Nested Query to filter matching "GroupModel.ParentGroupId" values and then apply Nested Sort Query to sort results by "GroupModel.PeerOrder".
As per documentation:
Nested Query: Nested query allows to query nested objects / docs (see nested mapping). The query is executed against the nested objects / docs as if they were indexed as separate docs (they are, internally) and resulting in the root parent doc (or parent nested mapping). Here is a sample mapping:
PUT /my_index
{
"mappings": {
"_doc" : {
"properties" : {
"obj1" : {
"type" : "nested"
}
}
}
}
}
GET /_search
{
"query": {
"nested" : {
"path" : "obj1",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{ "match" : {"obj1.name" : "blue"} },
{ "range" : {"obj1.count" : {"gt" : 5}} }
]
}
}
}
}
}
Nested Sort Query: It is possible to sort by the value of a nested field, even though the value exists in a separate nested document.
GET /_search
{
"query": {
"nested": {
"path": "comments",
"filter": {
"range": {
"comments.date": {
"gte": "2014-10-01",
"lt": "2014-11-01"
}
}
}
}
},
"sort": {
"comments.stars": {
"order": "asc",
"mode": "min",
"nested_filter": {
"range": {
"comments.date": {
"gte": "2014-10-01",
"lt": "2014-11-01"
}
}
}
}
}
}

This is what I ended up with (With the help from #ydrall - thank you very much! :>):
The query part (for us is works just fine without this first part):
"query": {
"bool": {
"must": [
{
"nested": {
"path": "groups",
"query": {
"bool": {
"must": [
{
"match": {
"groups.parentGroupId": 3
}
}
]
}
}
}
}
]
}
}
The sorting part of the query:
"sort": [
{
"groups.peerOrder": {
"order": "asc",
"nested_path": "groups",
"nested_filter": {
"match": {
"groups.parentGroupId": 3
}
}
}
}
Index mappings:
"mappings": {
"productmodel": {
"properties": {
"groups": {
"type": "nested",
"properties": {
"id": {
"type": "integer"
},
"parentGroupId": {
"type": "integer"
},
"peerOrder": {
"type": "integer"
}
}
}
}
}
}

Related

Doing aggregation on object in Elasticsearch

I would like to do an aggregation on one of the object type but I couldn't make it work. I created a mapping from dynamic_templates because my object is dictionary and the key is list of constants. Here is my object, mapping and aggregate query.. even I can't access indexed field by must->exists query.
Document
{
"property":{
"innerProperty":{
"constantKey":{
"someArrays":[
{
"id":"12345"
}
]
}
}
}
}
Mapping
{
"dynamic_templates": [
{
"property_map": {
"path_match": "property.innerProperty.*",
"mapping": {
"type": "object",
"dynamic": false
}
}
}
]
}
Mapping result after adding a document
"property": {
"properties": {
"innerProperty": {
"properties": {
"constantKey": {
"type": "object",
"dynamic": "false"
}
}
}
}
}
Aggregation
GET /index/_search
{
"query": {
"bool": {
"must": {
"exists": {
"field": "property.innerProperty.constantKey"
}
}
}
}
}
Query
GET /index/_search
{
"aggs": {
"property-agg": {
"terms": {
"field": "property.innerProperty.constantKey"
}
}
},
"size": 0
}
Both of aren't working. I would like to do an aggregation by constantKey so that I would get the correct document count to facets make it work.

Elasticsearch querying number of dates in array matching query

I have documents in the following form
PUT test_index/_doc/1
{
"dates" : [
"2018-07-15T14:12:12",
"2018-09-15T14:12:12",
"2018-11-15T14:12:12",
"2019-01-15T14:12:12",
"2019-03-15T14:12:12",
"2019-04-15T14:12:12",
"2019-05-15T14:12:12"],
"message" : "hello world"
}
How do I query for documents such that there are n number of dates within the dates array falling in between two specified dates?
For example: Find all documents with 3 dates in the dates array falling in between "2018-05-15T14:12:12" and "2018-12-15T14:12:12" -- this should return the above document as "2018-07-15T14:12:12", "2018-09-15T14:12:12" and "2018-11-15T14:12:12" fall between "2018-05-15T14:12:12" and "2018-12-15T14:12:12".
I recently faced the same problem. However came up with two solutions.
1) If you do not want to change your current mapping, you could query for the documents using query_string. Also note you will have to create the query object according to the range that you have. ("\"2019-04-08\" OR \"2019-04-09\" OR \"2019-04-10\" ")
{
"query": {
"query_string": {
"default_field": "dates",
"query": "\"2019-04-08\" OR \"2019-04-09\" OR \"2019-04-10\" "
}
}
}
However,this type of a query only makes sense if the range is short.
2) So the second way is the nested method. But you will have to change your current mapping in such a way.
{
"properties": {
"dates": {
"type": "nested",
"properties": {
"key": {
"type": "date",
"format": "YYYY-MM-dd"
}
}
}
}
}
So your query will look something like this :-
{
"query": {
"nested": {
"path": "dates",
"query": {
"bool": {
"must": [
{
"range": {
"dates.key": {
"gte": "2018-04-01",
"lte": "2018-12-31"
}
}
}
]
}
}
}
}
}
You can create dates as a nested document and use bucket selector aggregation.
{
"empId":1,
"dates":[
{
"Days":"2019-01-01"
},
{
"Days":"2019-01-02"
}
]
}
Mapping:
"mappings" : {
"properties" : {
"empId" : {
"type" : "keyword"
},
"dates" : {
"type" : "nested",
"properties" : {
"Days" : {
"type" : "date"
}
}
}
}
}
GET profile/_search
{
"query": {
"bool": {
"filter": {
"nested": {
"path": "dates",
"query": {
"range": {
"dates.Days": {
"format": "yyyy-MM-dd",
"gte": "2019-05-01",
"lte": "2019-05-30"
}
}
}
}
}
}
},
"aggs": {
"terms_parent_id": {
"terms": {
"field": "empId"
},
"aggs": {
"availabilities": {
"nested": {
"path": "dates"
},
"aggs": {
"avail": {
"range": {
"field": "dates.Days",
"ranges": [
{
"from": "2019-05-01",
"to": "2019-05-30"
}
]
},
"aggs": {
"count_Total": {
"value_count": {
"field": "dates.Days"
}
}
}
},
"max_hourly_inner": {
"max_bucket": {
"buckets_path": "avail>count_Total"
}
}
}
},
"bucket_selector_page_id_term_count": {
"bucket_selector": {
"buckets_path": {
"children_count": "availabilities>max_hourly_inner"
},
"script": "params.children_count>=19;" ---> give the number of days that should match
}
},
"hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
I found my own answer to this, although I'm not sure how efficient it is compared to the other answers:
GET test_index/_search
{
"query":{
"bool" : {
"filter" : {
"script" : {
"script" : {"source":"""
int count = 0;
for (int i=0; i<doc['dates'].length; ++i) {
if (params.first_date < doc['dates'][i].toInstant().toEpochMilli() && doc['dates'][i].toInstant().toEpochMilli() < params.second_date) {
count += 1;
}
}
if (count >= 2) {
return true
} else {
return false
}
""",
"lang":"painless",
"params": {
"first_date": 1554818400000,
"second_date": 1583020800000
}
}
}
}
}
}
}
where the parameters are the two dates in epoch time. I've chosen 2 matches here, but obviously you can generalise to any number.

Problem in nest query for sorting in aggregation

I need to get a query from Elasticsearch for last registered records in specified area in range of specified serial numbers.
for this reason I mapped my index in this shape:
{ "settings": {
"index": {
"number_of_shards": 5,
"number_of_replicas": 2
}
},
"mapping": {
"AssetStatus": {
"properties": {
"serialnumber": {
"type": "text",
"fielddata": true
},
"vehiclestate": {
"type": "text",
"fielddata": true
},
"vehiclegeopoint": {
"type": "geo-point",
"fielddata": true
},
"vehiclespeed": {
"type": "number",
"fielddata": true
},
"vehiclefuelpercent": {
"type": "text",
"fielddata": true
},
"devicebatterypercent": {
"type": "text",
"fielddata": true
},
"networklatency": {
"type": "text",
"fielddata": true
},
"satellitescount": {
"type": "number",
"fielddata": true
},
"createdate": {
"type": "date",
"fielddata": true
}
}
}
}
}
and this query works correctly
{
"query": {
"bool": {
"must": [
{
"term": {
"serialnumber.keyword": "2228187d-b1a5-4e18-82bb-4d12438e0ec0"
}
},
{
"range": {
"vehiclegeopoint.lat": {
"gt": "31.287958",
"lt": "31.295485"
}
}
},
{
"range": {
"vehiclegeopoint.lon": {
"gt": "48.639844",
"lt": "48.652032"
}
}
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 0,
"sort": [],
"aggs": {
"SerialNumberGroups": {
"terms": {
"field": "serialnumber.keyword"
},
"aggs": {
"tops": {
"top_hits": {
"sort": [
{
"createdate.keyword": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}
whereas my nest query has this error
Invalid NEST response built from a unsuccessful low level call on
POST: /fms2/AssetStatus/_search?typed_keys=true
Audit trail of this API call:
[1] BadResponse: Node: http://localhost:9200/ Took: 00:00:00.1917118
OriginalException: Elasticsearch.Net.ElasticsearchClientException: Request failed to execute. Call: Status code 400 from: POST
/fms2/AssetStatus/_search?typed_keys=true. ServerError: Type:
parsing_exception Reason: "Unknown key for a VALUE_STRING in [field]."
Request:
force it to be set on the response.>
Response:
ConnectionSettings to force it to be set on the response.>
my nest query is this
var searchResponse =
Client.Search<AssetStatus>(x => x
.Index(settings.DefaultIndex)
.Type("AssetStatus")
.Query(fq => fq.GeoBoundingBox(c => c.Field(f => f.VehicleGeoPoint).BoundingBox(new GeoLocation(TopLeft.Lat, TopLeft.Lon), new GeoLocation(BottomRight.Lat, BottomRight.Lon))))
.Query(fq =>
fq.Bool(b => b.
Filter(
f => f.Match(m => m.Field(g => g.SerialNumber.Suffix("keyword").Equals(sn)))
)
))
.Aggregations(a => a
.Terms("group_by_SerialNumber", st => st
.Field(o => o.SerialNumber.Suffix("keyword"))
.Size(0)
.Aggregations(b=> b.TopHits("top_hits", lastRegistered => lastRegistered
.Field(bf=> bf.CreateDate.Suffix("keyword"))
.Size(1)))
))
);
This problem is for sorting in aggregation
Problem was in my POCO, as you can see in my nest query I use uppercase letters for naming my properties. I should use Nest library for using data annotation for elastic's POCO.
[ElasticsearchType(Name = "AssetStatus")]
public class AssetStatus
{
[Text]
[PropertyName("serialnumber")]
public string SerialNumber { get; set; }
[Text]
[PropertyName("vehiclestate")]
public string VehicleState { get; set; }
[GeoPoint]
[PropertyName("vehiclegeopoint")]
public GeoPoint VehicleGeoPoint { get; set; }
[Number]
[PropertyName("vehiclespeed")]
public int VehicleSpeed { get; set; }
[Text]
[PropertyName("vehiclefuelpercent")]
public string VehicleFuelPercent { get; set; }
[Text]
[PropertyName("devicebatterypercent")]
public string DeviceBatteryPercent { get; set; }
[Text]
[PropertyName("networklatency")]
public string NetworkLatency { get; set; }
[Number]
[PropertyName("satellitescount")]
public byte SatellitesCount { get; set; }
[Date]
[PropertyName("createdate")]
public string CreateDate { get; set; }
}

ElasticSearch count nested fields

How do you count objects of a nested field (which is a nested objects list) which meet a certain condition in ElasticSearch?
EXAMPLE
Having Customer index, with type Customer which has a Services nested field with following structure:
public class Customer
{
public int Id;
public List<Service> Services;
}
public class Service
{
public int Id;
public DateTime Date;
public decimal Rating;
}
How do I count all services which happened in June 2017 and got a rating higher than 5?
Good question :)
In order to get what you want you should define your mapping upfront and nested property mapping works well.
The nested type is a specialized version of the object datatype that allows arrays of objects to be indexed and queried independently of each other.
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/nested.html
Please find below full example :)
Mapping
PUT example_nested_test
{
"mappings": {
"nested": {
"properties": {
"Id": {
"type": "long"
},
"Services": {
"type": "nested",
"properties": {
"Id": {
"type": "long"
},
"Date": {
"type": "date"
},
"Rating": {
"type": "long"
}
}
}
}
}
}
}
POST Data
POST example_nested_test/nested/100
{
"Id" : 1,
"Services": [
{
"Id": 1,
"Date" :"2017-05-10",
"Rating" : 5
},
{
"Id": 2,
"Date" :"2013-05-10",
"Rating" : 2
},
{
"Id": 4,
"Date" :"2017-05-10",
"Rating" : 6
}]
}
Query
GET example_nested_test/_search
{
"size":0,
"aggs": {
"Services": {
"nested": {
"path": "Services"
},
"aggs": {
"Rating": {
"filter": {
"bool": {
"must": [
{
"range": {
"Services.Date": {
"gt": "2017",
"format": "yyyy"
}
}
},
{
"range": {
"Services.Rating": {
"gt": "5"
}
}
}
]
}
}
}
}
}
}
}
Result :
"aggregations": {
"Services": {
"doc_count": 3,
"Rating": {
"doc_count": 1
}
}
}

ElasticSearch - Unable to filter on an array of strings

I have the following model class:
public class NewsItem
{
public String Language { get; set; }
public DateTime DateUpdated { get; set; }
public List<String> Tags { get; set; }
}
I index it with NEST using the automapping, resulting in the mapping below:
{
"search": {
"mappings": {
"news": {
"properties": {
"dateUpdated": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"language": {
"type": "string"
},
"tags": {
"type": "string"
},
}
}
}
}
}
I then run a query on language which works fine:
{
"query": {
"constant_score": {
"filter": [
{
"terms": {
"language": [
"en"
]
}
}
]
}
},
"sort": {
"dateUpdated": {
"order": "desc"
}
}
}
But running the same query on the tags property doesn't work. Is there any special tricks to query an array field? I read the docs again and again and I don't understand why this query gives no results:
{
"query": {
"constant_score": {
"filter": [
{
"terms": {
"tags": [
"Hillary"
]
}
}
]
}
},
"sort": {
"dateUpdated": {
"order": "desc"
}
}
}
The document returned from another query:
{
"_index": "search",
"_type": "news",
"_score": 0.12265198,
"_source": {
"tags": [
"Hillary"
],
"language": "en",
"dateUpdated": "2016-11-07T15:41:00Z"
}
}
Your tags field is analyzed, hence Hillary has been indexed to hillary. So you have two ways out:
A. Use a match query instead (since terms query does not analyze the token
{
"query": {
"bool": {
"filter": [
{
"match": { <--- use match here
"tags": "Hillary"
}
}
]
}
},
"sort": {
"dateUpdated": {
"order": "desc"
}
}
}
B. Keep the terms query but lowercase the token:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"tags": [
"hillary" <--- lowercase here
]
}
}
]
}
},
"sort": {
"dateUpdated": {
"order": "desc"
}
}
}
Elasticsearch by default runs an analyzer on all strings but Terms filter on other hand computer exact match. So this implies that ES is storing 'Hillary' as 'hillary' while you are querying for 'Hillary'. So, there are 2 ways to fix this. Either you use a match query instead of terms query or you don't automap and rather create an index and analyze the tags field as you want. You can also query 'hillary' but this would be a solution for this one case because if tag was something like 'us elections' us and elections both will be stored separately.

Resources