Finding nested result of a certain parent - elasticsearch

I have the following query which works in giving me all periods (nested) +houses they belong to that have an arrivaldate for the period I specify.
Now I want to try and get just the arrivaldates for a certain house, but I cannot figure out the syntax of how to do this in Elasticsearch.
GET /houses/house/_search
{
"_source" : ["HouseId"],
"query": {
"nested": {
"path": "Periods",
"query": {
"bool": {
"must": [
{"range": {
"Periods.ArrivalDate": {
"gte" : "2017-10-01",
"lt" : "2017-11-01"
}
}
}
]
}
},
"inner_hits" : {}
}
}
}
The mapping is this (shortened to I hope the relevant parts)
{
"houses": {
"mappings": {
"house": {
"properties": {
"Periods": {
"type": "nested",
"properties": {
"ArrivalDate": {
"type": "date",
"format": "yyyy-MM-dd"
},
....
"HouseId": {
"type": "keyword"
},
So I would like to find the available arrivaldates for a house with a certain HouseId within a certain month

I think I have it figured out, but please let me know if better solutions are available:
{
"_source":[
"HouseCode",
"Country",
"Region"
],
"query":{
"bool":{
"must":[
{
"match":{
"HouseId":"someid"
}
},
{
"nested":{
"path":"Periods",
"query":{
"range":{
"Periods.ArrivalDate":{
"gte":"2017-05-01",
"lt":"2017-06-01"
}
}
},
"inner_hits":{
"size":1000
}
}
}
]
}
}
}

The query will return you whole houses.
If you wan to get only some Periods, you should use a nested aggregations, combined with a filter aggregation :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-nested-aggregation.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html

Related

Elasticsearch query by generic properties: keywords and numeric values

I have this mapping in ES 7.9:
{
"mappings": {
"properties": {
"cid": {
"type": "keyword",
"store": true
},
"id": {
"type": "keyword",
"store": true
},
"a": {
"type": "nested",
"properties": {
"attribute":{
"type": "keyword"
},
"key": {
"type": "keyword"
},
"num": {
"type": "float"
}
}
}
}
}
}
And some documents indexed like:
{
"cid": "177",
"id": "1",
"a": [
{
"attribute": "tags",
"key": [
"heel",
"thong",
"low_heel",
"economic"
]
},
{
"attribute": "weight",
"num": 15
}
]
}
Basically, an object can have multiple attributes (a property array).
Those attributes can be different for each client. In this example, I have 2 types of attributes: tag and weight, however other documents could have other attributes like vendor, size, power, etc., so the model has to be generic enough to support beforehand unknown attributes.
An attribute can be a list of keywords (like tags) or a numeric value (like weight).
I need an ES query to fetch the documents ids with this pseudo-query:
cid="177" and (tag="flat" or tag="heel") and tag="economic" and weight<20
I managed to reach this query that seems to be working as expected:
{
"_source": ["id"],
"query": {
"bool": {
"must" : [
{"term" : { "cid" : "177" }},
{
"nested": {
"path": "a",
"query": {
"bool":{
"must":[
{"term" : { "a.attribute": "tags"}},
{"terms" : { "a.key": ["flat","heel"]}}
]
}
}
}
},
{
"nested": {
"path": "a",
"query": {
"bool":{
"must":[
{"term" : { "a.attribute": "tags"}},
{"term" : { "a.key": "economic"}}
]
}
}
}
},
{
"nested": {
"path": "a",
"query": {
"bool":{
"must":[
{"term" : { "a.attribute": "weight" } },
{"range": { "a.num": {"lt": 20} } }
]
}
}
}
}
]
}
}
}
Is this query correct or I am getting the correct results by chance?
Is the query (or mapping) optimal or I should rethink something?
Can the query be simplified?
The query is correct.
The mapping is great and the query is optimal.
While the query can be simplified:
{
"_source": [
"id"
],
"query": {
"bool": {
"must": [
{
"term": {
"cid": "177"
}
},
{
"nested": {
"path": "a",
"query": {
"query_string": {
"query": "a.attribute:tags AND ((a.key:flat OR a.key:heel) AND a.key:economic)"
}
}
}
},
{
"nested": {
"path": "a",
"query": {
"query_string": {
"query": "a.attribute:weight AND a.num:<20"
}
}
}
}
]
}
}
}
it'd be less optimal due to the fact that these query_strings would still need to be internally compiled into essentially the query DSL that you've got above. Plus you'd still be needing the two separate nested groups so... You're good to roll with what you've got.

Filter nested sorting in elasticsearch

I have a document with a nested structure the nested object has an assignment_name and a due_date:
The mapping
{
"goal": {
"mappings": {
"doc": {
"properties": {
"title": {
"type": "keyword"
},
// lot's of other fields here ...
"steps": {
"type": "nested",
"properties": {
"assignment_name": {
"type": "keyword"
},
"due_date": {
"type": "date"
}
// lots of other fields here
}
}
}
}
}
}
}
I want to:
Filter all document that have a specific assignment_name (e.g.user_a)
Sort the result by the next due_date, not taking other assignements into account.
This query gives me random result (no sortings):
{
"query":{
"bool":{
"filter":[
{
"nested":{
"path":"steps",
"query":{
"term":{
"steps.assignment_name":"user_a"
}
}
}
}
]
}
},
"sort":[
{
"steps.due_date":{
"order":"asc",
"nested":{
"path":"steps",
"filter":{
"term":{
"steps.assignment_name":"user_a"
}
}
}
}
}
],
"from":0,
"size":25
}
Firstly you need to ensure that datatype for steps field is nested. Then you have to use nested sorting to sort documents based on a nested document field.
The query would be:
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "steps",
"query": {
"term": {
"steps.assignment_name": "user_a"
}
}
}
}
]
}
},
"sort": [
{
"steps.due_date": {
"order": "asc",
"nested": {
"path": "steps",
"filter": {
"term": {
"steps.assignment_name": "user_a"
}
}
}
}
}
]
}
The catch above is to use the same filter in sort as used in the main query to filter the documents. This ensures that the correct nested document's field value is considered to sort the documents.

How to join ElasticSearch query with multi_match, boosting, wildcard and filter?

I'm trying to acheve this goals:
Filter out results by bool query, like "status=1"
Filter out results by bool range query, like "discance: gte 10 AND lte 60"
Filter out results by match at least one int value from int array
Search words in many fields with calculating document score. Some fields needs wildcard, some boosting, like importantfield^2, somefield*, someotherfield^0.75
All above points join by AND operator. All terms in one point join by OR operator.
Now I wrote something like this, but wildcards not working. Searching "abc" don't finds "abcd" in "name" field.
How to solve this?
{
"filtered": {
"query": {
"multi_match": {
"query": "John Doe",
"fields": [
"*name*^1.75",
"someObject.name",
"tagsArray",
"*description*",
"ownerName"
]
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"status": 2
}
},
{
"bool": {
"should": [
{
"term": {
"someIntsArray": 1
}
},
{
"term": {
"someIntsArray": 5
}
}
]
}
},
{
"range": {
"distanceA": {
"lte": 100
}
}
},
{
"range": {
"distanceB": {
"gte": 50,
"lte": 100
}
}
}
]
}
}
}
}
Mappings:
{
"documentId": {
"type": "integer"
},
"ownerName": {
"type": "string",
"index": "not_analyzed"
},
"description": {
"type": "string"
},
"status": {
"type": "byte"
},
"distanceA": {
"type": "short"
},
"createdAt": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"distanceB": {
"type": "short"
},
"someObject": {
"properties": {
"someObject_id": {
"type": "integer"
},
"name": {
"type": "string",
"index": "not_analyzed"
}
}
},
"someIntsArray": {
"type": "integer"
},
"tags": {
"type": "string",
"index": "not_analyzed"
}
}
You can make use of Query String if you would want to apply wildcard for multiple fields and at the same time apply various boosting values for individual fields:
Below is how your query would be:
POST <your_index_name>/_search
{
"query":{
"bool":{
"must":[
{
"query_string":{
"query":"abc*",
"fields":[
"*name*^1.75",
"someObject.name",
"tagsArray",
"*description*",
"ownerName"
]
}
}
],
"filter":{
"bool":{
"must":[
{
"term":{
"status":"2"
}
},
{
"bool":{
"minimum_should_match":1,
"should":[
{
"term":{
"someIntsArray":1
}
},
{
"term":{
"someIntsArray":5
}
}
]
}
},
{
"range":{
"distanceA":{
"lte":100
}
}
},
{
"range":{
"distanceB":{
"gte": 50,
"lte":100
}
}
}
]
}
}
}
}
}
Note that for the field someIntsArray, I've made use of "minimum_should_match":1 so that you won't end up with documents that'd have neither of those values.
Updated Answer:
Going by the updated comment, you can have the fields with wildcard search used by query_string and you can make use of simple match query with boosting as shown in below. Include both these queries (can even add more match queries depending on your requirement) in a combine should clause. That way you can control where wildcard query can be used and where not.
{
"query":{
"bool":{
"should":[
{
"query_string":{
"query":"joh*",
"fields":[
"name^2"
]
}
},
{
"match":{
"description":{
"query":"john",
"boost":15
}
}
}
],
"filter":{
"bool":{
"must":[
{
"term":{
"status":"2"
}
},
{
"bool":{
"minimum_should_match":1,
"should":[
{
"term":{
"someIntsArray":1
}
},
{
"term":{
"someIntsArray":5
}
}
]
}
},
{
"range":{
"distanceA":{
"lte":100
}
}
},
{
"range":{
"distanceB":{
"lte":100
}
}
}
]
}
}
}
}
}
Let me know if this helps

Elasticsearch nested geo-shape query

Suppose I have the following mapping:
"mappings": {
"doc": {
"properties": {
"name": {
"type": "text"
},
"location": {
"type": "nested",
"properties": {
"point": {
"type": "geo_shape"
}
}
}
}
}
}
}
There is one document in the index:
POST /example/doc?refresh
{
"name": "Wind & Wetter, Berlin, Germany",
"location": {
"type": "point",
"coordinates": [13.400544, 52.530286]
}
}
How can I make a nested geo-shape query?
Example of usual geo-shape query from the documentation (the "bool" block can be skipped):
{
"query":{
"bool": {
"must": {
"match_all": {}
},
"filter": {
"geo_shape": {
"location": {
"shape": {
"type": "envelope",
"coordinates" : [[13.0, 53.0], [14.0, 52.0]]
},
"relation": "within"
}
}
}
}
}
}
Example of a nested query is:
{
"query": {
"nested" : {
"path" : "obj1",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{ "match" : {"obj1.name" : "blue"} },
{ "range" : {"obj1.count" : {"gt" : 5}} }
]
}
}
}
}
}
Now how to combine them? In the documentation it is mentioned that nested filter has been replaced by nested query. And that it behaves as a query in “query context” and as a filter in “filter context”.
If I try query for intersect with the point:
{
"query": {
"nested": {
"path": "location",
"query": {
"geo_shape": {
"location.point": {
"shape": {
"type": "point",
"coordinates": [
13.400544,
52.530286
]
},
"relation": "disjoint"
}
}
}
}
}
}
I still get back the document even if relation is "disjoint", so it's not correct. I tried different combinations, with "bool" and "filter", etc. but query is ignored, returning the whole index. Maybe it's impossible with this type of mapping?
Clearly I am missing something here. Can somebody help me out with that, please? Any help is greatly appreciated.

Multi_match and match queries together

I have the following queries in elastic search :
{
"query": {
"multi_match": {
"query": "bluefin bat",
"type": "phrase",
"fields": [
"title^5",
"body.value"
]
}
},
"highlight": {
"fields": {
"body.value": {
"number_of_fragments": 3
}
}
},
"fields": [
"title",
"id"
]
}
I have tried using "dis_max" but then two of my fields have to be searched for the same query.
The remaining match query has a different query text.
The remaining match query is like this:
{
"query": {
"match": {
"ingredients": "key1, key2",
"analyzer": "keyword_analyzer"
}
}
}
How can I integrate these two queries without using dis_max for joining.
I figured out the answer. multi_match internally applies :
"dis_max"
Hence, you cannot apply dis_max with multi_match.
But what I could do is I could apply bool query to solve this type of problem.
I could apply should which actually translates to OR boolean value or I could apply must which is equivalent to AND.
So this is how I modified my query :
{
"query": {
"bool":{
"should": [
{"multi_match":
{"query": "SOME_QUERY",
"type": "phrase",
"fields": ["title^5","body"]
}
},
{
"match":{
"labels" :{
"query": "SOME_QUERY",
"analyzer": "keyword_analyzer"
}
}
},
{
"match":{
"displayName" :{
"query": "SOME_QUERY",
"fuzziness": "AUTO"
}
}
}
],
"minimum_number_should_match": "50%"
}
},
"fields": ["title","id","labels","displayName","username"],
"highlight": {
"fields": {
"body.storage.value": {
"number_of_fragments": 3}
}
}
}
I hope this helps someone in future.

Resources