Elasticsearch query by generic properties: keywords and numeric values - elasticsearch

I have this mapping in ES 7.9:
{
"mappings": {
"properties": {
"cid": {
"type": "keyword",
"store": true
},
"id": {
"type": "keyword",
"store": true
},
"a": {
"type": "nested",
"properties": {
"attribute":{
"type": "keyword"
},
"key": {
"type": "keyword"
},
"num": {
"type": "float"
}
}
}
}
}
}
And some documents indexed like:
{
"cid": "177",
"id": "1",
"a": [
{
"attribute": "tags",
"key": [
"heel",
"thong",
"low_heel",
"economic"
]
},
{
"attribute": "weight",
"num": 15
}
]
}
Basically, an object can have multiple attributes (a property array).
Those attributes can be different for each client. In this example, I have 2 types of attributes: tag and weight, however other documents could have other attributes like vendor, size, power, etc., so the model has to be generic enough to support beforehand unknown attributes.
An attribute can be a list of keywords (like tags) or a numeric value (like weight).
I need an ES query to fetch the documents ids with this pseudo-query:
cid="177" and (tag="flat" or tag="heel") and tag="economic" and weight<20
I managed to reach this query that seems to be working as expected:
{
"_source": ["id"],
"query": {
"bool": {
"must" : [
{"term" : { "cid" : "177" }},
{
"nested": {
"path": "a",
"query": {
"bool":{
"must":[
{"term" : { "a.attribute": "tags"}},
{"terms" : { "a.key": ["flat","heel"]}}
]
}
}
}
},
{
"nested": {
"path": "a",
"query": {
"bool":{
"must":[
{"term" : { "a.attribute": "tags"}},
{"term" : { "a.key": "economic"}}
]
}
}
}
},
{
"nested": {
"path": "a",
"query": {
"bool":{
"must":[
{"term" : { "a.attribute": "weight" } },
{"range": { "a.num": {"lt": 20} } }
]
}
}
}
}
]
}
}
}
Is this query correct or I am getting the correct results by chance?
Is the query (or mapping) optimal or I should rethink something?
Can the query be simplified?

The query is correct.
The mapping is great and the query is optimal.
While the query can be simplified:
{
"_source": [
"id"
],
"query": {
"bool": {
"must": [
{
"term": {
"cid": "177"
}
},
{
"nested": {
"path": "a",
"query": {
"query_string": {
"query": "a.attribute:tags AND ((a.key:flat OR a.key:heel) AND a.key:economic)"
}
}
}
},
{
"nested": {
"path": "a",
"query": {
"query_string": {
"query": "a.attribute:weight AND a.num:<20"
}
}
}
}
]
}
}
}
it'd be less optimal due to the fact that these query_strings would still need to be internally compiled into essentially the query DSL that you've got above. Plus you'd still be needing the two separate nested groups so... You're good to roll with what you've got.

Related

Elasticsearch nested query and sorting

can somebody help me to understand what Elastic means by nested. In documentations https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_nested_sorting_examples is an example which does not show how the document object looks like. It look like I should imagine the mapping from the search query. Query looks like:
POST /_search
{
"query": {
"nested": {
"path": "parent",
"query": {
"bool": {
"must": {"range": {"parent.age": {"gte": 21}}},
"filter": {
"nested": {
"path": "parent.child",
"query": {"match": {"parent.child.name": "matt"}}
}
}
}
}
}
},
"sort" : [
{
"parent.child.age" : {
"mode" : "min",
"order" : "asc",
"nested": {
"path": "parent",
"filter": {
"range": {"parent.age": {"gte": 21}}
},
"nested": {
"path": "parent.child",
"filter": {
"match": {"parent.child.name": "matt"}
}
}
}
}
}
]
}
Can somebody write a document structure on which this query will work?
Something like this.
{
"parent": {
"name": "Elasti Sorch",
"age": 23,
"child": [
{
"name": "Kibana Lion",
"age": 12
},
{
"name": "Matt",
"age": 15
}
]
}
}
In Elastic nested means it's an array of objects. To store an array of objects into a field in elastic search you have to map the field to a nested while creating the index.
PUT parent
{
"mappings": {
"doc":{
"properties": {
"name":{
"type": "text"
},
"age":{
"type": "integer"
},
"child":{
"type": "nested",
"properties": {
"name":{
"type":"text"
},
"age":{
"type":"integer"
}
}
}
}
}
}
}
and a sample nested document cab be inserted like this
POST parent/doc
{
"name":"abc",
"age":50,
"child":[
{
"name":"son1",
"age":25
},
{
"name":"adughter1",
"age":20
}
]
}

Elasticsearch nested geo-shape query

Suppose I have the following mapping:
"mappings": {
"doc": {
"properties": {
"name": {
"type": "text"
},
"location": {
"type": "nested",
"properties": {
"point": {
"type": "geo_shape"
}
}
}
}
}
}
}
There is one document in the index:
POST /example/doc?refresh
{
"name": "Wind & Wetter, Berlin, Germany",
"location": {
"type": "point",
"coordinates": [13.400544, 52.530286]
}
}
How can I make a nested geo-shape query?
Example of usual geo-shape query from the documentation (the "bool" block can be skipped):
{
"query":{
"bool": {
"must": {
"match_all": {}
},
"filter": {
"geo_shape": {
"location": {
"shape": {
"type": "envelope",
"coordinates" : [[13.0, 53.0], [14.0, 52.0]]
},
"relation": "within"
}
}
}
}
}
}
Example of a nested query is:
{
"query": {
"nested" : {
"path" : "obj1",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{ "match" : {"obj1.name" : "blue"} },
{ "range" : {"obj1.count" : {"gt" : 5}} }
]
}
}
}
}
}
Now how to combine them? In the documentation it is mentioned that nested filter has been replaced by nested query. And that it behaves as a query in “query context” and as a filter in “filter context”.
If I try query for intersect with the point:
{
"query": {
"nested": {
"path": "location",
"query": {
"geo_shape": {
"location.point": {
"shape": {
"type": "point",
"coordinates": [
13.400544,
52.530286
]
},
"relation": "disjoint"
}
}
}
}
}
}
I still get back the document even if relation is "disjoint", so it's not correct. I tried different combinations, with "bool" and "filter", etc. but query is ignored, returning the whole index. Maybe it's impossible with this type of mapping?
Clearly I am missing something here. Can somebody help me out with that, please? Any help is greatly appreciated.

Finding nested result of a certain parent

I have the following query which works in giving me all periods (nested) +houses they belong to that have an arrivaldate for the period I specify.
Now I want to try and get just the arrivaldates for a certain house, but I cannot figure out the syntax of how to do this in Elasticsearch.
GET /houses/house/_search
{
"_source" : ["HouseId"],
"query": {
"nested": {
"path": "Periods",
"query": {
"bool": {
"must": [
{"range": {
"Periods.ArrivalDate": {
"gte" : "2017-10-01",
"lt" : "2017-11-01"
}
}
}
]
}
},
"inner_hits" : {}
}
}
}
The mapping is this (shortened to I hope the relevant parts)
{
"houses": {
"mappings": {
"house": {
"properties": {
"Periods": {
"type": "nested",
"properties": {
"ArrivalDate": {
"type": "date",
"format": "yyyy-MM-dd"
},
....
"HouseId": {
"type": "keyword"
},
So I would like to find the available arrivaldates for a house with a certain HouseId within a certain month
I think I have it figured out, but please let me know if better solutions are available:
{
"_source":[
"HouseCode",
"Country",
"Region"
],
"query":{
"bool":{
"must":[
{
"match":{
"HouseId":"someid"
}
},
{
"nested":{
"path":"Periods",
"query":{
"range":{
"Periods.ArrivalDate":{
"gte":"2017-05-01",
"lt":"2017-06-01"
}
}
},
"inner_hits":{
"size":1000
}
}
}
]
}
}
}
The query will return you whole houses.
If you wan to get only some Periods, you should use a nested aggregations, combined with a filter aggregation :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-nested-aggregation.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html

Searching in specific fields of types

Consider the following query:
{
"query" : {
"match_phrase" : {
"_all" : "Smith"
}
}
}
How would I specify in which fields of which types it may search, instead of searching in everything? (field names may be non-unique across types)
I've tried the query below, but it didn't work (it doesn't return results, it does when I remove person. from all fields):
{
"query": {
"multi_match": {
"query": "Smith",
"fields": [
"person.first_name",
"person.last_name",
"person.age"
],
"lenient": true
}
}
}
I'm sending these queries to http://localhost:9200/tsf-model/_search.
If you can build your query dynamically, I think you can use a combination of your multi_match query and a type query for each type, in order to achieve what you want:
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"filter": [
{
"type": {
"value": "type1"
}
},
{
"multi_match": {
"query": "Smith",
"fields": [
"field1",
"field3",
"field5"
]
}
}
]
}
},
{
"bool": {
"filter": [
{
"type": {
"value": "type2"
}
},
{
"multi_match": {
"query": "Smith",
"fields": [
"field2",
"field4",
"field6"
]
}
}
]
}
}
]
}
}
}

Elastic Search Relevance for query based on most matches

I have a following mapping
posts":{
"properties":{
"prop1": {
"type": "nested",
"properties": {
"item1": {
"type": "string",
"index": "not_analyzed"
},
"item2": {
"type": "string",
"index": "not_analyzed"
},
"item3": {
"type": "string",
"index": "not_analyzed"
}
}
},
"name": {
"type": "string",
"index": "not_analyzed"
}
}
}
Consider the objects indexed like following for these mapping
{
"name": "Name1",
"prop1": [
{
"item1": "val1",
"item2": "val2",
"item3": "val3"
},
{
"item1": "val1",
"item2": "val5",
"item3": "val6"
}
]
}
And another object
{
"name": "Name2",
"prop1": [
{
"item1": "val2",
"item2": "val7",
"item3": "val8"
},
{
"item1": "val12",
"item2": "val9",
"item3": "val10"
}
]
}
Now say i want to search documents which have prop1.item1 value to be either "val1" or "val2". I also want the result to be sorted in such a way that the document with both val1 and val2 would have more score than the one with only one of "val1" or "val2".
I have tried the following query but that doesnt seem to score based on number of matches
{
"query": {
"filtered": {
"query": {"match_all": {}},
"filter": {
"nested": {
"path": "prop1",
"filter": {
"or": [
{
"and": [
{"term": {"prop1.item1": "val1"}},
{"term": {"prop1.item2": "val2"}}
]
},
{
"and": [
{"term": {"prop1.item1": "val1"}},
{"term": {"prop1.item2": "val5"}}
]
},
{
"and": [
{"term": {"prop1.item1": "val12"}},
{"term": {"prop1.item2": "val9"}}
]
}
]
}
}
}
}
}
}
Now although it should give both documents, first document should have more score as it contains 2 of the things in the filter whereas second contains only one.
Can someone help with the right query to get results sorted based on most matches ?
The biggest problem you have with your query is that you are using a filter. Therefore no score is calculated. Than you use a match_all query which gives all documents a score of 1. Replace the filtered query with a query and use the bool query instead of the bool filter.
Hope that helps.
Scores aren't calculated on filters use a nested query instead:
{
"query": {
"nested": {
"score_mode": "sum",
"path": "prop1",
"query": {
"bool": {
"should": [{
"bool": {
"must": [{
"match": {
"prop1.item1": "val1"
}
},
{
"match": {
"prop1.item2": "val2"
}
}]
}
},
{
"bool": {
"must": [{
"match": {
"prop1.item1": "val1"
}
},
{
"match": {
"prop1.item2": "val5"
}
}]
}
},
{
"bool": {
"must": [{
"match": {
"prop1.item1": "val12"
}
},
{
"match": {
"prop1.item2": "val9"
}
}]
}
}]
}
}
}
}
}

Resources