pick up objects inside a polygon, using elasticsearch - elasticsearch

I am trying to collect all my items within a polygon, but I am encountering an error.
"reason": "failed to find geo_point field [SitePoint.coordinates]",
I spent some time trying to understand and I can't understand what's wrong
my index
{
"took": 43,
"timed_out": false,
"hits": {
"total": 5,
"max_score": 0,
"hits": [
{
"_index": "my_index",
"_type": "doc",
"_id": "xxxx",
"_score": 0,
"_source": {
"SitePoint": {
"type": "Point",
"coordinates": [
18.85491,
-33.92305
]
}
}
}
]
}
}
my query
GET my_index/_search
{
"query": {
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"SitePoint.coordinates" : {
"points" : [
[18.85096,-33.96311],
[18.87787,-33.92564],
[18.85096,-33.96311]
]
}
}
}
}
}
}
Can someone help me please?

The issue seems to be that you're using a geo_polygon query which operates on geo_point fields, whereas SitePoint seems to be a geo_shape.
You need to use a geo_shape query instead of geo_polygon:
Try this query:
GET my_index/_search
{
"query": {
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_shape" : {
"SitePoint" : {
"shape": {
"type": "polygon",
"coordinates" : [
[18.85096,-33.96311],
[18.87787,-33.92564],
[18.85096,-33.96311]
]
},
"relation": "within"
}
}
}
}
}
}

Related

Elasticsearch query showing weird behavior : bug?

To sum up things quickly, we are using Elasticsearch 6.8.4 and have documents with fields such as "statutPublicOuInterne" (public or internal state) or "identifiant" (identifier).
I cannot share the whole JSON (_source) for security reasons (corporate restrictions), but it looks like the following:
"_source": {
"dateCreation": "2020-11-05T16:31:28.404+01:00",
"dateDerModif": "2020-11-05T16:31:49.183+01:00",
"contenu": { ... }
"langue": "fr",
"observations": null,
"statutPublicOuInterne": "enAttenteTraitementCommissionTask",
"identifiant": "SFB-20201105-ELUH",
(...)
}
Some of the "statutPublicOuInterne" can have values such as "enAttenteTraitementCommissionTask" or "enCoursTraitementCommissionTask".
1st question: for some reason, when I search for statutPublicOuInterne=enCoursTraitementCommissionTask, it doesn't work, but if I search for statutPublicOuInterne=enCoursTraitementCommission (without "Task"), it works! That seems so weird to me and I really can't explain it.
2nd question: if I assume I need to search without the "Task" at the end, then searching for statutPublicOuInterne=enCoursTraitementCommission works but statutPublicOuInterne=enAttenteTraitementCommission doesn't work! (nor does statutPublicOuInterne=enAttenteTraitementCommissionTask work)
The query is as follows:
{
"query": {
"bool" : {
"must" : [
{
"match" : {
"statutPublicOuInterne" : {
"query" : "enAttenteTraitementCommission"
}
}
}
]
}
}
}
I just can't understand why it doesn't find anything, because if I search for this document with its "identifiant" field, then it works:
{
"query": {
"bool" : {
"must" : [
{
"match" : {
"identifiant" : {
"query" : "SFB-20201105-ELUH"
}
}
}
]
}
}
}
The response is:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2.0283146,
"hits": [
{
"_index": "some-index",
"_type": "demandes",
"_id": "SFB-20201105-ELUH",
"_score": 2.0283146,
"_source": {
"dateCreation": "2020-11-05T16:31:28.404+01:00",
"dateDerModif": "2020-11-05T16:31:49.183+01:00",
"contenu": { ... }
"langue": "fr",
"observations": null,
"statutPublicOuInterne": "enAttenteTraitementCommissionTask",
"identifiant": "SFB-20201105-ELUH",
(...)
}
}
]
}
}
We can clearly see "statutPublicOuInterne": "enAttenteTraitementCommissionTask" in the response.
Am I missing something?
Many thanks in advance for your help!
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"statutPublicOuInterne": {
"type": "text"
}
}
}
}
Index Data:
{
"dateCreation": "2020-11-05T16:31:28.404+01:00",
"dateDerModif": "2020-11-05T16:31:49.183+01:00",
"langue": "fr",
"observations": null,
"statutPublicOuInterne": "enAttenteTraitementCommissionTask",
"identifiant": "SFB-20201105-ELUH"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"match": {
"statutPublicOuInterne": {
"query": "enAttenteTraitementCommissionTask"
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "64700803",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"dateCreation": "2020-11-05T16:31:28.404+01:00",
"dateDerModif": "2020-11-05T16:31:49.183+01:00",
"langue": "fr",
"observations": null,
"statutPublicOuInterne": "enAttenteTraitementCommissionTask",
"identifiant": "SFB-20201105-ELUH"
}
}
]

elasticsearch nested aggregation is empty

So, I have an index in Elasticsearch 7.6, which has documents similar to this one:
{
"_index": "my-index",
"_type": "_doc",
"_id": "kjdskjwolsjj",
"_version": 1,
"_score": null,
"_source": {
"timestamp": "2018-04-22T20:11:35.0292586Z",
"batchId": "9c96d360-5549-4b3b-85c8-756330117bad",
"userId": "id-001-001",
"things": [
{
"id": 650055867,
"name": "green",
},
{
"id": 523,
"name": "eggs",
},
{
"id": 1269,
"name": "ham",
}
]
}
}
Of course, this is just one document of many in the index. I would like to create an aggregate bucket of all the "things" in my index, so that I could sub aggregate against that bucket.
My agg query looks like this:
{
"aggs": {
"all_things": {
"nested": {
"path": "_source.things"
}
}
}
}
(BTW ... if I used just "things" as the nested path, it complains "[nested] nested path [things] is not nested".)
Finally the result (using the Kibana console) is:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1408,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"all_things" : {
"doc_count" : 0
}
}
}
Could someone explain why I get no docs in my bucket? Or perhaps a decent way to create a bucket of all my "things"?
Thanks.
You've gotta index your things as nested:
PUT my-index
{
"mappings": {
"properties": {
"things": {
"type": "nested"
}
}
}
}
POST my-index/_doc
{
"timestamp": "2018-04-22T20:11:35.0292586Z",
"batchId": "9c96d360-5549-4b3b-85c8-756330117bad",
"userId": "id-001-001",
"things": [
{
"id": 650055867,
"name": "green"
},
{
"id": 523,
"name": "eggs"
},
{
"id": 1269,
"name": "ham"
}
]
}
Then and only then will your nested aggs work:
GET my-index/_search
{
"size": 0,
"aggs": {
"things_ids": {
"nested": {
"path": "things"
},
"aggs": {
"things_ids": {
"cardinality": {
"field": "things.id"
}
}
}
}
}
}

Elasticsearch filter by multiple fields in an object which is in an array field

The goal is to filter products with multiple prices.
The data looks like this:
{
"name":"a",
"price":[
{
"membershipLevel":"Gold",
"price":"5"
},
{
"membershipLevel":"Silver",
"price":"50"
},
{
"membershipLevel":"Bronze",
"price":"100"
}
]
}
I would like to filter by membershipLevel and price. For example, if I am a silver member and query price range 0-10, the product should not appear, but if I am a gold member, the product "a" should appear. Is this kind of query supported by Elasticsearch?
You need to make use of nested datatype for price and make use of nested query for your use case.
Please see the below mapping, sample document, query and response:
Mapping:
PUT my_price_index
{
"mappings": {
"properties": {
"name":{
"type":"text"
},
"price":{
"type":"nested",
"properties": {
"membershipLevel":{
"type":"keyword"
},
"price":{
"type":"double"
}
}
}
}
}
}
Sample Document:
POST my_price_index/_doc/1
{
"name":"a",
"price":[
{
"membershipLevel":"Gold",
"price":"5"
},
{
"membershipLevel":"Silver",
"price":"50"
},
{
"membershipLevel":"Bronze",
"price":"100"
}
]
}
Query:
POST my_price_index/_search
{
"query": {
"nested": {
"path": "price",
"query": {
"bool": {
"must": [
{
"term": {
"price.membershipLevel": "Gold"
}
},
{
"range": {
"price.price": {
"gte": 0,
"lte": 10
}
}
}
]
}
},
"inner_hits": {} <---- Do note this.
}
}
}
The above query means, I want to return all the documents having price.price range from 0 to 10 and price.membershipLevel as Gold.
Notice that I've made use of inner_hits. The reason is despite being a nested document, ES as response would return the entire set of document instead of only the document specific to where the query clause is applicable.
In order to find the exact nested doc that has been matched, you would need to make use of inner_hits.
Below is how the response would return.
Response:
{
"took" : 128,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808291,
"hits" : [
{
"_index" : "my_price_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.9808291,
"_source" : {
"name" : "a",
"price" : [
{
"membershipLevel" : "Gold",
"price" : "5"
},
{
"membershipLevel" : "Silver",
"price" : "50"
},
{
"membershipLevel" : "Bronze",
"price" : "100"
}
]
},
"inner_hits" : {
"price" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808291,
"hits" : [
{
"_index" : "my_price_index",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "price",
"offset" : 0
},
"_score" : 1.9808291,
"_source" : {
"membershipLevel" : "Gold",
"price" : "5"
}
}
]
}
}
}
}
]
}
}
Hope this helps!
Let me take show you how to do it, using the nested fields and query and filter context. I will take your example to show, you how to define index mapping, index sample documents, and search query.
It's important to note the include_in_parent param in Elasticsearch mapping, which allows us to use these nested fields without using the nested fields.
Please refer to Elasticsearch documentation about it.
If true, all fields in the nested object are also added to the parent
document as standard (flat) fields. Defaults to false.
Index Def
{
"mappings": {
"properties": {
"product": {
"type": "nested",
"include_in_parent": true
}
}
}
}
Index sample docs
{
"product": {
"price" : 5,
"membershipLevel" : "Gold"
}
}
{
"product": {
"price" : 50,
"membershipLevel" : "Silver"
}
}
{
"product": {
"price" : 100,
"membershipLevel" : "Bronze"
}
}
Search query to show Gold with price range 0-10
{
"query": {
"bool": {
"must": [
{
"match": {
"product.membershipLevel": "Gold"
}
}
],
"filter": [
{
"range": {
"product.price": {
"gte": 0,
"lte" : 10
}
}
}
]
}
}
}
Result
"hits": [
{
"_index": "so-60620921-nested",
"_type": "_doc",
"_id": "1",
"_score": 1.0296195,
"_source": {
"product": {
"price": 5,
"membershipLevel": "Gold"
}
}
}
]
Search query to exclude Silver, with same price range
{
"query": {
"bool": {
"must": [
{
"match": {
"product.membershipLevel": "Silver"
}
}
],
"filter": [
{
"range": {
"product.price": {
"gte": 0,
"lte" : 10
}
}
}
]
}
}
}
Above query doesn't return any result as there isn't any matching result.
P.S :- this SO answer might help you to understand nested fields and query on them in detail.
You have to use Nested fields and nested query to archive this: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
Define you Price property with type "Nested" and then you will be able to filter by every property of nested object

Elasticsearch advanced autocomplete

I want to autocomplete user input with Elasticsearch. Now There are tons of tutorials out there how to do so, but none go into the really detailed stuff.
The last issue I'm having with my query is that it should score Results that are not real "autocompletions" lower. Example:
IS:
I type: "Bed"
I find: "Bed", "Bigbed", "Fancy Bed", "Bed Frame"
WANT:
I type: "Bed"
I find: "Bed", "Bed Frame", [other "Bed XXX" results], "Fancy Bed", "Bigbed"
So i want Elasticsearch to first complete "to the right" if that makes sense. And then use results that have words in front of it.
I've tried the completion suggester I doesn't do other stuff I want but also has the same issue.
In German there are lots of examples of words like Bigbed (which isn't a real word in English, I know. But I don't want those words as high results. But since they match closer than Bed Frame (because that is 2 Tokens) they show up so high.
This is my query currently:
POST autocompletion/_search?pretty
{
"query": {
"function_score": {
"query": {
"match": {
"keyword": {
"query": "Bed",
"fuzziness": 1,
"minimum_should_match": "100%"
}
}
},
"field_value_factor": {
"field": "bias",
"factor": 1
}
}
}
}
If you use elasticsearch completion suggester, as explained at https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html, when querying like:
{
"suggest": {
"song-suggest" : {
"prefix" : "bed",
"completion" : {
"field" : "suggest"
}
}
}
}
You will get:
{
"took": 13,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": 0.0,
"hits": []
},
"suggest": {
"song-suggest": [
{
"text": "bed",
"offset": 0,
"length": 3,
"options": [
{
"text": "Bed",
"_index": "autocomplete",
"_type": "_doc",
"_id": "1",
"_score": 34.0,
"_source": {
"suggest": {
"input": [
"Bed"
],
"weight": 34
}
}
},
{
"text": "Bed Frame",
"_index": "autocomplete",
"_type": "_doc",
"_id": "3",
"_score": 34.0,
"_source": {
"suggest": {
"input": [
"Bed Frame"
],
"weight": 34
}
}
}
]
}
]
}
}
If you want to use the search API instead, you can use 2 queries:
prefix query "bed ****"
with a term starting by "bed"
Here the mapping:
{
"mappings": {
"_doc" : {
"properties" : {
"suggest" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword"
}
}
}
}
}
}
Here the search query:
{
"query" : {
"bool" : {
"must" : [
{
"match" : {
"suggest" : "Bed"
}
}
],
"should" : [
{
"prefix" : {
"suggest.keyword" : "Bed"
}
}
]
}
}
}
The should clause will boost document starting by "Bed". Et voilĂ !

Making an elasticsearch nested query for match on all fields of nested object

The basic question is as follows:
Is there a convenient way to specify a multi-field match on all fields for a nested query?
For a normal query { match : { _all : "query string" }} works.
This doesn't work in a nested query for perhaps because the nested object doesn't have an _all?
The more detailed question below:
I have a nested document called "Parent" as follows:
{
"children" : [
{
"field_a": "value_a_1",
"field_b" : "value_b_1",
"field_c" : [ {
"field_c_a" : "value_c_a_1",
"field_c_b" : "value_c_b_1"
} ]
},
{
"field_a": "value_a_2",
"field_b" : "value_b_2",
"field_c" : [ {
"field_c_a" : "value_c_a_2",
"field_c_b" : "value_c_b_2"
} ]
}
]
}
This is the mapping I used for making the children nested objects:
"Parent" : {
"properties" : {
"children" : {
"type" : "nested",
"include_in_parent" : true
}
}
}
And here is a query, where I want to select a few terms using a match on all children fields query, and a term query:
"query" : {
"nested": {
"path" : "children",
"query" : {
"bool" : {
"must" : [
{"multi_match" : {"query": "value_c_a_1", "fields" : ["children.*"]}},
{"term" : {children.field_a : "value_a_1" }}
]
}
}
}
}
The above query doesn't work because I can't select all fields in a multimatch query for a nested object.
"query" : {
"nested": {
"path" : "children",
"query" : {
"bool" : {
"must" : [
{"multi_match" : {"query": "value_c_a_1", "fields" : ["*_c_a"]}}
]
}
}
}
}
The query above works because the pattern matching allows a * to be placed before a string but not after for some reason (?)
Is there a nice shorthand way to select all the fields of a nested object?
It would also be nice to know why the expected children.* wildcard doesn't work as expected.
Worked normally for me (note: Elasticsearch 1.7). Notice the different names though (aparent, somechildren) - I haven't tested it with the original names, but it could have something to do with it.
Schema:
curl -XPUT localhost:9200/test -d '{
"mappings": {
"aparent": {
"properties": {
"somechildren": {
"type": "nested",
"include_in_parent": true
}
}
}
}
}'
Document:
curl -XPUT localhost:9200/test/aparent/1 -d '{
"somechildren" : [
{
"field_a": "value_a_1",
"field_b" : "value_b_1",
"field_c" : [ {
"field_c_a" : "value_c_a_1",
"field_c_b" : "value_c_b_1"
} ]
},
{
"field_a": "value_a_2",
"field_b" : "value_b_2",
"field_c" : [ {
"field_c_a" : "value_c_a_2",
"field_c_b" : "value_c_b_2"
} ]
}
]
}'
Query:
GET test/_search
{
"query": {
"nested": {
"path": "somechildren",
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "value_c_a_1",
"fields": [
"somechildren.*"
]
}
},
{
"term": {
"somechildren.field_a": {
"value": "value_a_1"
}
}
}
]
}
}
}
}
}
Result:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.8603305,
"hits": [
{
"_index": "test",
"_type": "aparent",
"_id": "1",
"_score": 0.8603305,
"_source": {
"somechildren": [
{
"field_a": "value_a_1",
"field_b": "value_b_1",
"field_c": [
{
"field_c_a": "value_c_a_1",
"field_c_b": "value_c_b_1"
}
]
},
{
"field_a": "value_a_2",
"field_b": "value_b_2",
"field_c": [
{
"field_c_a": "value_c_a_2",
"field_c_b": "value_c_b_2"
}
]
}
]
}
}
]
}
}

Resources