Elasticsearch point in polygon query - elasticsearch

i'm using Elasticsearch V6.7.1.
I have created few indices and filled them with data. each index has a field
gps_coords with lat andlon` values ( coordinates )
What I want to do is to write a query, where i pass a polygon and check if a certain point falls into that polygon, is that possible ?
This is a query that I've already tried:
{
"query": {
"bool" : {
"must" : {
"match_all" : {}
},
"filter": {
"geo_shape": {
"location": {
"shape": {
"type": "polygon",
"coordinates" : [
[25.0245351, 54.5693374],
[25.0245351, 54.83232],
[25.4815808, 54.83232],
[25.4815808, 54.5693374],
[25.0245351, 54.5693374]
]
},
"relation": "within"
}
}
}
}
}
}
But it returns this error:
{
"error": {
"root_cause": [
{
"type": "parse_exception",
"reason": "Invalid LinearRing found. Found a single coordinate when expecting a coordinate array"
}
],
"type": "parse_exception",
"reason": "Invalid LinearRing found. Found a single coordinate when expecting a coordinate array"
},
"status": 400
}
Here is my index mapping:
[
'index' => 'places',
'body' => [
'mappings' => [
'place' => [
"properties" => [
"gps_coords" => [
"ignore_malformed" => true,
"type" => "geo_shape"
]
]
],
],
"number_of_replicas" => 0
]
]
];
can someone please point me to the right direction.
Thank you !

First, in the query sample you use location field instead of gps_coords, as already stated in the comment. But I believe this is just a typo because that's not the source of the error.
The reason you receive a parse exception is that you are missing one pair of brackets in the polygon definition in geo_shape query. See the correct form here. The correct form would be (just the relevant part):
"shape": {
"type": "polygon",
"coordinates" : [
[[25.0245351, 54.5693374],
[25.0245351, 54.83232],
[25.4815808, 54.83232],
[25.4815808, 54.5693374],
[25.0245351, 54.5693374]]
]
}

Yeah, sorry, my bad, just copied the example from ES docs not my code. Okey, will try to add those brackets and see if it helps, thanks!

Related

How to get the best matching document in Elasticsearch?

I have an index where I store all the places used in my documents. I want to use this index to see if the user mentioned one of the places in the text query I receive.
Unfortunately, I have two documents whose name is similar enough to trick Elasticsearch scoring: Stockholm and Stockholm-Arlanda.
My test phrase is intyg stockholm and this is the query I use to get the best matching document.
{
"size": 1,
"query": {
"bool": {
"should": [
{
"match": {
"name": "intyig stockholm"
}
}
],
"must": [
{
"term": {
"type": {
"value": "4"
}
}
},
{
"terms": {
"name": [
"intyg",
"stockholm"
]
}
},
{
"exists": {
"field": "data.coordinates"
}
}
]
}
}
}
As you can see, I use a terms query to find the interesting documents and I use a match query in the should part of the root bool query to use scoring to get the document I want (Stockholm) on top.
This code worked locally (where I run ES in a container) but it broke when I started testing on a cluster hosted in AWS (where I have the exact same dataset). I found this explaining what happens and adding the search type argument actually fixes the issue.
Since the workaround is best not used on production, I'm looking for ways to have the expected result.
Here are the two documents:
// Stockholm
{
"type" : 4,
"name" : "Stockholm",
"id" : "42",
"searchableNames" : [
"Stockholm"
],
"uniqueId" : "Place:42",
"data" : {
"coordinates" : "59.32932349999999,18.0685808"
}
}
// Stockholm-Arlanda
{
"type" : 4,
"name" : "Stockholm-Arlanda",
"id" : "1832",
"searchableNames" : [
"Stockholm-Arlanda"
],
"uniqueId" : "Place:1832",
"data" : {
"coordinates" : "59.6497622,17.9237807"
}
}

Combining terms with synonyms - ElasticSearch

I am new to Elasticsearch and have a synonym analyzer in place which looks like-
{
"settings": {
"index": {
"analysis": {
"filter": {
"graph_synonyms": {
"type": "synonym_graph",
"synonyms": [
"gowns, dresses",
"backpacks, bags",
"coats, jackets"
]
}
},
"analyzer": {
"search_time_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"graph_synonyms"
]
}
}
}
}
}
}
And the mapping looks like-
{
"properties": {
"category": {
"type": "text",
"search_analyzer": "search_time_analyzer",
"fields": {
"no_synonyms": {
"type": "text"
}
}
}
}
}
If I search for gowns, it gives me proper results for both gowns as well as dresses.
But the problem is if I search for red gowns, (the system does not have any red gowns) the expected behavior is to search for red dresses and return those results. But instead, it returns results of gowns and dresses irrespective of the color.
I would want to configure the system such that it considers both the terms and their respective synonyms if any and then return the results.
For reference, this is what my search query looks like-
"query":
{
"bool":
{
should:
[
{
"multi_match":
{
"boost": 300,
"query": term,
"type": "cross_fields",
"operator": "or",
"fields": ["bu.keyword^10", "bu^10", "category.keyword^8", "category^8", "category.no_synonyms^8", "brand.keyword^7", "brand^7", "colors.keyword^2", "colors^2", "size.keyword", "size", "hash.keyword^2", "hash^2", "name"]
}
}
]
}
}
Sample document:
_source: {
productId: '12345',
name: 'RUFFLE FLORAL TRIM COTTON MAXI DRESS',
brand: [ 'self-portrait' ],
mainImage: 'http://test.jpg',
description: 'Self-portrait presents this maxi dress, crafted from cotton, to offer your off-duty ensembles an elegant update. Trimmed with ruffled broderie details, this piece is an effortless showcase of modern femininity.',
status: 'active',
bu: [ 'womenswear' ],
category: [ 'dresses', 'gowns' ],
tier1: [],
tier2: [],
colors: [ 'WHITE' ],
size: [ '4', '6', '8', '10' ],
hash: [
'ballgown', 'cotton',
'effortless', 'elegant',
'floral', 'jar',
'maxi', 'modern',
'off-duty', 'ruffle',
'ruffled', '1',
'2', 'crafted'
],
styleCode: '211274856'
}
How can I achieve the desired output? Any help would be appreciated. Thanks
You can configured index time analyzer insted of search time analyzer like below:
{
"properties": {
"category": {
"type": "text",
"analyzer": "search_time_analyzer",
"fields": {
"no_synonyms": {
"type": "text"
}
}
}
}
}
Once you done with index mapping change, reindex your data and try below query:
Please note that I have changed operator to and and analyzer to standard:
{
"query": {
"multi_match": {
"boost": 300,
"query": "gowns red",
"analyzer": "standard",
"type": "cross_fields",
"operator": "and",
"fields": [
"category",
"colors"
]
}
}
}
Why your current query is not working:
Inexing:
Your current index mapping indexing data with standard analyzer so it will not index any of your category with synonyms values.
Searching:
Your current query have operator or so if you search for red gowns then it will create query like red OR gowns OR dresses and it will giving you result irrespective of the color. Also, if you change operator to and in existing configuration then it will return zero result as it will create query like red AND gowns AND dresses.
Solution: Once you done changes as i suggsted it will index synonyms for category field as well and it will work with and operator. So if you try query gowns red then it will create query like gowns AND red. It will match because category field have both values gowns and dresses due to synonyms applied at index time.

Does Elasticsearch support geo queries like ST_DWithin in postgis

How to perform such query in Elasticsearch that whether a geo_point is within the specified distance(or radius/buffer) of a line(depicted by 2 pairs of lat/lon)
shape like this
This is not implemented in Elastic as far as I know. But you can still achieve what you want by calculating the polygon offline and then use it in a geo_polygon query. The geo_shape query could also be used but you need a geo_shape field instead of a geo_point one.
So, for instance, using turf you can precompute the polygon around the line using the buffer feature. Below, I'm defining a line along some road somewhere in San Jose (CA) and a buffer of 50 meters around that line/road:
const line = turf.lineString([[-121.862282,37.315430], [-121.851553,37.305532]], {name: 'line 1'});
const bufferPoly = turf.buffer(line, 50, {units: 'meters'});
You'll get the following polygon (abbreviated)
{
"type": "Feature",
"properties": {
"name": "line 1"
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-121.85121372045873,
37.305765606399724
],
[
-121.85116304254947,
37.30570833334188
],
[
-121.85112738572346,
37.30564429665501
],
[
-121.85110812025259,
37.30557595721911
],
...
[
-121.85121372045873,
37.305765606399724
]
]
]
}
}
Which looks like this:
Then you can leverage the geo_polygon query like this:
GET /_search
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": {
"geo_polygon": {
"your_geo_point": {
"points": [
[
-121.85121372045873,
37.305765606399724
],
[
-121.85116304254947,
37.30570833334188
],
[
-121.85112738572346,
37.30564429665501
],
[
-121.85110812025259,
37.30557595721911
],
...
[
-121.85121372045873,
37.305765606399724
]
]
}
}
}
}
}
}

elasticsearch -check if array contains a value

I want to check on an field of an array long type that includes some values.
the only way I found is using script: ElasticSearch Scripting: check if array contains a value
but it still not working fore me:
Query:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "doc['Commodity'].values.contains(param1)",
"params": {
"param1": 50
}
}
}
}
}
}
but I get 0 hits. while I have the records:
{
"_index" : "aaa",
"_type" : "logs",
"_id" : "2zzlXEOgRtujWiCGtX6s9Q",
"_score" : 1,
"_source" : {
"Commodity" : [
50
],
"Type" : 1,
"SourceId" : "fsd",
"Id" : 123
}
}
Try this instead of that script:
{
"query": {
"filtered": {
"filter": {
"terms": {
"Commodity": [
55,
150
],
"execution": "and"
}
}
}
}
}
For those of you using the latest version of Elasticsearch (7.1.1), please note that
"filtered" and "execution" are deprecated so #Andrei Stefan's answer may not help anymore.
You can go through the below discussion for alternative approaches.
https://discuss.elastic.co/t/is-there-an-alternative-solution-to-terms-execution-and-on-es-2-x/41089
In the answer written by nik9000 in the above discussion, I just replaced "term" with "terms" (in PHP) and it started working with array inputs and AND was applied with respect to each of the "terms" keys that I used.
EDIT: Upon request I will post a sample query written in PHP.
'body' => [
'query' => [
'bool' => [
'filter' => [
['terms' => ['key1' => array1]],
['terms' => ['key2' => array2]],
['terms' => ['key3' => array3]],
['terms' => ['key4' => array4]],
]
]
]
]
key1,key2 and key3 are keys present in my elasticsearch data and they will be searched for in their respective arrays. AND function is applied between the ["terms" => ['key' => array ] lines.
For those of you who are using es 6.x, this might help.
Here I am checking whether the user(rennish.joseph#gmail.com) has any orders by passing in an array of orders
GET user-orders/_search
{
"query": {
"bool": {
"filter": [
{
"terms":{
"orders":["123456","45678910"]
}
},
{
"term":{
"user":"rennish.joseph#gmail.com"
}
}
]
}
}
}

ElasticSearch get all fields even if their value is null

I want to search ElasticSearch and retrieve specific fields from all records, no matter their value. But response contains for each record only the fields whose value is not null. Is there a way to force ElasticSearch to return the exact same number of fields for all records?
Example Request:
{
"fields" : ["Field1","Field2","Field3"],
"query" : {
"match_all" : {}
}
}
Example Response:
{
"hits": [
{
"fields": {
"Field1": [
"bla"
],
"Field2": [
"test"
]
}
},
{
"fields": {
"Field1": [
"bla"
],
"Field2": [
"test"
],
"Field3": [
"somevalue"
]
}
}
]
}
My goal is to get something for "Field3" in the first hit.
As per the guide given in the following link, It clearly says that any fields which has null,[] or "" are not stored or not indexed in the document. Its an inverted index concept and has to be handled in the program explicitly.
link - http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_dealing_with_null_values.html

Resources