Unexpected Geo_shape query behaviour - elasticsearch

My documents have geo_shapes to associate them to an area. If I give ES (1.7) a geo_point I'm wanting it to give me back the documents where the point falls within that area.
I've recreated with the following toy example:-
# create the index
put location_test
put location_test/_mapping/place
{
"place": {
"properties": {
"message": {"type": "string"},
"coverage": {"type": "geo_shape"}
}
}
}
# check the mapping is correct
get location_test/place/_mapping
# location 1
put location_test/place/1
{
"message": "we will be in this box",
"coverage": {
"type" : "envelope",
"coordinates" : [[1, 0], [0, 1] ]
}
}
# location 2
put location_test/place/2
{
"message": "we will be outside this box",
"coverage": {
"type" : "envelope",
"coordinates" : [[2, 1], [1, 2] ]
}
}
# all documents returned - OK
get location_test/place/_search
{
"query": { "match_all": {}}
}
# should only get document 1, but get both.
get location_test/place/_search
{
"query": {
"geo_shape": {
"coverage": {
"shape": {
"type": "point"
"coordinates": [0.1,0.1]
}
}
}
}
}

Besides the fact that you're missing a comma after "type": "point" in your last query, I do get a single point when POSTing the query to the _search endpoint:
curl -XPOST localhost:9200/location_test/place/_search -d '{
"query": {
"geo_shape": {
"coverage": {
"shape": {
"type": "point", <---- comma missing here
"coordinates": [0.1,0.1]
}
}
}
}
}'
Results:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "location_test",
"_type" : "place",
"_id" : "1",
"_score" : 1.0,
"_source":{"message":"we will be in this box","coverage":{"type":"envelope","coordinates":[[1,0],[0,1]]}}
} ]
}
}
When sending a payload you should use POST instead of GET as not all HTTP clients send payload when using GET.

Related

Elastic Search shows "Unknown key for a START_OBJECT" exception

I am sending the following query to elastic search in order to get data which are within the range of the values between the from and to:
{
"range" : {
"variables.value.long" : {
"from" : -1.0E19,
"to" : 9.1E18,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}.
}
}
Despite that elastic search throws the following error:
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "Unknown key for a START_OBJECT in [range].",
"line": 2,
"col": 13
}
],
"type": "parsing_exception",
"reason": "Unknown key for a START_OBJECT in [range].",
"line": 2,
"col": 13
},
"status": 400
}
Does anybody know what this error means and why I am getting it?
There is some lack of context here like your mappings or the full query you are running, but this is how a range query should look for your document.
Create index
PUT test_andromachiii
{
"mappings": {
"properties": {
"variables": {
"properties": {
"values": {
"properties": {
"long": {
"type": "double"
}
}
}
}
}
}
}
}
Index document
POST test_andromachiii/_doc
{
"variables": {
"values": {
"long": 9.1E18
}
}
}
Run Query
POST test_andromachiii/_search
{
"query": {
"range": {
"variables.values.long": {
"lte": -1.0E19,
"gte": 9.1E18,
"boost": 1
}
}
}
}
Note lte means lower or equals to, gte greater or equals to.
Response
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "test_andromachiii",
"_type" : "_doc",
"_id" : "gtGj73cBbr4pOF0Is9my",
"_score" : 1.0,
"_source" : {
"variables" : {
"values" : {
"long" : 9.1E18
}
}
}
}
]
}
}
It looks like you're using version <0.90.4. If that's the case, simply wrap your range in a parent query object:
{
"query":{
"range":{
"variables.value.long":{
"from":-1.0E19,
"to":9.1E18,
"include_lower":true,
"include_upper":true,
"boost":1.0
}
}
}
}
If you're using any newer version than that, note that:
The from, to, include_lower and include_upper parameters have been deprecated in 0.90.4 in favour of gt, gte, lt, and lte.
This error is saying (somewhat cryptically) that you have a key range with an Object value, in a place where that key isn't recognised.
The specific cause here is that your range needs to be part of a higher query key such as (i.e.) the bool query, not part of the main.
Credit: https://discuss.elastic.co/t/unknown-key-for-a-start-object-in-should/140008/3

Named rescore queries

Named queries help me to identify which part of the query hit.
For normal queries this works perfectly.
However for rescore queries named queries don't show up in the response.
Question: Is this a bug or intentional? Is there a workaround?
Update: I raised a feature request
I attached some code to reproduce the problem:
Set up a test index with a single document
PUT /test
{
"mappings": {
"properties": {
"field1": {
"type": "keyword"
},
"field2": {
"type": "keyword"
}
}
}
}
POST /test/_doc/1
{
"field1": "a",
"field2": "b"
}
"Normal" and rescore query.
GET /test/_search
{
"query": {
"term": {
"field1": {
"value": "a",
"_name": "query_field_1"
}
}
},
"rescore": {
"query": {
"rescore_query": {
"term": {
"field2": {
"value": "b",
"_name": "query_field_2"
}
}
}
},
"window_size": 50
}
}
Response: Only the name of the "normal" query shows up in matched_queries.
That "query_field_2" must have also hit can be ensured by comparing the score with and without the rescore query.
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.5753642,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"field1" : "a",
"field2" : "b"
},
"matched_queries" : [
"query_field_1" <<-----HERE I'D EXPECT query_field_2------
]
}
]
}
}
The naming is perhaps unfortunate but the rescore query just tweaks the scoring and is applied after the query and post_filter phases. So since it's not an actual query, it cannot be _named.
It's certainly worth a feature request though.

access query value from function_score to compute new score

I need to customize ES score. The score function I need to implement is:
score = len(document_term) - len(query_term)
For instance, one of my document in the ES index is :
{
"name": "foobar"
}
And the search query
{
"query": {
"function_score": {
"query": {
"match": {
"name": {
"query": "foo"
}
}
},
"functions": [
{
"script_score": {
"script": {
"source": "doc['name'].value.length() - ?LEN(query_tem)?"
}
}
}
],
"boost_mode": "replace"
}
}
}
The above search should provide a score of 6 - 3 = 3. But I didn't find a solution to get access the value of the query term.
Is it possible to access the value of the query term in a function_score context ?
There is no direct way to do this, however you can achieve that in the below way where you would need to add the query parameters in two different parts of the query.
Before that one important note, you cannot apply the doc['myfield'].value if the field is of type text, instead you would need to have its sibling field created as keyword and refer that in the script, which again I've mentioned below:
Mapping:
PUT myindex
{
"mappings" : {
"properties" : {
"myfield" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
Sample Document:
POST myquery/_doc/1
{
"myfield": "I've become comfortably numb"
}
Query:
POST <your_index_name>/_search
{
"query": {
"function_score": {
"query": {
"match": {
"myfield": "numb"
}
},
"functions": [
{
"script_score": {
"script": {
"source": "return doc['myfield.keyword'].value.length() - params.myquery.length()",
"params": {
"myquery": "numb" <---- Add the query string here as well
}
}
}
}
],
"boost_mode": "replace"
}
}
}
Response:
{
"took" : 558,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 24.0,
"hits" : [
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "1",
"_score" : 24.0,
"_source" : {
"myfield" : "I've become comfortably numb"
}
}
]
}
}
Hope this helps!

elastic search 5 - how to query Object datatype and nested array of json

I want to query against nested data already loaded into Elasticsearch 5 but every query returns nothing. The data is of object datatype and nested array of json.
This the nested datatype ie team_members array of json:
[{
"id": 6,
"name": "mike",
"priority": 1
}, {
"id": 7,
"name": "james",
"priority": 2
}]
This object datatype ie the availability_slot json:
{
"monday": {
"on": true,
"end_time": "15",
"start_time": "9",
"end_time_unit": "pm",
"start_time_unit": "am",
"events_starts_every": 10
}
}
This is my elasticsearch mapping:
{
"meetings_development_20170716013030509": {
"mappings": {
"meeting": {
"properties": {
"account": {"type": "integer"},
"availability_slot": {
"properties": {
"monday": {
"properties": {
"end_time": {"type": "text"},
"end_time_unit": {"type": "text"},
"events_starts_every": {
"type":"integer"
},
"on": {"type": "boolean"},
"start_time": {"type": "text"},
"start_time_unit": {
"type": "text"
}
}
}
}
},
"team_members": {
"type": "nested",
"properties": {
"id": {"type": "integer"},
"name": {"type": "text"},
"priority": {"type": "integer"}
}
}
}
}
}
}
}
I have two queries which are failing for different reasons:
query 1
This query returns a count of zero despite the records existing in elasticsearch, I discovered the queries are failing because of the filter:
curl -u elastic:changeme http://172.19.0.4:9200/meetings_development/_search?pretty -d '{"query":{"nested":{"path":"team_members","score_mode":"avg","query":{"bool":{"must":[{"match":{"team_members.name":"mike"}},{"match":{"team_members.priority":1}}],"filter":[{"match":{"account":1}}]}}}}}'
This returns zero result:
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
query 1 without filter
Thesame query from above without the filter works:
curl -u elastic:changeme http://172.19.0.4:9200/meetings_development/_search?pretty -d '{"query":{"nested":{"path":"team_members","score_mode":"avg","query":{"bool":{"must":[{"match":{"team_members.name":"mike"}},{"match":{"team_members.priority":1}}]}}}}}'
The query above returns 3 hits:
{
"took" : 312,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 2.1451323,
"hits" : [{**results available here**} ]
}
}
query 2 for the object datatype
curl -u elastic:changeme http://172.19.0.4:9200/meetings_development/_search?pretty -d '{"query":{"bool":{"must":{"match":{"availability_slot.start_time":1}},"filter":[{"match":{"account":1}}]}}}'
The query returns a hit of zero but the data is in elasticsearch:
{
"took" : 172,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
How do I get both queries to work filtering by account. Thanks
This elasticsearch guide link was very helpful in coming up with the correct elasticsearch queries shown below:
query 1 for the nested array of json
{
"query" => {
"bool": {
"must": [
{
"match": {
"name": "sales call"
}
},
{"nested" => {
"path" => "team_members",
"score_mode" => "avg",
"query" => {
"bool" => {
"must" => {
"match" => {"team_members.name" => "mike"}
}
}
}
}
}
],
"filter": {
"term": {
"account": 1
}
}
},
}
}
Just pass the query to elastic search like this:
curl http://172.19.0.4:9200/meetings_development/_search?pretty -d '{"query":{"bool":{"must":[{"match":{"name":"sales call"}},{"nested":{"path":"team_members","score_mode":"avg","query":{"bool":{"must":{"match":{"team_members.name":"mike"}}}}}}],"filter":{"term":{"account":1}}}}}'
correct syntax for query 2 for the object datatype ie json
{
"query": {
"bool": {
"must": {
"match": {'availability_slot.monday.start_time' => '9'}
},
"filter": [{
"match": {'account': 1}
}]
}
}
}
You the pass this to elasticsearch like this:
curl http://172.19.0.4:9200/meetings_development/_search?pretty -d '{"query":{"bool":{"must":{"match":{"availability_slot.monday.start_time":"9"}},"filter":[{"match":{"account":1}}]}}}'

How can I get element at a particular index in elasticsearch?

I have stored three json objects in elasticsearch, each object has a title and projects array.
{"name": "haris","projects": [{"title": "Splunk"},{"title": "QRadar"},{"title": "LogAnalysis"}]}
{"name": "khalid","projects": [{"title": "MS"},{"title": "Google"},{"title": "Apple"}]}
{"name": "Hamid","projects": [{"title": "Toyota"},{"title": "Honda"},{"title": "Kia"}]}
I have written a query to extract a particular object by _id and its specific property projects
curl -XGET 'localhost:9200/jsontest/_search?pretty' -d '{"query" : { "match" : {"_id":"AV1kzzZqAzHWQ2S7B8f1"} }, "_source": ["projects"]}'
As expected it returns projects object
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [
{
"_index" : "jsontest",
"_type" : "json",
"_id" : "AV1kzzZqAzHWQ2S7B8f1",
"_score" : 1.0,
"_source" : {
"projects" : [{"title" : "Splunk"},{"title" : "QRadar"},{"title" : "LogAnalysis"}
]
}
}
]
}
}
Question: is there a way to retrieve value at a particular index of projects? This is dummy data, in my real scenario projects can have a large number of elements and each element itself is a json object with a lot of properties. I only need to retrieve value at certain index of projects.
Here is what i would do.
First the mapping
PUT test/my_objects/_mapping
{
"properties": {
"name":{
"type": "string",
"index": "not_analyzed"
},
"projects": {
"type": "nested"
}
}
}
Second Projects are indexed
PUT test/my_objects/1111
{
"name": "haris",
"projects": [
{"title": "Splunk"},
{"title": "QRadar"},
{"title": "LogAnalysis"}
]
}
Finally the aggregation query
GET test/my_objects/_search
{
"aggs": {
"by_name": {
"terms": {
"field": "name"
},
"aggs": {
"by_project": {
"nested": {
"path": "projects"
},
"aggs": {
"by_title": {
"terms": {
"field": "projects.title"
}
}
}
}
}
}
}
}
its not tested and a bit tedious because of the nested aggs but should work if you manipulate it further for you requirements

Resources