Prioritise match results by geo_point and ordering by closest location in Elasticsearch - elasticsearch

I have a GET request, that is matching query string. It is searching within address strings and now returns solid and relevant results from an index.
Now I'd like to prioritise results by distance, so first, relevant strings are returned and ordered by closest geo_point parameter.
Putting the sort into the same level, right after the query parameter actually does not return hits sorted by distance. It returns weird results and it is definitely not what I want.
This is the mapping I use:
{
"location": {
"properties": {
"address": {
"type": "string"
},
"gps": {
"type": "geo_point"
}
}
}
}
The request I am doing now is:
GET /locations/location/_search
{
"query": {
"match" : {
"address" : {
"query": "Churchill Av"
}
}
},
"sort": [
{
"_geo_distance": {
"gps": {
"lat": 51.358599,
"lon": 0.531964
},
"order": "asc",
"unit": "km",
"distance_type": "plane"
}
}
]
}
I know that the best way would be gettting the results by match first and then sorting those few results by distance, so the geo-distance calculation is not too expensive.
I tried this question, but it didn't help.
EDIT: I need to mention, that I store geo_point data in my index like this:
"_source": {
"address" : "Abcdef, ghijk, lmnoprst"
"gps": [
51.50,
1.25
]
}
QUESTION: How to set the the geo distance sorting / filter, so results are sorted after the match query, ordered by closest geo_point parameter?

EDIT:
I realised that as the geo_point data is stored as an indexed array and that means the values are as [Lon, Lat], and not [Lat, Lon] as I expected, it is unable to search within _geo_distance pattern of associative array:
{
"lat" : Lat,
"lon" : Long
}
From elasticsearch.co Docs:
Please note that string geo-points are ordered as lat,lon, while array
geo-points are ordered as the reverse: lon,lat.
So the correct sort notation in this manner is
"_geo_distance": {
"gps": [51.358599,0.531964],
}
OR
"_geo_distance": {
"gps": {
"lat": 0.531964,
"lon": 51.358599
}
}
... because I store my geo_point data as [LON, LAT] and not [LAT, LON], as I had thought.
Now it work as expected. My problem is now, that I should reindex data with reverse order of latitude/longitude within the geo_point array.
I hope this remark could help someone else.

Related

How to rank ElasticSearch documents based on scores

I have an Elastic search index that contain thousands of documents, each document represent a user.
each document has set of fields (is_verified: boolean, country: string, is_creator: boolean), also i have another service that call ES search to lookup for documents, how i can rank the retrieved documents based on those fields? for example a verified user with match should come first than un verified one.
is there some kind of document scoring while indexing the documents ? if yes can i modify it based on my criteria ?
what shall i read/look to understand how to rank in elastic search.
thanks
I guess the sorting function mentioned by Mikael is pretty straight forward and should cover your use cases. Check Elastic Doc for more information on that.
But in case you want to do really fancy sorting, maybe you could use a bool query and different boost values to set your desired relevancy for each matched field. It tried to come up with a real life example, but honestly didn't find one. For the sake of completeness, he following snippet should give you an idea how to achieve similar results as with the sort API (but still, i would prefer using sort).
GET /yourindexname/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "Monica"
}
}
],
"should": [
{
"term": {
"is_verified": {
"value": true,
"boost": 2
}
}
},
{
"term": {
"is_creator": {
"value": true,
"boost": 2
}
}
}
]
}
}
}
is there some kind of document scoring while indexing the documents ? if yes can i modify it based on my criteria ?
I wouldn't assign a fixed score to a document while indexing, as the score should be dependent on the query. However, if you insist to have a predefined relevancy for each document, theoretically you could add a field relevancy having that value for ordering and use it later in the query:
GET /yourindexname/_search
{
"query" : {
"match" : {
"name": "Monica"
}
},
"sort" : [
{
"relevancy": {
"order": "desc"
},
"_score"
}
]
}
You can consider using the Sort Api inside your search queries ,In example below we used the search on the field country and sorted the result with respect of Boolean field (is_verified) , You can also add the other Boolean field inside Sort brackets .
GET /yourindexname/_search
{
"query" : {
"match" : {
"country": "Iceland"
}
},
"sort" : [
{
"is_verified": {
"order": "desc"
}
}
]
}

Elasticsearch ordering by field value which is not in the filter

can somebody help me please to make a query which will order result items according some field value if this field is not part of query in request. I have a query:
{
"_source": [
"ico",
"name",
"city",
"status"
],
"sort": {
"_score": "desc",
"status": "asc"
},
"size": 20,
"query": {
"bool": {
"should": [
{
"match": {
"normalized": {
"query": "idona",
"analyzer": "standard",
"boost": 3
}
}
},
{
"term": {
"normalized2": {
"value": "idona",
"boost": 2
}
}
},
{
"match": {
"normalized": "idona"
}
}
]
}
}
}
The result is sorted according field status alphabetically ascending. Status contains few values like [active, canceled, old....] and I need something like boosting for every possible values in query. E.g. active boost 5, canceled boost 4, old boost 3 ........... Is it possible to do it? Thanks.
You would need a custom sort using script to achieve what you want.
I've just made use of generic match_all query for my query, you can probably go ahead and add your query logic there, but the solution that you are looking for is in the sort section of the below query.
Make sure that status is a keyword type
Custom Sorting Based on Values
POST <your_index_name>/_search
{
"query":{
"match_all":{
}
},
"sort":[
{ "_score": "desc" },
{
"_script":{
"type":"number",
"script":{
"lang":"painless",
"inline":"if(params.scores.containsKey(doc['status'].value)) { return params.scores[doc['status'].value];} return 100000;",
"params":{
"scores":{
"active":5,
"old":4,
"cancelled":3
}
}
},
"order":"desc"
}
}
]
}
In the above query, go ahead and add the values in the scores section of the query. For e.g. if your value is new and you want it to be at say value 2, then your scores would be in the below:
{
"scores":{
"active":5,
"old":4,
"cancelled":3,
"new":6
}
}
So basically the documents would first get sorted by _score and then on that sorted documents, the script sort would be executed.
Note that the script sort is desc by nature as I understand that you would want to show active documents at the top, followed by other values. Feel free to play around with it.
Hope this helps!

Exact match search on text field

I'm using ElasticSearch to search data. My data contains text field and when I tried to match query on input, it outputs the input with another string.
_mapping
"direction": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Elastic Data
[
{
direction: "North"
},
{
direction: "North East"
}
]
Query
{
match: {
"direction" : {
query: "North",
operator : "and"
}
}
}
Result
[
{
direction: "North"
},
{
direction: "North East"
}
]
Expected Result
[
{
direction: "North"
}
]
Noted: It should output exact match direction
You may want to look at Term Queries which are used on keyword datatype to perform exact match searches.
POST <your_index_name>/_search
{
"query": {
"term": {
"direction.keyword": {
"value": "North"
}
}
}
}
The reason you observe what you observe, is because you are querying on Text field using Match Query. The values of the text field are broken down into tokens which are then stored in inverted indexes. This process is called Analysis. Text fields are not meant to be used for exact match.
Also note that whatever words/tokens you'd mention in Match Query, they would also go through the analysis phase before getting executed.
Hope it helps!
Based on you mapping, you should not search on field direction but on direction.keyword if you want exact match. The field direction is type text and gets analyzed - in your case to the words north and east.
Try this
{ "query" : { "bool" : { "must": { "term": { "direction": "North" } } } } }

Elasticsearch: get all documents where array contains one of many values

I have the following document data structure in Elasticsearch:
{
"topics": [ "a", "b", "c", "d" ]
}
I have a selection list where the user can filter which topics to show. When the user is OK with their filter, they will be presented with all documents that have any of the topics they selected in the array "topics"
I've tried the query
{
"query": {
"terms": {
"topics": ["a", "b"]
}
}
}
but this returns no results.
To expand on the query. For example, the list ["a", "b"] would match the first, second and third objects in the array below.
Is there a good way to do this in Elasticsearch? Obviously I could do multiple "match" queries but that's verbose as I have hundreds of topics
Edit: my mapping
{
"fb-cambodia-post": {
"mappings": {
"scrapedpost": {
"properties": {
"topics": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
}
As #Filip cordas mentioned you can use topic.keyword like.
{
"query": {
"terms": {
"topics.not_analyzed": [
"A" , "B"
]
}
}
}
This will do case sensitive search .It Will look for exact match. In case you want case-insensitive search you can use query_string like:
{
"query": {
"query_string": {
"default_field": "topics",
"query": "A OR B"
}
}
}
I will give some more info on the problem. The query with the data you added ("a", "b", "c") will work but if the topics have casing or multiple words it won't. This is due to the analyzer applied to the topic field. When you add a string value to ElasitcSearch it will by default use the standard analyzer. The terms query only compares raw terms as they are put. So if you have something like "Topic1" in the document and you search "terms":["Topic1"] it won't return any value because the term in standard analyzer is lowercased and the query that will return the value will be "terms":["topic1"]. As of 5.0 elastic added the default "keyword" subfield that stores the data with the keyword analyzer. And it stores it as is no transformation is applied. Terms on that field "terms.keyword":["Topic1"] will get you the values, but "terms.keyword":["topic1"] won't. What the match query dose is apply the filter on the input string as well and so you get the right result.

ElasticSearch - sort search results by relevance and custom field (Date)

For example, I have entities with two fields - Text and Date. I want search by entities with results sorted by Date. But if I do it simply, then the result is unexpected.
For search query "Iphone 6" there are the newest texts only with "6" in top of еру results, not with "iphone 6". Without sorting the results seem nice, but not ordered by Date as I want.
How write custom sort function which will consider both relevance and Date? Or may be exist way to give weight to field Date which will be consider in scoring?
In addition, may be I shall want to suppress search results only with "6". How to customize search to find results only by bigrams for example?
Did you tried with bool query like this
{
"query": {
"bool": {
"must": {
"match": {
"field": "iphone 6"
}
}
}
},
"sort": {
"date": {
"order": "desc"
}
}
}
or with your query you can also do this with is more appropriate way of doing i guess ..
just add this as sort
"sort": [
{ "date": { "order": "desc" }},
{ "_score": { "order": "desc" }}
]
all matching results sorted first by date, then by relevance.
The solution is to use _score and the date field both in sort. _score as the first sort order and date field as secondary sort order.
You can use simple match query to perform relevance match.
Try it out.
Data setup:
POST ecom/prod
{
"name":"iphone 6",
"date":"2019-02-10"
}
POST ecom/prod
{
"name":"iphone 5",
"date":"2019-01-10"
}
POST ecom/prod
{
"name":"iphone 6",
"date":"2019-02-28"
}
POST ecom/prod
{
"name":"6",
"date":"2019-03-01"
}
Query for relevance and date based sorting:
POST ecommerce/prododuct/_search
{
"query": {
"match": {
"name": "iphone 6"
}
},
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"date": {
"order": "desc"
}
}
]
}
You could definitely use a phrase matching query for this.
It does position-aware matching so the documents will be considered a match for your query only if both "iphone" and "6" occur in the searched fields AND that their occurrences respects this order, "iphone" shows up before "6".
looks like you want to sort first by relevance and then by date. this query will do it.
{ "query" : {
"match" : {
"my_field" : "my query"
}
},
"sort": {
"pubDate": {
"order": "desc",
"mode": "min"
}
}
}
When sorting on fields with more than one value, remember that the
values do not have any intrinsic order; a multivalue field is just a
bag of values. Which one do you choose to sort on? For numbers and
dates, you can reduce a multivalue field to a single value by using
the min, max, avg, or sum sort modes. For instance, you could sort on
the earliest date in each dates field by using the above query.
elasticsearch guide sorting
I think your relevance is broken. You should use two different analyzers, 1 for setting up your index and another for searching. like this:
PUT /my_index/my_type/_mapping
{
"my_type": {
"properties": {
"name": {
"type": "string",
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
also you can read more about this here: https://www.elastic.co/guide/en/elasticsearch/guide/master/_index_time_search_as_you_type.html
Once you fix the relevance then sorting should work correctly.

Resources