Elasticsearch Boost near location, boost if no location is available - elasticsearch

There is Location field exists with geo_point type
I want to implement with some conditions below.
If there is a location, the closer it is, boost it
If there is no location, boost by 5
Ultimately, i want to implement the following features: If the location exists, will show it in the order of distance, but we would like to boost documents without location so that they are not pushed out last.
below is my query. I reached to get nearest document by location. But i don't know how to boost which is no location.
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"should": {
"distance_feature": {
"field": "location",
"pivot": "1000m",
"boost": 8,
"origin": {
"lat": 33.489009,
"lon": 133.022831
}
}
},
"filter": [
{
"terms" : {
"state": ["AVAILABLE"]
}
}
]
}
}
}

You could try to do it like this:
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"minimum_should_match": 1,
"should": [
{
"distance_feature": {
"field": "locations.parcelLocation",
"pivot": "1000m",
"boost": 8,
"origin": {
"lat": 33.489009,
"lon": 133.022831
}
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "location",
"boost": 5
}
}
}
}
],
"filter": [
{
"terms": {
"state": [
"AVAILABLE"
]
}
}
]
}
}
}

Related

Limit the size per index when searching multiple index in Elastic

I have been following the guidelines from this post. I can get the desired output but in the same DSL how can I limit the size of results for each index ?
Full text Search with Multiple index in Elastic Search using NEST C#
POST http://localhost:9200/componenttypeindex%2Cprojecttypeindex/Componenttype%2CProjecttype/_search?pretty=true&typed_keys=true
{
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"term": {
"_index": {
"value": "componenttypeindex"
}
}
}
],
"must": [
{
"multi_match": {
"fields": [
"Componentname",
"Summary^1.1"
],
"operator": "or",
"query": "test"
}
}
]
}
},
{
"bool": {
"filter": [
{
"term": {
"_index": {
"value": "projecttypeindex"
}
}
}
],
"must": [
{
"multi_match": {
"fields": [
"Projectname",
"Summary^0.3"
],
"operator": "or",
"query": "test"
}
}
]
}
}
]
}
}
}
With your given query, you could use aggregations to group and limit number of hits per index (in this case, limiting to 5):
{
"size": 0,
"query": {
... Same query as above ...
},
"aggs": {
"index_agg": {
"terms": {
"field": "_index",
"size": 20
},
"aggs": {
"hits_per_index": {
"top_hits": {
"size": 5
}
}
}
}
}
}

Filter query by length of nested objects. ie. min_child

I'm trying to filter my query by the number of nested objects found. The Elastic Search documentation mentions that using a script is an expensive task, so I've set out to do it with a score, though I can't seem to get the script to work either.
Here's my mappings:
"mappings": {
"properties": {
"dates" : {
"type" : "nested",
"properties" : {
"rooms" : {
"type" : "integer"
},
"timestamp" : {
"type" : "long"
}
}
},
"doc_id" : {
"type" : "text"
},
"distance" : {
"type" : "integer"
}
...
}
}
Here's some example data:
PUT /test/_doc/1
{
"doc_id": "1",
"distance": 1,
"dates": [
{
"rooms": 1,
"timestamp": 1
},
{
"rooms": 1,
"timestamp": 2
},
...
]
}
I'm filtering by the parents distance field, among others, and filtering the nested dates by their timestamps, and rooms. I need to filter all results to an exact number of nest dates found.
I tried to borrow from here.
This is my search query:
GET /test/_search
{
"query" : {
"function_score": {
"min_score": 20,
"boost": 1,
"functions": [
{
"script_score": {
"script": {
"source": "if (_score > 20) { return - 1; } return _score;"
}
}
}
],
"query": {
"bool" : {
"filter": [
{ "range": { "distance": { "lt": 5 }}},
{
"nested": {
"score_mode": "sum",
"boost": 10,
"path": "dates",
"query": {
"bool": {
"filter": [
{ "range": { "dates.rooms": { "gte": 1 } } },
{ "range": { "dates.timestamp": { "lte": 2 }}},
{ "range": { "dates.timestamp": { "gte": 1 }}}
]
}
}
}
}
]
}
}
}
}
}
This returns all the results that match, yet they all have a score of 0.0 and aren't getting filtered by the number of nested objects found.
If this is the right solution, how can I get this working? If not, how can I get a script to do it within this search?
Thanks!
Before getting started, keep in mind that the scoring function has changed between Elastic 6 and 7. You can find the updated code samples on this this gist.
Your question didn't outline the specifics of your search. Reading the code, it seems like you want to retrieve all documents where the distance is less than five, and the number of matching rooms is precisely 2. If this is correct, the code you submitted does not achieve this.
Reasons: your function score contains your primary condition and your condition on the number of matching rooms (it is quite tricky to mix both, though not impossible). To make things simpler, isolate them for the function score to be only applicable to the number of rooms.
Supposing you are using elastic 7+, this might work:
{
"_source": {
"includes": ["*"],
"excludes": ["dates"]
},
"query": {
"bool": {
"must": [
{"range": {"distance": {"lt": 5}}},
{
"function_score": {
"min_score": 20,
"boost": 1,
"score_mode": "multiply",
"boost_mode": "replace",
"functions": [
{
"script_score": {
"script": {
"source": "if (_score > 20) { return 0; } return _score;"
}
}
}
],
"query": {
"nested": {
"path": "date",
"boost": 10,
"score_mode": "sum",
"query": {
"constant_score": {
"boost": 1,
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{"term": {"dates.timestamp": 1}},
{"range": {"dates.rooms": {"lt": 5}}}
],
"should": [
{"term": {"dates.other_prop": 1}},
{"term": {"dates.other_prop": 4}}
]
}
},
{
"bool": {
"must": [
{"term": {"dates.timestamp": 2}},
{"range": {"dates.rooms": {"lt": 5}}}
],
"should": [
{"term": {"dates.other_prop": 1}},
{"term": {"dates.other_prop": 3}}
]
}
}
]
}
}
}
}
}
}
}
}
]
}
}
}
I managed to get it all working with scoring as filtering doesn't allow scoring. Using GET /test/_explain/[id] helped to understand exactly what was happening
GET /test/_search
{
// Don't return the nested fields, they are returned in the inner_hits
"_source": {
"includes": [ "*" ],
"excludes": [ "dates" ]
},
"query": {
"function_score": {
// Score is calculated with 1 point for each matched inner property and outer property.
// 7 is the exact score to allow
"min_score": 7,
"boost": 1,
"score_mode": "sum",
"boost_mode": "multiply",
"functions": [
{
"script_score": {
"script": {
// Ignore any results that don't match exactly
"source": "if (_score == 7) { return 1; } return 0;",
"lang": "painless"
}
}
}
],
"query": {
"bool" : {
"must" : [
{ "range" : { "distance" : { "lt": 10 }}},
{
"nested": {
"inner_hits" : {},
"path": "dates",
"score_mode": "sum",
"query": {
"bool": {
// Match each required nested object individually, then verify with the score if we got 1 match for each should
"should": [
{
"bool": {
"must": [
{ "term": { "dates.timestamp": 1 }},
{ "range": { "dates.rooms": { "lt": 5 } } }
],
"should": [
{ "term": { "dates.other_prop": 1 }},
{ "term": { "dates.other_prop": 4 }}
]
}
},
{
"bool": {
"must": [
{ "term": { "dates.timestamp": 2 }},
{ "range": { "dates.rooms": { "lt": 5 } } }
],
"should": [
{ "term": { "dates.other_prop": 1 }},
{ "term": { "dates.other_prop": 3 }}
]
}
}
]
}
}
}
}
]
}
}
}
}
}

how to add filters to elastic query when using function_score?

Here is my current elastic query:
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"must": [{
"multi_match": {
"query": "ocean",
"fields": [],
"fuzziness": "AUTO"
}}],
"must_not": [{
"exists": {
"field": "parentId"
}
}]
}
},
"functions" : [
{
"gauss": {
"createTime": {
"origin": "2020-07-09T23:50:00",
"scale": "365d",
"decay": 0.3
}
}
}
]
}
}
}
How do I properly add filters to this? I think maybe the fact that I'm using function_score makes this different? I would like to add a hard filter, for example, only show me results with uploadUser: 'Mr. Bean' ... but still keep the scoring in place for the results that pass this filter.
I tried using filter in various places, also using must but I either get no results or all the results.
I'm using Elastic Search 7. Thanks for your help
You can try this below search query:
Refer this ES official documentation to know more about Function score query
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"term": {
"uploadUser": "Mr. Bean"
}
},
"must": [
{
"multi_match": {
"query": "ocean",
"fields": [
],
"fuzziness": "AUTO"
}
}
],
"must_not": [
{
"exists": {
"field": "parentId"
}
}
]
}
},
"functions": [
{
"gauss": {
"createTime": {
"origin": "2020-07-09T23:50:00",
"scale": "365d",
"decay": 0.3
}
}
}
]
}
}
}

elasticsearch: Add weight for each match of array

I want to add a weight for each match (instead of adding a weight once if one of those matched):
Having docs like this:
[{
"username": "xyz",
"categories": [
{
"category.id": 1
},
{
"category.id": 2
}
]
}, {
"username": "xyz2",
"categories": [
{
"category.id": 1
}
]
}]
And currently, I have this query:
{
"query": {
"filtered": {
"query": {
"function_score": {
"query": {
"bool": {}
},
"score_mode": "sum",
"boost_mode": "sum",
"functions": [
{
"weight": 1.1,
"filter": {
"terms": {
"category.id": [
1,
2
]
}
}
}
]
}
},
"filter": {
"bool": {
"must_not": [
{
"terms": {
"_id": [
8
]
}
}
]
}
}
}
},
"from": 0,
"size": 30
}
With this query, both entries would receive a single weight of 1.1, but I want the first entry to get 2 * 1.1 because 2 categories are matched. How could I achieve that?
EDIT: Sorry, I missed to add elastic search version. It's 1.7.2.
This might be a bit cumbersome, since for multiple IDs that query will need to have multiple statements, but I don't think there is any other way. Also, notice that your field referencing is not complete - it should be categories.category.id to be correct. Also, be careful when upgrading with dots in field names. This changed in some releases over time.
{
"query": {
"filtered": {
"query": {
"function_score": {
"query": {
"match_all": {}
},
"score_mode": "sum",
"boost_mode": "sum",
"functions": [
{
"weight": 1.1,
"filter": {
"term": {
"categories.category.id": 1
}
}
},
{
"weight": 1.1,
"filter": {
"term": {
"categories.category.id": 2
}
}
}
]
}
},
"filter": {
"bool": {
"must_not": [
{
"terms": {
"_id": [
8
]
}
}
]
}
}
}
},
"from": 0,
"size": 30
}

Elasticsearch No query registered for [exists]]

After running the below query, i am getting the exception as No query registered for [exists]. Please help me.
{
"query": {
"function_score": {
"query": {
"bool": {
"must": {
"match": {
"_all": {
"query": "cardio new york"
}
}
}
}
},
"functions": [
{
"gauss": {
"geo_location": {
"origin": {
"lat": 40.7127,
"lon": -74.0059
},
"scale": "100km",
"offset": "0km",
"decay": 0.9
}
}
},
{
"gauss": {
"startdate": {
"origin": "now",
"scale": "30d",
"offset": "30d"
}
},
"weight": 0.5
}
],
"filter": {
"query": {
"bool": {
"must": {
"match": {
"_all": {
"query": "cardio new york"
}
}
},
"should": {
"exists": {
"fields": [
"venue",
"geo_location"
]
}
}
}
}
}
}
}
}
I am trying to filter the search results after the function_score with combining bool match query.
exists is not a query, it's filter you cannot use it in a bool query, instead I would use bool filter and wrap only match into query filter like this:
"filter": {
"bool": {
"must": [{
"query": {
"match": {
"_all": {
"query": "cardio new york"
}
}
}
}, {
"exists": {
"fields": [
"venue",
"geo_location"
]
}
}]
}
}
This depends on your version, before version 2.0 you would use the Exists filter, as demonstrated in imotov's answer.
With 2.0 and after exists filter has been replaced by exists query documentation for current version
thus the newer exists query would look like the following:
{
"index": "foobars",
"type": "foo",
"body": {
"query": {
"bool": {
"must":
{"exists" : { "field" : "bar" }}
}
},
"from": 0,
"size": 20
}
}

Resources