Is it possible to do this sort in ElasticSearch, without using script_score? - elasticsearch

I would like to do this sort with a single ElasticSearch query (without resorting to using script_score):
Objects with region=DE and language=de, sorted by createdDate.
Objects with region=DE and any language, sorted by createdDate.
Objects with any region and language=en, sorted by createdDate.
At first, I thought I could do a function_score query (boost_mode: replace, score_mode: sum) and:
If region=DE and language=de, set score to 300000000000000 + createdDate.
If region=DE and language!=de, set score to 200000000000000 + createdDate.
If region!=DE and language=en, set score to 100000000000000 + createdDate.
I can add the createdDate to the score by using field_value_factor. But I can't find a function_score function to add 300000000000000 to the score if region=DE and language=de.
Is it possible to do this without using script_score?

Here's how to do it:
{
"sort": [
{"_score": "desc"},
{"created_date": "desc"}
],
"query": {
"function_score": {
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"constant_score": {
"filter": {
"bool": {
"must": [
{ "term": { "region": "DE" } },
{ "term": { "language": "de" } }
]
}
},
"boost": 3
}
},
{
"constant_score": {
"filter": {
"bool": {
"must": [
{ "term": { "region": "DE" } },
{ "not": { "term": { "language": "de" } } }
]
}
},
"boost": 2
}
},
{
"constant_score": {
"filter": {
"bool": {
"must": [
{ "not": { "term": { "region": "DE" } } },
{ "term": { "language": "en" } }
]
}
},
"boost": 1
}
}
]
}
}
}
}
}

Related

Simplest way to query a elasticsearch index with chained conditions

I have an index of products on which I want to find all the products who fulfill conditions , such as :
((type = "orange" and price > 10) or (type = "apple" and price > 8)) and on_sale=True.
What about
(type = "orange" or type = "apple") and (price <= 25 or on_sale=True) .
You need to combine bool clauss, with "must" and "should".
Find below the required query for the first statement
GET _search
{
"query": {
"bool": {
"must": [
{
"term": {
"on_sale": {
"value": "True"
}
}
},
{
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"type": {
"value": "orange"
}
}
},
{
"range": {
"price": {
"gte": 10
}
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"type": {
"value": "apple"
}
}
},
{
"range": {
"price": {
"gte": 8
}
}
}
]
}
}
]
}
}
]
}
}
}
It is just about wrapping "must" or "Should" clauses into one another as required. You need a little bit of practise to figure out how to chain them. But literally any combination can be queried using this kind of syntax.
For the second query:
{
"query": {
"bool": {
"must": [
{
"terms": {
"type": [
"ornage",
"apple"
]
}
},
{
"bool": {
"should": [
{
"term": {
"on_sale": {
"value": "True"
}
}
},
{
"range": {
"price": {
"gte": 10
}
}
}
]
}
}
]
}
}
}
When you need "and" use "MUST", when you need "or" use "SHOULD".
HTH.

Elasticsearch Add additional condition if type is different

GET test/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "fragment"
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"fragmentId": "1"
}
},
{
"term": {
"type": "cf"
}
},
{
"range" :{
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}
I am looking for two documents, one which has fragment id = 1, and type = fragment, whereas another where fragment id = 1, type = "cf" and start between 1 and 5.
The above query is doing the job, but I need to write type and fragment id twice. Is there a way I can add range condition only when the type is cf, basically clubbing both bools in one ?
This is the query you're looking for:
{
"query": {
"bool": {
"filter": [
{
"term": {
"fragmentId": "1"
}
}
],
"minimum_should_match": 1,
"should": [
{
"term": {
"type": "fragment"
}
},
{
"bool": {
"filter": [
{
"term": {
"type": "cf"
}
},
{
"range": {
"start": {
"gte": 1,
"lte": 5
}
}
}
]
}
}
]
}
}
}

ElasticSearch should with nested and bool must_not exists

With the following mapping:
"categories": {
"type": "nested",
"properties": {
"category": {
"type": "integer"
},
"score": {
"type": "float"
}
}
},
I want to use the categories field to return documents that either:
have a score above a threshold in a given category, or
do not have the categories field
This is my query:
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "categories",
"query": {
"bool": {
"must": [
{
"terms": {
"categories.category": [
<id>
]
}
},
{
"range": {
"categories.score": {
"gte": 0.5
}
}
}
]
}
}
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "categories"
}
}
]
}
}
],
"minimum_should_match": 1
}
}
}
It correctly returns documents both with and without the categories field, and orders the results so the ones I want are first, but it doesn't filter the results having score below the 0.5 threshold.
Great question.
That is because categories is not exactly a field from the elasticsearch point of view[a field on which inverted index is created and used for querying/searching] but categories.category and categories.score is.
As a result categories being not found in any document, which is actually true for all the documents, you observe the result what you see.
Modify the query to the below and you'd see your use-case working correctly.
POST <your_index_name>/_search
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "categories",
"query": {
"bool": {
"must": [
{
"terms": {
"categories.category": [
"100"
]
}
},
{
"range": {
"categories.score": {
"gte": 0.5
}
}
}
]
}
}
}
},
{
"bool": {
"must_not": [ <----- Note this
{
"nested": {
"path": "categories",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "categories.category"
}
},
{
"exists": {
"field": "categories.score"
}
}
]
}
}
}
}
]
}
}
],
"minimum_should_match": 1
}
}
}

Using multiple Should queries

I want to get docs that are similar to multiple "groups" but separately. Each group has it's own rules (terms).
When I try to use more than one Should query inside a "bool" I get items that are a mix of both Should's terms.
I want to use 1 query total and not msearch for example.
Can someone please help me with that?
{
"explain": true,
"query": {
"filtered": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"p_id": "123"
}
},
{
"term": {
"p_id": "124"
}
}
]
}
},
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"cat": "1"
}
},
{
"term": {
"cat": "2"
}
},
{
"term": {
"keys": "a"
}
},
{
"term": {
"keys": "b"
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"cat": "6"
}
},
{
"term": {
"cat": "7"
}
},
{
"term": {
"keys": "r"
}
},
{
"term": {
"keys": "u"
}
}
]
}
}
]
}
}
}
},
"from": 0,
"size": 3
}
You can try using a terms aggregation on multiple fields with scripting and add a top hits aggregation as a sub-aggregation. Be warned this will be pretty slow. Add this after the query/filter and adjust the size parameter as needed
"aggs": {
"Cat_and_Keys": {
"terms": {
"script": "doc['cat'].values + doc['keys'].values"
},
"aggs":{ "separate_docs": {"top_hits":{"size":1 }} }
}
}

Mutiple query_strings (nested and not nested)

I have got the following index:
{
"thread":{
"properties":{
"members":{
"type":"nested",
"properties":{
"memberId":{
"type":"keyword"
},
"firstName":{
"type":"keyword",
"copy_to":[
"members.fullName"
]
},
"fullName":{
"type":"text"
},
"lastName":{
"type":"keyword",
"copy_to":[
"members.fullName"
]
}
}
},
"name":{
"type":"text"
}
}
}
}
I want to implement a search, that finds all threads, that either match the members name or the thread name, as long as the user id matches.
My current query looks like this:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "members",
"score_mode": "none",
"query": {
"bool": {
"filter": [
{ "match": { "members.id": "123456789" } }
]
}
}
}
},
{
"nested": {
"path": "members",
"query": {
"bool": {
"must": {
"simple_query_string": {
"query": "Rhymen",
"fields": ["members.fullName"]
}
}
}
}
}
}
]
}
}
}
Can I filter the members and thread names in one query or do I have to merge two separate queries? I tried adding a "should" with "minimum_should_match: 1" so I could add a second not nested "query_string". But that didn't work as expected (scores were pretty screwed).
yeah i think this should work.
you have to keep the concern for filter memberId in both the filters. Nested filter will need it to match the user with memberId and name.
{
"query": {
"bool": {
"must": [{
"nested": {
"path": "members",
"query": {
"term": {
"members.memberId": {
"value": 1
}
}
}
}
},
{
"bool": {
"should": [{
"term": {
"name": {
"value": "thread_name"
}
}
},
{
"nested": {
"path": "members",
"query": {
"bool": {
"should": [{
"term": {
"members.fullName": {
"value": "trump"
}
}
},
{
"term": {
"members.memberId": {
"value": 1
}
}
}
]
}
}
}
}
]
}
}
]
}
}
}

Resources