How to boost certain documents if the search query contains a certain term/text in elastic - elasticsearch

If the search query contains fruits I want to boost the products from a certain category?
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"boost": 2,
"filter": {
"term": { "categories": "3" }
}
}
}
]
}
}
}
I have the above query, which gives a constant score to items with the category 3, I want to apply this constant score/boosting/increase relevancy only when a certain text (for example fruits) is present in the search term.
Sample elasticsearch document
{
"id" : 1231,
"name" : {
"ar" : "Arabic fruit name",
"en" : "english fruit name"
}
"categories" : [3,1,3] // category ids because the same product is shown in multiple categories
}
How do I achieve this? I use elasticsearch 7.2

Original answer:
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"boost": 2,
"filter": {
"bool": {
"filter": [
{
"term": {
"categories": "3"
}
}
],
"should": [
{
"match": {
"name.ar": "fruit"
}
},
{
"match": {
"name.en": "fruit"
}
}
],
"minimum_should_match": 1
}
}
}
}
]
}
}
}
If i understand correctly what you're looking for.
Btw, I suggest using "match_phrase" instead of "match" if you want to match "fruit name" exactly and not "fruit" or "name"
Update: (based on the comment)
In that case i'd suggest reorganizing your schema in the following manner:
"item": {
"properties": {
"name": {
"type": ["string"]
},
"language": {
"type": ["string"]
}
}
}
So your sample would become:
{
"id" : 1231,
"item" : [
{"name": "Arabic fruit name", "language": "ar"}
{"name": "english fruit name", "language": "en"}
],
"categories" : [3,1,3]
}
And then you can match against "item.name"
Why? Because the way ElasticSearch indexes (at least, by default) is to flatten your the array, so internally it looks like ["Arabic fruit name", "english fruit name"]
With your original sample, two different fields are created in the schema (name.ar and name.en), which is actually not a great design if you need to scale

Related

How can we sort records by specific value of a filed in elastic search

We want to sort the records by specific value of a filed, for example :-
We have data with country code, name & other details and we want to show records at the top which have country code 'US', after us we want to show the results of country code 'AR'.
so if we are searching for obama, then all obama from US will come first and after that obama from AR will be available in results and we have also want to sort us records base on some rating score.
I am trying filter query with boost but not getting expected data because with filter we are getting only filtered records but we want sort the records basis on boost of specific value of country filed
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2.0
}
}
],
"filter": {
"bool": {
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
}
}
},
"size": 50,
"sort": [
{
"rating": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
]
}
Expectation :
All records which belongs with country US should be available on top base on sorting by rating
All records which belongs with country AR should be available after US's records with respective rating order
All records which belongs with country ES should be available after Ar's records with respective rating order
Expected example:
[
{name:"obama a", countryCode:us, rating:5}
{name:"obama b", countryCode:us, rating:4}
{name:"obama ac", countryCode:ar, rating:3}
{name:"obama ess", countryCode:es, rating:3.5}
]
If you want to tune the score but not drop the document you can use should.
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
must
The clause (query) must appear in matching documents and will
contribute to the score.
filter
The clause (query) must appear in matching documents. However unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
should
The clause (query) should appear in the matching document.
must_not
The clause (query) must not appear in the matching documents. Clauses
are executed in filter context meaning that scoring is ignored and
clauses are considered for caching. Because scoring is ignored, a
score of 0 for all documents is returned.
Here is an example:
POST test_stackoverflow_us/_bulk?refresh=true&pretty
{ "index": {}}
{"name":"obama a", "countryCode":"us", "rating":5}
{ "index": {}}
{"name":"obama b", "countryCode":"us", "rating":4}
{ "index": {}}
{"name":"obama ac", "countryCode":"ar", "rating":3}
{ "index": {}}
{"name":"obama ess", "countryCode":"es", "rating":3.5}
GET test_stackoverflow_us/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"match_phrase_prefix": {
"name": {
"query": "obama"
}
}
}
],
"boost": 2
}
}
],
"should": [
{
"term": {
"countryCode": {
"value": "US",
"boost": 4
}
}
},
{
"term": {
"countryCode": {
"value": "AR",
"boost": 3
}
}
},
{
"term": {
"countryCode": {
"value": "ES",
"boost": 2
}
}
}
]
}
},
"size": 50,
"sort": [
{
"rating": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
]
}

Elastic Search : Search keyword results of a specific category

I'm trying to build a query where I'm trying to search for names of people of a specific country. If I provide input as John and USA, I should only find results of people by the name John (by the property : name) from USA (by the property : country) and results from other countries shouldn't appear in the results.
What I have tried :
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
With the above query the problem I'm seeing is that the results also show people **who don't have their name as John but are from USA
**.
Expectation : To filter results of given keyword specific to given country.
Instead of using should you need to use must clause in your name query.
Below query should give you expected results. refer boolean query official doc to understand the difference with examples.
"query": {
"bool": {
"must": [ --> note `must` here
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
You are using should clause thats why it is not working. You can use must insted of should and it will resolved your issue.
You can use "type":"phrase_prefix" to match Jo with John.
You can change your query as shown below and it will work:
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"type":"phrase_prefix",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}

ElasticSearch: Find record with multiple conditions on a list of sub-elements

I'm saving documents like this to ElasticSearch:
[
{
"text": "Sam works for Google.",
"entities": [
{
"text": "Sam",
"type": "PERSON"
},
{
"text": "Google",
"type": "ORGANIZATION"
}
]
}
]
It's essentially a sentence and entities that appear in that sentence. Now, I want to find any document that has entities of type "PERSON" AND "ORGANIZATION". I tried a boolean must query:
{
"bool": {
"must": [
{
"match": {
"entities.type": "PERSON"
}
},
{
"match": {
"entities.type": "ORGANIZATION"
}
}
]
}
}
... but that seems to try to look for entities that that are of both types, which obviously returns nothing. How do I need to formulate my query?
Thanks!
You should use below query as your original query dont have correct field name.
{
"query": {
"bool": {
"must": [
{
"match": {
"entities.type": "PERSON"
}
},
{
"match": {
"entities.type": "ORGANIZATION"
}
}
]
}
}
}

How to boost specific terms in elastic search?

If I have the following mapping:
PUT /book
{
"settings": {},
"mappings": {
"properties": {
"title": {
"type": "text"
},
"author": {
"type": "text"
}
}
}
}
How can i boost specific authors higher than others?
In case of the below example:
PUT /book/_doc/1
{
"title": "car parts",
"author": "john smith"
}
PUT /book/_doc/2
{
"title": "car",
"author": "bob bobby"
}
PUT /book/_doc/3
{
"title": "soap",
"author": "sam sammy"
}
PUT /book/_doc/4
{
"title": "car designs",
"author": "joe walker"
}
GET /book/_search
{
"query": {
"bool": {
"should": [
{ "match": { "title": "car" }},
{ "match": { "title": "parts" }}
]
}
}
}
How do I make it so my search will give me books by "joe walker" are at the top of the search results?
One solution is to make use of function_score.
The function_score allows you to modify the score of documents that are retrieved by a query.
From here
Base on your mappings try to run this query for example:
GET book/_search
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"title": "car"
}
},
{
"match": {
"title": "parts"
}
}
]
}
},
"functions": [
{
"filter": {
"match": {
"author": "joe walker"
}
},
"weight": 30
}
],
"max_boost": 30,
"score_mode": "max",
"boost_mode": "multiply"
}
}
}
The query inside function_score is the same should query that you used.
Now we want to take all the results from the query and give more weight (increase the score) to joe walker's books, meaning prioritize its books over the others.
To achieved that we created a function (inside functions) that compute a new score for each document returned by the query filtered by joe walker books.
You can play with the weight and other params.
Hope it helps

Multi match query with terms lookup searching multiple indices elasticsearch 6.x

All,
I am working on building a NEST 6.x query that takes a serach term and looks in different fields in different indices.
This is the one I got so far but is not returning any results that I am expecting.
Please see the details below
Indices used
dev-sample-search
user-agents-search
The way the search should work is as follows.
The value in the query field(27921093) is searched against the
fields agentNumber, customerName, fileNumber, documentid(These are all
analyzed fileds).
The search should limit the documents to the agentNumbers the user
sampleuser#gmail.com has access to( sample data for
user-agents-search) is added below.
agentNumber, customerName, fileNumber, documentid and status are
part of the index dev-sample-search.
status field is defined as a keyword.
The fields in the user-agents-search index are all keywords
Sample user-agents-search index data:
{
"id": "sampleuser#gmail.com"",
"user": "sampleuser#gmail.com"",
"agentNumber": [
"123.456.789",
"1011.12.13.14"
]
}
Sample dev-sample-search index data:
{
"agentNumber": "123.456.789",
"customerName": "Bank of america",
"fileNumber":"test_file_1123",
"documentid":"1234456789"
}
GET dev-sample-search/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"multi_match": {
"type": "best_fields",
"query": "27921093",
"operator": "and",
"fields": [
"agentNumber",
"customerName",
"fileNumber",
"documentid^10"
]
}
}
],
"filter": [
{
"bool": {
"must": [
{
"terms": {
"agentNumber": {
"index": "user-agents-search",
"type": "_doc",
"user": "sampleuser#gmail.com",
"path": "agentNumber"
}
}
},
{
"bool": {
"must_not": [
{
"terms": {
"status": {
"value": "pending"
}
}
},
{
"term": {
"status": {
"value": "cancelled"
}
}
},
{
"term": {
"status": {
"value": "app cancelled"
}
}
}
],
"should": [
{
"term": {
"status": {
"value": "active"
}
}
},
{
"term": {
"status": {
"value": "terminated"
}
}
}
]
}
}
]
}
}
]
}
}
}
I see a couple of things that you may want to look at:
In the terms lookup query, "user": "sampleuser#gmail.com", should be "id": "sampleuser#gmail.com",.
If at least one should clause in the filter clause should match, set "minimum_should_match" : 1 on the bool query containing the should clause

Resources