ElastiSearch Query: How to do inline "calculation" between fields, and then use it as boost variable? - elasticsearch

I have an Books Index with fields something like this:
{
"title": "To Kill a Mockingbird",
"summary": "To Kill a Mockingbird takes place in Alabama during the Depression..",
"type": "book",
"views": 36
},
{
"title": "The Genius of Birds",
"summary": "The Genius Of Birds shines a new light on a genuinely underrated kind..",
"type": "book",
"views": 10
},
{
"title": "Handbook of Bird Biology",
"summary": "The Handbook of Bird Biology is an essential reference for birdwatchers..",
"type": "book",
"views": 27
}
In ElasticSearch v5.1, below is my current simple Query which is working on it's own:
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"term": {
"type": "book"
}
}
]
}
},
"must": {
"multi_match": {
"query": "the bird",
"fields": [
"title",
"summary"
]
}
}
}
}
}
(Searching for the words the bird from the fields: title, summary where the type must be book)
This gives me a simple result based on title and summary fields. But i need it to be modified a little bit more.
Is it possible to modify the Query to look something like:
..
"must": {
"multi_match": {
"query": "the bird",
"fields": [
"title^(0.1*views)",
"summary"
]
}
}
..
I don't know how to call it in ES, but basically i want to boost a field (the title) by another field (the view).
Or in the simplest form, something like:
field1^(field2)

Thanks Aarchit Saxena for the hint in the comment section. Now i know it is called field_value_factor, and then by exploring further from there, i've now finally managed to get the query i needed.
The original query (above) has became like this now:
{
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"term": {
"type": "book"
}
}
]
}
},
"must": {
"multi_match": {
"query": "the bird",
"fields": [
"title",
"summary"
]
}
}
}
},
"functions": [
{
"field_value_factor": {
"field": "views",
"factor": 1,
"modifier": "none",
"missing": 1
}
}
],
"boost": 1,
"boost_mode": "multiply"
}
}
}
Thank you.

Related

Elastic Search : Search keyword results of a specific category

I'm trying to build a query where I'm trying to search for names of people of a specific country. If I provide input as John and USA, I should only find results of people by the name John (by the property : name) from USA (by the property : country) and results from other countries shouldn't appear in the results.
What I have tried :
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
With the above query the problem I'm seeing is that the results also show people **who don't have their name as John but are from USA
**.
Expectation : To filter results of given keyword specific to given country.
Instead of using should you need to use must clause in your name query.
Below query should give you expected results. refer boolean query official doc to understand the difference with examples.
"query": {
"bool": {
"must": [ --> note `must` here
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"fuzziness": "AUTO",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}
You are using should clause thats why it is not working. You can use must insted of should and it will resolved your issue.
You can use "type":"phrase_prefix" to match Jo with John.
You can change your query as shown below and it will work:
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "John",
"fields": ["username", "first_name", "last_name"],
"type":"phrase_prefix",
"minimum_should_match": "50%"
}
}
],
"filter": [
{
"match": {
"country": "USA"
}
},
{
"match": {
"is_citizen": true
}
}
]
}
}

How to boost specific terms in elastic search?

If I have the following mapping:
PUT /book
{
"settings": {},
"mappings": {
"properties": {
"title": {
"type": "text"
},
"author": {
"type": "text"
}
}
}
}
How can i boost specific authors higher than others?
In case of the below example:
PUT /book/_doc/1
{
"title": "car parts",
"author": "john smith"
}
PUT /book/_doc/2
{
"title": "car",
"author": "bob bobby"
}
PUT /book/_doc/3
{
"title": "soap",
"author": "sam sammy"
}
PUT /book/_doc/4
{
"title": "car designs",
"author": "joe walker"
}
GET /book/_search
{
"query": {
"bool": {
"should": [
{ "match": { "title": "car" }},
{ "match": { "title": "parts" }}
]
}
}
}
How do I make it so my search will give me books by "joe walker" are at the top of the search results?
One solution is to make use of function_score.
The function_score allows you to modify the score of documents that are retrieved by a query.
From here
Base on your mappings try to run this query for example:
GET book/_search
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"match": {
"title": "car"
}
},
{
"match": {
"title": "parts"
}
}
]
}
},
"functions": [
{
"filter": {
"match": {
"author": "joe walker"
}
},
"weight": 30
}
],
"max_boost": 30,
"score_mode": "max",
"boost_mode": "multiply"
}
}
}
The query inside function_score is the same should query that you used.
Now we want to take all the results from the query and give more weight (increase the score) to joe walker's books, meaning prioritize its books over the others.
To achieved that we created a function (inside functions) that compute a new score for each document returned by the query filtered by joe walker books.
You can play with the weight and other params.
Hope it helps

Elasticsearch: Conditionally filter query on fields if they exist in multi-index query

I have a query for a general search which spans multiple indices. Some of the indices have a field called is_published and some have a field called date_review, some have both.
I'm struggling to write a query which will search across fields and filter on the fields mentioned above but only if they exist. I have managed to achieve what I want on the individual fields using missing and/or exists, but it excludes the other variants.
In english, I want to keep documents in the result where:
is_published is true OR the field does not exist
date_review is in the future OR the field does not exist
So, if a document has is_published and it's false, remove it. If a document has date_review in the past, remove it. If it has is_published == false and date_review is in the future, remove it.
I hope this makes sense?
For the purpose of answering, assume the documents might look like this:
// Has `is_published` flag
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"is_published": true
}
// Has `date_review` flag
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"date_review": "2017-01-01"
}
// Has both `is_published` and `date_review` flags
{
"label": "My document",
"body": "Lorem ipsum doler et sum.",
"is_published": true
"date_review": "2017-01-01"
}
At the moment, my [unfiltered] query looks like this:
{
"index": "index-1,index-2,index-3",
"type": "item",
"body": {
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "my serach phrase",
"type": "phrase_prefix",
"fuzziness": null,
"fields": [
"label^3",
"body",
]
}
},
"filter": []
}
}
}
}
Very grateful for any pointers.
Thanks.
You can try a query like this one:
{
"query": {
"filtered": {
"query": {
"multi_match": {
"query": "my serach phrase",
"type": "phrase_prefix",
"fuzziness": null,
"fields": [
"label^3",
"body"
]
}
},
"filter": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"missing": {
"field": "is_published"
}
},
{
"term": {
"is_published": true
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"missing": {
"field": "date_review"
}
},
{
"range": {
"date_review": {
"gt": "now"
}
}
}
]
}
}
]
}
}
}
}
}

ElasticSearch How to AND a nested query

I am trying to figure out how to AND my Elastic Search query. I've tried a few different variations but I am always hitting a parser error.
What I have is a structure like this:
{
"title": "my title",
"details": [
{ "name": "one", "value": 100 },
{ "name": "two", "value": 21 }
]
}
I have defined details as a nested type in my mappings. What I'm trying to achieve is a query where it matches a part of the title and it matches various details by the detail's name and value.
I have the following query which gets me nearly there but I haven't been able to figure out how to AND the details. As an example I'd like to find anything that has:
detail of one with value less than or equal to 100
AND detail of two with value less than or equal to 25
The following query only allows me to search by one detail name/value:
"query" : {
"bool": {
"must": [
{ "match": {"title": {"query": titleQuery, "operator": "and" } } },
{
"nested": {
"path": "details",
"query": {
"bool": {
"must": [
{ "match": {"details.name" : "one"} },
{ "range": {"details.value" : { "lte": 100 } } }
]
}
}
} // nested
}
] // must
}
}
As a second question, would it be better to query the title and then move the nested part of the query into a filter?
You were so close! Just add another "nested" clause in your outer "must":
POST /test_index/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "title",
"operator": "and"
}
}
},
{
"nested": {
"path": "details",
"query": {
"bool": {
"must": [
{"match": {"details.name": "one" } },
{ "range": { "details.value": { "lte": 100 } } }
]
}
}
}
},
{
"nested": {
"path": "details",
"query": {
"bool": {
"must": [
{"match": {"details.name": "two" } },
{ "range": { "details.value": { "lte": 25 } } }
]
}
}
}
}
]
}
}
}
Here is some code I used to test it:
http://sense.qbox.io/gist/1fc30d49a810d22e85fa68d781114c2865a7c92e
EDIT: Oh, the answer to your second question is "yes", though if you're using 2.0 things have changed a little.

Elasticsearch additional boost if multiple conditions are met

Imagine I have a document, which looks like this:
{
"Title": "Smartphones in United Kingdom",
"Text": "A huge text about the topic",
"CategoryTags": [
{
"CategoryID": 1,
"CategoryName": "Smartphone"
},
{
"CategoryID": 2,
"CategoryName": "Apple"
},
{
"CategoryID": 3,
"CategoryName": "Samsung"
}
],
"GeographyTags": [
{
"GeographyID": 1,
"GeographyName": "Western Europe"
},
{
"GeographyID": 2,
"GeographyName": "United Kingdom"
}
]
}
CategoryTags and GeographyTags are stored as nested subdocuments.
I'd be looking for "apple united kingdom" in my search bar. How'd I make a query that would boost this document if it has both matching category and geography at the same time?
I was thinking of multi_match query, but I didn't figure out how would I deal with nested documents here...
I was thinking of nesting must into should statement. Would that make any sense?
POST /_search
{
"template": {
"size": "50",
"_source": {
"include": "Title"
},
"query": {
"filtered": {
"query": {
"bool": {
"minimum_number_should_match": "2<50%",
"must": [
{
"match": {
"Text": {
"query": "{{SearchPhrase}}"
}
}
}
],
"should": [
{
"match": {
"Title": {
"query": "{{SearchPhrase}}",
"type": "phrase",
"boost": "20"
}
}
},
{
"bool": {
"must": [
{
"nested": {
"path": "CategoryTags",
"query": {
"match": {
"CategoryTags.CategoryName": "{{SearchPhrase}}"
}
}
}
},
{
"nested": {
"path": "GeographyTags",
"query": {
"match": {
"GeographyTags.GeographyName": "{{SearchPhrase}}"
}
}
}
}
]
}
}
]
}
}
}
}
}
}

Resources