promote results in Elasticsearch - elasticsearch

I searched in the documentation for a way to promote ElasticSearch results if a specific field has a certain value, but I didn't find any good practice, for example, I have a user that lives in Paris if the user search for a query I want the documents that are relevant to Paris to appear the first or just to be promoted.

There is a lot to this but you want to research "boosting". This can be done at the mapping level or the query level.
Mapping example:
{
"mappings": {
"_doc": {
"properties": {
"location": {
"type": "keyword",
"boost": 2 <--- 2x boost to the final score
}
}
}
}
}
Query Example:
GET /_search
{
"query": {
"bool": {
"must": {
"match": {
"content": {
"query": "full text search",
"operator": "and"
}
}
},
"should": [
{ "term": {
"location": {
"value": "xxx",
"boost": 3 <--- 3x boost if the location matches
}
}}
]
}
}
}

Related

Proximity-Relevance in elasticsearch

I have an json record in the elastic search with fields
"streetName": "5 Street",
"name": ["Shivam Apartments"]
I tried the below query but it does not return anything if I add streetName bool in the query
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": {
"match": {
"name": {
"query": "shivam apartments",
"minimum_should_match": "80%"
}
}
}
}
},
{
"bool": {
"must": {
"match": {
"streetName": {
"query": "5 street",
"minimum_should_match": "80%"
}
}
}
}
}
]
}
}
}
Document Mapping
{
"rabc_documents": {
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "autocomplete_analyzer",
"position_increment_gap": 0
},
"streetName": {
"type": "keyword"
}
}
}
}
}
Based on the E.S Documentation (Keywords in Elastic Search)
"Keyword fields are only searchable by their exact value".
Along with that keywords are case sensitive as well.
Taking aforementioned into account:
Searching for "5 street" will not match "5 Street" ('s' vs 'S') on keyword field
minimum_should_match will not work on a keyword field.
Suggestion: For partial matches use "text" mapping instead of "keyword". Keywords are meant to be used for filtering, aggregation based on term, etc.

How can we use exists query in tandem with the search query?

I have a scenario in Elasticsearch where my indexed docs are like this :-
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123", "note": "imp"}
Here the requirement stress that we have to do a term match query and then provide relevance score to them which is a straight forward thing but the additional aspect here is if any doc found in search result has note field then it should be given higher relevance. How can we achieve it with DSL query? Using exists we can check which docs contain notes but how to integrate with match query in ES query. Have tried lot of ways but none worked.
With ES 5, you could boost your exists query to give a higher score to documents with a note field. For example,
{
"query": {
"bool": {
"must": {
"match": {
"name": {
"query": "your term"
}
}
},
"should": {
"exists": {
"field": "note",
"boost": 4
}
}
}
}
}
With ES 2, you could try a boosted filtered subset
{
"query": {
"function_score": {
"query": {
"match": { "name": "your term" }
},
"functions": [
{
"filter": { "exists" : { "field" : "note" }},
"weight": 4
}
],
"score_mode": "sum"
}
}
}
I believe that you are looking for boosting query feature
https://www.elastic.co/guide/en/elasticsearch/reference/5.1/query-dsl-boosting-query.html
{
"query": {
"boosting": {
"positive": {
<put yours original query here>
},
"negative": {
"filtered": {
"filter": {
"exists": {
"field": "note"
}
}
}
},
"negative_boost": 4
}
}
}

Search specific fields in nested documents as one document

I have the following structure:
{
"mappings": {
"document": {
"properties": {
"title": {
"type": "string"
},
"paragraphs": {
"type": "nested",
"properties": {
"paragraph": {
"type" : "object",
"properties" : {
"content": { "type": "string"},
"number":{"type":"integer"}
}
}
}
}
}
}
}
}
With these sample documents
{
"title":"Dubai seeks cause of massive hotel fire at New Year",
"paragraphs":[
{"paragraph": {"number": "1", "content":"Firefighters managed to subdue the blaze, but part of the Address Downtown Hotel is still smouldering."}},
{"paragraph": {"number": "2", "content":"A BBC reporter says a significant fire is still visible on the 20th floor, where the blaze apparently started."}},
{"paragraph": {"number": "3", "content":"The tower was evacuated and 16 people were hurt. But a fireworks show went ahead at the Burj Khalifa tower nearby."}},
{"paragraph": {"number": "4", "content":"The Burj Khalifa is the world's tallest building and an iconic symbol of the United Arab Emirates (UAE)."}}]
}
{
"title":"Munich not under imminent IS threat",
"paragraphs":[{"paragraph": {"number": "1", "content":"German officials say there is no sign of any imminent terror attack, after an alert that shut down two Munich railway stations on New Year's Eve."}}]
}
I can now search each paragraph using
{
"query": {
"nested": {
"path": "paragraphs", "query": {
"query_string": {
"default_field": "paragraphs.paragraph.content",
"query": "Firefighters AND still"
}
}
}
}
}
Question: How can I wright a query that searches several paragraphs but only the content field?
This works, but searches all fields
{
"query": {
"query_string": {
"query": "Firefighters AND apparently AND 1"
}
}
}
It is matching Firefighters from paragraph 1 and apparently from paragraph 2 which I want. I do however not want 1 to be matched since it isn't a content field.
Clarification: The first search performs a search per paragraph which I want some times. I do however also want to be able to search the whole document (all paragraphs) sometimes.
Solution
I added "include_in_parent": true as it is mentioned in https://www.elastic.co/guide/en/elasticsearch/reference/1.7/mapping-nested-type.html
The way you are querying is wrong because nested documents are indexed separately. See the last para from the doc.
Your query
{
"query": {
"nested": {
"path": "paragraphs",
"query": {
"query_string": {
"default_field": "paragraphs.paragraph.content",
"query": "Firefighters AND apparently"
}
}
}
}
}
is looking for both words in the same para and hence you are not getting the result. You need to query them separately like this
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "paragraphs",
"query": {
"match": {
"paragraphs.paragraph.content": "firefighters"
}
}
}
},
{
"nested": {
"path": "paragraphs",
"query": {
"match": {
"paragraphs.paragraph.content": "apparently"
}
}
}
}
]
}
}
}
This will give you the right results.
As a side note I do not think you need object datatype inside paragraphs. Following will work fine too
"paragraphs": {
"type": "nested",
"properties": {
"content": {
"type": "string"
},
"number": {
"type": "integer"
}
}
}
Hope this helps!!

Exact and fuzzy search

My setup:
I have some documents with name "Apple", "Apple delicous", ...
This is my query:
GET p_index/_search
{
"query": {
"bool": {
"should": [
{"match": {
"name": "apple"
}},
{ "fuzzy": {
"name": "apple"
}}
]
}
}
}
I want achieve, that first the exact match is shown and then the fuzzy one:
apple
apple delicous
Second, i am wondering that i did not get any result if i enter only app in the search:
GET p_index/_search
{
"query": {
"bool": {
"should": [
{"match": {
"name": "app"
}},
{ "fuzzy": {
"name": "app"
}}
]
}
}
}
There are two problems here.
1)To give higher score to an exact match you could try adding "index" : "not_analyzed" to your name field like this.
name: {
type: 'string',
"fields": {
"raw": {
"type": "string",
"index" : "not_analyzed" <--- here
}
}
}
After that your query would look like this
{
"query": {
"bool": {
"should": [
{
"match": {
"name": "apple"
}
},
{
"match": {
"name.raw": "apple"
},
"boost": 5
}
]
}
}
}
This will give higher score for document with "apple" than "apple delicous"
2)To better understand fuzziness you should go through this and this article.
From the Docs
The fuzziness parameter can be set to AUTO, which results in the
following maximum edit distances:
0 for strings of one or two characters
1 for strings of three, four, or five characters
2 for strings of more than five characters
So, the reason your fuzzy query did not return apple for app is because fuzziness i.e edit distance is 2 between those words and since "app" is only three letter word, fuzziness value is 1. You could achieve the desired result with following query
{
"query": {
"fuzzy": {
"name": {
"value": "app",
"fuzziness": 2
}
}
}
}
I seriously would not recommend using this query, because It will return bizarre results, the above query will return cap, arm, pip and lot of other words as they fall within edit distance of 2.
This would better query
{
"query": {
"fuzzy": {
"name": {
"value": "appl"
}
}
}
}
It will return apple.
I hope this helps.
I think ,This will help you.
{"query":{"bool":{"must":[{"function_score":{"query":{"multi_match":{"query":"airetl","fields":["brand_lower"],"boost":1,"fuzziness":Auto,"prefix_length":1}}}}}]}}

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

Resources