How to limit the results in a multi match query? - elasticsearch

i had used multi match phrase when I make search. However I have to put limit result of all math phrase seperately. I mean, I want to take only 2 result for each multi match. I can't find any limit/size attributes. Do you know any solution?
Example Code:
"query": {
"bool": {
"should": [
{
"match_phrase": {
"text": {
"query": " Home is clear and big ",
"slop": 2
}
}
},
{
"match_phrase": {
"text": {
"query": "365 different company use our system in test",
"slop": 2
}
}
}
]}}

use
{"limit" : 3, "from":0, "query": ...}

The simplest solution is to make to individual searches for each of the conditions. The size parameter can be set to retrieve only the first 2 results for each query.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html
The boolean should query will not distinguish which condition has been satisfied: it returns documents for which at least one of the two conditions holds. The scores for the two matches will be combined into a single score but it will be impossible to tell which s

Related

Elasticsearch - Impact of adding Boost to query

I have a very simple Elastic query mentioned below.
{
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"match": {
"tag": {
"query": "Audience: PRO Brand: Samsung",
"boost": 3,
"operator": "and"
}
}
},
{
"match": {
"tag": {
"query": "audience: PRO brand samsung",
"boost": 2,
"operator": "or"
}
}
}
]
}
}
]
}
}
}
I want to know if I add a boost in the query, will there be any performance impact because of this, and also will boosting help if you have a very large data set, where the occurrence of a search word is common.
Elasticsearch adds boost param with default value, IMO giving different value won't make much difference in the performance, but you should be able to measure it yourself.
Reg. your second question, adding boost definitely makes sense where the occurrence of your search words are common, this will help you to find the relevant document. for example: suppose you are searching for query in a index containing Elasticsearch posts(query will be very common on Elasticsearch posts), but you want the give more weight to documents which have tag elasticsearch-query. Adding boosts in this case, will provide you more relevant results.

Give more score to documents that contains all query terms

I have a problem with scoring in elasticsearch. When user enter a query that contains 3 terms, sometimes a document that has two words a lot, outscores a document that contains all three words. for example if user enters "elasticsearch query tutorial", I want documents that contains all these words score higher than a document with a lot of "tutorial" and "elasticsearch" terms in it.
PS: I am using minimum should match and shingls in my query. also they made ranking a lot better, they did not solve this problem completely. I need something like query coordination in lucene's practical scoring function. is there anything like that in elastic with BM-25?
One of the possible solutions could be using function score:
{
"query": {
"function_score": {
"query": { "match_all": {} },
"functions": [
{
"filter": { "match": { "title": "elasticserch" } },
"weight": 1
},
{
"filter": { "match": { "title": "tutorial" } },
"weight": 1
}
],
"score_mode": "sum"
}
}
}
In this case, you would have clearly a better position for documents with more matches. However, this would completely ignore TF-IDF or any other parameters.

ElasticSearch Ignoring words having one single letter

I'm a beginner in ElasticSearch, I have an application that uses elasticSearch to look for ingredients in a given food or fruit...
I'm facing a problem with scoring if the user for example tapes: "Vitamine d"
ElasticSearch will give the "vitamine" phrase that has the best scoring even if the phrase "Vitamine D" exists and normally it should have the highest score.
I see that if the second word "d" in my case is just one letter then elastic search will ignore it.
I did another example: "vitamine b12" and I had the correct score.
Here is the query that the application send to the server:
{
"from": 0,
"size": 5,
"query": {
"bool": {
"must": [
{
"match": {
"constNomFr": {
"query": "vitamine d"
}
}
}
],
"should": [
{
"prefix": {
"constNomFr": {
"value": "vitamine d",
"boost": 2
}
}
}
]
}
},
"_source": {
"excludes": [
"alimentDtos"
]
}
}
What could I modify to make it work?
Thank you so much.
If you can identify your ingredients, I recommend you to index them on a separate field "ingredients" setting it's type to keyword. This way you can use a term filter and you can even run aggregations.
You may already have your documents indexed that way, in that case if your are using the default mapping, just run your query against your_field_name.keyword.
If you don't have your ingredients indexed as an array then you should take a look to the elasticsearch analyzers to choose or build the right one.

Elasticsearch fuzzy matching: How can I get direct hits first?

I'm using Elasticsearch to search names in a database, and I want it to be fuzzy to allow for minor spelling errors. Based on the advice I've found on the matter, I'm using "match" and "fuzziness" instead of "fuzzy", which definitely seems to be more accurate. This is my query:
{ "query":
{ "match":
{ "last_name":
{ "query": "Beach",
"type": "phrase",
"fuzziness": 2
}
}
}
}
However, even though I have numerous results with last_name "Beach" (I know there's at least 100), I also get results with last_name "Beech" and "Berch" in the first 10 hits returned by my query. Can someone help me figure out how to get the exact matches first?
Try changing your query to a boolean query with 2 should queries.
The first one being your current query, and then second being a query that only gives exact matches, then give that one a big boost (like 10.0).
That should get your exact matches on top while still listing your partial matches.
I tried to edit "Constantijn" answer above to include sample based on his answer, but still not appearing (pending approval). So, I will just put a sample here instead...
{
"query": {
"bool": {
"should": [
{
"match": {
"last_name": {
"query": "Beach",
"fuzziness": 2,
"boost": 1
}
}
},
{
"match": {
"last_name": {
"query": "Beach",
"boost": 10
}
}
}
]
}
}
}

elasticsearch boost importance of exact phrase match

Is there a way in elasticsearch to boost the importance of the exact phrase appearing in the the document?
For example if I was searching for the phrase "web developer" and if the words "web developer" appeared together they would be boosted by 5 compared to "web" and "developer" appearing separately throughout the document. Thereby any document that contained "web developer" together would appear first in the results.
You can combine different queries together using a bool query, and you can assing a different boost to them as well. Let's say you have a regular match query for both the terms, regardless of their positions, and then a phrase query with a higher boost.
Something like the following:
{
"query": {
"bool": {
"should": [
{
"match": {
"field": "web developer"
}
},
{
"match_phrase": {
"field": "web developer",
"boost": 5
}
}
],
"minimum_number_should_match": 1
}
}
}
As an alternative to javanna's answer, you could do something similar with must and should clauses within a bool query:
{
"query": {
"bool": {
"must": {
"match": {
"field": "web developer",
"operator": "and"
}
},
"should": {
"match_phrase": {
"field": "web developer"
}
}
}
}
}
Untested, but I believe the must clause here will match results containing both 'web' and 'developer' and the should clause will score phrases matching 'web developer' higher.
You could try using rescore to run an exact phrase match on your initial results. From the docs:
"Rescoring can help to improve precision by reordering just the top (eg 100 - 500) documents returned by the query and post_filter phases, using a secondary (usually more costly) algorithm, instead of applying the costly algorithm to all documents in the index."
https://www.elastic.co/guide/en/elasticsearch/reference/current/filter-search-results.html#rescore
I used below sample query in my case which is working. It brings exact + fuzzy results but exact ones are boosted!
{ "query": {
"bool": {
"should": [
{
"match": {
"name": "pala"
}
},
{
"fuzzy": {
"name": "pala"
}
}
]
}}}
I do not have enough reputation to comment on James Adison's answer, which I agree with.
What is still missing is the boost factor, which can be done using the following syntax:
{
"match_phrase":
{
"fieldName": {
"query": "query string for exact match",
"boost": 10
}
}
}
I think its default behaviour already with match query "or" operator. It'll filter phrase "web developer" first and then terms like "web" or "develeper". Though you can boost your query using above answers. Correct me if I'm wrong.

Resources