What is the purpose of score for a user in elastic search query result? - elasticsearch

The main difference between must and filter query is the _score calculation.
Can anyone tell me what is the purpose of the score shown in the query result?
How can we use the score?

The score gives you the relevance of a given document to the executed query. The higher the score, the more relevant is the document. For example, consider the following documents:
# Doc 1
{
"title": "What is the purpose of score for a user in elastic search query result?"
}
# Doc 2
{
"title": "What is the purpose of score in life?"
}
Then, if you query for a title that includes the words purpose score elastic (something you would do, for example, in the stackoverflow search bar), the first document will get a higher score and will appear on top of the list of results.
On the other hand, filters tell you whether a document matches or not the query. It is either a yes or no, therefore, it is not necessary to calculate the score.
For further details, have a read at the always very good Elastic documentation.

Related

Boosting score in Elasticsearch based on aggregation result

I need to boost the score of my documents based on a particular value. That value can be obtained from aggregation query.
Currently I am using 2 queries to do it, would like to achieve it in a single query.
1-So basically first query gets me the highest no of occurrence of a particular chapter based on a simple term/match query.
2- Next step is once I get the highest occurring chapter will fire another query which would basically have the same above query with another term query added with a boost factor of 10.
Any input with this regards is welcomed, if we can accomplish this in one query. Thanks in advance.
Ashit

Scoring in Match phrase and Match query not as expected

I am little confused how scoring is done in matchphrase and match query by seeing my results
For Match Phrase
I have query like below
sd.Query(q =>
q.MatchPhrase(m => m
.Field(p => p.title)
.Query("Test Article in Credit")
));
the reults obtained are as below
a. document with "Test Article in Credit - Consumer" score 12.64
b. document with "Test Article with Credit -X" score 12.64
c. document with "Test Article in Credit - XYZ" score 10.92
d. document with "Test Article in Credit" score 10.22
e. document with "Test Article in Credit -Z" score 09.40
first two are from different index and last three from different index
According to me the fourth one should have high score and second should have lowest.
I am using standard Analyzer
Similarly my match query does same
document with title"Test" is having high score than
document with title"Test Article in Credit"
I know scoring happens for more relevant articles based on
frequency of term appearing , length of text.
How can i restructure my query to give documents in right order?
It's extremely hard from here to say why your expectations don't match up with the results but I will do you one better: you can find out yourself by using the Explain API

How can I get response(which is searched by score) in Elasticsearch?

I need your help. I want to a search which can be search by common conditions and its score range also used as conditions。Can I do it successfully? if you know ,I hope you can share.
I have a example in the picture:
In the picture,we know the score range is [0,1] ,if I want to get response which scores is [0.2,0.6],How do it! help! SOS! Execute my English!
Elasticsearch provides a min_score field that can be included in a request body search to filter out documents with a _score less than a specified value.
There is no way to filter out documents with a _score greater than a certain value, but: why do you want to do this? Scores in Lucene by definition mean that documents were found matching your search query, and that some results are more relevant than others. I recommend that you read "What is Relevance?" in the Elasticsearch documentation, and "Apache Lucene - Scoring" for a basic understanding of how the scoring formula works.
Also, the Lucene score range isn't always [0,1]: it can be greater than 1.

elasticsearch scoring on multiple indexes

i have an index for any quarter of a year ("index-2015.1","index-2015.2"... )
i have around 30 million documents on each index.
a document has a text field ('title')
my document sorting method is (1)_score (2)created date
the problem is:
when searching for some text on on 'title' field for all indexes ("index-201*"), always the first results is from one index.
lets say if i am searching for 'title=home' and i have 10k documents on "index-2015.1" with title=home and 10k documents on "index-2015.2" with title=home then the first results are all documents from "index-2015.1" (and not from "index-2015.2", or mixed) even that on "index-2015.2" there are documents with "created date" higher then in "index-2015.1".
is there a reason for this?
The reason is probably, that the scores are specific to the index. So if you really have multiple indices, the result score of the documents will be calculated (slightly) different for each index.
Simply put, among other things, the score of a matching document is dependent on the query terms and their occurrences in the index. The score is calculated in regard to the index (actually, by default even to each separate shard). There are some normalizations elasticsearch does, but I don't know the details of those.
I'm not really able to explain it well, but here's the article about scoring. I think you want to read at least the part about TF/IDF. Which I think, should explain why you get different scores.
https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html
EDIT:
So, after testing it a bit on my machine, it seems possible to use another search_type, to achieve a score suitable for your case.
POST /index1,index2/_search?search_type=dfs_query_then_fetch
{
"query" : {
"match": {
"title": "home"
}
}
}
The important part is search_type=dfs_query_then_fetch. If you are programming java or something similar, there should be a way to specify it in the request. For details about the search_types, refer to the documentation.
Basically it will first collect the term-frequencies on all affected shards (+ indexes). Therefore the score should be generalized over all these.
according to Andrei Stefan and Slomo, index boosting solve my problem:
body={
"indices_boost" : { "index-2015.4" : 1.4, "index-2015.3" : 1.3,"index-2015.2" : 1.2 ,"index-2015.1" : 1.1 }
}
EDIT:
using search_type=dfs_query_then_fetch (as Slomo described) will solve the problem in better way (depend what is your business model...)

Elasticsearch query too many results

I'm tring to set up a simple search that would return me simple results with a custom ordering, the ordering i get back is fine based on a custom score.
The problem is that for this query
"query": {
"query_string": {
"query": query_term,
"fields": ["name_auto"],
}
}
NOTE: name_auto is an Edge N gram field on elastics
I always get a result set also if the query does not make any sense.
Example:
I have an elastcisearch index populated with the name of all the android applications.
If i search for face i get back all the results related to it ordered by number of comments on the play store, menans [facebook, facebook messenger, ...]
The problem is that when i query for something like facesomeuselesschars i still get the same results as before but fore sure there is nothing that match "someuselesschars".
Can anybody help about
ElasticSearch will always return results that match your query, even if the score of those results are poor. Your query for 'facesomeuselesschars' will match anything that has 'face' in it because of your ngrams (e.g. the first four characters of your query will be match multiple tokens in your index).
The rest of the characters in your query will simply lower the score of the returned match, but not prevent it from being returned.
If you want to set a minimum score that a result must reach, you can use the min_score parameter.

Resources