Why Elastic search is returning wrong relevance score? - elasticsearch

I am learning elastic search, I inserted the following data in the megacorp index having the type employee:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.6931472,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.6931472,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
}
]
}
}
Then I ran the following request:
GET /megacorp/employee/_search
{
"query" : {
"match" : {
"about" : "rock climbing"
}
}
}
However the result I got is as follows:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.6682933,
"hits" : [
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.6682933,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
]
}
},
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
}
]
}
}
I have the doubt that relevance score for the following record:
{
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
]
}
}
is lesser than the previous one. I ran the query with
explain: true
and got the following result:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.6682933,
"hits" : [
{
"_shard" : "[megacorp][2]",
"_node" : "pGtCz_FvSTmteJwQKvn_lg",
"_index" : "megacorp",
"_type" : "employee",
"_id" : "2",
"_score" : 0.6682933,
"_source" : {
"first_name" : "Jane",
"last_name" : "Smith",
"age" : 32,
"about" : "I like to collect rock albums",
"interests" : [
"music"
],
"fielddata" : true
},
"_explanation" : {
"value" : 0.6682933,
"description" : "sum of:",
"details" : [
{
"value" : 0.6682933,
"description" : "weight(about:rock in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.6682933,
"description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details" : [
{
"value" : 0.6931472,
"description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details" : [
{
"value" : 1.0,
"description" : "docFreq",
"details" : [ ]
},
{
"value" : 2.0,
"description" : "docCount",
"details" : [ ]
}
]
},
{
"value" : 0.96414346,
"description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details" : [
{
"value" : 1.0,
"description" : "termFreq=1.0",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "parameter k1",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "parameter b",
"details" : [ ]
},
{
"value" : 5.5,
"description" : "avgFieldLength",
"details" : [ ]
},
{
"value" : 6.0,
"description" : "fieldLength",
"details" : [ ]
}
]
}
]
}
]
}
]
}
},
{
"_shard" : "[megacorp][3]",
"_node" : "pGtCz_FvSTmteJwQKvn_lg",
"_index" : "megacorp",
"_type" : "employee",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests" : [
"sports",
"music"
],
"fielddata" : true
},
"_explanation" : {
"value" : 0.5753642,
"description" : "sum of:",
"details" : [
{
"value" : 0.2876821,
"description" : "weight(about:rock in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.2876821,
"description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details" : [
{
"value" : 0.2876821,
"description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details" : [
{
"value" : 1.0,
"description" : "docFreq",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "docCount",
"details" : [ ]
}
]
},
{
"value" : 1.0,
"description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details" : [
{
"value" : 1.0,
"description" : "termFreq=1.0",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "parameter k1",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "parameter b",
"details" : [ ]
},
{
"value" : 6.0,
"description" : "avgFieldLength",
"details" : [ ]
},
{
"value" : 6.0,
"description" : "fieldLength",
"details" : [ ]
}
]
}
]
}
]
},
{
"value" : 0.2876821,
"description" : "weight(about:climbing in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.2876821,
"description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details" : [
{
"value" : 0.2876821,
"description" : "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details" : [
{
"value" : 1.0,
"description" : "docFreq",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "docCount",
"details" : [ ]
}
]
},
{
"value" : 1.0,
"description" : "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details" : [
{
"value" : 1.0,
"description" : "termFreq=1.0",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "parameter k1",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "parameter b",
"details" : [ ]
},
{
"value" : 6.0,
"description" : "avgFieldLength",
"details" : [ ]
},
{
"value" : 6.0,
"description" : "fieldLength",
"details" : [ ]
}
]
}
]
}
]
}
]
}
}
]
}
}
Can you please tell me what is the reason behind this?

Short answer: Relevance in Elasticsearch is not a simple topic :) Details below.
I was trying to reproduce your case...
First I've put the two documents:
POST /megacorp/employee/1
{
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}
POST /megacorp/employee/2
{
"first_name": "Jane",
"last_name": "Smith",
"age": 32,
"about": "I like to collect rock albums",
"interests": [
"music"
]
}
and later I used your query:
GET /megacorp/employee/_search
{
"query": {
"match": {
"about": "rock climbing"
}
}
}
My results were totally different:
{
"took": 89,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.5753642,
"hits": [
{
"_index": "megacorp",
"_type": "employee",
"_id": "1",
"_score": 0.5753642,
"_source": {
"first_name": "John",
"last_name": "Smith",
"age": 25,
"about": "I love to go rock climbing",
"interests": [
"sports",
"music"
]
}
},
{
"_index": "megacorp",
"_type": "employee",
"_id": "2",
"_score": 0.2876821,
"_source": {
"first_name": "Jane",
"last_name": "Smith",
"age": 32,
"about": "I like to collect rock albums",
"interests": [
"music"
]
}
}
]
}
}
As you can see results are in "expected" order. Please note that the _score values are totally different than you.
The question is: Why? What happened?
The detailed answer for this situation was described in the Practical BM25 - Part 1: How Shards Affect Relevance Scoring in Elasticsearch article.
Shortly: as you probably could notice Elasticsearch stores documents split among shards. To be faster, by default it uses query_then_fetch strategy. This means that Elasticsearch first asks for results on every shard and later will fetch the results and present them to the user. Of course the same happens with the score calculation.
As you can see, in our results 5 shards where queried. Elasticsearch is using 5 shards by default if not specified on index creation (can be specified with number_of_shards param). That is why our scores are different. Moreover, if you try to do this again yourself there is a big chance that you get different results once again. Everything depends on how the document is distributed among shards. If you set number_of_shards to 1 for this index you will be getting the same scores each time.
An additional thing, also mentioned in the article is:
People start loading just a few documents into their index and ask
“why does document A have a higher/lower score than document B” and
sometimes the answer is that the user has a relatively high ratio of
shards to documents so that the scores are skewed across different
shards.
Elasticsearch was designed to maintain a large amount of data and the more data you put into an index, the more accurate the results you get.
I hope my answer explains your doubts.

Related

Normalization of term frequency in elasticsearch

I recently started working with elasticsearch (version 7.17.2) and there is something related to term frequency normalization and boosting that I don't quite understand.
To keep it simple, suppose I just create an index with
PUT test
and add a couple of documents
POST test/_doc/1
{
"firstname": "foo",
"lastname": "bar"
}
POST test/_doc/2
{
"firstname": "foo",
"lastname": "baz"
}
Now I want to perform the following search
POST test/_search
{
"explain": true,
"query": {
"bool": {
"should": {
"multi_match": {
"fields": [
"firstname^3",
"lastname^5"
],
"query": "foo bar"
}
}
}
}
}
which returns
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 3.465736,
"hits" : [
{
"_shard" : "[test][0]",
"_node" : "Or9Q1aPLTi-liJvA8NJW6g",
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 3.465736,
"_source" : {
"firstname" : "foo",
"lastname" : "bar"
},
"_explanation" : {
"value" : 3.465736,
"description" : "max of:",
"details" : [
{
"value" : 0.5469647,
"description" : "sum of:",
"details" : [
{
"value" : 0.5469647,
"description" : "weight(firstname:foo in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.5469647,
"description" : "score(freq=1.0), computed as boost * idf * tf from:",
"details" : [
{
"value" : 6.6000004,
"description" : "boost",
"details" : [ ]
},
{
"value" : 0.18232156,
"description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details" : [
{
"value" : 2,
"description" : "n, number of documents containing term",
"details" : [ ]
},
{
"value" : 2,
"description" : "N, total number of documents with field",
"details" : [ ]
}
]
},
{
"value" : 0.45454544,
"description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details" : [
{
"value" : 1.0,
"description" : "freq, occurrences of term within document",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "k1, term saturation parameter",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "b, length normalization parameter",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "dl, length of field",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "avgdl, average length of field",
"details" : [ ]
}
]
}
]
}
]
}
]
},
{
"value" : 3.465736,
"description" : "sum of:",
"details" : [
{
"value" : 3.465736,
"description" : "weight(lastname:bar in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 3.465736,
"description" : "score(freq=1.0), computed as boost * idf * tf from:",
"details" : [
{
"value" : 11.0,
"description" : "boost",
"details" : [ ]
},
{
"value" : 0.6931472,
"description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details" : [
{
"value" : 1,
"description" : "n, number of documents containing term",
"details" : [ ]
},
{
"value" : 2,
"description" : "N, total number of documents with field",
"details" : [ ]
}
]
},
{
"value" : 0.45454544,
"description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details" : [
{
"value" : 1.0,
"description" : "freq, occurrences of term within document",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "k1, term saturation parameter",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "b, length normalization parameter",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "dl, length of field",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "avgdl, average length of field",
"details" : [ ]
}
]
}
]
}
]
}
]
}
]
}
},
{
"_shard" : "[test][0]",
"_node" : "Or9Q1aPLTi-liJvA8NJW6g",
"_index" : "test",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.5469647,
"_source" : {
"firstname" : "foo",
"lastname" : "baz"
},
"_explanation" : {
"value" : 0.5469647,
"description" : "max of:",
"details" : [
{
"value" : 0.5469647,
"description" : "sum of:",
"details" : [
{
"value" : 0.5469647,
"description" : "weight(firstname:foo in 0) [PerFieldSimilarity], result of:",
"details" : [
{
"value" : 0.5469647,
"description" : "score(freq=1.0), computed as boost * idf * tf from:",
"details" : [
{
"value" : 6.6000004,
"description" : "boost",
"details" : [ ]
},
{
"value" : 0.18232156,
"description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
"details" : [
{
"value" : 2,
"description" : "n, number of documents containing term",
"details" : [ ]
},
{
"value" : 2,
"description" : "N, total number of documents with field",
"details" : [ ]
}
]
},
{
"value" : 0.45454544,
"description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
"details" : [
{
"value" : 1.0,
"description" : "freq, occurrences of term within document",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "k1, term saturation parameter",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "b, length normalization parameter",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "dl, length of field",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "avgdl, average length of field",
"details" : [ ]
}
]
}
]
}
]
}
]
}
]
}
}
]
}
}
I purposedly gave more relevance to lastname with respect to firstname (5 vs. 3). In the explanation, for instance for the contribution of firstname:foo, the score is computed as boost * idf * tf.
While I gave the field firstname a relevance boost of 3, its actual boost according to the explanation is 6.6. After some investigation, I figured out that this value corresponds to 3 * (1.2 + 1), that is my boost of 3 mutiplied by (k_1 + 1), where k_1 corresponds to the parameter of the default BM25 similarity function, whose default value is 1.2.
I know this might be related to some normalization that elasticsearch performs behind the scenes (whose documentation is rather poor), but I have seen this happening in two ways:
Exactly as in this example, with tf = freq / (freq + k1 * (1 - b + b * dl / avgdl)).
Like they do it on wikipedia, with tfNorm = (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)). Notice already that the value is called tfNorm instead of just tf, and that the (k1 + 1) factor appears explicitly in the tfNorm and not "hidden" in the boost. Here are the wikipedia elasticsearch settings and mappings, in case they help.
What I would like to clarify is what is the difference between these two behaviors and how to switch between them, perhaps by updating the mapping.
BONUS QUESTION: Actually, there is a third option, that we can see in the same wikipedia example, searching for the field all_near_match. There, tfNorm = (freq * (k1 + 1)) / (freq + k1), and there is an annotation saying that the b parameter in the BM25 similarity function is 0 because norms omitted for field. How does this other approach relate with the other two I described above?
Thank you very much!

How to calculate gauss value?

I am just curious to know how this value came i applied formula but i think i am missing something can anybody tell me please.I am running single ELK stack version 7.16
POST sneaker/_search
{
"query": {
"function_score": {
"functions": [
{
"gauss": {
"price": {
"origin": "300",
"scale": "200"
}
}
}
]
}
}
, "explain": true
}
Query result
"max_score" : 1.0,
"hits" : [
{
"_shard" : "[sneaker][0]",
"_node" : "29ds_f0VSM6_-eDNhdQPLw",
"_index" : "sneaker",
"_type" : "_doc",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"brand" : "flite",
"price" : 300,
"rating" : 2,
"release_date" : "2020-12-21"
},
"_explanation" : {
"value" : 1.0,
"description" : "function score, product of:",
"details" : [
{
"value" : 1.0,
"description" : "*:*",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "min of:",
"details" : [
{
"value" : 1.0,
"description" : "Function for field price:",
"details" : [
{
"value" : 1.0,
"description" : "exp(-0.5*pow(MIN[Math.max(Math.abs(300.0(=doc value) - 300.0(=origin))) - 0.0(=offset), 0)],2.0)/28853.900817779268)",
"details" : [ ]
}
]
},
I look for guassian distribution but it is different from this.
I want to know how 28853.900817779268 value came
If you look on the official documentation for the gauss decay function, you'll find the following formula for computing the sigma:
Using scale = 200 and decay = 0.5 (default value if unspecified), we get the following:
-200^2 / (2 * ln (0.5)) = -28853.90081
which is what you're seeing in the explanation of the query.

Elastic Search Query for Relevancy Given a Phrase Rather Than Just One Word

Elastic Search querying/boosting is not working as I would expect it to...
I have an index where documents look like this:
{
"entity_id" : "x",
"entity_name" : "y",
"description": "search engine",
"keywords" : [
"Google"
]
}
Im trying to get the document to show up with a relevancy score when querying by a search phrase that contains one of the keywords.
like this:
{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "What are some of products for Google?",
"boost": 10,
"fields": ["keywords"]
}
}
],
"filter": {
"term" : { "entity_name" : "y" }
}
}
}
}
The problem is that my results are not as expected for three reasons:
The result contains hits that do not have any relevancy to "Google" or "Products" or any words in the search phrase.
The document that I am expecting to get returned has a _score = 0.0
The document that I am expecting to get returned has a mysterious "_ignored" : [ "description.keyword"],
The response looks like this:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_score" : 0.0,
"_source": {
"entity_id" : "a",
"entity_name" : "y",
"description": "some other entity",
"keywords": ["Other"]
}
},
{
"_score" : 0.0,
"_ignored" : [
"description.keyword"
],
"_source": {
"entity_id" : "x",
"entity_name" : "y",
"description": "search engine",
"keywords": ["Google"]
}
}
]
}
}
What am I doing wrong?
TLDR;
You use the wrong query type, query_string is not suitable for your needs, maybe use match
To understand
First and foremost:
_ignored is a field that track all the fields that where malformed at index time, and thus are going to be ignored at search time. [doc]
Why is my score 0:
It is because of the query_string query. [doc]
Returns documents based on a provided query string, using a parser with a strict syntax.
eg:
"query": "(new york city) OR (big apple)"
The query_string query splits (new york
city) OR (big apple) into two parts: new york city and big apple.
To illustrate my point, look at the example bellow:
POST /so_relevance_score/_doc
{
"entity_id" : "x",
"entity_name" : "y",
"description": "search engine",
"keywords" : [
"Google"
]
}
POST /so_relevance_score/_doc
{
"entity_id" : "x",
"entity_name" : "y",
"description": "consumer electronic",
"keywords" : [
"Apple"
]
}
GET /so_relevance_score/_search
{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "What are some of products for Google?",
"boost": 10,
"fields": ["keywords"]
}
}
],
"filter": {
"term" : { "entity_name" : "y" }
}
}
}
}
will return the following results:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "so_relevance_score",
"_type" : "_doc",
"_id" : "0uYgP34Bpf2xEaYqLYai",
"_score" : 0.0,
"_source" : {
"entity_id" : "x",
"entity_name" : "y",
"description" : "search engine",
"keywords" : [
"Google"
]
}
},
{
"_index" : "so_relevance_score",
"_type" : "_doc",
"_id" : "1eYmP34Bpf2xEaYquoZC",
"_score" : 0.0,
"_source" : {
"entity_id" : "x",
"entity_name" : "y",
"description" : "consumer electronic",
"keywords" : [
"Apple"
]
}
}
]
}
}
Score is 0 for both document. Which means that both documents are as relevant on this query for ElasticSearch.
But if you were to change the query type to match
GET /so_relevance_score/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"keywords": "What are some of products for Google?"
}
}
],
"filter": {
"term" : { "entity_name" : "y" }
}
}
}
}
I get:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.6931471,
"hits" : [
{
"_index" : "so_relevance_score",
"_type" : "_doc",
"_id" : "0uYgP34Bpf2xEaYqLYai",
"_score" : 0.6931471,
"_source" : {
"entity_id" : "x",
"entity_name" : "y",
"description" : "search engine",
"keywords" : [
"Google"
]
}
},
{
"_index" : "so_relevance_score",
"_type" : "_doc",
"_id" : "1eYmP34Bpf2xEaYquoZC",
"_score" : 0.0,
"_source" : {
"entity_id" : "x",
"entity_name" : "y",
"description" : "consumer electronic",
"keywords" : [
"Apple"
]
}
}
]
}
}
With a relevance score !
If you want to fine tune your results, I suggest diving into the documentation for query types [doc]

Why is queryWeight included for some result scores, but not others, in the same query?

I'm executing a query_string query with one term on multiple fields, _all and tags.name, and trying to understand the scoring. Query: {"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}. Here are the documents returned by the query:
Document 1 has an exact match on tags.name, but not on _all.
Document 8 has an exact match on both tags.name and on _all.
Document 8 should win, and it does, but I'm confused by how the scoring works out. It seems like Document 1 is getting penalized by having its tags.name score multiplied by the IDF twice, whereas Document 8's tags.name score is only multiplied by the IDF once. In short:
They both have a component weight(tags.name:animal in 0) [PerFieldSimilarity].
In Document 1, we have weight = score = queryWeight x fieldWeight.
In Document 8, we have weight = fieldWeight!
Since queryWeight contains idf, this results in Document 1 getting penalized by its idf twice.
Can anyone make sense of this?
Additional information
If I remove _all from the fields of the query, queryWeight is completely gone from the explain.
Adding "use_dis_max":true as an option has no effect.
However, additionally adding "tie_breaker":0.7 (or any value) does affect Document 8 by giving it the more-complicated formula we see in Document 1.
Thoughts: It's plausible that a boolean query (which this is) might do this on purpose to give more weight to queries that match more than one sub-query. However, this doesn't make any sense for a dis_max query, which is supposed to just return the maximum of the sub-queries.
Here are the relevant explain requests. Look for embedded comments.
Document 1 (match only on tags.name):
curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}':
{
"ok" : true,
"_index" : "questions_1390104463",
"_type" : "question",
"_id" : "1",
"matched" : true,
"explanation" : {
"value" : 0.058849156,
"description" : "max of:",
"details" : [ {
"value" : 0.058849156,
"description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:",
// weight = score = queryWeight x fieldWeight
"details" : [ {
// score and queryWeight are NOT a part of the other explain!
"value" : 0.058849156,
"description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details" : [ {
"value" : 0.30685282,
"description" : "queryWeight, product of:",
"details" : [ {
// This idf is NOT a part of the other explain!
"value" : 0.30685282,
"description" : "idf(docFreq=1, maxDocs=1)"
}, {
"value" : 1.0,
"description" : "queryNorm"
} ]
}, {
"value" : 0.19178301,
"description" : "fieldWeight in 0, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 0.30685282,
"description" : "idf(docFreq=1, maxDocs=1)"
}, {
"value" : 0.625,
"description" : "fieldNorm(doc=0)"
} ]
} ]
} ]
} ]
}
Document 8 (match on both _all and tags.name):
curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}':
{
"ok" : true,
"_index" : "questions_1390104463",
"_type" : "question",
"_id" : "8",
"matched" : true,
"explanation" : {
"value" : 0.15342641,
"description" : "max of:",
"details" : [ {
"value" : 0.033902764,
"description" : "btq, product of:",
"details" : [ {
"value" : 0.033902764,
"description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:",
"details" : [ {
"value" : 0.033902764,
"description" : "fieldWeight in 0, product of:",
"details" : [ {
"value" : 0.70710677,
"description" : "tf(freq=0.5), with freq of:",
"details" : [ {
"value" : 0.5,
"description" : "phraseFreq=0.5"
} ]
}, {
"value" : 0.30685282,
"description" : "idf(docFreq=1, maxDocs=1)"
}, {
"value" : 0.15625,
"description" : "fieldNorm(doc=0)"
} ]
} ]
}, {
"value" : 1.0,
"description" : "allPayload(...)"
} ]
}, {
"value" : 0.15342641,
"description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:",
// weight = fieldWeight
// No score or queryWeight in sight!
"details" : [ {
"value" : 0.15342641,
"description" : "fieldWeight in 0, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 0.30685282,
"description" : "idf(docFreq=1, maxDocs=1)"
}, {
"value" : 0.5,
"description" : "fieldNorm(doc=0)"
} ]
} ]
} ]
}
}
I've no answer. Just want to mention I posted question to the Elasticsearch forum: https://groups.google.com/forum/#!topic/elasticsearch/xBKlFkq0SP0
I'll notify here when I'll get the answer.

Elasticsearch gives different scores for same documents

I have some documents which have the same content but when I try to query for these documents, I am getting different scores although the queried field contains the same text. I have explained the scores but I am not able to analyse and find the reason for different scores.
My query is
curl 'localhost:9200/acqindex/_search?pretty=1' -d '{
"explain" : true,
"query" : {
"query_string" : {
"query" : "text:shimla"
}
}
}'
Search response :
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 31208,
"max_score" : 268.85962,
"hits" : [ {
"_shard" : 0,
"_node" : "KOebAnGhSJKUHLPNxndcpQ",
"_index" : "acqindex",
"_type" : "autocomplete_questions",
"_id" : "50efec6c38cc6fdabd8653a3",
"_score" : 268.85962, "_source" : {"_class":"com.ixigo.next.cms.model.AutoCompleteObject","_id":"50efec6c38cc6fdabd8653a3","ad":"rajasthan,IN","category":["Destination"],"ctype":"destination","eid":"503b2a65e4b032e338f0d24b","po":8.772307692307692,"text":"shimla","url":"/travel-guide/shimla"},
"_explanation" : {
"value" : 268.85962,
"description" : "sum of:",
"details" : [ {
"value" : 38.438133,
"description" : "weight(text:shi in 5860), product of:",
"details" : [ {
"value" : 0.37811017,
"description" : "queryWeight(text:shi), product of:",
"details" : [ {
"value" : 5.0829277,
"description" : "idf(docFreq=7503, maxDocs=445129)"
}, {
"value" : 0.074388266,
"description" : "queryNorm"
} ]
}, {
"value" : 101.658554,
"description" : "fieldWeight(text:shi in 5860), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shi)=1)"
}, {
"value" : 5.0829277,
"description" : "idf(docFreq=7503, maxDocs=445129)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=5860)"
} ]
} ]
}, {
"value" : 66.8446,
"description" : "weight(text:shim in 5860), product of:",
"details" : [ {
"value" : 0.49862078,
"description" : "queryWeight(text:shim), product of:",
"details" : [ {
"value" : 6.7029495,
"description" : "idf(docFreq=1484, maxDocs=445129)"
}, {
"value" : 0.074388266,
"description" : "queryNorm"
} ]
}, {
"value" : 134.05899,
"description" : "fieldWeight(text:shim in 5860), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shim)=1)"
}, {
"value" : 6.7029495,
"description" : "idf(docFreq=1484, maxDocs=445129)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=5860)"
} ]
} ]
}, {
"value" : 81.75818,
"description" : "weight(text:shiml in 5860), product of:",
"details" : [ {
"value" : 0.5514458,
"description" : "queryWeight(text:shiml), product of:",
"details" : [ {
"value" : 7.413075,
"description" : "idf(docFreq=729, maxDocs=445129)"
}, {
"value" : 0.074388266,
"description" : "queryNorm"
} ]
}, {
"value" : 148.2615,
"description" : "fieldWeight(text:shiml in 5860), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shiml)=1)"
}, {
"value" : 7.413075,
"description" : "idf(docFreq=729, maxDocs=445129)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=5860)"
} ]
} ]
}, {
"value" : 81.8187,
"description" : "weight(text:shimla in 5860), product of:",
"details" : [ {
"value" : 0.55164987,
"description" : "queryWeight(text:shimla), product of:",
"details" : [ {
"value" : 7.415818,
"description" : "idf(docFreq=727, maxDocs=445129)"
}, {
"value" : 0.074388266,
"description" : "queryNorm"
} ]
}, {
"value" : 148.31636,
"description" : "fieldWeight(text:shimla in 5860), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shimla)=1)"
}, {
"value" : 7.415818,
"description" : "idf(docFreq=727, maxDocs=445129)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=5860)"
} ]
} ]
} ]
}
}, {
"_shard" : 1,
"_node" : "KOebAnGhSJKUHLPNxndcpQ",
"_index" : "acqindex",
"_type" : "autocomplete_questions",
"_id" : "50efed1c38cc6fdabd8b8d2f",
"_score" : 268.29953, "_source" : {"_id":"50efed1c38cc6fdabd8b8d2f","ad":"himachal pradesh,IN","category":["Hill","See and Do","Destination","Mountain","Nature and Wildlife"],"ctype":"destination","eid":"503b2a64e4b032e338f0d0af","po":8.781970310391364,"text":"shimla","url":"/travel-guide/shimla"},
"_explanation" : {
"value" : 268.29953,
"description" : "sum of:",
"details" : [ {
"value" : 38.52957,
"description" : "weight(text:shi in 14769), product of:",
"details" : [ {
"value" : 0.37895453,
"description" : "queryWeight(text:shi), product of:",
"details" : [ {
"value" : 5.083667,
"description" : "idf(docFreq=7263, maxDocs=431211)"
}, {
"value" : 0.07454354,
"description" : "queryNorm"
} ]
}, {
"value" : 101.67334,
"description" : "fieldWeight(text:shi in 14769), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shi)=1)"
}, {
"value" : 5.083667,
"description" : "idf(docFreq=7263, maxDocs=431211)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=14769)"
} ]
} ]
}, {
"value" : 66.67524,
"description" : "weight(text:shim in 14769), product of:",
"details" : [ {
"value" : 0.49850821,
"description" : "queryWeight(text:shim), product of:",
"details" : [ {
"value" : 6.6874766,
"description" : "idf(docFreq=1460, maxDocs=431211)"
}, {
"value" : 0.07454354,
"description" : "queryNorm"
} ]
}, {
"value" : 133.74953,
"description" : "fieldWeight(text:shim in 14769), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shim)=1)"
}, {
"value" : 6.6874766,
"description" : "idf(docFreq=1460, maxDocs=431211)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=14769)"
} ]
} ]
}, {
"value" : 81.53204,
"description" : "weight(text:shiml in 14769), product of:",
"details" : [ {
"value" : 0.5512571,
"description" : "queryWeight(text:shiml), product of:",
"details" : [ {
"value" : 7.3951015,
"description" : "idf(docFreq=719, maxDocs=431211)"
}, {
"value" : 0.07454354,
"description" : "queryNorm"
} ]
}, {
"value" : 147.90204,
"description" : "fieldWeight(text:shiml in 14769), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shiml)=1)"
}, {
"value" : 7.3951015,
"description" : "idf(docFreq=719, maxDocs=431211)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=14769)"
} ]
} ]
}, {
"value" : 81.56268,
"description" : "weight(text:shimla in 14769), product of:",
"details" : [ {
"value" : 0.55136067,
"description" : "queryWeight(text:shimla), product of:",
"details" : [ {
"value" : 7.3964915,
"description" : "idf(docFreq=718, maxDocs=431211)"
}, {
"value" : 0.07454354,
"description" : "queryNorm"
} ]
}, {
"value" : 147.92982,
"description" : "fieldWeight(text:shimla in 14769), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(text:shimla)=1)"
}, {
"value" : 7.3964915,
"description" : "idf(docFreq=718, maxDocs=431211)"
}, {
"value" : 20.0,
"description" : "fieldNorm(field=text, doc=14769)"
} ]
} ]
} ]
}
}
}
}
The documents are :
{"_class":"com.ixigo.next.cms.model.AutoCompleteObject","_id":"50efec6c38cc6fdabd8653a3","ad":"rajasthan,IN","category":["Destination"],"ctype":"destination","eid":"503b2a65e4b032e338f0d24b","po":8.772307692307692,"text":"shimla","url":"/travel-guide/shimla"}
{"_id":"50efed1c38cc6fdabd8b8d2f","ad":"himachal
pradesh,IN","category":["Hill","See and
Do","Destination","Mountain","Nature and Wildlife"],"ctype":"destination","eid":"503b2a64e4b032e338f0d0af","po":8.781970310391364,"text":"shimla","url":"/travel-guide/shimla"}
Please guide me in understanding the reason for the difference in scores.
The lucene score depends on different factors. Using the tf idf similarity (default one) it mainly depends on:
Term frequency: how much the terms found are frequent within the document
Inverted document frequency: how much the terms found appear among the documents (while index)
Field norms (including index time boosting). Shorter fields get higher score than longer ones.
In your case you have to take into account that your two documents come from different shards, thus the score is computed separately on each of those, since every shard is in fact a separate lucene index.
You might want to have a look at the more expensive DFS, Query then Fetch search type that elasticsearch provides for more accurate scoring. The default one is the simple Query then Fetch.
javanna clearly pointed out the problem indicating that a difference of scores comes from the fact that scoring happens in multiple shards. Those shards may have different number of documents. This affects scoring algorithm.
However, authors of Elasticsearch: The Definitive Guide inform:
The differences between local and global IDF [inverse document frequency] diminish the more documents that you add to the index. With real-world volumes of data, the local IDFs soon even out. The problem is not that relevance is broken but that there is too little data.
You should not use dfs_query_then_fetch on production. For testing, put your index on one primary shard or specify ?search_type=dfs_query_then_fetch.

Resources