How disable scoring in elasticsearch for one query? - elasticsearch

Is it possible to disable score calculation on particular query (not for type or all index) in elasticsearch?

As stated in comments, you could wrap your particular query in ConstantScoreQuery
{
"constant_score" : {
"query": { your_query_here}
"filter": {your_filter_here}
"boost" : 1.0
}
}
All matched documents will get score 1.0. For more reference information - http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-constant-score-query.html

Related

Filter on score after rescore in Elasticsearch

I have been on an internet manhunt for days for this and getting ready to give up. I need to filter on _score in Elasticsearch after the rescore function has completed. So given an example query like this:
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 50,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}
}
Say just for simplicity's sake that the above returns 5 documents with scores ranging from 0.0 to 1.0. I want the final returned results set to only be the documents with a score above 0.90. In other words, take those newly-rescored docs, and hand them off to a filter where it drops all documents scored below 0.90.
I have tried many, many different ways but nothing is working. Post_filter is apparently meant to come after the main query but before rescore, so that one doesn't work. min_score does not work at all with rescore, it only works with the original ES scores from the main query. Aggs is one functionality that I am able to get to work after rescore, but aggregating is not what I need to do here. But at least it shows me that ES has the ability to continue operating on the data after a rescore query.
Any thoughts on how to get this seemingly simple task accomplished? I have also tried using function_score and script_score but really those are just ways to further modify the scores, whereas I need to filter on the scores generated by the rescore. The requirement here is to get it done in the query. We can't do it as a post-processing step.

Elasticsearch : constant_score query vs bool.filter query

I am trying to achieve an exact match result using Elasticsearch (so I don't care about scoring here)
I see that there are 2 ways to do this :
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"exact_match_field" : "hello world !"
}
}
}
}
}
or
{
"query": {
"bool": {
"filter": {
"term": {
"exact_match_field": "hello world !"
}
}
}
}
}
Both work and gives me the result I want. Whats the difference between them ? Are there performance benefits of using one vs the other ?
(I am using Elasticsearch V 5.6)
Thanks !
Constant score query gives an equal score to any matching document irrespective of any scoring factors like TF, IDF etc. This can be used when you don't care whether how much a doc matched but just if a doc matched or not and give a score too, unlike filter.
A constant_score query takes a boost argument that is set as the score for every returned document when combined with other queries. By default boost is set to 1.
If you are interested below link will give you more insight
https://www.compose.com/articles/elasticsearch-query-time-strategies-and-techniques-for-relevance-part-ii/

Boost a document based on the existence of a field

Is it possible to boost a document's relevance based on the presence of a field? I've read about function score queries but I'm wondering how existence is taken into account - from my understanding, the field_value_factor applies to the content of the field, not on its presence.
Function score query is a possibility, however a score function is computationally expensive and not necessary (keep in mind exists query used below may not have been available in 2014).
Can do the following:
POST _search
{
"query": {
"bool" : {
"should" : [
{
match_all: {
"boost": 10
}
},
{
"exists": {
"field": "some_field_that_should_exist"
}
}
],
"minimum_should_match" : 2,
}
}
}
With a minimum should match of 2 we say that both clauses must match in order for the should clause to match. This prevents the boost from being applied to documents that do not have the field.
This can be simplified. Exists is a constant score query, boost can be used with it directly.
{
"query":{
"exists":{
"field":"some_field_that_should_exist",
"boost":10
}
}
}

Elastic Search boost query corresponding to first search term

I am using PyElasticsearch (elasticsearch python client library). I am searching strings like Arvind Kejriwal India Today Economic Times and that gives me reasonable results. I was hoping I could increase weight of the first words more in the search query. How can I do that?
res = es.search(index="article-index", fields="url", body={
"query": {
"query_string": {
"query": "keywordstr",
"fields": [
"text",
"title",
"tags",
"domain"
]
}
}
})
I am using the above command to search right now.
split given query into multiple terms. In your example it will be Arvind, Kejriwal... Now form query string queries(or field query or any other which fits into the need) for each of the given terms. A query string query will look like this
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-query-string-query.html
{
"query_string" : {
"default_field" : "content",
"query" : "<one of the given term>",
"boost": <any number>
}
}
Now you have got multiple queries like above with different boost values(depending upon which have higher weight). Combine all of those queries into one query using BOOL query. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
If you want all of the terms to be present in the result, query will be like this.
{
"bool" : {
"must" : [q1, q2, q3 ...]
}
}
you can use different options of bool query. for example you want any of 3 terms to present in result then query will be like
{
"bool" : {
"should" : [q1, q2,q3 ...]
},
"minimum_should_match" : 3,
}
theoretically:
split into terms using api
query against terms with different boosting
Lucene Query Syntax does the trick. Thanks
http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Boosting%20a%20Term

Elastic Search truncating hits.total via score

Is it possible to execute a query and filter it so that only elements with score > 1.0 are considered in the hits.total response?
I believe you can use min_score to achieve this (http://www.elasticsearch.org/guide/reference/api/search/min-score/). The ES docs example:
{
"min_score": 0.5,
"query" : {
"term" : { "user" : "kimchy" }
}
}
As the docs also say, this isn't usually practical because scoring is a relative calculation. If you're heavily influencing the results however, it might be what you need.

Resources