Elasticsearch : constant_score query vs bool.filter query - elasticsearch

I am trying to achieve an exact match result using Elasticsearch (so I don't care about scoring here)
I see that there are 2 ways to do this :
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"exact_match_field" : "hello world !"
}
}
}
}
}
or
{
"query": {
"bool": {
"filter": {
"term": {
"exact_match_field": "hello world !"
}
}
}
}
}
Both work and gives me the result I want. Whats the difference between them ? Are there performance benefits of using one vs the other ?
(I am using Elasticsearch V 5.6)
Thanks !

Constant score query gives an equal score to any matching document irrespective of any scoring factors like TF, IDF etc. This can be used when you don't care whether how much a doc matched but just if a doc matched or not and give a score too, unlike filter.
A constant_score query takes a boost argument that is set as the score for every returned document when combined with other queries. By default boost is set to 1.
If you are interested below link will give you more insight
https://www.compose.com/articles/elasticsearch-query-time-strategies-and-techniques-for-relevance-part-ii/

Related

Filter on score after rescore in Elasticsearch

I have been on an internet manhunt for days for this and getting ready to give up. I need to filter on _score in Elasticsearch after the rescore function has completed. So given an example query like this:
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 50,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}
}
Say just for simplicity's sake that the above returns 5 documents with scores ranging from 0.0 to 1.0. I want the final returned results set to only be the documents with a score above 0.90. In other words, take those newly-rescored docs, and hand them off to a filter where it drops all documents scored below 0.90.
I have tried many, many different ways but nothing is working. Post_filter is apparently meant to come after the main query but before rescore, so that one doesn't work. min_score does not work at all with rescore, it only works with the original ES scores from the main query. Aggs is one functionality that I am able to get to work after rescore, but aggregating is not what I need to do here. But at least it shows me that ES has the ability to continue operating on the data after a rescore query.
Any thoughts on how to get this seemingly simple task accomplished? I have also tried using function_score and script_score but really those are just ways to further modify the scores, whereas I need to filter on the scores generated by the rescore. The requirement here is to get it done in the query. We can't do it as a post-processing step.

Elasticsearch 6.5 query scoring changed, how do we get the ES 5 type results?

I am making a recommender with Elasticsearch. I know what people have bought and this forms the query. The index is of items and has a field that contains items bought in common.
We were using ES 5 and the following query finds the highest score, meaning items that have the most in common with the query. But this query in ES 6 returns only score = 1.0 and so no longer find the most similar items.
{
"query": {
"bool": {
"should": [
{
"terms": {
"bought": [
"iPad Pro",
"iPhone 8"
]
}
}
]
}
}
}
How do we get the same results with an ES 6 query?
It’s listed as a breaking change in the Elasticsearch 6.0. Basically, terms query now always return scores equal to 1
Unfortunately, as I stated already, it would be very difficult to have exactly the same behaviour, but according to your logic in question - I would recommend to use boolean query, e.g.
{
"query": {
"bool" : {
"should" : [
{ "term" : { "bought" : "Ipad PRO" } },
{ "term" : { "bought" : "Iphone XS" } }
]
}
}
}
In this case you would be able to mimic the same terms query behaviour, but also keep the score related to exactly logic you want. If person just bought 1 thing out of 2 score will be less`

Elasticsearch: Filter (or Query) by Term Frequency

How do I run an elasticsearch query that only returns results with the term X mentioned at least Y times in a document?
For example, suppose you had a footer in all of your indexed documents that say something like copyright 2013. Suppose when the user runs a search for the term copyright, you want to be smart and only show those documents that say the word copyright twice (otherwise you'll return all documents). I know there are multiple ways of accomplishing this, but one way, would be to run a filter that returns only those documents that use the term copyright twice. Does such a filter exist?
I could envision something like this, but I don't see anything comparable in the docs:
"filter" : {
"term" : { "user" : "copyright"},
"frequency" : { "gt" : 1 }
}
Considering that Elasticsearch stores term frequencies, I would expect that this would be possible to implement.
Use a script filter in which you access the term frequency of copyright in field user using something like _index['user']['copyright'].tf():
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "_index['name'][term_to_lookup].tf() > occurrences",
"params": {
"term_to_lookup": "copyright",
"occurrences": 1
}
}
}
}
}
}

How disable scoring in elasticsearch for one query?

Is it possible to disable score calculation on particular query (not for type or all index) in elasticsearch?
As stated in comments, you could wrap your particular query in ConstantScoreQuery
{
"constant_score" : {
"query": { your_query_here}
"filter": {your_filter_here}
"boost" : 1.0
}
}
All matched documents will get score 1.0. For more reference information - http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-constant-score-query.html

Elastic Search boost query corresponding to first search term

I am using PyElasticsearch (elasticsearch python client library). I am searching strings like Arvind Kejriwal India Today Economic Times and that gives me reasonable results. I was hoping I could increase weight of the first words more in the search query. How can I do that?
res = es.search(index="article-index", fields="url", body={
"query": {
"query_string": {
"query": "keywordstr",
"fields": [
"text",
"title",
"tags",
"domain"
]
}
}
})
I am using the above command to search right now.
split given query into multiple terms. In your example it will be Arvind, Kejriwal... Now form query string queries(or field query or any other which fits into the need) for each of the given terms. A query string query will look like this
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/query-dsl-query-string-query.html
{
"query_string" : {
"default_field" : "content",
"query" : "<one of the given term>",
"boost": <any number>
}
}
Now you have got multiple queries like above with different boost values(depending upon which have higher weight). Combine all of those queries into one query using BOOL query. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
If you want all of the terms to be present in the result, query will be like this.
{
"bool" : {
"must" : [q1, q2, q3 ...]
}
}
you can use different options of bool query. for example you want any of 3 terms to present in result then query will be like
{
"bool" : {
"should" : [q1, q2,q3 ...]
},
"minimum_should_match" : 3,
}
theoretically:
split into terms using api
query against terms with different boosting
Lucene Query Syntax does the trick. Thanks
http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Boosting%20a%20Term

Resources