ElasticSearch number of elements change when using filter and minScore - elasticsearch

I am using a query like this:
Select all results for Keyword "X" with min_score = 0.25. Also I am doing aggregations for this results. But when I am clicking on an aggregation, the number of documents becomes different, because this min_score. When I remove the min_score, everything is fine.
What can I do, so I have ever the same count on the aggregations and on the results.

Here is the answer:
https://www.elastic.co/guide/en/elasticsearch/reference/2.3/query-dsl-bool-query.html#_scoring_with_literal_bool_filter_literal
Use filter instead of must

Related

Must returns more results than Filter

I checked this question What is the difference between must and filter in Query DSL in elasticsearch? and read answers.
As far as I understood must and filter should return same result. Am I right? But when I change filter query to must, I receive more result? What I am doing wrong?
I compared filter and must query and got different result.
Must query gives you some score that is used to add to the total score of the doc.
Filter query does not add any score. It is just used to decide whether a doc is returned or not in the result set.
By just looking at the screenshot of the query attached, when you change filter query to must query it starts adding some value to the total score of the doc.
Since you are using min_score condition, the must clause makes more docs exceed 0.2 score and hence more docs are returned in the final result set.
Rest things will be more clear when you share the complete query.

How to search result set from a result set in elasticsearch

Im tring to understand the DSL query i needed if i want to make a search from a result set i got. means i have an initial term search, then i want make another query upon the previous result.
lets say i a have 10 documents with a sharing identifier between them, each document has a description field. i want to search first all the documents containing the value 'Hello' in the description and then take their id's, and search the document containing the value 'good by'.
thanks.
No need to execute two queries, you can use filter context that will filter out the results.filter parameter will filter out documents that do not match, and will also not affect the score for matching documents.
Filter context is in effect whenever a query clause is passed to a
filter parameter, such as the filter or must_not parameters in the
bool query, the filter parameter in the constant_score query, or the
filter aggregation.
Refer this to know more about Query and Filter contexts

What is the default order of search result if search parameters absent in ElasticSearch

What is default order of index for filtered results (and for all results)? By last update? By date of indexed? I tried to get all the documents from my index, but I could not figure out what is the default sort order.
Default sort order is _score, but score is always 1 when you do not specify a search query. Its then more or less random with score being 1. You still get consistent results as far as I remember. You have the "same" when you get results in SQL but don't specify ORDER BY.

Elasticsearch subaggregation not working as expected

I am trying to perform aggregation on a term and then perform sub aggregation on the result test to filter the results on a date range. But sub aggregation filter has no affect on the search response. The search response is always returning all the documents without applying filter.
For example:
TermsBuilder aggregationBuilders = AggregationBuilders.terms("form.id").field("form.id").size(0);
aggregationBuilders.subAggregation(AggregationBuilders.filter("indexDate").filter(QueryBuilders.rangeQuery("indexDate").lte(date)));
You need to use filter aggregations the other way around, i.e. as a top aggregation and then you add the terms aggregation as a sub-aggregation.
TermsBuilder formBuckets = AggregationBuilders.terms("form.id")
.field("form.id")
.size(0);
FilterBuilder dateFilter = AggregationBuilders.filter("indexDate")
.filter(QueryBuilders.rangeQuery("indexDate").lte(date))
.subAggregation(formBuckets);
I see in your other question, you have somehow "solved" this issue by moving the filter on indexDate to the query section. That will also work in your case.

RethinkDB: custom scoring (like Elasticsearch)

I recently discovered RethinkDB, and find it's query language to be much simpler than Elasticsearch. The only use case I haven't been able to find a solution for is specifying how to score results based on the document's fields, like you can do in Elasticsearch (http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/script-score.html). Is there a way to score the query results in RethinkDB and return only the top-n results?
If you have a query like r.table('comments').filter(r.row('name').eq('tldr')), then you can do something like r.table('comments').filter(r.row('name').eq('tldr')).map({score: CALCULATE_SCORE(r.row), row: r.row}).orderBy('score').limit(n) to return the top n results. Note that his does work proportional to the number of results in the original query. If that's too expensive, you can do something similar with an index by writing r.table('comments').indexCreate('score', CALCULATE_SCORE(r.row)) and then writing r.table('comments').orderBy({index: 'score'}).limit(n).

Resources