elasticsearch: find the newest elements, return "asc" - elasticsearch

Using Elasticsearch in Go, I need to search for the newest last X elements, ordered by time.
I think having something like this will accomplish the goal:
"query": {"constant_score": {}},
"sort": {"time": {"order": "desc"}},
"size": X
However, this would return the newest elements in reverse order, wouldn't it?
Is there a way to return the newest X elements in ascending order?

This request will give you the oldest elements matching the query.
To achieve your goal, you could make a count query (ordering not needed) and then a desc sorted request with the start parameter set to count-X. This solution is ugly and very inefficient.
You'd be a lot better off desc sorting the results in your Go app.

Related

Search After (pagination) in Elasticsearch when sorting by score

Search after in elasticsearch must match its sorting parameters in count and order. So I was wondering how to get the score from previous result (example page 1) to use it as a search after for next page.
I faced an issue when using the score of the last document in previous search. The score was 1.0, and since all documents has 1.0 score, the result for next page turned out to be null (empty).
That's actually make sense, since I am asking elasticsearch for results that has lower rank (score) than 1.0 which are zero, so which score do I use to get the next page.
Note:
I am sorting by score then by TieBreakerID, so one possible solution is using high value (say 1000) for score.
What you're doing sounds like it should work, as explained by an Elastic team member. It works for me (in ES 7.7) even with tied scores when using the document ID (copied into another indexed field) as a tiebreaker. It's true that indexing additional documents while paginating will make your scores slightly unstable, but not likely enough to cause a significant problem for an end user. If you need it to be reliable for a batch job, the Scroll API is the better choice.
{
"query": {
...
},
"search_after": [
12.276552,
14173
],
"sort": [
{ "_score": "desc" },
{ "id": "asc" }
]
}

ElasticSearch Score Function Depending on Neighbor Documents

I have an ElasticSearch index with 2 mappings (types).
In the app I need to display a paginated feed containing items of both types.
Currently the items are sorted just by creation date, but I also want to have control on how the items alternate with each other on the page.
For example, I want to set a rule for sequence "3 items of type A, 1 item of type B, and so on".
I need it to make sure items of both types are displayed on each page and equally distributed across the pages.
But as far as I see it's not possible to access another documents in custom score function script.
Of course it's easy to implement directly in the app logic, but it's not clear how to implement pagination using this way.
Any ideas on how to achieve that?
I don't think you can do this.
One approach (that doesn't work) is to keep a global variable in a script and to increment that once every document is being returned/processed. And then to take this number, divide it by 3 and get the modulo number. Based on this number, to sort the docs. But "global" variables are not possible in sripts.
The only two approaches that I can think of is to use a script to generate a random number and based on that to sort. In this way, you get some chances to have a "mixed list of types.
Or, if you want the smallest deterministic way of sorting the docs, still in a script take the ID of the document (you said is a number) modulo 3 it and use the value to sort.
For the random approach:
"sort": [
{
"date": {
"order": "desc"
}
},
{
"_script": {
"script": "Math.random()",
"type": "number",
"order": "asc"
}
}
]

Possible to have a document always return above certain position

I've got a bunch of documents from a query which are sorted by a modified date. However I'd like certain documents (identified by a field value) to always return in the top ten results regardless of whether there are ten or more documents with a more recent modified date.
From what I've read about the various ways of sorting in Elasticsearch (score, boost, scripts) I don't think I have any way of determining the actual position of a document in the search results, let alone some way of manipulating the score to push a document into the top ten.
Assuming that you have a field called "important_field" which contains value 1, for documents you in top and say 0 for all other documents, you can use multi field sorting as below
{
"sort": [
{ "important_field": { "order": "desc" }},
{ "modified_date": { "order": "desc" }}
]
}
This way of sorting means it will sort by important_field value and if they are same then will be sorted by modified_date. So all documents with important_field value 1 will come on top and rest will still be sorted by modified_date.

How to sort parents by number of children in Elasticsearch

I have parent/child-related documents in my index and want to get list of parents sorted by number of children. Is it any way to do it? I'm using Elasticsearch 1.5.1
Right now I can easily get number of children documents together with parent query results by using inner_hits feature, but it seems no way to access inner_hits.{child_type_name}.hits.total value from the script or search/score function. Any ideas?
Well, I found answer myself, finally. Thanks to hints from #doctorcal on #elasticsearch IRC
As I mentioned in the question, we can get list of children together with each parent using inner_hits in Elasticsearch 1.5.
To be able to sort parents by number of their children we need to use a small trick - put number of children into the parent's score (which is used to sort by default). For that, we just use the score mode sum for has_child query:
{
"query": {
"has_child": {
"type": "comment",
"score_mode": "sum",
"query": {
"match_all": {}
},
"inner_hits": {}
}
}
}
NOTE: this query has a limitation - it seems you can't keep information about initial scores (relevance scores for the query), since we replace them with number of children.

Elastic Search Distinct values

I want to know how it's possible to get distinct value of a field in elastic search. I read an article here shows how to do that with facets, but I read facets are deprecated:
http://elasticsearch-users.115913.n3.nabble.com/Getting-Distinct-Values-td3830953.html
Is there any other way to do that? if not is it possible to tell me how to do that? it's abit hard to understand solutions like this: Elastic Search - display all distinct values of an array
Use aggregations:
GET /my_index/my_type/_search?search_type=count
{
"aggs": {
"my_fields": {
"terms": {
"field": "name",
"size": 1000
}
}
}
}
You can use the Cardinality metric
Although the counts returned aren't guaranteed to be 100% accurate, they almost always are for low cardinality terms and the precision is configurable via the precision_threshold param.
http://www.elastic.co/guide/en/elasticsearch/guide/current/cardinality.html

Resources