Custom sort lexicographically as int - elasticsearch

I have some elastic elements that have a string property that looks like 10/2021 and it need to be sorted as a int, but when I perform this query
"sort": [
{
"myProperty": {
"order": "asc"
}
},
I get the lexicographic order.
1/2021
10/2021
100/2021
101/2021
102/2021
But I need it to sort by the first number and the year like this:
1/2020
2/2020
...
1/2021
2/2021
I can't figure out how to custom sort, is it even possible?

Solution 1:
Using Scripted-Sort ...
Not Recommended with large data-set: It will take time as we are performing computations here
GET <>/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script":{
"type":"number",
"script":{
"lang":"painless",
"source":"Integer.parseInt(doc['myProperty.keyword'].value.replace(\"/\",\"\"))" //<====== Replace myProperty.keyword with the keyword field or String field with field-data true
}
}
}
}
Note: i haven't added null checks in the script, just in case you have any document which don't have this field.
Solution 2:
Store another Numeric field in elastic search which doesn't have "/"
Sort based on that field
Migrate the data of existing documents to the field using update_by_query API
This is the Recommended approach.

Related

Elasticsearch: sort by copy_to target of two fields

I'm using trying to create a fullName using copy_to exactly like docs say to do: https://www.elastic.co/guide/en/elasticsearch/reference/current/copy-to.html
But, what I want to do in a query, is to sort by fullName. However, when I specify the field to sort I see that the query is actually sorted by forename, e.g. the first part of copy_to:
{
"sort": [
{
"<nested>.fullName.keyword": {
"nested_path": "<nested>",
"order": "desc"
}
}
]
}
What I want to do is to sort by forename + surname e.g. by full fullName.
Is it possible to do this using copy_to at all?

Sorting on field which in text data type but integer will be store in Elastic Search

we have a field in index - TempNo which has to be text type but all values in this field are number (integer)
When i am doing sorting (desc) on this field , sort does not happen correctly. I am not getting result in desc order of TempNo.
It seems it is because of text type . How can I sort it correctly ? (type is text but sorting should happen based on Number)
Thanks,
Gopal
Actually, if the type is text, ElasticSearch does not do any Sort/Agg operations for you.
There are 2 ways to make some changes.
1. Change the TempNo from text to integer directly. (It will sort correctly)
2. Add Raw type for TempNo if you must use the text,(https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html), and then use the painless for sorting by number.
GET my_index/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script": {
"type": "Number",
"order": "desc",
"script": {
"lang": "painless",
"source": """
String s = doc['TempNo'].value;
int tdvalue = Integer.parseInt(s);
return tdvalue;
"""
}
}
}
}

Query DSL terms filtering with script for day by numeric value

Within aggs I am able to get buckets by day of the week that are represented in numeric (1-7) keys using something like this:
"aggs" : {
"group_by_day" :{
"terms": {
"script": "doc['#timestamp'].date.dayOfWeek",
"order": {
"_key": "asc"
}
}
}
}
however I am looking for a way to add to the query filtering terms clause something like this to only show results for a monday or tuesday and haven't been able to get this:
I have tried
{
"terms": {
"script":"doc['#timestamp'].date.dayOfWeek"
}
}
and the use of script tag doesn't seem to be supported in terms query? at least how I am attempting to use it. Is there another way to get at filtering with script, or another approach (better) to get want I am trying to achieve? I am using 6.2...thanks!
Here is it:
"script":{
"script": {
"source": "doc['#timestamp'].date.dayOfWeek == 1"
}
}
Where I just handle the string to numeric conversion outside of this query, this is within a query.bool.must clause.

Elasticsearch:: Sorting giving weird results

When I am searching the for the first time, its sorting all documents and giving me the first 5 records. However, if same search query is executed by changing the sort direction(ASC -> DESC), then its not sorting all documents again, its giving me last 5 retrieved documents(from previous search query), sorting them in desc order, and giving it back to me. I was expecting that it will sort all available documents in DESC order, and then retrieve first 5 results.
Am I doing something wrong, or missed any concept.
My search query:
{
"sort": {
"taskid": {
"order": "ASC"
}
},
"from": 0,
"size": 5,
"query": {
"filtered": {
"query": {
"match_all": []
}
}
}
}
I have data with taskid 1 to 100. Now above query fetched me record from taskid 1 to 5 in first attempt. Now when I changed the sort direction to desc, I was expecting documents with taskid 96-100(100,99,98,97,96 sequence) should be returned, however I was returned documents with taskid 5,4,3,2,1 in that sequence. Which meant, sorting was done on previous returned result only.
Please note that taskid and _id are same in my document. I had added a redundant field in my mapping which will be same as _id
Just change the case of the value in order key and you are good to go.
{
"sort": {
"taskid": {
"order": "asc" // or "desc"
}
},
"from": 0,
"size": 5,
"query": {
"filtered": {
"query": {
"match_all": []
}
}
}
}
Hope this helps..
In elastic search, sort query is applied after the result are extracted from the es. As per the query mentioned in your question, first result is filtered based on search criteria, and then sorting is applied on the filtered result.
If it looks like you are only getting results based on an old subset of your data, then it may be that your newer data has not been indexed yet. This can happen easily in an automated test but with manual testing it is less likely.
Segments are rebuilt every second, so adding a delay/sleep of about a second between indexing and searching should fix your test if this is the problem.

Sorting a match query with ElasticSearch

I'm trying to use ElasticSearch to find all records containing a particular string. I'm using a match query for this, and it's working fine.
Now, I'm trying to sort the results based on a particular field. When I try this, I get some very unexpected output, and none of the records even contain my initial search query.
My request is structured as follows:
{
"query":
{
"match": {"_all": "some_search_string"}
},
"sort": [
{
"some_field": {
"order": "asc"
}
}
] }
Am I doing something wrong here?
In order to sort on a string field, your mapping must contain a non-analyzed version of this field. Here's a simple blog post I found that describes how you can do this using the multi_field mapping type.

Resources