Sorting Mongo records on non unique created date time field - spring

I am sorting on created field which is in the form of 2022-03-26T03:56:13.176+00:00 and is a representation of a java.time.LocalDateTime.
Sorting by only created is not consistent as the field is not unique, due to batch operations that run quickly enough to result in duplicates.
I've added a second sort, on _id, which is ObjectId.
It seems the second sort adds quite a bit of time to the query, more so than the first, which is odd to me.
Why does it more than double response time, and is there a more preferred way to ensure the order?
Using MongoTemplate, I sort like this:
query.with(Sort.by(Sort.Order.desc("createdOn"), Sort.Order.desc("_id")));

If your use case is always to sort in descending order on both fileds it is best to create the compound index in the expected sort order as follow:
db.collection.createIndex({ createdOn:-1,_id:-1 })
But in general the default _id field is containing the document insertion date and it is unique accross mongodb process so you may just sort based on _id , you most porbably don't need to sort additionally on createdOn date ...

Related

Is ElasticSearch Auto-Generated Ids sequential?

If we do not specify an Id when inserting a document to elasticsearch, the Id is automatically generated. I also understand that the Ids are Flake Ids, which have a predictive pattern.
My question is are these generated Flake Ids sequential enough that I can perform a sort on _id or _uid and be myself sure the results are in the same order as inserted?
The autogenerated _id is not sequential. It is an URL-safe, Base64-encoded GUID generated using modified FlakeID algorithm. FlakeID is a decentralized algorithm that generates k-ordered unique IDs.
Note that Elasticsearch does not generate the _id using the random UUIDs anymore.
See for more details:
https://github.com/elastic/elasticsearch/issues/5941
https://github.com/elastic/elasticsearch/pull/7531
https://github.com/ppearcy/elasticflake
Elasticsearch autogenerated _id is random, not sequential and same is the case for _uid. If you want to sort sequentially, then easy step is enabling _timestamp so _timestamp will have time of document inserted.
But, _timestamp is updated when document is updated. So, you may want to create new date field providing current time manually .
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-timestamp-field.html

Sorting jsonb objects in Postgresql 9.4

The problem that I am facing is not that I'm not able to perform a sort, but rather a correct sort. That is, my objects that are stored as jsonb need to be sorted before getting displayed in a table. Part of the query that sorts is:
ORDER BY data ->> 'Name' ASC
However the problem is that at its current state, psql returns the list of people ordered by two clusters: upper and lower case. ASC sort returns sorted upcase + sorted downcase while DESC returns inverted sort downcase + inverted sort upcase.
Is there a trick behind sorting the data in a caseless order or does the data need to initially be stored in a particular case.
ORDER BY lower(data ->> 'Name') ASC
This does create a temporary fix, but I will be glad if there are other methods out there
Sorting by jsonb value works the same as sorting by simple text field. If you get case-sensitive sorting, you likely set incorrect collation to your database.
See this issue, answer by Michał Niklas.

Sorting Solr multivalue fields based on field values

I have multiple Solr instances with separate schemas.
I need to receive multivalue field in sorted order, e.g. by type: train_station, airport, city_district, and so on:
q=köln&sort=query({!v="type:(airport OR train_station)"}) desc
I would like to see airport type document before train_station type. For now I am always getting train_station type at the top.
How should I write the query?
You are getting train_stations at the top because of the IDF.
A quick hack to fix it would be to use a range query (which has the advantage of having constant scores) and query boosts: q=köln&sort=query({!v="type:([airport TO airport]^3 OR [train_station TO train_station]^2)"}) desc.
This way, documents which have airport in their type field will have a score of 3, documents which have train_station in their type field will have a score of 2 and documents which have airport and train_station in their field type will have a score of 2+3=5 (to a multiplicative constant).
A more elegant (and effective) way of doing this would be to write a custom query parser (or even a function query).
You can sort on a function only if it returns a single value per document. You definitely can't sort on a multiValued field or any field that is tokenized. Seems like you would need a function that returns "airport" if the field contains "airport" (even if it contains "train station") and "train station" if it contains "train station" but not "airport", and then sort on that.
Another option would be to handle this at index time. Add a field called "airport_train_station_sort" that returns 1 if the field contains "airport", 2 if the field contains "train station" but NOT airport, and 3 if it contains neither. Then simply sort on that field.
You cannot solve this problem inside SOLR. Check the documentation, SOLR does not sort multivalued fields. Older versions of SOLR let you try, but the results were undefined and unpredictable.
You either change your schema and put this sort data into single value indexed fields, or you need to make several queries, first for airports, then city districts, then train stations.
To order items within the field itself you have to either index it in order you want, or do post processing. Solr's sort will sort only docs!

Lucene equivalent of SQL Server's ORDER BY [duplicate]

I got my lucene index with a field that needs to be sorted on.
I have my query and I can make my Sort object.
If I understand right from the javadoc I should be able to do query.SetSort(). But there seems to be no such method...
Sure I'm missing something vital.
Any suggestions?
There are actually two important points. First, the field must be indexed. Second, pass the Sort object into the overloaded search method.
Last time I looked, the docs didn't do a very good job of pointing out the indexing part, and certainly didn't explain why this is so. It took some digging to find out why.
When a field is sortable, the searcher creates an array with one element for each document in the index. It uses information from the term index to populate this array so that it can perform sorting very quickly. If you have a lot of documents, it can use a lot of memory, so don't make a field sortable unless there is a need.
One more caveat: a sortable field must have no more than one value stored in each field. If there are multiple values, Lucene doesn't know which to use as the sort key.
It looks like the actual method you want is e.g. Searcher.search(Query query, Filter filter, int n, Sort sort). setSort is a method of Sort.

Sorting in lucene.net

I got my lucene index with a field that needs to be sorted on.
I have my query and I can make my Sort object.
If I understand right from the javadoc I should be able to do query.SetSort(). But there seems to be no such method...
Sure I'm missing something vital.
Any suggestions?
There are actually two important points. First, the field must be indexed. Second, pass the Sort object into the overloaded search method.
Last time I looked, the docs didn't do a very good job of pointing out the indexing part, and certainly didn't explain why this is so. It took some digging to find out why.
When a field is sortable, the searcher creates an array with one element for each document in the index. It uses information from the term index to populate this array so that it can perform sorting very quickly. If you have a lot of documents, it can use a lot of memory, so don't make a field sortable unless there is a need.
One more caveat: a sortable field must have no more than one value stored in each field. If there are multiple values, Lucene doesn't know which to use as the sort key.
It looks like the actual method you want is e.g. Searcher.search(Query query, Filter filter, int n, Sort sort). setSort is a method of Sort.

Resources