I have seen quite a few questions here about the best storage ideologies for dates in MongoDB. most of the answers boiling down to using JavaScript Date objects.
I have another question, however: which is the most performant way to store dates?
I am operating on a collection of about 5 million entries and performs about 500 ranged read operations per minute on it asking for records $gt or $lt the current timestamp. How are indices built around the native JavaScript Date object? Are they more performant that storing an integer timestamp?
Dates are stored as 64 bit integers in MongoDB.
See the BSON spec http://bsonspec.org/#/specification
Related
I'm working on a firestore db need to query on dates range.
My question : in terms of query efficiency, how should I store my dates?
as numbers?
as firestore Timestamp?
(my guess is that string format is bad !)
Knowing that my queries will only be on day (not hours or seconds), is it more efficient to query on a field that has just day-precision, or can I keep better precision?
Thanks!
For performance, it doesn't really matter how you store the date. The index created for the date field will be able to find the range of documents equally well for each one. The performance of Firestore queries is based on the number of documents matched by the query, not the type of data used.
I need to store 2 columns in SQLite db. a date (YYYY-MM-DD) and a time (HH-MM-SS)
Since SQLite can store date time in TEXT, REAL, INTEGER only which data type should I use ?
Is it more efficient to store as TEXT or INTEGER. I will be doing frequent lookup against these columns.
So my queries will have a WHERE condition on the date and time columns.
I want to make the WHERE comparison efficient
The primary bottleneck when using databases is I/O.
If you care about performance, use the datatype which requires the least amount of I/O.
An INTEGER value needs less storage than a REAL one, which needs less than TEXT.
Whether this makes a noticeable difference is another question; you have to measure.
Interview question:
A store has n customers and any customer can visit them any time through out the year.Data is stored in a file.Design a data structure to find given person visited the store on given date or not.
I think Hashmap would be fine to implement the above requirement.
Can some one give me a better solution..Thanks.
If n and the range of dates is large then the file will be large and it may run slowly. You may not be able to load it all into memory at one time - or it will be slow even if you can. A 'better' approach probably means going faster and use less resources. You could speed things up by having some sort of indexing into the file by date and only looking into the chunk of the file that is for the date in question. This would significantly reduce the (usually slowest) part - getting the data from disk to memory - and then just need to use a hash of names within that chunk.
I see people storing / getting the server time and times relative to it using date or getTime which can be kept in the database as a string of the sorts: "July 21, 1983 01:15:00".
Up until now I stored my server time as the difference between NOW and 1 january 2013. This would return a number value (in minutes), rounded down between 1 jan 2013 and right now, which I keep as internal server time.
The advantages of this are that:
- querying the server implies a simple numeric comparison operation, while (I make an educated guess) comparing two dates implying internal conversion to objects and using fat comparison operations.
- storing a number of that size is more lightweight than a string of ~25 characters.
- converting back to "real" time is by adding 1 jan 2013 but second and millisecond values are lost due to initial roundness.
But still, other fellow programmers insist that using the string version
- is easy to read as a human.
- its an universal format for most languages (especially nodejs, mongodb and as3 which this project has).
I am uncertain which is better for large scale databases and specifically, for a multiplayer socket based game. I am sure others with real experience in this could shed some light on my issue.
So which is better and why?
Store them as Mongo Date objects. Mongo stores dates as 8-byte second-offset integers [1], and displays them in human readable format. You are NOT storing 25 characters!
Therefore, all comparisons are just as fast. There is no string parsing except for when you're querying, which is a one-time operation per query.
Your difference is stored as either as an int of 4 bytes. So you're saving ONLY 4 bytes over normal MongoDB date storage. That's a very small savings, considering against the average size of your mongo objects.
Consider all the disadvantages of your "offset since January 2013" method:
Time spent writing extra logic to offset the dates when updating or querying.
Time spent dealing with bugs that arise from having forgotten to offset a date.
Time spent shifting dates by hand or in your head when inspecting database output (when diagnosing a problem), instead of seeing the actual date right away.
Inability to use date operators in the MongoDB aggregations without extra work (e.g. $dayOfMonth, extra work being a projection to shift your dates internally to ).
Basically, more code and more headache and more time spent, all to save 4 bytes on objects in a database where the same 4 bytes can be saved by renaming your field from "updated" to "upd"? I don't think that's a wise tradeoff.
Also,
Best way to store date/time in mongodb
Premature optimization is the root of all evil. Don't optimize unless you've determined something to be a problem.
1 - http://bsonspec.org/#/specification
I'm playing around with MongoDB 2.4.5 and I'm interested in the reading/querying performance.
Say I have two very large collections (about 1,5000,000 documents each). The documents do have about 40 fields. They only differ in exactly one field, therefore they do have the same indexes and so on.
One collection does have a field Body where a string is stored. This string can be rather large as it represents the content of a news item. The other collection does not have that field.
My question now is which of the two collections is faster to be queried, sorted and so on. Writing is no issue here.
So what is more serious for querying a MongoDB collection. The sheer amount of items within a collection or the size of the items.
you have to do it yourself:
1. db.coll1.find({}).explain()
2. db.coll2.find({}).explain()
and after your could measure the difference of performance between two different queries.