What is the best way to format a date in JSON for Mongo DB storage - ruby

I have a date with a time. I'm using ruby, but the language shouldn't matter.
d = "2010-04-01 13:00:00"
What is the best way to format this date for Mongo DB? By 'best' I mean, is there a certain format I could use where Mongo would recognize it as a date and might give me more-advanced filtering optons?
ie: If formatted correctly, could I ask Mongo to return all records whose month is '04'?
Thanks!

You don't need to format dates at all; dates are a supported data type. Each client driver should support dates through their standard date type, including the ruby one.
For advanced queries like your example, you can use a javascript expression for the find specifier:
{"$where": "this.date.getMonth() == 3"}

In ruby you should use a Time instance, which will get stored as the BSON datetime type. You could use a $where clause like Coady mentions, but will be better to do a range query with $lt and $gt - less overhead and can leverage an index.

I don't like relying on mongo Date objects. I think Mongo is slower with 'date' objects than it is with other data types (such as integers).
I tend to use integers (if you need timezone, have a tz field too, then you have localized time):
document = {:some_timestamp => Time.now.to_i}
#collection.find({'some_timestamp' => {'$gte' => Time.now.to_i}})
Sometimes I just use the timestamp built into the BSON::ObjectId's:
id = BSON::ObjectId.from_time(Time.now)
#collection.find({'_id' => {'$lte' => id}})

Related

Convert azure.timestamp to NiFi date data type in NiFi expression language

I am using the NiFi ListAzureBlobStorage to get the available blob objects. The processor creates a flowfile for each object with the attributes containing the object metadata. I want to filter on the azure.timestamp attribute, but I do not know what the numeric value represents and how it relates to the NiFi's expression language date data type. I want to compare it with a known date so I need to convert it to a NiFi data-time variable first. How do I do this?
Thanks
According to the code it is already in "NiFi format" which means a Unix timestamp.
Since it represents the number of milliseconds passed since 1/1/1970, you can compare this and the other timestamp using regular number comparison operators.
example: ${azure.timestamp:ge(${now()})} - this will return true if the azure.timestamp is later(or equal) than the current timestamp(now).
If you'd like to compare it to another attribute you can do this:
${azure.timestamp:ge(${attribute.name})}.
If you'd like to convert a different date into a unix timestamp, you can use toDate and then toNumber, or to do the other way around, just use format.

How to convert output timezone per query in sequel ruby?

I wanted to be able to set timezone in each query depending on my user's preferred timezone without adding timezone conversion in each raw sql generated by my application.
I was able to query/retrieve the records in 'Asia/Manila' TZ using this config
Sequel.extension :named_timezones
Sequel.application_timezone = 'Asia/Manila'
Is it possible to set application_timezone per query so that I will pass the current application user's timezone in each request.
You probably want to use Sequel's thread_local_timezones extension: http://sequel.jeremyevans.net/rdoc-plugins/files/lib/sequel/extensions/thread_local_timezones_rb.html
This is per thread, not per query, but should hopefully still work for your needs.
Store everything in UTC and then convert in the UI/presentation layer.

For Hive partition based on date, why use string type? why not int?

If I'm defining a table in Hive, and will be partitioning based on date, and my dates are in the format YYYYMMDD, which should I choose for the type, int or string?
If it was just a field, and therefore in the files I'm supplying for the table, I could see using a string, even if only so that I can search for and identify malformed entries that might work their way into my data. But since I will be specifying the partition as part of the load process, I know I'll always have correctly formed values.
When used in a Where clause, the partition field will normally be equality or less-than/greater-than logic.
Dates are typically treated as strings in Hive. If you look at all the date manipulation UDFs available, they use string types, so if you were using integers you would have to cast them every time.
Conceptually also I think it makes more sense to use strings, your YYYYMMDD is just a literal representation of a date object, but it is implicitly equivalent to something like YYYY-MM-DD or DDMMYYYY. So if you were using an integer here, it becomes painful to do such comparisons.
Note that you can also compare strings in Hive with equality/greater/lower-than operators, if you want to select a range of partitions you can easily do that with these operators.
The only case I would see using a "date" as an integer is using a timestamps (Unix-style) because it is a continuous value and represents a real measurable quantity.
Because YYYY-MM-DD is the standard for date representation and is the output of hive's to_date() UDF
it also allows you to do lazy things like select * from foo where day>'2013'
http://xkcd.com/1179/

SQL column type from Arel::Attributes::Attribute object

tl;dr Given an Arel::Attributes::Attribue object, say Model.arel_table[:created_at] how do get it's SQL type?
Context: I'm bypassing the ActiveRecord infrastructure in favour of Arel to write some SQL reports that need to be generated really efficiently. Using Arel's to_sql method I'm generating the final SQL and executing it directly via ActiveRecord::Base.connection.execute. However, I need to apply SQL transformations to certain columns (eg. change timezone of timestamps stored in GMT). Since the number of columns is large (and varying, based on user input), I don't want to hard code these transformations. I'd like to look at the SQL type of the columns being selected and apply the transformation accordingly.
If you have the ActiveRecord class set up then you have access to its columns and columns_hash methods. Those will give you column objects (instances of ActiveRecord::ConnectionAdapters::Column) and there you should find type and sql_type methods. For example:
> Model.columns_hash['created_at'].type
=> :datetime
> Model.columns_hash['created_at'].sql_type
=> "timestamp without time zone"
The sql_type will be database-specific (that's PostgreSQL above), the type will match the type in your migrations so you probably want to use that instead of sql_type.
That said, you could probably get away with use the usual ActiveRecord relation methods (which should deal with conversions and time zones for you) and then call to_sql at the end:
sql = Model.where('created_at > ?', some_time).select('pancakes').to_sql
and then feed that SQL into execute or select_rows. That will let you use most of the usual ActiveRecord stuff while avoiding the overhead of creating a bunch of ActiveRecord wrappers that you don't care about.
Something that might be helpful specifically in arel is type_cast_for_database. This can be used on an arel table:
Model.arel_table.type_cast_for_database(:id, 'test')
=> 0
Model.arel_table.type_cast_for_database(:id, '47test')
=> 47
While you don't get the type specifically you can see if values like strings are going to be converted to a number or something.
EDIT
It's important to note that this only works if the arel table has a type_caster able_to_type_cast?. If you get it from the model like above, it should have a type caster.

hibernate JDBC type not found

Does hibernate have any mapping for this oracle data type:(10G)
TIMESTAMP(6) WITH TIME ZONE
I am getting:
No Dialect mapping for JDBC type: -101
My manager does not want to do the: registerHibernateType(-101, Hibernate.getText().getname())
He thinks it is too much.:)
What alternative can I have?
The answer you provide to yourself is more like a workaround than a proper solution. For the sake of the visitors looking for an answer, I'll provide my view on this:
1) Database date-based fields should be always set to UTC, never with a specific timezone. Date calculation with timezone information is an unneeded complexity. Remember that timezones usually changes twice a year for a lot of countries in the world ("daylight saving time"). There's a reason why only a few RDMBS' supports this, and there's a reason why Hibernate developers refuse to support this data-type. The patch for Hibernate is simple enough (one line of code), the implications aren't.
2) Converting your "timestamp with timezone" to a String will only cause problems later. Once you retrieve it as String, you'll need to convert it again to a Date/Calendar object, an unneeded overhead. Not to mention the risks associated with this operation.
3) If you need to know in which timezone is some user, just store the String representing the timezone offset (like "Europe/Prague"). You can use this in Java to build a Calendar with date/time and timezone, as it'll take care of DST for you.
For now, I solved the problem by:
`select TO_CHAR(TRUNC(field)) from table` //field is the one having type= timestamp with timezone
This ensures that when the query returns, the field has datatype 'String'

Resources