Hive dataype for date - hadoop

Date;Time;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3
16/12/2008;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
16/12/2008;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000
16/12/2008;17:26:00;5.374;0.498;233.290;23.000;0.000;2.000;17.000
16/12/2008;17:27:00;5.388;0.502;233.740;23.000;0.000;1.000;17.000
16/12/2008;17:28:00;3.666;0.528;235.680;15.800;0.000;1.000;17.000
16/12/2008;17:29:00;3.520;0.522;235.020;15.000;0.000;2.000;17.000
16/12/2008;17:30:00;3.702;0.520;235.090;15.800;0.000;1.000;17.000
16/12/2008;17:31:00;3.700;0.520;235.220;15.800;0.000;1.000;17.000
16/12/2008;17:32:00;3.668;0.510;233.990;15.800;0.000;1.000;17.000
This is the sample data, i am really confused with the datatype to be used for Date and Time.
plese help,

You probably want to be looking at the TimeStamp datatype which has this format:yyyy-mm-dd hh:mm:ss
There is also a Date datatype which takes this format: YYYY-­mm-­dd
There isn't a seperate time datatype. If you can't change the sample data you probably want to load the dates as string and use udfs like unix_timestamp(string date, string pattern) then copy to result into a new table.

Related

Changing format of date without using to_char - Oracle

I have to get the max payment date on an invoice and I am having trouble with the date format. I do not need the max in this formula as I am using the format in a reporting tool that is pulling the max from what it finds for me.
Using "to_char({datefield},'mm/dd/yyyy')" works for displaying that date the way we would like BUT when you use summary function MAX it does not pull the correct date because it is looking at a string and not a date (it will think 12/3/21 is larger than 3/2/22).
Another thing I have tried is trunc - "trunc({datefield})" which gives us the correct max date but it changes the formatting. For example if the date prior to the formula being applied is "8/12/21 12:00:00:000" the trunc formula will display it as 12-08-21 which is horribly wrong.
Long story short is I need a way to change a date/time to date with the format of 'mmmm/dd/yyyy' WITHOUT converting it to a string with something like to_char. Thank you!!!!
A DATE is a binary data type consisting of 7 bytes representing: century, year-of-century, month, day, hour, minute and second. It ALWAYS has all of those components and it is NEVER stored with any (human-readable) format.
What you are seeing when a date is displayed is the client application you are using to access the database making a decision to be helpful to you, the user, and display the binary DATE provided by the database in a human-readable format.
If you want to change how the DATE is displayed then you either need to:
Change the settings on the client application that controls how it formats dates when it displays them to you; or
Change the data-type so that it is no longer a DATE (which does not have a format) to a data type where the values of the date can be formatted (such as a string). You can do this using TO_CHAR.
If you want to find the maximum then do it BEFORE applying the formatting:
SELECT TO_CHAR(MAX({datefield}),'mm/dd/yyyy')
FROM your_table;

Convert TimeStamp To Joda DateTime Without Any Timezone Conversion

I have timestamp value in Oracle database column stored in UTC. I want to read it through spring's jdbcTemplate and convert it to joda DateTime object without any timezone conversion i.e. read it as is without converting or losing timezone.
For e.g. if the input timestamp is 2019-03-08 15:07:37.232, I would like to have the DateTime object with the value 2019-03-08T15:07:37.232Z
How can I achieve this?
Note this code - new DateTime(timestamp.getTime(), DateTimeZone.UTC)) does not help since it assumes that the input timestamp is in local timezone and reconverts it to UTC. For the above input is 2019-03-08 15:07:37.232 the outcome comes to 2019-03-08T09:37:37.232Z
Thanks.
One solution that worked was to change the way the timestamp column value is retrieved from the database. The below snippet works for converting a timestamp column value to joda DateTime without losing/converting the timezone
new DateTime(resultSet.getTimestamp("CreatedDateTime",
Calendar.getInstance(TimeZone.getTimeZone("UTC"))), DateTimeZone.UTC)
Hope that helps others too.
If you already have a DateTime object there is method
public DateTime withZone(DateTimeZone newZone)
which would return a copy of datetime with a different time zone, preserving the millisecond instant
eg:- dateTime.withZone(DateTimeZone.UTC)

how to convert this string '2014-12-31T05:00:00.000+00:00' into date in informatica

i have my date coming in string form :'2014-12-31T05:00:00.000+00:00'
how can TO convert this into date format in informaticA
Take a sub string of the time string '2014-12-31T05:00:00.000+00:00' to '2014-12-31' after all you need date part only.
Then parse the date using to_date method.
to_date('substring date','yyyy-dd-mm')
You can use this:
to_date('2014-12-31T05:00:00.000+00:00', 'YYYY-MM-DD.HH:MI:SS.MS......')
Keep in mind that it ignores the time zone information.

Filter by time and date in Hadoop

I have a table of data which have date and time as two separate field where date format is
dd/mm/yyyy and dd-mm-yyyy and time format is like hh:mm:ss(eg: 6:52:53)
i need to filter the record for a particular time period that both time and date wise filtering.
is there any predefined filter available with hive or pig?
Hive does recognize certain strings as unixtime dates.
You might try a where condition while concatenating the time & date together into unixtime format.
Some documentation on Hive date functions/formats are located here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
I suppose you have one column having two date format ie. dd/mm/yyyy and dd-mm-yyyy
What You can try
1) Replacing '/' to '-' so that complete column will be in dd-mm-yyyy format.
2) Try concatanating this field with time field
3) filter it by Casting concatinated field.
Hope this helps.
just possibility :- Have you tried casting that concatenated field to date datatype and then try date functions for desired output ?
eg. to_date()
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

Oracle - Fetch date/time in milliseconds from DATE datatype field

I have last_update_date column defined as DATE field
I want to get time in milliseconds.
Currently I have:
TO_CHAR(last_update_date,'YYYY-DD-MM hh:mi:ss am')
But I want to get milliseconds as well.
I googled a bit and think DATE fields will not have milliseconds. only TIMESTAMP fields will.
Is there any way to get milliseconds? I do not have option to change data type for the field.
DATE fields on Oracle only store the data down to a second so there is no way to provide anything more precise than that. If you want more precision, you must use another type such as TIMESTAMP.
Here is a link to another SO question regarding Oracle date and time precision.
As RC says, the DATE type only supports a granularity down to the second.
If converting to TIMESTAMP is truly not an option then how about the addition of another numerical column that just holds the milliseconds?
This option would be more cumbersome to deal with than a TIMESTAMP column but it could be workable if converting the type is not possible.
In a similar situation where I couldn't change the fields in a table, (Couldn't afford to 'break' third party software,) but needed sub-second precision, I added a 1:1 supplemental table, and an after insert trigger on the original table to post the timestamp into the supplemental table.
If you only need to know the ORDER of records being added within the same second, you could do the same thing, only using a sequence as a data source for the supplemental field.

Resources