Hive Current date function - hadoop

I want to get the current date in beeline.
I tried to use this:
FROM_UNIXTIME(UNIX_TIMESTAMP())
it outputs this:
16-03-21
What I was looking to get it:
2016-03-21 09:34
How do I do it? I see the beeline documentation here:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions
But it didnt work for me.

you can get it by passing expected format as a parameter of from_unixtime function.
Example :
select from_unixtime(unix_timestamp(),'yyyy-MM-dd HH:MM');
Result:
2016-03-21 16:03

Try this:
Select to_date(from_unixtime(unix_timestamp())) from my table ...
Results in '2016-03-21'

there are many functions you can use in hive : taken from http://atiblog.com/date-function-hive/
1)from_unixtime:
This function converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a STRING that represents the TIMESTAMP of that moment in the current system time zone in the format of “1970-01-01 00:00:00”. The following example returns the current date including the time.
hive> SELECT FROM_UNIXTIME(UNIX_TIMESTAMP());
OK
2015–05–18 05:43:37
Time taken: 0.153 seconds, Fetched: 1 row(s)
2)from_utc_timestamp:-
This function assumes that the string in the first expression is UTC and then, converts that string to the time zone of the second expression. This function and the to_utc_timestamp function do timezone conversions. In the following example, t1 is a string.
hive> SELECT from_utc_timestamp(‘1970-01-01 07:00:00’, ‘JST’);
OK
1970–01–01 16:00:00
Time taken: 0.148 seconds, Fetched: 1 row(s)
3)to_utc_timestamp:
This function assumes that the string in the first expression is in the timezone that is specified in the second expression, and then converts the value to UTC format. This function and the from_utc_timestamp function do timezone conversions.
hive> SELECT to_utc_timestamp (‘1970-01-01 00:00:00’,‘America/Denver’);
OK
1970–01–01 07:00:00
Time taken: 0.153 seconds, Fetched: 1 row(s)
4)unix_timestamp :
This function converts the date to the specified date format and returns the number of seconds between the specified date and Unix epoch. If it fails, then it returns 0. The following example returns the value 1237487400
hive> SELECT unix_timestamp (‘2009-03-20’, ‘yyyy-MM-dd’);
OK
1237487400
Time taken: 0.156 seconds, Fetched: 1 row(s)
5)unix_timestamp() :This function returns the number of seconds from the Unix epoch (1970-01-01 00:00:00 UTC) using the default time zone.
hive> select UNIX_TIMESTAMP(‘2000-01-01 00:00:00’);
OK
946665000
Time taken: 0.147 seconds, Fetched: 1 row(s)
6)unix_timestamp( string date ) :
This function converts the date in format ‘yyyy-MM-dd HH:mm:ss’ into Unix timestamp. This will return the number of seconds between the specified date and the Unix epoch. If it fails, then it returns 0.
hive> select UNIX_TIMESTAMP(‘2000-01-01 10:20:30’,‘yyyy-MM-dd’);
OK
946665000
Time taken: 0.148 seconds, Fetched: 1 row(s)
7)unix_timestamp( string date, string pattern ) :
This function converts the date to the specified date format and returns the number of seconds between the specified date and Unix epoch. If it fails, then it returns 0.
hive> select FROM_UNIXTIME( UNIX_TIMESTAMP() );
8)from_unixtime( bigint number_of_seconds [, string format] ) :The FROM_UNIX function converts the specified number of seconds from Unix epoch and returns the date in the format ‘yyyy-MM-dd HH:mm:ss’.
hive> SELECT FROM_UNIXTIME(UNIX_TIMESTAMP());
9)To_Date( string timestamp ) :
hive> select TO_DATE(‘2000-01-01 10:20:30’);
OK
2000–01–01
10)WEEKOFYEAR( string date )
The WEEKOFYEAR function returns the week number of the date.
hive> SELECT WEEKOFYEAR(‘2000-03-01 10:20:30’);
OK
9
11)DATEDIFF( string date1, string date2 )
The DATEDIFF function returns the number of days between the two given dates.
hive> SELECT DATEDIFF(‘2000-03-01’, ‘2000-01-10’);
OK
51
Time taken: 0.156 seconds, Fetched: 1 row(s)
12)DATE_ADD( string date, int days )
The DATE_ADD function adds the number of days to the specified date
hive> SELECT DATE_ADD(‘2000-03-01’, 5);
OK
2000–03–06
13)DATE_SUB( string date, int days )
The DATE_SUB function subtracts the number of days to the specified date
hive> SELECT DATE_SUB(‘2000-03-01’, 5);
OK
2000–02–25
14)DATE CONVERSIONS :Convert MMddyyyy Format to Unixtime
Note: M Should be Capital Every time in MMddyyyy Format
select cast(substring(from_unixtime(unix_timestamp(dt, ‘MMddyyyy’)),1,10) as date) from sample;

Related

How Oracle internally deduces the differece between dates

select (current_date - TO_DATE('20210817124015','YYYYMMDDHH24MISS')) from dual;
Outputs:
0.1229282407407407407407407407407407407407
I want to know how oracle internally achieves this value.
ps: the current_date and the hardcoded date are same, only time is the difference.
CURRENT_DATE returns the current date and time in the user's session time zone.
TO_DATE('20210817124015','YYYYMMDDHH24MISS') returns the date 2021-08-17T12:40:15.
Note: A DATE data type always has year, month, day, hour, minute and second components. However, the user interface you are using may chose not to show all the components.
Subtracting one date from another returns the number of days between the two values.
0.1229282407407407407407407407407407407407 days is:
2.950277778 hours; or
177.016666667 minutes; or
10621 seconds; or
2 hours 57 minutes and 1 second.
So your current date was 2021-08-17T12:40:15 + 10621 seconds or 2021-08-17T15:37:16.
For example:
ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD"T"HH24:MI:SS';
ALTER SESSION SET TIME_ZONE = 'Asia/Samarkand';
SELECT CURRENT_DATE,
TO_DATE('20210817124015','YYYYMMDDHH24MISS') As other_date,
CURRENT_DATE - TO_DATE('20210817124015','YYYYMMDDHH24MISS') as difference,
(CURRENT_DATE - TO_DATE('20210817124015','YYYYMMDDHH24MISS')) DAY TO SECOND
as interval_difference
FROM DUAL;
Outputs:
CURRENT_DATE
OTHER_DATE
DIFFERENCE
INTERVAL_DIFFERENCE
2021-08-17T15:40:01
2021-08-17T12:40:15
.124837962962962962962962962962962962963
+00 02:59:46.000000
db<>fiddle here
Subtracting two dates returns a difference in days.
0.1229282407407407407407407407407407407407 days is
2.9502777777768 hours
177.016666666608 minutes
10621 seconds
Or, put another way, current_date is returning a date value that is 2 hours 57 minutes and 1 second after the hard-coded date. Since the hard-coded date has a time of 12:40:51, that means that current_date has a time of 15:37:52.

Javascript Date conversion in Hive

I have a date column as a string data type in MMMM Do YYYY, HH:mm:ss.SSS
(December 16th 2019, 21:30:22.000) format.
I'm trying to convert this into a timestamp data type in hive but couldn't able to achieve it because this format is not available in unixtime.
Is there any way to convert this in hive?
This method will preserve millisecond precision. First extract only parts compatible with SimpleDateFormat pattern using regex, then convert to datetime, concat with milliseconds (milliseconds lost after unix_timestamp conversion) and convert to timestamp:
select timestamp(concat(from_unixtime(unix_timestamp(dt,'MMM dd yyyy HH:mm:ss.SSS')),'.',split(dt,'\\.')[1]))
from
(select regexp_replace('December 16th 2019, 21:30:22.001','([A-Za-z]+ \\d{1,2})[a-z]{0,2} (\\d{4}), (\\d{2}:\\d{2}:\\d{2}\\.\\d+)','$1 $2 $3') as dt --returns December 16 2019 21:30:22.001
) s;
OK
2019-12-16 21:30:22.001
Time taken: 0.09 seconds, Fetched: 1 row(s)
Try this
SELECT from_unixtime(unix_timestamp) as new_timestamp from data ...
That converts a unix timestamp into a YYYY-MM-DD HH:MM:SS format, then you can use the following functions to get the year, month, and day:
SELECT year(new_timestamp) as year, month(new_timestamp) as month, day(new_timestamp) as day

How to convert string date to big int in hive with milliseconds

I have a string 2013-01-01 12:00:01.546 which represents a timestamp with milliseconds that I need to convert to a bigint without losing the milliseconds.
I tried unix_timestamp but I lose the milliseconds:
unix_timestamp(2013-01-01 12:00:01.546,'yyyy-MM-dd HH:mm:ss') ==> 1357059601
unix_timestamp(2013-01-01 12:00:01.786,'yyyy-MM-dd HH:mm:ss') ==> 1357059601
I tried with milliseconds format as well but no difference
unix_timestamp(2013-01-01 12:00:01.786,'yyyy-MM-dd HH:mm:ss:SSS') ==> 1357059601
Is there any way to get milliseconds difference in hive?
This is what I came with so far.
If all your timestamps have a fraction of 3 digits it can be simplified.
with t as (select timestamp '2013-01-01 12:00:01.546' as ts)
select cast ((to_unix_timestamp(ts) + coalesce(cast(regexp_extract(ts,'\\.\\d*',0) as decimal(3,3)),0)) * 1000 as bigint)
from t
1357070401546
Verification of the result:
select from_utc_timestamp (1357070401546,'UTC')
2013-01-01 12:00:01.546000
So apparently unix_timestamp doesn't convert milliseconds. You can use the following approach.
hive> select unix_timestamp(cast(regexp_replace('2013-01-01 12:00:01.546', '(\\d{4})-(\\d{2})-(\\d{2}) (\\d{2}):(\\d{2}):(\\d{2}).(\\d{3})', '$1-$2-$3 $4:$5:$6.$7' ) as timestamp));
OK
1357063201
Hive function unix_timestamp() doesn't convert the milli second part, so you may want to use the below:
unix_timestamp('2013-01-01 12:00:01.546') + cast(split('2013-01-01 12:00:01.546','\\\.')[1] as int) => 1357067347
unix_timestamp('2013-01-01 12:00:01.786') + cast(split('2013-01-01 12:00:01.786','\\\.')[1] as int) => 1357067587

Convert Seconds to HH:MM:SS in teradata

I have a decimal(6,2) field storing data in seconds.
I want those seconds to be converted in to HH:MM:SS.
For ex: '19,500' seconds should be shown as 05:25:00
Assuming that the datatype is a typo (you can't store 19500 in a decimal(6,2)):
col * INTERVAL '00:00:01' HOUR TO SECOND -- no fractional seconds
col * INTERVAL '00:00:01.00' HOUR TO SECOND -- fractional seconds

How do I get millisecond precision in hive?

The documentation says that timestamps support the following conversion:
•Floating point numeric types: Interpreted as UNIX timestamp in seconds with decimal precision
First of all, I'm not sure how to interpret this. If I have a timestamp 2013-01-01 12:00:00.423, can I convert this to a numeric type that retains the milliseconds? Because that is what I want.
More generally, I need to do comparisons between timestamps such as
select maxts - mints as latency from mytable
where maxts and mints are timestamp columns. Currently, this gives me NullPointerException using Hive 0.11.0. I am able to perform queries if I do something like
select unix_timestamp(maxts) - unix_timestamp(mints) as latency from mytable
but this only works for seconds, not millisecond precision.
Any help appreciated. Tell me if you need additional information.
If you want to work with milliseconds, don't use the unix timestamp functions because these consider date as seconds since epoch.
hive> describe function extended unix_timestamp;
unix_timestamp([date[, pattern]]) - Returns the UNIX timestamp
Converts the current or specified time to number of seconds since 1970-01-01.
Instead, convert the JDBC compliant timestamp to double.
E.g:
Given a tab delimited data:
cat /user/hive/ts/data.txt :
a 2013-01-01 12:00:00.423 2013-01-01 12:00:00.433
b 2013-01-01 12:00:00.423 2013-01-01 12:00:00.733
CREATE EXTERNAL TABLE ts (txt string, st Timestamp, et Timestamp)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '/user/hive/ts';
Then you may query the difference between startTime(st) and endTime(et) in milliseconds as follows:
select
txt,
cast(
round(
cast((e-s) as double) * 1000
) as int
) latency
from (select txt, cast(st as double) s, cast(et as double) e from ts) q;

Resources