Round the timestamp to hour in hive - hadoop

If we have timestamp in column like '2018-01-01 01:35:00.000'. I want to round off the timestamp to hour and get the value as '2018-01-01 01:00:00.000'.

So your question is not round but truncate the time format to hours. The truncate function works only for date (Year, Month and Day) but not for time. For a workaround, you can use the snippet below:
date_format('2018-01-01 01:35:00.000', 'YYYY-MM-dd hh:00:00.000')
Result:
2018-01-01 01:00:00.000

You can use from_unixtime,unix_timestamp functions to match your input data and create your required output format. in your case output format would be
yyyy-MM-dd hh:00:00.000
Sample query:-
hive> select from_unixtime(unix_timestamp('2018-01-01 01:35:00.000',"yyyy-MM-dd hh:mm:ss.sss"),'yyyy-MM-dd hh:00:00.000');
+--------------------------+--+
| _c0 |
+--------------------------+--+
| 2018-01-01 01:00:00.000 |
+--------------------------+--+
(or)
2.if you just want date then change the output format to yyyy-MM-dd
hive>select from_unixtime(unix_timestamp('2018-01-01 01:35:00.000',"yyyy-MM-dd hh:mm:ss.sss"),'yyyy-MM-dd');
+-------------+--+
| _c0 |
+-------------+--+
| 2018-01-01 |
+-------------+--+
3.Extract years and hours --> output format is yyyy hh
hive> select from_unixtime(unix_timestamp('2018-01-01 01:35:00.000',"yyyy-MM-dd hh:mm:ss.sss"),'yyyy ss');
+----------+--+
| _c0 |
+----------+--+
| 2018 00 |
+----------+--+

for the question asked :
select from_unixtime(unix_timestamp('2018-01-01 01:35:00.000',"yyyy-MM-dd HH:mm:ss.sss"),'yyyy-MM-dd HH:00:00.000');
To round off the time column to any granularity
General Approach :
Hive-time-column-name : end_time
Hive-time-column-name-date-format : "yyyy-MM-dd HH:mm:ss"
Required-output-format : "yyyy-MM-dd HH:mm:ss"
round-off-granularity : 15min
Hive-query-command :
from_unixtime(unix_timestamp(time-column-name-in-hive,"time-column-name-in-hive-date-format")-unix_timestamp(time-column-name-in-hive,"time-column-name-in-hive-date-format")%900 (because 15*60), 'output-date-format')
lets say end_time column value : 2019-11-29 08:23:27
so following command will convert end_time (ex. 2019-11-29 08:23:27 to 2019-11-29 08:15:00), given granularity is 15 min
select from_unixtime(unix_timestamp(end_time,"yyyy-MM-dd HH:mm:ss")-unix_timestamp(end_time,"yyyy-MM-dd HH:mm:ss")%900, 'yyyy-MM-dd HH:mm:ss') from <table-name>;

Related

HIVE Time conversion issue

We are getting a time from src "2019-11-03 01:01:00". 2019-11-03 is the day light saving day.
lets say this is the end_time. We have another column in hive table start_time.
Logic to derive start time is :
start_time = (end_time- 3600)
Issue ##:
When we apply the same logic during the job execution with unix_timestamp(), following are the results.
Start_time =
select from_unixtime(unix_timestamp('2019-11-03 01:01:00') - 3600 ,'yyyy-MM-dd HH:mm:ss');
+----------------------+--+
| _c0 |
+----------------------+--+
| 2019-11-03 01:01:00 |
+----------------------+--+
also
End_time = select from_unixtime(unix_timestamp('2019-11-03 01:01:00') ,'yyyy-MM-dd HH:mm:ss');
+----------------------+--+
| _c0 |
+----------------------+--+
| 2019-11-03 01:01:00 |
+----------------------+--+
Both are returning the same result. This way our start_date=end_date which is not expected.
We want the End_time = "2019-11-03 00:01:00"
Can someone help!
You are hitting this issue HIVE-14305
The solution can be to calculate date in bash and pass it to your script as a variable:
initial_date="2019-11-03 01:01:00"
datesec="$(date '+%s' --date="$initial_date")"
result_date=$( date --date="#$((datesec - 3600))" "+%Y-%m-%d %H:%M:%S")
echo $result_date
#result 2019-11-03 00:01:00
#call your script like this
hive -hiveconf result_date="$result_date" -f script_name
#In the script use '${hiveconf:result_date}'

How to count calculate the remaining time years using Carbon Laravel 5.8

I have a table name user . in user have filed colomn 'date' .
in this colomn field 'date' have data like this
users table
id | name | email | date |
1 | jhon | a#gmail.com | 2021-06-07 |
2 | phil | b#gmail.com | 2020-06-07 |
i want to showed data where 'date' is year is less than 1 year from now.
from this table just show :
2 | phil | b#gmail.com | 2020-06-07 |
because this date is less than 1 year . i am using laravel .
You can use this query to get users which their date field are less than one year from now:
User::query()->where('date', '<', Carbon::now()->subYear())->get();
You want to show users who have "date" that is less than a year from now. You can write a query very easily for that.
$users = User::whereBetween('date', [now(), now()->addYear()])->get();
This will filter down the results who have "date" values between now and a year from now. now is a helper function that returns a carbon date instance.

Hive : get rows where difference between a date and date field is some value

This is my table 'ekko' and I need to get all rows where the difference between today's date and column aedat is greater than 65 days. How can I construct a hive query for the same? I use unix OS.
id rfid aedat
---|-------|-------------|
1 | 3122 | 2017-12-08 |
2 | 3423 | 2017-12-27 |
3 | 4564 | 2017-11-09 |
4 | 23442 | 2017-10-03 |
In hive you can use current_date function which can have today's date i.e 2018-02-26 and then use datediff function in where clause to caluculate the difference between aedat and current_date is greater than 65 days.
With casting aedat as date type
hive>select * from ekko where datediff(current_date,cast(aedat as date))>65;
(or)
Without casting aedat to date type
hive> select * from ekko where datediff(current_date,aedat)>65;
You can use from_unixtime(unix_timestamp()) to get the current date.
select * from ekko where datediff(from_unixtime(unix_timestamp()),aedat) > 65
or if your aedat is string type use the bellow one.
select * from ekko where datediff(from_unixtime(unix_timestamp()),cast(aedat as date))>65;

Hive query to Extract Date and Hour separately from String

I need to extract Date and hour from the string column in hive.
Table:
select TO_DATE(from_unixtime(UNIX_TIMESTAMP(dates,'dd/MM/yyyy'))) from dates;
output:
0016-01-01
0016-01-01
select TO_DATE(from_unixtime(UNIX_TIMESTAMP(dates,'hh'))) from dates;
output:
1970-01-01
1970-01-01
Please advise how to take date seperately and hour seperately from the table column.
I've change the data sample to something more reasonable
with dates as (select explode(array('1/11/16 3:29','12/7/16 17:19')) as dates)
select from_unixtime(unix_timestamp(dates,'dd/MM/yy HH:mm'),'yyyy-MM-dd') as the_date
,from_unixtime(unix_timestamp(dates,'dd/MM/yy HH:mm'),'H') as H
,from_unixtime(unix_timestamp(dates,'dd/MM/yy HH:mm'),'HH') as HH
from dates
+------------+----+----+
| the_date | h | hh |
+------------+----+----+
| 2016-11-01 | 3 | 03 |
| 2016-07-12 | 17 | 17 |
+------------+----+----+

Hive : group column based on max value

I have a table with fields as
date value
10-02-1900 23
09-05-1901 22
10-03-1900 10
10-02-1901 24
....
I have to return maximum value for each year
i.e.,
1900 23
1901 24
I tried the below query but getting wrong ans.
SELECT YEAR(FROM_UNIXTIME(UNIX_TIMESTAMP(date,'dd-mm-yyyy'))) as date,MAX(value) FROM teb GROUP BY date;
Can anyone suggest me a query to do this?
Option 1
select year(from_unixtime(unix_timestamp(date,'dd-MM-yyyy'))) as year
,max(value) as max_value
from t
group by year(from_unixtime(unix_timestamp(date,'dd-MM-yyyy')))
;
Option 2
pre Hive 2.2.0
set hive.groupby.orderby.position.alias=true;
as of Hive 2.2.0
set hive.groupby.position.alias=true;
select year(from_unixtime(unix_timestamp(date,'dd-MM-yyyy'))) as date
,max(value)
from t
group by 1
;
+------+-----------+
| year | max_value |
+------+-----------+
| 1900 | 23 |
| 1901 | 24 |
+------+-----------+
P.s.
Another way to extract the year:
from_unixtime(unix_timestamp(date,'dd-MM-yyyy'),'yyyy')

Resources