Hive : group column based on max value

Hive : group column based on max value - hadoop

I have a table with fields as
date value
10-02-1900 23
09-05-1901 22
10-03-1900 10
10-02-1901 24
....
I have to return maximum value for each year
i.e.,
1900 23
1901 24
I tried the below query but getting wrong ans.
SELECT YEAR(FROM_UNIXTIME(UNIX_TIMESTAMP(date,'dd-mm-yyyy'))) as date,MAX(value) FROM teb GROUP BY date;
Can anyone suggest me a query to do this?

Option 1
select year(from_unixtime(unix_timestamp(date,'dd-MM-yyyy'))) as year
,max(value) as max_value
from t
group by year(from_unixtime(unix_timestamp(date,'dd-MM-yyyy')))
;
Option 2
pre Hive 2.2.0
set hive.groupby.orderby.position.alias=true;
as of Hive 2.2.0
set hive.groupby.position.alias=true;
select year(from_unixtime(unix_timestamp(date,'dd-MM-yyyy'))) as date
,max(value)
from t
group by 1
;
+------+-----------+
| year | max_value |
+------+-----------+
| 1900 | 23 |
| 1901 | 24 |
+------+-----------+
P.s.
Another way to extract the year:
from_unixtime(unix_timestamp(date,'dd-MM-yyyy'),'yyyy')

Related

Presto, how to duplicate a record based on a time validity interval

I'm traying to make the below transformation in presto:
From:
id | valid_from | valid_unitl | value
12 | 2021/02/17 | 2021/05/17 | 150
To:
id | date | value
12 | 2021/02/17 | 150
12 | 2021/03/17 | 150
12 | 2021/04/17 | 150
12 | 2021/05/17 | 150
Is it possible?
Thanks,

You can use the sequence function to generate an array between valid_from and valid_until with the desired step and then unnest it:
select id, format_datetime(date,'yyyy/MM/dd') as date, value
from my_table
cross join unnest(sequence(parse_datetime(valid_from,'yyyy/MM/dd'),
parse_datetime(valid_until,'yyyy/MM/dd'),
interval '1' month)) t(date)
See the docs for the sequence function: https://prestodb.io/docs/current/functions/array.html

Add extra column as a marker in laravel datatable

What I want to do is to select the record from database based on two conditions, the first condition is to select all of the records that has reached 30 days today, and the second one is the records that has reached 25 days today based on the date field of the records. And for each condition I want to add a column (status) for marking the result. I want the result to be like this:
+--+--------------------+-------------------+-------------------+
|No| Date | Name | Status |
+--+--------------------+-------------------+-------------------+
|1 | 2018-10-17 | John | 30 days |
|2 | 2018-10-22 | Daniel | 25 days |
+--+--------------------+-------------------+-------------------+
This is my query:
$data = DB::table('data_pemohon')->select('*')->whereRaw('tanggal_permohonan + INTERVAL 30 DAY <= NOW()')->orWhereRaw('tanggal_permohonan + INTERVAL 25 DAY <= NOW()')->where('status_paspor','Serahkan paspor')->get();
return Datatables::of($data)->addIndexColumn()->make(true);
How can I do that, help me guys, thanks.

Hive : get rows where difference between a date and date field is some value

This is my table 'ekko' and I need to get all rows where the difference between today's date and column aedat is greater than 65 days. How can I construct a hive query for the same? I use unix OS.
id rfid aedat
---|-------|-------------|
1 | 3122 | 2017-12-08 |
2 | 3423 | 2017-12-27 |
3 | 4564 | 2017-11-09 |
4 | 23442 | 2017-10-03 |

In hive you can use current_date function which can have today's date i.e 2018-02-26 and then use datediff function in where clause to caluculate the difference between aedat and current_date is greater than 65 days.
With casting aedat as date type
hive>select * from ekko where datediff(current_date,cast(aedat as date))>65;
(or)
Without casting aedat to date type
hive> select * from ekko where datediff(current_date,aedat)>65;

You can use from_unixtime(unix_timestamp()) to get the current date.
select * from ekko where datediff(from_unixtime(unix_timestamp()),aedat) > 65
or if your aedat is string type use the bellow one.
select * from ekko where datediff(from_unixtime(unix_timestamp()),cast(aedat as date))>65;

Hive query to Extract Date and Hour separately from String

I need to extract Date and hour from the string column in hive.
Table:
select TO_DATE(from_unixtime(UNIX_TIMESTAMP(dates,'dd/MM/yyyy'))) from dates;
output:
0016-01-01
0016-01-01
select TO_DATE(from_unixtime(UNIX_TIMESTAMP(dates,'hh'))) from dates;
output:
1970-01-01
1970-01-01
Please advise how to take date seperately and hour seperately from the table column.

I've change the data sample to something more reasonable
with dates as (select explode(array('1/11/16 3:29','12/7/16 17:19')) as dates)
select from_unixtime(unix_timestamp(dates,'dd/MM/yy HH:mm'),'yyyy-MM-dd') as the_date
,from_unixtime(unix_timestamp(dates,'dd/MM/yy HH:mm'),'H') as H
,from_unixtime(unix_timestamp(dates,'dd/MM/yy HH:mm'),'HH') as HH
from dates
+------------+----+----+
| the_date | h | hh |
+------------+----+----+
| 2016-11-01 | 3 | 03 |
| 2016-07-12 | 17 | 17 |
+------------+----+----+

oracle sql totalize days for contiguos ranges

I have a table with date ranges and i need to count the days only for the contiguos date ranges...
-----------------------------------
| table RANGES |
----------------------------------
| d_start | d_end | days |
| (date) | (date) | (num)|
-----------------------------------
| 2014-02-01 | 2014-02-05 | 4 |
| 2014-02-06 | 2014-02-11 | 5 |
| 2014-03-22 | 2014-03-25 | 3 |
| 2014-04-02 | 2014-04-10 | 8 |
| 2014-04-11 | 2014-04-20 | 9 |
-----------------------------------
I need to totalize days with break when the date ranges are not contiguos, a result like this:
| 2014-02-01 | 2014-02-11 | 9 |
| 2014-03-22 | 2014-03-25 | 3 |
| 2014-04-02 | 2014-04-20 | 17 |
i Tryed with LEAD to check if next record's d_start is equal d_end but i can't achieve the goal.
many thanks for any idea!
Marco

The answer is quite tricky:
SQL> create table tmp$dates (d_start date, d_end date);
Table created
SQL> insert into tmp$dates values (DATE '2014-02-01', DATE '2014-02-05');
1 row inserted
SQL> insert into tmp$dates values (DATE '2014-02-06', DATE '2014-02-11');
1 row inserted
SQL> insert into tmp$dates values (DATE '2014-03-22', DATE '2014-03-25');
1 row inserted
SQL> insert into tmp$dates values (DATE '2014-04-02', DATE '2014-04-10');
1 row inserted
SQL> insert into tmp$dates values (DATE '2014-04-11', DATE '2014-04-20');
1 row inserted
SQL> select min(d_start), max(d_end), max(d_end) - min(d_start) + 1 n#
2 from tmp$dates d
3 start with d_start not in (select d_end + 1 from tmp$dates)
4 connect by prior d_end = d_start - 1
5 group by level - rownum
6 order by 1;
MIN(D_START) MAX(D_END) N#
------------ ----------- ----------
01.02.2014 11.02.2014 11
22.03.2014 25.03.2014 4
02.04.2014 20.04.2014 19

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Hive : group column based on max value - hadoop

Related

Presto, how to duplicate a record based on a time validity interval

Add extra column as a marker in laravel datatable

Hive : get rows where difference between a date and date field is some value

Hive query to Extract Date and Hour separately from String

oracle sql totalize days for contiguos ranges

Categories

Resources