hive get each month end date - hadoop

I want each month last date ,like Jan - 31 , Feb - 28 , so on.
I tried below with current_date and it works but when I use my date column it returns null:
SELECT datediff(CONCAT(y, '-', (m + 1), '-', '01'), CONCAT(y, '-', m, '-', '01')) FROM
(SELECT
month(from_unixtime(unix_timestamp(C_date, 'yyyyMMdd'),'yyyy-MM-dd') ) as m,
year(from_unixtime(unix_timestamp(C_date, 'yyyyMMdd'),'yyyy-MM-dd') ) as y,
day(from_unixtime(unix_timestamp(C_date, 'yyyyMMdd'),'yyyy-MM-dd') )
from table2 ) t
returns :
_c0
NULL
SELECT
month(from_unixtime(unix_timestamp(C_date, 'yyyyMMdd'),'yyyy-MM-dd') ) as m,
year(from_unixtime(unix_timestamp(C_date, 'yyyyMMdd'),'yyyy-MM-dd') ) as y,
day(from_unixtime(unix_timestamp(C_date, 'yyyyMMdd'),'yyyy-MM-dd') )
from table2) t
returns:
m | y | _c2|
3 |2017| 21|
Thanks in advance.

Use last_day() and day() functions:
hive> select day(last_day(current_date)) ;
OK
31
Time taken: 0.079 seconds, Fetched: 1 row(s)
Apply to_date() to convert your column before applying last_day().

Related

Convert time from a format to an int in Oracle

I can't seem to figure this out. I have some rows with time in the format 00:00:00 (hh:mm:ss) and i need to calculate the total time it takes for a task.
I am unable to sum this data. Can someone advise on a way to convert this to a format i can sum or a method to calculate the total time for the task.
Thanks for any assistance. This is in an Oracle DB.
Convert your time string to a date and subtract the equivalent date at midnight to give you an number as a fraction of a day. You can then sum this number and convert it to an interval:
Oracle Setup:
CREATE TABLE test_data( value ) AS
SELECT '01:23:45' FROM DUAL UNION ALL
SELECT '12:34:56' FROM DUAL UNION ALL
SELECT '23:45:00' FROM DUAL;
Query:
SELECT NUMTODSINTERVAL(
SUM( TO_DATE( value, 'HH24:MI:SS' ) - TO_DATE( '00:00:00', 'HH24:MI:SS' ) ),
'DAY'
) AS total_time_taken
FROM test_data;
Output:
| TOTAL_TIME_TAKEN |
| :---------------------------- |
| +000000001 13:43:41.000000000 |
db<>fiddle here
Update including durations longer than 23:59:59.
Oracle Setup:
CREATE TABLE test_data( value ) AS
SELECT '1:23:45' FROM DUAL UNION ALL
SELECT '12:34:56' FROM DUAL UNION ALL
SELECT '23:45:00' FROM DUAL UNION ALL
SELECT '48:00:00' FROM DUAL;
Query:
SELECT NUMTODSINTERVAL(
SUM(
DATE '1970-01-01'
+ NUMTODSINTERVAL( SUBSTR( value, 1, HM - 1 ), 'HOUR' )
+ NUMTODSINTERVAL( SUBSTR( value, HM + 1, MS - HM - 1 ), 'MINUTE' )
+ NUMTODSINTERVAL( SUBSTR( value, MS + 1 ), 'SECOND' )
- DATE '1970-01-01'
),
'DAY'
) AS total_time
FROM (
SELECT value,
INSTR( value, ':', 1, 1 ) AS HM,
INSTR( value, ':', 1, 2 ) AS MS
FROM test_data
);
Output:
| TOTAL_TIME |
| :---------------------------- |
| +000000003 13:43:41.000000000 |
db<>fiddle here
Even better would be if you changed your table to hold the durations as intervals rather than as strings then everything becomes much simpler:
Oracle Setup:
CREATE TABLE test_data( value ) AS
SELECT INTERVAL '1:23:45' HOUR TO SECOND FROM DUAL UNION ALL
SELECT INTERVAL '12:34:56' HOUR TO SECOND FROM DUAL UNION ALL
SELECT INTERVAL '23:45:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT INTERVAL '48:00:00' HOUR TO SECOND FROM DUAL;
Query:
SELECT NUMTODSINTERVAL(
SUM( DATE '1970-01-01' + value - DATE '1970-01-01' ),
'DAY'
) AS total_time
FROM test_data;
Output:
| TOTAL_TIME |
| :---------------------------- |
| +000000003 13:43:41.000000000 |
db<>fiddle here

Oracle: Days between two date and Exclude weekdays how to handle negative numbers

I have two date columns and trying to measure days between the two dates excluding weekends. I'm getting a negative number and need help solving.
Table
CalendarDate DayNumber FirstAssgn FirstCnt DayNumber2 Id BusinessDays
5/21/2017 Sunday 5/21/17 5/21/17 Sunday 1 -1
Query:
TRUNC(TO_DATE(A.FIRST_CONTACT_DT, 'DD/MM/YYYY')) - TRUNC(TO_DATE(A.FIRST_ASSGN_DT, 'DD/MM/YYYY'))
- ((((TRUNC(A.FIRST_CONTACT_DT,'D'))-(TRUNC(A.FIRST_ASSGN_DT,'D')))/7)*2)
- (CASE WHEN TO_CHAR(A.FIRST_ASSGN_DT,'DY','nls_date_language=english') ='SUN' THEN 1 ELSE 0 END)
- (CASE WHEN TO_CHAR(A.FIRST_CONTACT_DT,'DY','nls_date_language=english')='SAT' THEN 1 ELSE 0 END)
- (SELECT COUNT(1) FROM HUM.CALENDAR CAL
WHERE 1=1
AND CAL.CALENDAR_DATE >= A.FIRST_ASSGN_DT
AND CAL.CALENDAR_DATE < A.FIRST_CONTACT_DT
--BETWEEN A.FIRST_ASSGN_DT AND A.FIRST_CONTACT_DT
AND CAL.GRH_HOLIDAY_IND = 'Y'
) AS Business_Days
Looks like below piece needs editing...
- (CASE WHEN TO_CHAR(A.FIRST_ASSGN_DT,'DY','nls_date_language=english')='SUN' THEN 1 ELSE 0 END)
Adapted from my answer here:
Get the number of days between the Mondays of both weeks (using TRUNC( datevalue, 'IW' ) as an NLS_LANGUAGE independent method of finding the Monday of the week) then add the day of the week (Monday = 1, Tuesday = 2, etc., to a maximum of 5 to ignore weekends) for the end date and subtract the day of the week for the start date. Like this:
SELECT ( TRUNC( end_date, 'IW' ) - TRUNC( start_date, 'IW' ) ) * 5 / 7
+ LEAST( end_date - TRUNC( end_date, 'IW' ) + 1, 5 )
- LEAST( start_date - TRUNC( start_date, 'IW' ) + 1, 5 )
AS WeekDaysDifference
FROM your_table
With RANGE_TEMP as (
SELECT
STARTPERIOD start_date,
ENDPERIOD end_date
FROM
TABLE_DATA -- YOUR TABLE WITH ALL DATA DATE
), DATE_TEMP AS (
SELECT
(start_date + LEVEL) DATE_ALL
FROM
RANGE_TEMP
CONNECT BY LEVEL <= (end_date - start_date)
), WORK_TMP as (
SELECT
COUNT(DATE_ALL) WORK_DATE
FROM
DATE_TEMP
WHERE
TO_CHAR(DATE_ALL,'D', 'NLS_DATE_LANGUAGE=ENGLISH') NOT IN ('1','7')
), BUSINESS_TMP as (
SELECT
COUNT(DATE_ALL) BUSINESS_DATE
FROM
DATE_TEMP
WHERE
TO_CHAR(DATE_ALL,'D', 'NLS_DATE_LANGUAGE=ENGLISH') IN ('1','7')
)
SELECT
L.WORK_DATE,
H.BUSINESS_DATE
FROM
BUSINESS_TMP H,
WORK_TMP L
;

find nearest row of different type in oracle

My table looks like
__ Key type timeStamp flag
1 ) 1 B 2015-06-28 22:19:26 Y
2 ) 1 B 2015-06-28 22:20:22 Y
3 ) 1 C 2015-06-28 22:22:06 N
4 ) 1 A 2015-06-28 22:25:11 N
5 ) 1 B 2015-06-28 22:29:44 Y
6 ) 1 A 2015-06-28 22:33:33 N
7 ) 1 B 2015-06-28 22:35:21 N
8 ) 1 B 2015-06-28 22:39:34 Y
9 ) 1 B 2015-06-28 22:43:53 N
10) 1 A 2015-06-28 22:45:53 N
I need to find out all the types of A whose flag='N' with respect to which there exist type B whose timestampOF(B)<timestampOF(A) and Flag(B)='Y' and key(A)=key(B).
note: If there exist two B previous than A than take the B with max timestamp.(ROW[8,9,10] 9 is taken instead of 8)
OUTPUT
__ Key type timeStamp flag
4 ) 1 A 2015-06-28 22:25:11 N
6 ) 1 A 2015-06-28 22:33:33 N
My approach
SELECT *
FROM tab TAB_OUT
WHERE TAB_OUT.TYPE='A'
AND TAB_OUT.FLAG='N'
AND EXISTS(
SELECT *
FROM tab TAB_IN
WHERE TAB_IN.KEY = TAB_OUT.KEY
AND TAB_IN.TYPE='B'
AND TAB_OUT.FLAG='Y'
AND TAB_IN.timestamp<TAB_OUT.timestamp
AND TAB_IN.timestamp = (SELECT MAX(timestamp) from
tab where timestamp< `TAB_OUT.timestamp`)
);
But in this i can not use TAB_OUT.timestamp in third level query. Is there any alternative solution to solve this problem.
In my query note: part is not satisfied as my query as it skips no. 9) and satisfy condition with no. 8)
A solution that only requires a single table scan:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( Key, type, timeStamp, flag ) AS
SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:19:26' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:20:22' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'C', CAST( TIMESTAMP '2015-06-28 22:22:06' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'A', CAST( TIMESTAMP '2015-06-28 22:25:11' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:29:44' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'A', CAST( TIMESTAMP '2015-06-28 22:33:33' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:35:21' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:39:34' AS DATE ), 'Y' FROM DUAL
UNION ALL SELECT 1, 'B', CAST( TIMESTAMP '2015-06-28 22:43:53' AS DATE ), 'N' FROM DUAL
UNION ALL SELECT 1, 'A', CAST( TIMESTAMP '2015-06-28 22:45:53' AS DATE ), 'N' FROM DUAL
Query 1:
SELECT Key,
type,
timeStamp,
flag
FROM (
SELECT Key,
type,
timeStamp,
flag,
LAG( CASE WHEN type = 'B' THEN flag END ) IGNORE NULLS OVER ( PARTITION BY Key ORDER BY timeStamp ) AS prev_b_flag
FROM table_name t
WHERE type IN ( 'A', 'B' )
)
WHERE type = 'A'
AND flag = 'N'
AND prev_b_flag = 'Y'
Results:
| KEY | TYPE | TIMESTAMP | FLAG |
|-----|------|------------------------|------|
| 1 | A | June, 28 2015 22:25:11 | N |
| 1 | A | June, 28 2015 22:33:33 | N |
SELECT
*
FROM
tab A
WHERE
flag = 'N' AND type = 'A'
AND EXISTS (
SELECT
NULL
FROM
tab B
WHERE
type = 'B'
AND A.timestamp > timestamp AND A.Key = Key
GROUP BY
Key
HAVING
MAX(flag) KEEP (DENSE_RANK LAST ORDER BY timestamp) = 'Y'
);
There is no need to make correlated query to select flag from the the last record. Using aggregate KEEP clause is more efficient way. In this case it sort the groups by timestamp and keeps only the last value for the aggregation (last timestamp you wanted), so there comes only single record to the MAX function and we just take the FLAG value from it.
Here is simple example:
WITH sample (value1, value2) AS (
SELECT 1, 'Y' FROM DUAL UNION ALL
SELECT 2, 'X' FROM DUAL
)
SELECT
MIN(value2) KEEP (DENSE_RANK LAST ORDER BY value1) value2
FROM
sample
This returns value2 from the record with highest value1.

PL/SQL - Calculate distinct days between overlapping time periods

Imagine this scenario (YYYY/MM/DD):
Start date: 2015/01/01 End date: 2015/08/10
Start date: 2014/10/03 End date: 2015/07/06
Start date: 2015/09/30 End date: 2016/04/28
Using PL/SQL can I calculate the distinct days between these overlapping dates?
Edit: My table has 2 DATE columns, Start_Date and End_Date. The result I'm expecting is 515 days ((2015/08/10 - 2014/10/03) + (2016/04/28 -2015/09/30))
You can do also with pure SQL (no need for PL/SQL):
with
minmax as (select min(start_date) min_dt, max(end_date) max_dt from myTable ),
dates as (
SELECT min_dt + rownum-1 dt1
FROM minmax CONNECT BY ROWNUM <= (max_dt - min_dt +1)
)
select count(*) from dates
where exists(
select 1 from MyTable T2
where dates.dt1 between T2.start_date and T2.end_date )
NOTE: an idea, written from head, not tested. Adapt generated dates as needed, with start date and needed length.
Hope it helps.
EDIT: Using actual table dates
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE DATES ( start_date, end_date ) AS
SELECT DATE '2015-01-01', DATE '2015-08-10' FROM DUAL
UNION ALL SELECT DATE '2014-10-03', DATE '2015-07-06' FROM DUAL
UNION ALL SELECT DATE '2015-09-30', DATE '2016-04-28' FROM DUAL
Query 1:
SELECT COUNT( DISTINCT COLUMN_VALUE ) AS number_of_days
FROM DATES d,
TABLE(
CAST(
MULTISET(
SELECT d.START_DATE + LEVEL - 1
FROM DUAL
CONNECT BY d.START_DATE + LEVEL - 1 < d.END_DATE
)
AS SYS.ODCIDATELIST
)
)
ORDER BY 1
Results:
| NUMBER_OF_DAYS |
|----------------|
| 522 |
Query 2 - Check:
SELECT DATE '2015-08-10' - DATE '2014-10-03'
+ DATE '2016-04-28' - DATE '2015-09-30'
FROM DUAL
Results:
| DATE'2015-08-10'-DATE'2014-10-03'+DATE'2016-04-28'-DATE'2015-09-30' |
|---------------------------------------------------------------------|
| 522 |

Time difference in oracle

Hi i have the following table which contains Start time,end time, total time
STARTTIME | ENDTIME | TOTAL TIME TAKEN |
02-12-2013 01:24:00 | 02-12-2013 04:17:00 | 02:53:00 |
I need to update the TOTAL TIME TAKEN field as above using the update query in oracle
For that I have tried the following select query
select round((endtime-starttime) * 60 * 24,2),
endtime,
starttime
from purge_archive_status_log
but I'm getting 02.53 as a result, but my expectation format is 02:53:00 Please let me know how can I do this?
There is probably no reason to have that total_time_taken column in your table at all, you can always calculate it's value. But If you insist on keeping it, it would be better to recreated it as column of interval day to second data type, not varchar2(assuming that that's its current data type). So here are two queries for you to choose from, one returns value of interval day to second data type and another one value of varchar2 data type:
This query returns difference between two dates as a value of interval day to second data type:
SQL> with t1(starttime, endtime, total_time_taken ) as(
2 select to_date('02-12-2013 01:24:00', 'dd/mm/yyyy hh24:mi:ss')
3 , to_date('02-12-2013 04:17:00', 'dd/mm/yyyy hh24:mi:ss')
4 , '02:53:00'
5 from dual
6 )
7 select starttime
8 , endtime
9 , (endtime - starttime) day(0) to second(0) as total_time_taken
10 from t1
11 ;
Result:
STARTTIME ENDTIME TOTAL_TIME_TAKEN
----------- ----------- ----------------
02-12-2013 01:24:00 02-12-2013 04:17:00 +0 02:53:00
This query returns difference between two dates as a value of varchar2 data type:
SQL> with t1(starttime, endtime, total_time_taken ) as(
2 select to_date('02-12-2013 01:24:00', 'dd/mm/yyyy hh24:mi:ss')
3 , to_date('02-12-2013 04:17:00', 'dd/mm/yyyy hh24:mi:ss')
4 , '02:53:00'
5 from dual
6 )
7 select starttime
8 , endtime
9 , to_char(extract(hour from res), 'fm00') || ':' ||
10 to_char(extract(minute from res), 'fm00') || ':' ||
11 to_char(extract(second from res), 'fm00') as total_time_taken
12 from(select starttime
13 , endtime
14 , total_time_taken
15 , (endtime - starttime) day(0) to second(0) as res
16 from t1
17 )
18 ;
Result:
STARTTIME ENDTIME TOTAL_TIME_TAKEN
----------- ----------- ----------------
02-12-2013 01:24:00 02-12-2013 04:17:00 02:53:00
Try this too,
WITH TIME AS (
SELECT to_date('02-12-2013 01:24:00', 'dd-mm-yyyy hh24:mi:ss') starttime,
to_date('02-12-2013 04:17:00', 'dd-mm-yyyy hh24:mi:ss') endTime
FROM dual)
SELECT to_char(TRUNC ((endTime - startTime)* 86400 / (60 * 60)), 'fm09')||':'||
to_char(TRUNC (MOD ((endTime - startTime)* 86400, (60*60)) / 60), 'fm09')||':'||
to_char(MOD((endTime - startTime)* 86400, 60), 'fm09') time_diff
FROM TIME;

Resources