ORACLE: How to get all column with GROUP by only 1 column? - oracle

I'm using ORACLE Database,
How to get all column with GROUP by only 1 column (EMP_ID)?
Example I have table ESD_RESULTS
FIRST_NAME | LAST_NAME | EMP_ID | WRIST_STATUS | LFOOT_STATUS | DATE
Dodo | A | 0101 | Pass | Pass | 2016-01-18 10:00
Wedi | Wil | 0105 | Pass | Pass | 2016-01-18 10:05
Dodo | A | 0101 | Pass | Fail | 2016-01-18 10:11
What I want the data display is (Get the last data by date desc if EMP_ID same):
FIRST_NAME | LAST_NAME | EMP_ID | WRIST_STATUS | LFOOT_STATUS | DATE
Dodo | A | 0101 | Pass | Fail | 2016-01-18 10:11
Wedi | Wil | 0105 | Pass | Pass | 2016-01-18 10:05
I tried to use DISTINCT and GROUP by the data still show all.

One option is to use ROW_NUMBER() to identify the latest record for each employee:
SELECT t.FIRST_NAME,
t.LAST_NAME,
t.EMP_ID,
t.WRIST_STATUS,
t.LFOOT_STATUS,
t.DATE
FROM
(
SELECT FIRST_NAME, LAST_NAME, EMP_ID, WRIST_STATUS, LFOOT_STATUS, DATE,
ROW_NUMBER() OVER (PARTITION BY EMP_ID ORDER BY DATE DESC) rn
FROM ESD_RESULTS
) t
WHERE t.rn = 1

Since presumably the first name and the last name are determined by the emp_id (they don't change from one row to another), you might as well group by all three columns - resulting in less work. (On the other hand, it would make more sense to normalize your table design; one table shows the associated first name and last name for each emp_id, there is no need to repeat the first name and last name in "this" table, which you show in your post.)
Then: you can use the FIRST/LAST function, with keep (dense_rank ...), as demonstrated below, to eliminate the need for a subquery and an outer query. If there is the possibility of two rows having the exact same date and time for an emp_id, you may refine the query to accommodate "tie-breaks" of some kind. If there are no ties, then the query will work without modification.
DATE is a reserved word in Oracle, it shouldn't be used for table or column names. I changed it to DT.
with
test_data ( first_name, last_name, emp_id, wrist_status, lfoot_status, dt ) as (
select 'Dodo', 'A' , 0101, 'Pass', 'Pass', to_date('2016-01-18 10:00', 'yyyy-mm-dd hh24:mi') from dual union all
select 'Wedi', 'Wil', 0105, 'Pass', 'Pass', to_date('2016-01-18 10:05', 'yyyy-mm-dd hh24:mi') from dual union all
select 'Dodo', 'A' , 0101, 'Pass', 'Fail', to_date('2016-01-18 10:11', 'yyyy-mm-dd hh24:mi') from dual
)
-- end of test data (NOT part of the solution); SQL query begins BELOW THIS LINE
select first_name, last_name, emp_id,
min(wrist_status) keep (dense_rank last order by dt) as wrist_status,
min(lfoot_status) keep (dense_rank last order by dt) as lfoot_status,
max(dt) as dt
from test_data
group by first_name, last_name, emp_id
;
FIRST_NAME LAST_NAME EMP_ID WRIST_STATUS LFOOT_STATUS DT
---------- --------- ---------- ------------ ------------ ----------------
Dodo A 101 Pass Fail 2016-01-18 10:11
Wedi Wil 105 Pass Pass 2016-01-18 10:05
2 rows selected.

Related

Separating Overlapping Date Ranges in Oracle

I have data with overlapping data ranges. Example below
Customer_ID
FAC_NUM
Start_Date
End_Date
New_Monies
12345
ABC1234
26/NOV/2014
26/MAY/2015
100000
12345
ABC1234
12/DEC/2014
12/JUN/2015
200000
12345
ABC1234
15/JUN/2015
15/DEC/2015
500000
12345
ABC1234
20/DEC/2015
20/JUN/2016
600000
I want to convert this table into data with non overlapping ranges such that for each overlapping period, the New_Monies column is summed together and shown as a new row. For the example above, I want the output to be as follows
Customer_ID
FAC_NUM
Start_Date
End_Date
New_Monies
12345
ABC1234
26/NOV/2014
11/DEC/2014
100000
12345
ABC1234
12/DEC/2014
26/MAY/2015
300000
12345
ABC1234
27/MAY/2015
12/JUN/2015
200000
12345
ABC1234
15/JUN/2015
15/DEC/2015
500000
12345
ABC1234
20/DEC/2015
20/JUN/2016
600000
Row 2 above being the overlapping period of 12 Dec 2014 to 26 May 2015 showing the total New_Monies as 300000 (100000+200000)
What would be the best way to do this in Oracle?
Thanks in advance for your support.
Regards,
Ani
with
prep (customer_id, fac_num, dt, amount) as (
select t.customer_id, t.fac_num,
case h.col when 's' then t.start_date else t.end_date + 1 end as dt,
case h.col when 's' then t.new_monies else - t.new_monies end as amount
from sample_data t
cross join
(select 's' as col from dual union all select 'e' from dual) h
)
, cumul_sums (customer_id, fac_num, dt, amount) as (
select distinct
customer_id, fac_num, dt,
sum(amount) over (partition by customer_id, fac_num order by dt)
from prep
)
, with_intervals (customer_id, fac_num, start_date, end_date, amount) as (
select customer_id, fac_num, dt,
lead(dt) over (partition by customer_id, fac_num order by dt) - 1,
amount
from cumul_sums
)
select customer_id, fac_num, start_date, end_date, amount
from with_intervals
where end_date is not null
order by customer_id, fac_num, start_date
;
The prep subquery unpivots the inputs, while at the same time changing the "end date" to the "start date" of the following interval and assigning a positive amount to the "start date" and the negative of the same amount to the following "start date". cumul_sums computes the cumulative sums; note that if two or more intervals begin on the same date (so the same date from prep appears multiple times for a customer and fac_num), the analytic sum will include the amounts from ALL the rows up to that date - the default windowing clause is range between...... After the cumulative sums are computed, this subquery also de-duplicates the output rows (to handle precisely that complication, of multiple intervals starting on the same date). with_intervals recovers the "start date" - "end date" intervals, and the final step simply removes the last interval ("to infinity") which would have an "amount" of zero.
EDIT This solution answers the OP's original question. After posting the solution, the OP changed the question. The solution can be changed easily to address the new formulation. I'm not going to chase shadows though; the solution will remain as is.
Here is an way to do this.
with all_data
as (select Customer_ID,FAC_NUM,start_date as dt,new_monies as calc_monies
from t
union all
select Customer_ID,FAC_NUM,end_date as dt,new_monies*-1 as calc_monies
from t
)
select x.customer_id
,x.fac_num
,x.start_date
,case when row_number() over(order by end_date desc)=1 then
x.end_date + 1
else x.end_date
end as new_end_date
from (
select t.customer_id
,t.fac_num
,t.dt as start_date
,lead(dt) over(order by dt)-1 as end_date
,sum(calc_monies) over(order by dt) as new_monies
from all_data t
)x
where end_date is not null
order by 3
db fiddle link
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=856c9ac0954e45429994f4ac45699e6f
+-------------+---------+------------+--------------+------------+
| CUSTOMER_ID | FAC_NUM | START_DATE | NEW_END_DATE | NEW_MONIES |
+-------------+---------+------------+--------------+------------+
| 12345 | ABC1234 | 26-NOV-14 | 11-DEC-14 | 100000 |
| 12345 | ABC1234 | 12-DEC-14 | 25-MAY-15 | 300000 |
| 12345 | ABC1234 | 26-MAY-15 | 12-JUN-15 | 200000 |
+-------------+---------+------------+--------------+------------+

Oracle - how to update a unique row based on MAX effective date which is part of the unique index

Oracle - Say you have a table that has a unique key on name, ssn and effective date. The effective date makes it unique. What is the best way to update a current indicator to show inactive for the rows with dates less than the max effective date? I can't really wrap my head around it since there are multiple rows with the same name and ssn combinations. I haven't been able to find this scenario on here for Oracle and I'm having developer's block. Thanks.
"All name/ssn having a max effective date earlier than this time yesterday:"
SELECT name, ssn
FROM t
GROUP BY name, ssn
HAVING MAX(eff_date) < SYSDATE - 1
Oracle supports multi column in, so
UPDATE t
SET current_indicator = 'inactive'
WHERE (name,ssn,eff_date) IN (
SELECT name, ssn, max(eff_date)
FROM t
GROUP BY name, ssn
HAVING MAX(eff_date) < SYSDATE - 1
)
Use a MERGE statement using an analytic function to identify the rows to update and then merge on the ROWID pseudo-column so that Oracle can efficiently identify the rows to update (without having to perform an expensive self-join by comparing the values):
MERGE INTO table_name dst
USING (
SELECT rid,
max_eff_date
FROM (
SELECT ROWID AS rid,
effective_date,
status,
MAX( effective_date ) OVER ( PARTITION BY name, ssn ) AS max_eff_date
FROM table_name
)
WHERE ( effective_date < max_eff_date AND status <> 'inactive' )
OR ( effective_date = max_eff_date AND status <> 'active' )
) src
ON ( dst.ROWID = src.rid )
WHEN MATCHED THEN
UPDATE
SET status = CASE
WHEN src.max_eff_date = dst.effective_date
THEN 'active'
ELSE 'inactive'
END;
So, for some sample data:
CREATE TABLE table_name ( name, ssn, effective_date, status ) AS
SELECT 'aaa', 1, DATE '2020-01-01', 'inactive' FROM DUAL UNION ALL
SELECT 'aaa', 1, DATE '2020-01-02', 'inactive' FROM DUAL UNION ALL
SELECT 'aaa', 1, DATE '2020-01-03', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 2, DATE '2020-01-01', 'active' FROM DUAL UNION ALL
SELECT 'bbb', 2, DATE '2020-01-02', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 3, DATE '2020-01-01', 'inactive' FROM DUAL UNION ALL
SELECT 'bbb', 3, DATE '2020-01-03', 'active' FROM DUAL;
The query only updates the 3 rows that need changing and:
SELECT *
FROM table_name;
Outputs:
NAME | SSN | EFFECTIVE_DATE | STATUS
:--- | --: | :------------- | :-------
aaa | 1 | 01-JAN-20 | inactive
aaa | 1 | 02-JAN-20 | inactive
aaa | 1 | 03-JAN-20 | active
bbb | 2 | 01-JAN-20 | inactive
bbb | 2 | 02-JAN-20 | active
bbb | 3 | 01-JAN-20 | inactive
bbb | 3 | 03-JAN-20 | active
db<>fiddle here

How to update table in Hive 0.13?

My Hive version is 0.13. I have two tables, table_1 and table_2
table_1 contains:
customer_id | items | price | updated_date
------------+-------+-------+-------------
10 | watch | 1000 | 20170626
11 | bat | 400 | 20170625
table_2 contains:
customer_id | items | price | updated_date
------------+----------+-------+-------------
10 | computer | 20000 | 20170624
I want to update records of table_2 if customer_id already exists in it, if not, it should append to table_2.
As Hive 0.13 does not support update, I tried using join, but it fails.
You can use row_number or full join. This is example using row_number:
insert overwrite table_1
select customer_id, items, price, updated_date
from
(
select customer_id, items, price, updated_date,
row_number() over(partition by customer_id order by new_flag desc) rn
from
(
select customer_id, items, price, updated_date, 0 as new_flag
from table_1
union all
select customer_id, items, price, updated_date, 1 as new_flag
from table_2
) all_data
)s where rn=1;
Also see this answer for update using FULL JOIN: https://stackoverflow.com/a/37744071/2700344

NULL values not found in cursor

I am trying to:
Create a cursor that gets all the current prices of items in a store.
I bulk collect the cursor and loop upserting by using MERGE statement into STORE_INVENTORY table.
Now I want to NULL out the PRICE column in the STORE_INVENTORY table that are not in the cursor.
How can step 3 be done? I can do step 1 and 2 already as I have already updated or inserted the items that are pulled from the cursor.
Here is some example data:
There are three source tables where it is updated by an external party. My objective is to take these three sources of data and merge it into a singular table.
SOURCE TABLES
ITEM_TYPES
DESC_ID | TYPE
A | Kitchen
B | Bath
ITEM_MANIFEST
LOC_ID | ORIGIN
U | USA
C | CHINA
ITEM_PRICE
ITEM_ID | PRICE | DESC_ID | LOC_ID | DATE
0 | 3.99 | A | U | 9/11/2015
1 | 2.99 | B | C | 9/11/2015
2 | 1.99 | A | U | 9/05/2015
DESTINATION TABLE
STORE_INVENTORY
ITEM_ID | TYPE | ORIGIN | PRICE
0 | Kitchen | CHINA | 3.99
8 | Bath | USA | 2.99
So after I execute the SQL Procedure that has a date as a parameter. It will only pull from ITEM_PRICE if it's after the given date.
If execute the procedure with the passed in date 9/10/2015
Expected Output
STORE_INVENTORY
0 | Kitchen | USA | 3.99
1 | Bath | China | 2.99
8 | Bath | USA | NULL
So, something like this, then?
drop table item_description;
drop table item_manifest;
drop table item_price;
drop table store_inventory;
create table item_description
as
select 'A' desc_id, 'Kitchen' type from dual union all
select 'B' desc_id, 'Bath' type from dual;
create table item_manifest
as
select 'U' loc_id, 'USA' origin from dual union all
select 'C' loc_id, 'CHINA' origin from dual;
create table item_price
as
select 0 item_id, 3.99 price, 'A' desc_id, 'U' loc_id, to_date('11/09/2015', 'dd/mm/yyyy') dt from dual union all
select 1 item_id, 2.99 price, 'B' desc_id, 'C' loc_id, to_date('11/09/2015', 'dd/mm/yyyy') dt from dual union all
select 2 item_id, 1.99 price, 'A' desc_id, 'U' loc_id, to_date('05/09/2015', 'dd/mm/yyyy') dt from dual;
create table store_inventory
as
select 0 item_id, 'Kitchen' type, 'CHINA' origin, 3.99 price from dual union all
select 8 item_id, 'Bath' type, 'USA' origin, 2.99 price from dual;
select * from store_inventory;
ITEM_ID TYPE ORIGIN PRICE
---------- ------- ------ ----------
0 Kitchen CHINA 3.99
8 Bath USA 2.99
select coalesce(ip.item_id, si.item_id) item_id,
coalesce(id.type, si.type) type,
coalesce(im.origin, si.origin) origin,
ip.price
from item_description id
inner join item_price ip on (id.desc_id = ip.desc_id and ip.dt > to_date('10/09/2015', 'dd/mm/yyyy')) -- use a parameter for the date here
inner join item_manifest im on (ip.loc_id = im.loc_id)
full outer join store_inventory si on (si.item_id = ip.item_id);
ITEM_ID TYPE ORIGIN PRICE
---------- ------- ------ ----------
0 Kitchen USA 3.99
8 Bath USA
1 Bath CHINA 2.99
merge into store_inventory tgt
using (select coalesce(ip.item_id, si.item_id) item_id,
coalesce(id.type, si.type) type,
coalesce(im.origin, si.origin) origin,
ip.price
from item_description id
inner join item_price ip on (id.desc_id = ip.desc_id and ip.dt > to_date('10/09/2015', 'dd/mm/yyyy')) -- use a parameter for the date here
inner join item_manifest im on (ip.loc_id = im.loc_id)
full outer join store_inventory si on (si.item_id = ip.item_id)) src
on (src.item_id = tgt.item_id)
when matched then
update set tgt.type = src.type,
tgt.origin = src.origin,
tgt.price = src.price
when not matched then
insert (tgt.item_id, tgt.type, tgt.origin, tgt.price)
values (src.item_id, src.type, src.origin, src.price);
commit;
select * from store_inventory;
ITEM_ID TYPE ORIGIN PRICE
---------- ------- ------ ----------
0 Kitchen USA 3.99
8 Bath USA
1 Bath CHINA 2.99
Obviously, your procedure would have an input parameter of DATE datatype to pass into the query, and your query would use the parameter, rather than a hardcoded date like I did in my example. E.g. ip.dt > p_cutoff_date
I can do step 1 and 2 already as I have already updated or inserted
the items that are pulled from the cursor.
Hmm. These steps seem unnecessary - why not do them as part of the MERGE statement? What does the store_inventory table look like before you do your insert/update from the cursor? Also, what is the cursor you're using to do this?
couldn't you do a date-limited subselect of ITEM_PRICE.PRICE, after pulling in the TYPE and ORIGIN via the main join to ITEM_PRICE, without limiting on date?
i.e. something like.
select ITEM_ID, TYPE, ORIGIN
/* not selecting PRICE in the main join */
,(select PRICE from ITEM_PRICE where your join conditions
and DATE >= your param)
from ITEM_TYPES, ITEM_MANIFEST, ITEM_PRICE
where your join conditions, but no criteria on DATE
Sorry, would be clearer and easier to type up if you had provided your existing query.
From re-reading your question, I am unsure if you are inserting only 2 rows but want to get 3. Or if you have 3 rows, but you want to NULL out the missing price.
If the target table already has the 3 rows, then, instead of doing a CURSOR based approach (which can be slow on high volumes and is fussy to write), why not do an UPDATE instead, with DATE as a criteria? The NULL will be assigned to price if there is no match, that's how UPDATEs work.
UPDATE STORE_INVENTORY set PRICE
= (select PRICE from ITEM_PRICE where your join conditions
and DATE >= your param)

Using DISTINCT for specific columns

select distinct employee_id, first_name, commission_pct, department_id from
employees;
When I use the above query it results in distinct combination of all the attributes mentioned. As employee_id (being the primary key for employees) is unique, the query results in producing all the rows in the table.
I want to have a result set that has distinct combination of commission_pct and department_id. so how the query should be formed. When I tried to include the DISTINCT in the middle as
select employee_id, first_name, distinct commission_pct, department_id from
employees;
It is resulting in an error
ORA-00936-missing expression
How to form a query which results have only distinct combination of commission and department_id.The table is from HR schema of oracle.
What you request is impossible. You cannot select all the employee ids but have only distinct commission_pct and department_id.
So think it over, what you want to show:
All distinct commission_pct, department_id only?
All distinct commission_pct, department_id and the number of relevant employees?
All distinct commission_pct, department_id and the relevant employees comma separated?
All employees, but with nulls when commission_pct and department_id are the same as in the line before?
The first can be solved with DISTINCT. The second and third with GROUP BY (plus count or listagg). The last would be solved with the analytic function LAG.
You have to remove two columns before distinct
select distinct commission_pct, department_id from
employees;
Indeed, if your second query would work, what do you expect to see in the first two columns? Consider example data
| employee_id | first_name | commission_pct | department_id |
| 1 | "x" | "b" | 3 |
| 2 | "y" | "b" | 3 |
| 1 | "x" | "c" | 4 |
| 2 | "y" | "c" | 4 |
You expect to get only two row result like this
| employee_id | first_name | commission_pct | department_id |
| ? | ? | "b" | 3 |
| ? | ? | "c" | 4 |
But what do you expect in the first two column?
Can you try this one?
SELECT
NAME1,
PH
FROM
(WITH T
AS (SELECT
'mark' NAME1,
'1234567' PH
FROM
DUAL
UNION ALL
SELECT
'bailey',
'456789'
FROM
DUAL
UNION ALL
SELECT
'mark',
'987654'
FROM
DUAL)
SELECT
NAME1,
PH,
ROW_NUMBER ( ) OVER (PARTITION BY NAME1 ORDER BY NAME1) SEQ
FROM
T)
WHERE
SEQ = 1;
If you dont care on a specific row, then use aggregate functions
SELECT
NAME1,
MAX ( PH ) PH
FROM
T
GROUP BY
NAME1;

Resources