I have a below table - AccountDetails
Account_No Request_Id Issue_date Amount Details
1 567 20150607 $156 Loan
2 789 20170406 $765 Personal
3 20170216 $897
3 987 20160525 $345 Loan
3 456 20170112 $556 Loan
4 234 20171118 $987 Loan
I have to update the request_id where request id is null or Details is null for the account with below logic.
Need to get the latest request id for the account based on the issue date and have to update the request id (latest request id + 1) WHERE request_id is null or details is null. So the result should be
Account No Request_Id Issue_date Amount Details
1 567 20150607 $156 Loan
2 789 20170406 $765 Personal
3 457 20170216 $897
3 987 20160525 $345 Loan
3 456 20170112 $556 Loan
4 234 20171118 $987 Loan
I tried with the below query
MERGE INTO AccountDetails a
USING ( select Request_Id + 1,ROW_NUMBER() OVER (PARTITION BY B.Account_No
ORDER BY B.Issue_date desc) AS RANK_NO
from AccountDetails ) b
ON ( a.Account_No = b.Account_No AND a.DETAILS IS NULL)
WHEN MATCHED THEN
UPDATE SET a.Request_Id = b.Request_Id
WHERE B.RANK_NO = 1;
Sounds like you need to use the analytic LAG function to determine the previous row's request_id, e.g.:
MERGE INTO account_details tgt
USING (SELECT account_no,
CASE WHEN request_id IS NULL THEN 1 + LAG(request_id) OVER (PARTITION BY account_no ORDER BY issue_date)
ELSE request_id
END request_id,
issue_date,
amount,
DETAILS,
ROWID r_id
FROM accountdetails) src
ON (tgt.rowid = src.r_id)
WHEN MATCHED THEN
UPDATE SET tgt.request_id = src.request_id;
Of course, this design seems a little odd - why is request_id null in the first place? Is it a unique column? If so, what happens if you end up duplicating an existing request_id with your replacement id? Also, what should happen if it's the first row in for an account number that's got a null request_id?
update accountdetails set request_id=(select max(request_id)+1 from accountdetails)
where request_id is null and details is null;
Related
I have the table T_LOCATION_DATA on Oracle DB as follows:
Person_ID | Location | Role
----------------------------
101 Delhi Manager
102 Mumbai Employee
103 Noida Manager
104 Mumbai Employee
105 Noida Employee
106 Delhi Manager
107 Mumbai Manager
108 Delhi Employee
109 Mumbai Employee
Another table is T_STATUS with following data:
Person_ID | Status
-------------------
101 Active
102 Active
103 Inactive
104 Active
105 Active
106 Inactive
107 Active
108 Active
109 Inactive
I am trying to get the count of both Employee and Manager who are Active; group by location in a single query so that the result comes as follows:
Location | MANAGER COUNT | EMPLOYEE COUNT
Delhi 1 1
Mumbai 1 1
Noida 0 1
I am trying with following query but with no result:
select location, count (a.person_id) as MANAGER COUNT,
count (b.person_id) as EMPLOYEE COUNT
from T_LOCATION_DATA a,T_LOCATION_DATA b
where a.person_id in (select person_id from t_status where status='Active')
... and I get lost here
Can someone guide me on this please?
From your data, I would query like this:
SELECT
Location,
COUNT(CASE WHEN Role='Manager' THEN 1 END) as count_managers,
COUNT(CASE WHEN Role='Employee' THEN 1 END) as count_employees,
COUNT(*) count_everyone
FROM
t_location_data l
INNER JOIN
t_status s
ON
l.person_id = s.person_id AND
s.status = 'Active'
GROUP BY location
Differences to your SQL:
We dump the awful old join syntax (SELECT * FROM a,b WHERE a.id=b.id) - please always use a JOIN b ON a.id = b.id
We join in the status table but we only really do that for the active ones, hence the reason why i stated it as another clause in the ON. I could have put it in a WHERE. With an INNER JOIN it makes no difference. With an OUTER JOIN it can make a big difference, as if you write a LEFT JOIN b ON a.id = b.id WHERE b.id = 'active' will convert that LEFT JOIN back to an INNER JOIN behaviour unless you made a where clause like WHERE b.id = 'active' OR b.id IS NULL - and that's just ugly. If that comparison to a constant had been put in an ON clause, you can skip the or ... is null ugliness
We group by location, but we don't necessarily count everything. If we count the result of a CASE WHEN role = 'Manager' THEN ..., the case when produces a 1 for a manager, and it produces NULL for a non manager (i didn't specify anything for the else; this is the design behaviour of CASE WHEN in such a scenario). The number didn't have to be a 1 either; it could be 'a', Role; anything that is non null. COUNT counts anything non null as a 1, and null as a 0. The following are thus equivalent, pick whichever one makes more sense to you:
COUNT(CASE WHEN Role='Employee' THEN 1 END) as count_employees,
COUNT(CASE WHEN Role='Employee' THEN 'a' END) as count_employees,
COUNT(CASE WHEN Role='Employee' THEN role END) as count_employees,
COUNT(CASE WHEN Role='Employee' THEN role ELSE null END) as count_employees,
SUM(CASE WHEN Role='Employee' THEN 1 ELSE 0 END) as count_employees,
They both work as counts, but in the SUM case, you really do have to use 1 and 0 if you want the output number to be a count. Actually, 0 is optional, as SUM doesn't sum nulls (but as mathguy points out below, if you didn't put ELSE 0, then the SUM method would produce a NULLwhen there were 0 items, rather than a 0. Whether this is helpful or hindering to you is a decision for you alone to make)
I wasn't clear whether managers are employees also. To me, they are, maybe not to you. I added a COUNT(*) that literally counts everyone at the location. Any difference meaning count_employees+count_managers != count_everyone means there was another role, not manager or employee, in the table.. Pick your poison
This COUNT/SUM(CASE WHEN...) pattern is really useful for turning data around - a PIVOT operation. It takes a column of data:
Manager
Employee
Manager
And turns it into two columns, for the count values:
Manager Employee
2 1
You can extend it as many times as you like. If you have 10 roles, make 10 case whens, and the results will have 10 columns with a grouped up count. The data is pivoted from row-ar representation to column-ar representation
I have source table and a target table I want to do merge such that there should always be insert in the target table. For each record updated there should ne a flag updated to 'Y' and when this in something is changed then record flag value should be chnaged to 'N' and a new row of that record is inserted in target such that the information of record that is updated should be reflected. Basically I want to implement SCD type2 . My input data is-
student_id name city state mobile
1 suraj bhopal m.p. 9874561230
2 ravi pune mh 9874563210
3 amit patna bihar 9632587410
4 rao banglore kr 9236547890
5 neel chennai tn 8301456987
and when my input chnages-
student_id name city state mobile
1 suraj indore m.p. 9874561230
And my output should be like-
surr_key student_id name city state mobile insert_Date end_date Flag
1 1 suraj bhopal m.p.9874561230 31/06/2015 1/09/2015 N
2 1 suraj indore m.p.9874561230 2/09/2015 31/12/9999 Y
Can anyone help me how can I do that?
You can do this with the use of trigger ,you can create before insert trigger on your target table which will update flag column of your source table.
Or you can have after update trigger on source table which will insert record in your target table.
Hope this helps
Regards,
So this should be the outline of your procedure steps. I used different columns in source and target for simplification.
Source (tu_student) - STUDENT_ID, NAME, CITY
Target (tu_student_tgt)- SKEY, STUDENT_ID, NAME, CITY, INSERT_DATE, END_DATE, IS_ACTIVE
The basic idea here is
Find the new records from source which are missing in target and Insert it. Set start_date as sysdate, end_date as 9999 and IsActive to 1.
Find the records which are updated (like your Bhopal -> Indore case). So we have to do 2 operations in target for it
Update the record in target and set end date as sysdate and IsActive to 0.
Insert this record in target which has new values. Set start_date as sysdate, end_date as 9999 and IsActive = 1.
-- Create a new oracle sequence (test_utsav_seq in this example)
---Step 1 - Find new inserts (records present in source but not in target
insert into tu_student_tgt
(
select
test_utsav_seq.nextval as skey,
s.student_id as student_id,
s.name as name,
s.city as city,
sysdate as insert_date,
'31-DEC-9999' as end_date,
1 as Flag
from tu_student s
left outer join
tu_student_tgt t
on s.student_id=t.student_id
where t.student_id is null)
----Step 2 - Find skey which needs to be updated due to data chage from source and target. So get the active records from target and compare with source data. If mismatch found, we need to
-- a update this recods in target and mark it as Inactive.
-- b Insert a new record for same student_id with new data and mark it Active.
-- part 2a - find updates.
--these records need update. Save these skey and use it one by one while updating.
select t.skey
from tu_student s inner join
tu_student_tgt t
on s.student_id=t.student_id
where t.Flag = 1 and
(s.name!=t.name or
s.city!=t.city)
--2 b ) FInd the ids which needs to be inserted as they changed in source from target. Now as above records are marked inactive,
select s.student_id
from tu_student s inner join
tu_student_tgt t
on s.student_id=t.student_id
where t.Flag = 1 and
(s.name!=t.name or
s.city!=t.city)
---2a - Implement update
-- Now use skey from 2a in a loop and run update statements like below. Replace t.key = with the keys which needs to be updated.
update tu_student_tgt t
set t.student_id = (select s.student_id from tu_student s,tu_student_tgt t where s.student_id=t.student_id and t.key= -- id from 2a step . )
, t.name=(select s.name from tu_student s,tu_student_tgt t where s.student_id=t.student_id and t.key= --id from 2a step. )
, end_date = sysdate
, is_active = 0
where t.skey = -- id from 2a step
---2b Implement Insert use student_id found in 2a
--Insert these student id like step 1
insert into tu_student_tgt
(
select
test_utsav_seq.nextval as skey,
s.student_id as student_id,
s.name as name,
s.city as city,
sysdate as insert_date,
'31-DEC-9999' as end_date,
1 as Flag
from tu_student s
where s.student_id = -- ID from 2b step - Repeat for other ids
I cannot give you a simple example of SCD-2. If you understand SCD-2, you should understand this implementation.
Suppose I have the following tables
Target table
sales
ID ItemNum DiscAmt OrigAmt
1 123 20.00 NULL
2 456 30.00 NULL
3 123 20.00 NULL
Source Table
prices
ItemNum OrigAmt
123 25.00
456 35.00
I tried to update the OrigAmt in the Target Table using the OrigAmt in the Source Table using
UPDATE
( SELECT s.OrigAmt dests
,p.OrigAmt srcs
FROM sales s
LEFT JOIN prices p
ON s.ItemNum = p.ItemNum
) amnts
SET amnts.dests = amnts.srcs
;
but i get: ORA-01779: cannot modify a column which maps to a non key-preserved table
i also tried using the merge but i get: ORA-30926: unable to get a stable set of rows in the source tables
You cannot generally UPDATE the result of an arbitrary SELECT.
Single statement, assuming ItemNum is a primary key for prices:
UPDATE sales WHERE (SELECT count(price.ItemNum) FROM price
WHERE price.ItemNum = sales.ItemNum) > 0
SET OrigAmt =
(SELECT MAX(OrigAmt) FROM price
WHERE price.ItemNum = sales.ItemNum)
You might get away with omitting the WHERE and/or MAX.
Less convoluted: loop a cursor over
SELECT ItemNum, OrigAmt FROM price
performing a number of updates for each ItemNum from table prices:
UPDATE sales SET OrigAmt=? WHERE ItemNum=?
I have a query requirement from ----. Trying to solve it with CONNECT BY, but can't seem to get the results I need.
Table (simplified):
create table CSS.USER_DESC (
USER_ID VARCHAR2(30) not null,
NEW_USER_ID VARCHAR2(30),
GLOBAL_HR_ID CHAR(8)
)
-- USER_ID is the primary key
-- NEW_USER_ID is a self-referencing key
-- GLOBAL_HR_ID is an ID field from another system
There are two sources of user data (datafeeds)... I have to watch for mistakes in either of them when updating information.
Scenarios:
A user is given a new User ID... The old record is set accordingly and deactivated (typically a rename for contractors who become fulltime)
A user leaves and returns sometime later. HR fails to send us the old user ID so we can connect the accounts.
The system screwed up and didn't set the new User ID on the old record.
The data can be bad in a hundred other ways
I need to know the following are the same user, and I can't rely on name or other fields... they differ among matching records:
ROOTUSER NUMROOTS NODELEVEL ISLEAF USER_ID NEW_USER_ID GLOBAL_HR_ID USERTYPE LAST_NAME FIRST_NAME
-----------------------------------------------------------------------------------------------------------------------------
EX0T1100 2 1 0 EX0T1100 EX000005 CONTRACTOR VON DER HAAVEN VERONICA
EX0T1100 2 2 1 EX000005 00126121 EMPLOYEE HAAVEN, VON DER VERONICA
GL110456 1 1 1 GL110456 00126121 EMPLOYEE VONDERHAAVEN VERONICA
EXOT1100 and EX000005 are connected properly by the NEW_USER_ID field. The rename occurred before there were global HR IDs, so EX0T1100 doesn't have one. EX000005 was given a new user ID, 'GL110456', and the two are only connected by having the same global HR ID.
Cleaning up the data isn't an option.
The query so far:
select connect_by_root cud.user_id RootUser,
count(connect_by_root cud.user_id) over (partition by connect_by_root cud.user_id) NumRoots,
level NodeLevel, connect_by_isleaf IsLeaf, --connect_by_iscycle IsCycle,
cud.user_id, cud.new_user_id, cud.global_hr_id,
cud.user_type_code UserType, ccud.last_name, cud.first_name
from css.user_desc cud
where cud.user_id in ('EX000005','EX0T1100','GL110456')
-- Using this so I don't get sub-users in my list of root users...
-- It complicates the matches with GLOBAL_HR_ID, however
start with cud.user_id not in (select cudsub.new_user_id
from css.user_desc cudsub
where cudsub.new_user_id is not null)
connect by nocycle (prior new_user_id = user_id);
I've tried various CONNECT BY clauses, but none of them are quite right:
-- As a multiple CONNECT BY
connect by nocycle (prior global_hr_id = global_hr_id)
connect by nocycle (prior new_user_id = user_id)
-- As a compound CONNECT BY
connect by nocycle ((prior new_user_id = user_id)
or (prior global_hr_id = global_hr_id
and user_id != prior user_Id))
UNIONing two CONNECT BY queries doesn't work... I don't get the leveling.
Here is what I would like to see... I'm okay with a resultset that I have to distinct and use as a subquery. I'm also okay with any of the three user IDs in the ROOTUSER column... I just need to know they're the same users.
ROOTUSER NUMROOTS NODELEVEL ISLEAF USER_ID NEW_USER_ID GLOBAL_HR_ID USERTYPE LAST_NAME FIRST_NAME
-----------------------------------------------------------------------------------------------------------------------------
EX0T1100 3 1 0 EX0T1100 EX000005 CONTRACTOR VON DER HAAVEN VERONICA
EX0T1100 3 2 1 EX000005 00126121 EMPLOYEE HAAVEN, VON DER VERONICA
EX0T1100 3 (2 or 3) 1 GL110456 00126121 EMPLOYEE VONDERHAAVEN VERONICA
Ideas?
Update
Nicholas, your code looks very much like the right track... at the moment, the lead(user_id) over (partition by global_hr_id) gets false hits when the global_hr_id is null. For example:
USER_ID NEW_USER_ID CHAINNEWUSER GLOBAL_HR_ID LAST_NAME FIRST_NAME
FP004468 FP004469 AARON TIMOTHY
FP004469 FOONG KOK WAH
I've often wanted to treat nulls as separate records in a partition, but I've never found a way to make ignore nulls work. This did what I wanted:
decode(global_hr_id,null,null,lead(cud.user_id ignore nulls) over (partition by global_hr_id order by user_id)
... but there's got to be a better way. I haven't been able to get the query to finish yet on the full-blown user data (about 40,000 users). Both global_hr_id and new_user_id are indexed.
Update
The query returns after about 750 seconds... long, but manageable. It returns 93k records, because I don't have a good way of filtering level 2 hits out of the root - you have start with global_hr_id is null, but unfortunately, that isn't always the case. I'll have to think some more about how to filter those out.
I've tried adding more complex start with clauses before, but I find that separately, they run < 1 second... together, they take 90 minutes >.<
Thanks again for you help... plodding away at this.
You have provided sample of data for only one user. Would be better to have a little bit more. Anyway, lets look at something like this.
SQL> with user_desc(USER_ID, NEW_USER_ID, GLOBAL_HR_ID)as(
2 select 'EX0T1100', 'EX000005', null from dual union all
3 select 'EX000005', null, 00126121 from dual union all
4 select 'GL110456', null, 00126121 from dual
5 )
6 select connect_by_root(user_id) rootuser
7 , count(connect_by_root(user_id)) over(partition by connect_by_root(user_id)) numroot
8 , level nodlevel
9 , connect_by_isleaf
10 , user_id
11 , new_user_id
12 , global_hr_id
13 from (select user_id
14 , coalesce(new_user_id, usr) new_user_id1
15 , new_user_id
16 , global_hr_id
17 from ( select user_id
18 , new_user_id
19 , global_hr_id
20 , decode(global_hr_id,null,null,lead(user_id) over (partition by global_hr_id order by user_id)) usr
21 from user_desc
22 )
23 )
24 start with global_hr_id is null
25 connect by prior new_user_id1 = user_id
26 ;
Result:
ROOTUSER NUMROOT NODLEVEL CONNECT_BY_ISLEAF USER_ID NEW_USER_ID GLOBAL_HR_ID
-------- ---------- ---------- ----------------- -------- ----------- ------------
EX0T1100 3 1 0 EX0T1100 EX000005
EX0T1100 3 2 0 EX000005 126121
EX0T1100 3 3 1 GL110456 126121
My Oracle table has the columns: message_id, status, status_date
I would like to return message_id where when grouping by message_id the record with the mininum value for status_date has a value of 'PC' in the status column.
In other words do not return the record if the record that is returned with the minimum value of status_date when grouped by mess_id does not have a value of 'PC' in the status column.
Thanks,
Brian
I'm guessing this is what the source data may look like:
message_id status status_date
---------- ------ -----------
1 PC 01-JAN-12
1 QC 02-JAN-12
1 RC 03-JAN-12
2 AA 04-JAN-12
2 PC 05-JAN-12
2 CC 06-JAN-12
3 PC 07-JAN-12
3 PC 08-JAN-12
3 PC 09-JAN-12
And that the expected output would be something like this:
message_id
----------
1
3
The reason that these two records are returned are because for all the records where message_id=1, the one with the minimum status_date has a value of 01-JAN-12. The record with message_id=1 and status_date of 01-JAN-12 has a status of 'PC', and similarly with message_id=3. Since the minimum status_date of the records with message_id=2 is 04-JAN-12, and it's status is NOT 'PC'.
We can build this query using inline views, but there's probably an easier way using analytic functions.
SELECT A.MESSAGE_ID
FROM MY_TABLE A,
( SELECT B.MESSAGE_ID, MIN(B.STATUS_DATE) MIN_DATE
FROM MY_TABLE B
GROUP BY B.MESSAGE_ID ) C
WHERE A.STATUS = 'PC'
AND A.STATUS_DATE = C.STATUS_DATE
AND A.MESSAGE_ID = C.MESSAGE_ID;