SAS: Extract every last unique combination ordered by date - sorting

I'm having difficulty in extracting unique tasks performed by workers in events arranged by time. The unique combination is defined by ID and Mode. Following dataset mimics the scenario :
ID Time Mode Event
23456 20120101 A Open
23456 20120101 B Closed
87690 20120311 G Closed
98000 20120201 B Open
98000 20120301 A Open
98000 20120101 A Open
87889 20121009 C Closed
87889 20120101 C Open
87900 20120411 A Closed
87900 20120102 A Closed
Hope for the following result:
ID Time Mode Event
23456 20120101 A Open
23456 20120101 B Closed
87690 20120311 G Closed
98000 20120201 B Open
98000 20120301 A Open
87889 20121009 C Closed
87900 20120411 A Closed
I will first sort by time in descending order:
proc sort data=df; by ID descending time; run;
Then I can use sort again to get unique combo by ID and Mode:
proc sort data=df dupout=nodup nodupkey;
by ID Mode; run;
In the last step, how do I ensure that the none-duped record is also the latest event?
Thanks!

you can do this by using first. and last concept
data have;
input ID Time:yymmdd8. Mode $ Event $;
format time yymmdd10.;
datalines;
23456 20120101 A Open
23456 20120101 B Closed
87690 20120311 G Closed
98000 20120201 B Open
98000 20120301 A Open
98000 20120101 A Open
87889 20121009 C Closed
87889 20120101 C Open
87900 20120411 A Closed
87900 20120102 A Closed
;
proc sort data = have out=have1;
by id mode time;
run;
data want;
set have1;
by id mode time;
if last.mode and last.time then output;
run;
or i you can simple proc sql as shown below
proc sql;
create table want1 as
select id, time, mode, event from have
group by id, mode
having time = max(time);
for your code to work, in your first sort you need to be your first sort as
proc sort data=df; by ID mode descending time; run;

Related

how to generate a deadlock scenario in Oracle?

I've been stuck on a Lab question for the last four hours because I generally don't understand what it wants, even with extensive research and flipping through endless slides. EDIT the prologue in question is a dbcreate.sql which creates a series of tables, and then a dbload.sql which inserts values into given tables.
The given question is
Implement in PL/SQL the database transactions that operate on the sample database created in Prologue step and such that their concurrent processing leads to dadeadlock situation. Save the transactions in SQL scripts solution1-1.sql and solution1-2.sql
I feel someone on this site could explain this in a way I can understand! Thank you for your help
EDIT theres a second part to this question
Simulate a concurrent processing of the transaction such that it will lead to a deadlock.
To simulate a concurrent processing of the database transactions use a PL/SQL procedure
SLEEP from the standard PL/SQL package DBMS_LOCK. By "simulation of concurrent
execution" we mean that the first transaction does a bit of work, then it is delayed for a
certain period of time and in the same period of time another transaction is processed.
Finally, after a delay the first transaction completes its job.
The simplest way (untested code):
CREATE OR REPLACE PROCEDURE doUpd ( id1 IN NUMBER, id2 IN NUMBER ) IS
BEGIN
UPDATE tableA set colA = 'upd1' where id = id1;
dbms_lock.sleep (20);
UPDATE tableA set colA = 'upd2' where id = id2;
END;
/
Then run in session 1:
execute doUpd( 21, 12 );
Immediate in session 2:
execute doUpd( 12, 21 );
What we're doing is updating 2 rows of but is a different order.
We would hope that the time between between the updates would be small enough not avoid a deadlock. But if we want to simulate a deadlock, we need add a delay so that we can fire off the updates in another session.
In the example above, session 1 will update the rows with id = 21 , then wait for 20 seconds, then update the row with id 12.
Session 2 will update the rows with id = 12 , then wait for 20 seconds, then update the row with id 21. If session 2 starts whilst session 1 is 'sleeping' we should get a deadlock.
In time order, provided you are quick with starting the session 2 job, you should be aiming for this:
Session 1: UPDATE tableA set colA = 'upd1' where id = 21;
Session 1: sleep 20
Session 2: UPDATE tableA set colA = 'upd1' where id = 12;
Session 2: sleep 20
Session 1: UPDATE tableA set colA = 'upd2' where id = 12; -- blocked until session 2 commit/rollback
Session 2: UPDATE tableA set colA = 'upd2' where id = 21; -- blocked until session 1 commit/rollback
Session 1 and 2 are now deadlocked.
For the first part of your question you can use also this example without DBMS_LOCK package:
CREATE TABLE T1 (c INTEGER, v INTEGER);
INSERT INTO T1 VALUES (1, 10);
INSERT INTO T1 VALUES (2, 10);
COMMIT;
Open session 1
Open session 2
In session 1 execute update t1 set v = v + 10 where c = 1;
In session 2 execute update t1 set v = v + 10 where c = 2;
In session 1 execute update t1 set v = v + 10 where c = 2;
In session 2 execute update t1 set v = v + 10 where c = 1;
Session 1 raises an ORA-00060: deadlock detected while waiting for resource

Oracle rownum = 1 to select topmost row from the set fails [duplicate]

This question already has answers here:
Oracle SELECT TOP 10 records [duplicate]
(6 answers)
How do I do top 1 in Oracle? [duplicate]
(9 answers)
How do I limit the number of rows returned by an Oracle query after ordering?
(14 answers)
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 5 years ago.
I need to select from two tables,
RATING_TABLE
RATING_TYPE RATING_PRIORITY
TITAN 1
PLATINUM(+) 1
PLATINUM 2
DIAMOND(+) 3
DIAMOND 3
GOLD 4
SILVER 4
RATING_STORAGE
RATING AMOUNT
SILVER 200
GOLD 510
DIAMOND 850
PLATINUM(+) 980
TITAN 5000
I want to select the rating from RATING_STORAGE table based on RATING_PRIORITY from RATING_TABLE.
I want to select one row with lowest rating priority. If two rating priority are eqaul I want to choose the one with the lowest amount.
So I used the query,
select s.rating,s.amount
from RATING_TABLE r, RATING_STORAGE s
where r.rating_type= s.rating_type
and rownum=1
order by r.rating_priority asc , s.amount asc ;
I am getting correct output when sorting the result but rownum=1 fails to give the topmost row.
Thanks in Advance.
You need to select after sorting is done, in your case:
select *
from (select s.rating
,s.amount
from rating_table r
,rating_storage s
where r.rating_type = s.rating_type
and rownum = 1
order by r.rating_priority asc
,s.amount asc)
where rownum = 1;

Why Newly Inserted Row Displaying at the Top of the table in oracle databse 12c..? [duplicate]

This question already has answers here:
Default row ordering for select query in oracle
(8 answers)
Closed 6 years ago.
Before i was working with MS SQL Server and recently moved to Oracle.
I have created java app that insert data to oracle table, but those inserting data displaying at the top of the table not at the bottom and not like SQL Server. i want to insert it at the bottom of the table. How may i do that..?
Please refer to this Screenshot.
As you can see their, user table consist of user id which is automatically increment by java application.
i know how to sort the data using SQL-developer but i need to fix this default saving style because the app that i created takes 'U002' as the last record.
Well, simple table structure is heap. If you create table and not specify type - its in common heap table. Its mean that new rows may be inserted in any free table space (realy, there is some rules but now we may forget about it). It means that you cant predict where new rows will be in output of select without ordering.
If you want to sort result ou should specify an order by clause.
select *
from user_tbl
order by userid
Just a simple example to see tat you can have no order without an ORDER:
SQL> create table unsortedTable(a number, b varchar2(1000));
Table created.
SQL> insert into unsortedTable
2 select level, lpad('X', 1000, 'X')
3 from dual
4 connect by level <=10;
10 rows created.
SQL> delete unsortedTable where a between 4 and 5;
2 rows deleted.
SQL> insert into unsortedTable
2 select -level, lpad('Y', 1000, 'Y')
3 from dual
4 connect by level <=4;
4 rows created.
SQL> select a, substr(b, 1, 5)
2 from unsortedTable;
A SUBSTR(B,1,5)
---------- --------------------
1 XXXXX
2 XXXXX
3 XXXXX
6 XXXXX
7 XXXXX
-1 YYYYY
-2 YYYYY
8 XXXXX
9 XXXXX
10 XXXXX
-3 YYYYY
-4 YYYYY
12 rows selected.
SQL>
The same sequence of operations, adding an /*+ append */ hint to the second insert statement wil give:
SQL> select a, substr(b, 1, 5)
2 from unsortedTable;
A SUBSTR(B,1,5)
---------- --------------------
1 XXXXX
2 XXXXX
3 XXXXX
6 XXXXX
7 XXXXX
8 XXXXX
9 XXXXX
10 XXXXX
-1 YYYYY
-2 YYYYY
-3 YYYYY
-4 YYYYY
12 rows selected.
Notice that this does NOT mean that an APPEND gives you a reliable way or ordering.

What if the value of order field is the same for all the records [duplicate]

This question already has answers here:
Why does Oracle return specific sequence if 'orderby' values are identical?
(4 answers)
Closed 7 years ago.
All, Let's say the SQL looks like below.
Select a, b ,c from table1 order by c
If all the rows in table1 have the same field value in the field c. I want to know if the result has the same order for each time I executed the SQL.
Let's say data in the table1 looks like below.
a b c
-------------------------------------------
1 x1 2014-4-1
....
100 x100 2014-4-1
....
1000 x1000 2014-4-1
....
How Oracle determine the rows sequence for the same order by value?
Added
Will they be random sequence for each time?
One simple answer is NO. There is no guarantee that the ORDER BY on equal values will return the same sorted result every time. It might seem to you it is always stable, however, there are many reasons when it could change.
For example, the sorting on equal values might defer after:
Gathering statistics
Adding an index on the column
For example,
Let's say I have a table t:
SQL> SELECT * FROM t ORDER BY b;
A B
---------- ----------
1 1
2 1
3 2
4 2
5 3
6 3
6 rows selected.
The sorting on the column having similar values is just like:
SQL> CREATE TABLE t1 AS SELECT * FROM t ORDER BY b, DBMS_RANDOM.VALUE;
Table created.
SQL> SELECT * FROM t1 ORDER BY b;
A B
---------- ----------
1 1
2 1
4 2
3 2
5 3
6 3
6 rows selected.
So, similar data in bot the tables, however, ORDER BY on the column having equal values, dos not guarantee the same sorting.
They must not be random (change each time), but the order is not guaranteed (change sometimes).

Eliminate pairs of observations under the condition, that observations can have more than one possible partner observation

In my current project we got several occasions where we had to implement a matching based on varying conditions. First a more detailed description of the Problem.
We got a table test:
key Value
1 10
1 -10
1 10
1 20
1 -10
1 10
2 10
2 -10
Now we want to apply a rule, so that inside a group (defined by value of key) pairs with a sum of 0 should be eliminated.
The expected result would be:
key value
1 10
1 20
Sort order is not relevant.
The following code is an example of our solution.
We want to eliminate observations with my_id 2 and 7 and additionaly 2 of the 3 Observations with amount 10.
data test;
input my_id alias $ amount;
datalines4;
1 aaa 10
2 aaa -10
3 aaa 8000
4 aaa -16000
5 aaa 700
6 aaa 10
7 aaa -10
8 aaa 10
;;;;
run;
/* get all possible matches represented by pairs of my_id */
proc sql noprint;
create table zwischen_erg as
select a.my_id as a_id,
b.my_id as b_id
from test as a inner join
test as b on (a.alias=b.alias)
where a.amount=-b.amount;
quit;
/* select ids of matches to eliminate */
proc sort data=zwischen_erg ;
by a_id b_id;
run;
data zwischen_erg1;
set zwischen_erg;
by a_id;
if first.a_id then tmp_id1 = 0;
tmp_id1 +1;
run;
proc sort data=zwischen_erg;
by b_id a_id;
run;
data zwischen_erg2;
set zwischen_erg;
by b_id;
if first.b_id then tmp_id2 = 0;
tmp_id2 +1;
run;
proc sql;
create table delete_ids as
select zwischen_erg1.a_id as my_id
from zwischen_erg1 as erg1 left join
zwischen_erg2 as erg2 on
(erg1.a_id = erg2.a_id and
erg1.b_id = erg2.b_id)
where tmp_id1 = tmp_id2
;
quit;
/* use delete_ids as filter */
proc sql noprint;
create table erg as
select a.*
from test as a left join
delete_ids as b on (a.my_id = b.my_id)
where b.my_id=.;
quit;
The algorithm seems to work, at least nobody found input data that caused a error.
But nobody could explain to me why it works and I dont understand in detail how it is working.
So i got a couple of questions.
Does this algorithm eliminate the pairs in a correct manner for all possible combinations of input data?
If it does work correct, how does the algorithm work in detail? Especially the part
where tmp_id1 = tmp_id2.
Is there a better algorithm to eliminate corresponding pairs?
Thanks in advance and happy coding
Michael
As an answer to your third question. The following approach seems simpler to me.
And probably more performant. (since i have no joins)
/*For every (absolute) value, find how many more positive/negative occurrences we have per key*/
proc sql;
create view V_INTERMEDIATE_VIEW as
select key, abs(Value) as Value_abs, sum(sign(value)) as balance
from INPUT_DATA
group by key, Value_abs
;
quit;
*The balance variable here means how many times more often did we see the positive than the negative of this value. I.e., how many of either the positive or the negative were we not able to eliminate;
/*Now output*/
data OUTPUT_DATA (keep=key Value);
set V_INTERMEDIATE_VIEW;
Value = sign(balance)*Value_abs; *Put the correct value back;
do i=1 to abs(balance) by 1;
output;
end;
run;
If you only want pure SAS (so no proc sql), you could do it as below. Note that the idea behind it remains the same.
data V_INTERMEDIATE_VIEW /view=V_INTERMEDIATE_VIEW;
set INPUT_DATA;
value_abs = abs(value);
run;
proc sort data=V_INTERMEDIATE_VIEW out=INTERMEDIATE_DATA;
by key value_abs; *we will encounter the negatives of each value and then the positives;
run;
data OUTPUT_DATA (keep=key value);
set INTERMEDIATE_DATA;
by key value_abs;
retain balance 0;
balance = sum(balance,sign(value));
if last.value_abs then do;
value = sign(balance)*value_abs; *set sign depending on what we have in excess;
do i=1 to abs(balance) by 1;
output;
end;
balance=0; *reset balance for next value_abs;
end;
run;
NOTE: thanks to Joe for some useful performance suggestions.
I don't see any bugs after a quick read. But "zwischen_erg" could have a lot of unnecessary many-to-many matches which would be inefficient.
This seems to work (but not guaranteed), and might be more efficient. Also shorter, so perhaps easier to see whats going on.
data test;
input my_id alias $ amount;
datalines4;
1 aaa 10
2 aaa -10
3 aaa 8000
4 aaa -16000
5 aaa 700
6 aaa 10
7 aaa -10
8 aaa 10
;;;;
run;
proc sort data=test;
by alias amount;
run;
data zwischen_erg;
set test;
by alias amount;
if first.amount then occurrence = 0;
occurrence+1;
run;
proc sql;
create table zwischen as
select
a.my_id,
a.alias,
a.amount
from zwischen_erg as a
left join zwischen_erg as b
on a.amount = (-1)*b.amount and a.occurrence = b.occurrence
where b.my_id is missing;
quit;

Resources