Dynamic column value to be set as next row's another column value in Oracle [closed] - oracle

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I am new to Oracle and I would like to form an Oracle query:
Id CrLmt Type Unit Price Amount Prev_bal NewBal
5-00001 100000 Sell 100 150 15000 100000 85000
Buy 75 600 45000 85000 130000
Buy 85 550 46750 130000 176750
Sell 60 1000 60000 176750 116750
5-00002 90000 Sell 100 400 40000 90000 50000
Buy 550 300 165000 50000 215000
Sell 300 1000 300000 215000 -85000
My conditions are as follows:
ID and CrLmt are combination and its subsequent rows come under this ID, CrLmt combination.
For every ID, CrLmt combination, the CrLmt will be assigned in Prev_Bal column, rest of the rows will have a calculation
Based on Buy/Sell in Type column, the values in Amount and Prev_Bal will be added or subtracted and the resultant value should be displayed in NewBal (dynamic) column
If the Type is "Sell" then the value in PrevBal should be subtracted with Amount column value and if the Type is "Buy" then then the value in PrevBal should be added with Amount column value and the resultant value should be displayed in NewBal (dynamic) column in the corresponding row
The value obtained in NewBal column in row 1 should be displayed in row 2 Prev_Bal column for 2nd row's calculation and so on.
If any negative value occurs in NewBal column the same needs to be carried out to next calculations.
I tried using LAG function to get previous values but doesn't know how to get a dynamic column's (NewBal) values on the go.

Here is a little example that you will have to adapt to your current structure. You will need a date on your transaction for the order clause of the sum.
all you need is a running sum to which you will either add the credit limit for the new_balance or to which you will take the previous row for the old_balance
--TEST DATA
CREATE TABLE credit_limit ( id varchar2(10), crlmt number );
CREATE TABLE transactions (transaction_type varchar2(4), unit number, price number, amount number, crlmt_id varchar2(10), date_transaction date );
INSERT INTO credit_limit values ('5-00001',100000);
INSERT INTO credit_limit values ('5-00002',90000);
INSERT INTO transactions values ('Sell',100,150,15000,'5-00001',sysdate-4);
INSERT INTO transactions values ('Buy',75,600,45000,'5-00001',sysdate-3);
INSERT INTO transactions values ('Buy',85,550,46750,'5-00001',sysdate-2);
INSERT INTO transactions values ('Sell',60,1000,60000,'5-00001',sysdate-1);
INSERT INTO transactions values ('Sell',100,400,40000,'5-00002',sysdate-3);
INSERT INTO transactions values ('Buy',550,300,165000,'5-00002',sysdate-2);
INSERT INTO transactions values ('Sell',300,1000,300000,'5-00002',sysdate-1);
--The query
select cr.id, cr.crlmt, tr.transaction_type, tr.unit, tr.price, tr.amount,
NVL(cr.crlmt + SUM(tr.amount*decode(tr.transaction_type,'Sell',-1,'Buy',1))
OVER (partition by cr.id order by cr.id, tr.date_transaction
rows between unbounded preceding and 1 preceding ),Cr.crlmt) old_bal,
cr.crlmt + SUM(tr.amount*decode(tr.transaction_type,'Sell',-1,'Buy',1))
OVER (partition by cr.id order by cr.id, tr.date_transaction
rows between unbounded preceding and current row ) new_bal
from
credit_limit cr
JOIN
transactions tr
ON cr.id=tr.crlmt_id
order by cr.id, tr.date_transaction
result :
ID CRLMT TRAN UNI PRICE AMOUNT OLD_BAL NEW_BAL
5-00001 100000 Sell 100 150 15000 100000 85000
5-00001 100000 Buy 75 600 45000 85000 130000
5-00001 100000 Buy 85 550 46750 130000 176750
5-00001 100000 Sell 60 1000 60000 176750 116750
5-00002 90000 Sell 100 400 40000 90000 50000
5-00002 90000 Buy 550 300 165000 50000 215000
5-00002 90000 Sell 300 1000 300000 215000 -85000

Related

Oracle - Insert x amount of rows with random data

I am currently doing some testing and am in the need for a large amount of data (around 1 million rows)
I am using the following table:
CREATE TABLE OrderTable(
OrderID INTEGER NOT NULL,
StaffID INTEGER,
TotalOrderValue DECIMAL (8,2)
CustomerID INTEGER);
ALTER TABLE OrderTable ADD CONSTRAINT OrderID_PK PRIMARY KEY (OrderID)
CREATE SEQUENCE seq_OrderTable
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 10000;
and want to randomly insert 1000000 rows into it with the following rules:
OrderID needs to be be sequential (1, 2, 3 etc...)
StaffID needs to be a random number between 1 and 1000
CustomerID needs to be a random number between 1 and 10000
TotalOrderValue needs to be a random decimal value between 0.00 and 9999.99
Is this even possible to do? I can I could generate all of these using this update statement? however generating a million rows in 1 go I am not sure on how to do this
Thanks for any help on this matter
This is how i would randomly generate the number on update:
UPDATE StaffTable SET DepartmentID = DBMS_RANDOM.value(low => 1, high => 5);
For testing purposes I created the table and populated it in one shot, with this query:
CREATE TABLE OrderTable(OrderID, StaffID, CustomerID, TotalOrderValue)
as (select level, ceil(dbms_random.value(0, 1000)),
ceil(dbms_random.value(0,10000)),
round(dbms_random.value(0,10000),2)
from dual
connect by level <= 1000000)
/
A few notes - it is better to use NUMBER as data type, NUMBER(8,2) is the format for decimal. It is much more efficient for populating this kind of table to use the "hierarchical query without PRIOR" trick (the "connect by level <= ..." trick) to get the order ID's.
If your table is created already, insert into OrderTable (select level...) (same subquery as in my code) should work just as well. You may be better off adding the PK constraint only after you create the data though, so as not to slow things down.
A small sample from the table created (total time to create the table on my cheap laptop - 1,000,000 rows - was 7.6 seconds):
SQL> select * from OrderTable where orderid between 500020 and 500030;
ORDERID STAFFID CUSTOMERID TOTALORDERVALUE
---------- ---------- ---------- ---------------
500020 666 879 6068.63
500021 189 6444 1323.82
500022 533 2609 1847.21
500023 409 895 207.88
500024 80 2125 1314.13
500025 247 3772 5081.62
500026 922 9523 1160.38
500027 818 5197 5009.02
500028 393 6870 5067.81
500029 358 4063 858.44
500030 316 8134 3479.47

How to generate each day of backlog for a ticket

Hi I'm trying to create a procedure for calculating the backlog for each day.
For example: I have a ticket with ticket_submitdate on 12-sep-2015 and resolved_date on 15-sep-2015 in one table. This ticket should come as a backlog in the backlog_table because it was not resolved on the same day as the ticket_submitdate.
I have another column date_col in the backlog_table where the date on which the ticket was a backlog is displayed,i.e, it should be there in the ticket_backlog table for dates 13-sep-2015 and 14-sep-2015 and the date_col column should have this ticket for both these dates.
Please help.
Thanks in advance.
Here is some test data:
create table backlog (ticket_no number, submit_date date, resolved_date date);
insert into backlog values (100, date '2015-09-12', date '2015-09-15');
insert into backlog values (200, date '2015-09-12', date '2015-09-14');
insert into backlog values (300, date '2015-09-13', date '2015-09-15');
insert into backlog values (400, date '2015-09-13', date '2015-09-16');
insert into backlog values (500, date '2015-09-13', date '2015-09-13');
This query generates a list of dates which spans the range of BACKLOG records, and joins them to the BACKLOG.
with dt as ( select min(submit_date) as st_dt
, greatest(max(resolved_date), max(submit_date)) as end_dt
from backlog)
, dt_range as ( select st_dt + (level-1) as date_col
from dt
connect by level <= ( end_dt - st_dt ))
select b.ticket_no
, d.date_col
from backlog b
cross join dt_range d
where d.date_col between b.submit_date and b.resolved_date
and b.submit_date != b.resolved_date
order by b.ticket_no
, d.date_col
/
Therefore it produces a list of TICKET_NOs with all the dates when they are live:
TICKET_NO DATE_COL
---------- ---------
100 12-SEP-15
100 13-SEP-15
100 14-SEP-15
100 15-SEP-15
200 12-SEP-15
200 13-SEP-15
200 14-SEP-15
300 13-SEP-15
300 14-SEP-15
300 15-SEP-15
400 13-SEP-15
400 14-SEP-15
400 15-SEP-15
14 rows selected.
SQL>
The result set does not include ticket #500 because it was resolved on the day of submission. You will probably need to tweak the filters to fit your actual business rules.
I m not sure I understood your question, if you are looking for all dates between two date range then you can use below query -
select trunc(date_col2+lv) from
(select level lv from dual connect by level < (date_col1-date_col2-1) )
order by 1

Sampling Issue with hive

"all_members" is a table in hive with 10m rows and 1 column: "membership_nbr". I want to sample 3000 rows. This is what I have done:
hive>create table sample_members as select * from all_members limit 1;
hive>insert overwrite table sample_members select membership_nbr from all_members tablesample(3000 rows);
hive>select count(*) from sample_members;
OK 45000
The result wont change if I replace 3000 rows with 300 rows
Do I do something wrong?
Table Sampling using tablesample(3000 rows) wont fetch 3000 rows from entire table instead it will fetch 3000 rows from each input split.
So, your query might run 15 mappers. So, each mapper will fetch 3000 rows. Totally, 3000 * 15 = 45000 rows. Also, if you change the 3000 rows to 300 rows you will get 4500 rows as output after sampling.
So, as per your requirement you have to give tablesample(200 rows). As a result each mapper will fetch 200 rows. Finally, 15 mappers will fetch 3000 sampling rows.
Refer the below link for various types of sampling:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling

Split amount into multiple rows if amount>=$10M or <=$-10B

I have a table in oracle database which may contain amounts >=$10M or <=$-10B.
99999999.99 chunks and also include remainder.
If the value is less than or equal to $-10B, I need to break into one or more 999999999.99 chunks and also include remainder.
Your question is somewhat unreadable, but unless you did not provide examples here is something for start, which may help you or someone with similar problem.
Let's say you have this data and you want to divide amounts into chunks not greater than 999:
id amount
-- ------
1 1500
2 800
3 2500
This query:
select id, amount,
case when level=floor(amount/999)+1 then mod(amount, 999) else 999 end chunk
from data
connect by level<=floor(amount/999)+1
and prior id = id and prior dbms_random.value is not null
...divides amounts, last row contains remainder. Output is:
ID AMOUNT CHUNK
------ ---------- ----------
1 1500 999
1 1500 501
2 800 800
3 2500 999
3 2500 999
3 2500 502
SQLFiddle demo
Edit: full query according to additional explanations:
select id, amount,
case
when amount>=0 and level=floor(amount/9999999.99)+1 then mod(amount, 9999999.99)
when amount>=0 then 9999999.99
when level=floor(-amount/999999999.99)+1 then -mod(-amount, 999999999.99)
else -999999999.99
end chunk
from data
connect by ((amount>=0 and level<=floor(amount/9999999.99)+1)
or (amount<0 and level<=floor(-amount/999999999.99)+1))
and prior id = id and prior dbms_random.value is not null
SQLFiddle
Please adjust numbers for positive and negative borders (9999999.99 and 999999999.99) according to your needs.
There are more possible solutions (recursive CTE query, PLSQL procedure, maybe others), this hierarchical query is one of them.

PIG Script How to

I am trying clean up this employee volunteer data. There is no way to track if employee already is registered volunteer so he can sign up as new volunteer and will get a new VOLUNTEER_ID. I have a data feeding into where i can tie each VOLUNTEER_ID to its EMP_ID. The volunteer data needs to be cleaned up so we can figure out how the employee moved from a volunteer_level to another and when.
The business logic is that, when there is a overlaping dates, we give the highest level to the employee for the timeframe of between start_date and end_date.
I posted a Input sample of data and what the output should be.
Is it possible to do this a PIG script ? Can someone please help me
INPUT:
EMP_ID VOLUNTEER_ID V_LEVEL STATUS START_DATE END_DATE
10001 100 1 A 1/1/2006 12/31/2007
10001 200 1 A 5/1/2006
10001 100 1 A 1/1/2008
10001 300 3 P 3/1/2008 3/1/2008
10001 300 3 A 3/2/2008 12/1/2008
10001 1001 2 A 5/1/2008 6/30/2008
10001 1001 3 A 7/1/2008
10001 300 2 A 12/2/2008
OUTPUT NEEDED:( VOLUNTEER_ID is not needed in output but adding below to show which ID was selected for output and which did not)
EMP_ID VOLUNTEER_ID V_LEVEL STATUS START_DATE END_DATE
10001 100 1 A 1/1/2006 12/31/2007
10001 300 3 P 3/1/2008 3/1/2008
10001 300 3 A 3/2/2008 12/1/2008
10001 1001 2 A 5/1/2008 6/30/2008
10001 1001 3 A 7/1/2008
It seems like you want the row in your data with the earliest start date for each V_LEVEL, STATUS, EMP_ID, and VOLUNTEER_ID
First we add a unix time column and then find the min for that column (this is in the latest version of pig so you may need to update your version).
data_with_unix = foreach data generate EMP_ID, VOLUNTEER_ID, V_LEVEL, STATUS, START_DATE, END_DATE, ToUnixTime((datetime)START_DATE) as unix_time;
grp = group data_with_unix by (EMP_ID, VOLUNTEER_ID, V_LEVEL, STATUS);
max_date = foreach grp generate group, MIN(data_with_unix.unix_time);
Then join the start and end date back into your dataset since there it doesn't look like there is currently a way to convert unix time back to date.

Resources