Tracking data change on Oracle 9 without timestamps or indexing - oracle

We're building a data warehouse on BigQuery, which includes data from a old Oracle 9 transactional database (still active), which does not include any indexing or timestamps.
Using Standard SQL, I would like to analyse changes in some tables imported from this database.
Simplifying the situation, imagine we have a two versions of the same table before and after as follows:
with before as (
select
'U123' as user, 'Gum' as product, '3' as quantity
union all
select
'U456', 'Tissue', '20'
union all
select
'U123', 'Cream', '1'
)
and
with after as (
select
'U123' as user, 'Gum' as product, '3' as quantity
union all
select
'U456', 'Tissue', '20'
union all
select
'U123', 'Cream', '3'
union all
select
'U456', 'Tomato', '5'
)
So that row 4 was added and row 3 modified.
What is the correct approach to compare data and locate changes given there is no indexing nor timestamps?
So the comparative method should output:
user | product | quantity
U123 | Cream | 3
U456 | Tomato | 5
I don't even know where to start.

Below is for BigQuery Standard SQL
#standardSQL
SELECT user, product, IFNULL(a.quantity, 0) - IFNULL(b.quantity, 0) AS quantity
FROM after a
FULL OUTER JOIN before b
USING(user, product)
WHERE IFNULL(a.quantity, 0) != IFNULL(b.quantity, 0)
When applied to sample data from your question as in below example
#standardSQL
WITH before AS (
SELECT 'U123' AS user, 'Gum' AS product, 3 AS quantity UNION ALL
SELECT 'U456', 'Tissue', 20 UNION ALL
SELECT 'U123', 'Cream', 1
), after AS (
SELECT 'U123' AS user, 'Gum' AS product, 3 AS quantity UNION ALL
SELECT 'U456', 'Tissue', 20 UNION ALL
SELECT 'U123', 'Cream', 3 UNION ALL
SELECT 'U456', 'Tomato', 5
)
SELECT user, product, IFNULL(a.quantity, 0) - IFNULL(b.quantity, 0) AS quantity
FROM after a
FULL OUTER JOIN before b
USING(user, product)
WHERE IFNULL(a.quantity, 0) != IFNULL(b.quantity, 0)
output is
Row user product quantity
1 U123 Cream 2
2 U456 Tomato 5

Oracle 9 keeps track of data change at Row level with the help of SCN (System Change Number). As a result any change performed through DML (INSERT/UPDATE) is internally captured with a TIMESTAMP.
How it works?
Create a Table with ROWDEPENDENCIES Option
Use SCN_TO_TIMESTAMP(ORA_ROWSCN) Function to get the TIMETAMP of Row Changes
Example:
-- Create Table
CREATE TABLE SCNTEST(USER NUMBER, PRODUCT NUMBER, QUANTITY NUMBER) ROWDEPENDENCIES;
-- Insert Data
INSERT ...
-- Query Data
SELECT USER, PRODUCT, QUANTITY, SCN_TO_TIMESTAMP(ORA_ROWSCN) FROM SCNTEST;
You can group data on SCN_TO_TIMESTAMP(ORA_ROWSCN) value to get before and after records.

Related

How does the recursive WITH query work in oracle? When does it go into a cycle?

I have a scenario where I have to display a row 'n' number of times depending on the value in its quantity column.
Item Qty
abc 2
cde 1
Item Qty
abc 1
abc 1
cde 1
I am looking to convert the first table to the second.
I came across the site that I should be using the recursive WITH query.
My anchor member returns the original table.
SELECT ITEM, QTY
FROM lines
WHERE
JOB = TO_NUMBER ('1')
AND ITEM IN
(SELECT PART
FROM PICK
WHERE DELIVERY = '2')
My recursive member is as follows.
SELECT CTE.ITEM, (CTE.QTY - 1) QTY
FROM CTE
INNER JOIN
(SELECT ITEM, QTY
FROM LINES
WHERE JOB_ID = TO_NUMBER ('1')
AND ITEM IN
(SELECT PART
FROM PICK
WHERE DELIVERY = '2'
)) T
ON CTE.ITEM = T.ITEM
WHERE CTE.QTY > 1
My goal is to get all the parts and quantities first then and then for all parts with qty > 1 in the recursive step generate new rows to be added to the original result set and qty displayed in the new rows would be (original qty for that part - 1). The recursion would go on until qty becomes 1 for all the parts.
So this is what I had in the end.
WITH CTE (ITEM, QTY)
AS (
SELECT ITEM, QTY
FROM lines
WHERE
JOB = TO_NUMBER ('1')
AND ITEM IN
(SELECT PART
FROM PICK
WHERE DELIVERY = '2')
UNION ALL
SELECT CTE.ITEM, (CTE.QTY - 1) QTY
FROM CTE
INNER JOIN
(SELECT ITEM, QTY
FROM LINES
WHERE JOB_ID = TO_NUMBER ('1')
AND ITEM IN
(SELECT PART
FROM PICK
WHERE DELIVERY = '2'
)) T
ON CTE.ITEM = T.ITEM
WHERE CTE.QTY > 1)
SELECT ITEM, QTY
FROM CTE
ORDER BY 1, 2 DESC
I get the following error when I try the above
"ORA-32044: cycle detected while executing recursive WITH query"
How is it getting into a cycle? What did I miss in its working?
Also, Upon reading from another website If I used a "cycle clause". I was able to stop the cycle.
The clause I used was.
CYCLE
QUANTITY
SET
END TO '1'
DEFAULT '0'
If I used this before the select statement. I'm getting the desired output but I don't feel this is the right way of going about it. What exactly is the clause doing? What is the right way of using it?
Oracle Setup:
CREATE TABLE lines ( Item, Qty ) AS
SELECT 'abc', 2 FROM DUAL UNION ALL
SELECT 'cde', 1 FROM DUAL;
CREATE TABLE pick ( part, delivery ) AS
SELECT 'abc', 2 FROM DUAL UNION ALL
SELECT 'cde', 2 FROM DUAL;
Query 1: Using a hierarchical query:
SELECT Item,
COLUMN_VALUE AS qty
FROM lines l
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT 1
FROM DUAL
CONNECT BY LEVEL <= l.Qty
)
AS SYS.ODCINUMBERLIST
)
) t
WHERE item IN ( SELECT part FROM pick WHERE delivery = 2 )
Query 2: Using a recursive sub-query factoring clause:
WITH rsqfc ( item, qty ) AS (
SELECT item, qty
FROM lines l
WHERE item IN ( SELECT part FROM pick WHERE delivery = 2 )
UNION ALL
SELECT item, qty - 1
FROM rsqfc
WHERE qty > 1
)
SELECT item, 1 AS qty
FROM rsqfc;
Output:
ITEM | QTY
:--- | --:
abc | 1
abc | 1
cde | 1
db<>fiddle here

Oracle month year temp table

I am trying to create a month, year temp table that I can relate to in calculations, however I am having some issues. I am unable to create global temp tables due to restrictions and have to rely on the following kind of query.
WITH Months AS
(
SELECT LEVEL -1 AS ID
FROM DUAL
CONNECT BY LEVEL <=264
)
(SELECT
ROWNUM AS MO_SYS_ID,
TO_CHAR(ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), ID), 'YYYY'||'MM') AS MO_NM,
TO_CHAR(ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), ID), 'MON') AS MO_ABBR_NM,
TO_CHAR(ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), ID), 'MONTH') AS MO_FULL_NM,
TO_CHAR(ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), ID), 'MM')AS MO_NBR,
TO_CHAR(ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), ID), 'YYYY') AS YR_NBR
from Months;
What I really need to do is have this inserted into the temp table that I can recall. I do not have any fields that I can use from other tables either unfortunately. I need it to show 264 months from 1999.
Thank you
You can calculate a date column within the table expression, like this:
WITH Months AS (
SELECT LEVEL -1 AS ID, ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), LEVEL -1) as dt
FROM DUAL
CONNECT BY LEVEL <=264
)
SELECT *
from Months
If you are attempting to create date ranges, you could do this:
WITH Months AS (
SELECT LEVEL -1 AS ID
, ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), LEVEL -1) as start_dt
, ADD_MONTHS(TO_DATE('01/01/1999', 'DD/MM/YY'), LEVEL ) as end_dt
FROM DUAL
CONNECT BY LEVEL <=264
)
SELECT *
from yourtable t
inner join Months m on t.somecol >= m.start_dt and t.somecol < m.end_dt

Compare table content

I have 2 tables and I need to do a table compare:
TABLE A
LABEL
VALUE
TABLE B
LABEL
VALUE
Basically I want:
Records in where the values are not equal on matching labels
Records in TABLE A that are not in TABLE B
Records in TABLE B that are not in TABLE A
With that information, I can record the proper historical data I need to. It will show me where the value has changed, or where a label was added or deleted......you can say TABLE A is the "new" set of data, and TABLE B is the "old" set of data. So I can see what is being added, what was deleted, and what was changed.
Been trying with UNION & MINUS, but no luck yet.
Something like:
A LABEL A VALUE B LABEL B VALUE
---------------------------------------
XXX 5 XXX 3
YYY 2
ZZZ 4
WWW 7 WWW 8
If the labels and values are the same, I do not need them in the result set.
Here is one way (and possibly the most efficient way) to solve this problem. The main part is the subquery that does a UNION ALL and GROUP BY on the result, keeping only groups consisting of a single row. (The groups with two rows are those where the same row exists in both tables.) This method was invented by Marco Stefanetti - first discussed on the AskTom discussion board. The benefit of this approach - over the more common "symmetric difference" approach - is that each base table is read just once, not twice.
Then, to put the result in the desired format, I use a PIVOT operation (available since Oracle 11.1); in earlier versions of Oracle, the same can be done with a standard aggregate outer query.
Note that I modified the inputs to show the handling of NULL in the VALUE column also.
Important: This solution assumes LABEL is primary key in both tables; if not, it's not clear how the required output would even make sense.
with
table_a ( label, value ) as (
select 'AAA', 3 from dual
union all select 'CCC', null from dual
union all select 'XXX', 5 from dual
union all select 'WWW', 7 from dual
union all select 'YYY', 2 from dual
union all select 'HHH', null from dual
),
table_b ( label, value ) as (
select 'ZZZ', 4 from dual
union all select 'AAA', 3 from dual
union all select 'HHH', null from dual
union all select 'WWW', 8 from dual
union all select 'XXX', 3 from dual
union all select 'CCC', 1 from dual
)
-- End of test data (NOT PART OF THE SOLUTION!) SQL query begins below this line.
select a_label, a_value, b_label, b_value
from (
select max(source) as source, label as lbl, label, value
from (
select 'A' as source, label, value
from table_a
union all
select 'B' as source, label, value
from table_b
)
group by label, value
having count(*) = 1
)
pivot ( max(label) as label, max(value) as value for source in ('A' as a, 'B' as b) )
;
Output:
A_LABEL A_VALUE B_LABEL B_VALUE
------- ------- ------- -------
YYY 2
CCC CCC 1
WWW 7 WWW 8
ZZZ 4
XXX 5 XXX 3

How to get minimum unused number from a column in Oracle?

In my database I have a table with column that indicates the code of each record ( aside from ID column ). this field is unique and each time the user tries to insert a record into the table, the first unused code should be assigned to the record. Now the table has the column of codes with the following order :
+------+
code
+------+
1
+------+
2
+------+
3
+------+
5
+------+
I want a query to return 4 as the result.
Note that this query is highly frequent in my system and the best query with minimum execution time will be appreciated.
Is using a self-join acceptable? If so:
-- your test data:
WITH data AS (SELECT 1 AS code FROM DUAL
UNION SELECT 2 FROM DUAL
UNION SELECT 3 FROM DUAL
UNION SELECT 5 FROM DUAL)
-- request:
SELECT COALESCE(MIN(d1.code+1),1)
FROM data d1 LEFT JOIN data d2 ON d1.code+1 = d2.code
WHERE d2.code IS NULL;
This will build the list of data.code without a successor. And using MIN(...+1) you will get the first empty slot. I used COALESCE(...) in order to handle the specific case where there isn't any entry in the data table.
And alternate form using a sequence generator might lead to better performances as is does not require the whole table to be traversed in order to perform the aggregate function MIN():
-- your test data:
WITH data AS (SELECT 1 AS code FROM DUAL
UNION SELECT 5 FROM DUAL
UNION SELECT 2 FROM DUAL
UNION SELECT 3 FROM DUAL)
-- request:
SELECT T.code FROM (SELECT d1.code
FROM (SELECT LEVEL code FROM DUAL CONNECT BY LEVEL < 9999) d1 LEFT JOIN data d2
ON d1.code = d2.code
WHERE d2.code IS NULL
ORDER BY d1.code ASC
) T WHERE ROWNUM < 2
The drawback is you now have an upper limit hard-coded. It might be dynamically inferred from the data table though. So is is not really blocking. I let you compare timings yourself.
this field is unique and each time the user tries to insert a record into the table, the first unused code should be assigned to the record
Please note however this will lead to a race condition if two concurrent sessions try to insert a row at the same time. Given your example, they will both try to insert a row with code = 4 -- obviously both will not succeed in doing so as your column is unique...
I recently use the code below:
SELECT t1.id+1
FROM table t1
LEFT OUTER JOIN table t2 ON (t1.id + 1 = t2.id)
WHERE t2.id IS NULL
/* and rownum = 1 Need to use a sub select if you want this to work */
ORDER BY t1.id;
I run it every time that I need to insert a new row and use the minimum unused id.
I hope it works for your purposes.
select level unusedval from dual connect by level < 10
minus
select tno from t2);
you can change level condition dependents on max value.

How to group by category, year so as to include zero sums for each year per category?

My imaginary results would look like:
Category | Year | sum |
--------- ------ --------
A 2008 200
A 2009 0
B 2008 100
B 2009 5
... ... ...
i.e. the sum of the transactions per year and per category.
There are cases where a category does not have any transaction for one year. in those cases the 2nd line of the results will not appear. How do I have to re-write the above query in order to include 2008, 2009 for every category?
select category, to_char(trans_date, 'YYYY') year, sum(trans_value)
from transaction
group by category, to_char(trans_date, 'YYYY')
order by 1, 2;
With a partitioned outer join, you don't need a categories table.
I used the same transactions table as "dcp" used:
SQL> create table transactions
2 ( category varchar(1)
3 , trans_date date
4 , trans_value number(25,8)
5 );
Table created.
SQL> insert into transactions values ('A',to_date('2008-01-01','yyyy-mm-dd'),100.0);
1 row created.
SQL> insert into transactions values ('A',to_date('2008-02-01','yyyy-mm-dd'),100.0);
1 row created.
SQL> insert into transactions values ('B',to_date('2008-01-01','yyyy-mm-dd'),50.0);
1 row created.
SQL> insert into transactions values ('B',to_date('2008-02-01','yyyy-mm-dd'),50.0);
1 row created.
SQL> insert into transactions values ('B',to_date('2009-08-01','yyyy-mm-dd'),5.0);
1 row created.
For the partitioned outer join you only need a set of years to partition outer join against. In the query below I used 2 years (2008 and 2009), but you can easily adjust that set.
SQL> with the_years as
2 ( select 2007 + level year
3 , trunc(to_date(2007 + level,'yyyy'),'yy') start_of_year
4 , trunc(to_date(2007 + level + 1,'yyyy'),'yy') - interval '1' second end_of_year
5 from dual
6 connect by level <= 2
7 )
8 select t.category "Category"
9 , y.year "Year"
10 , nvl(sum(t.trans_value),0) "sum"
11 from the_years y
12 left outer join transactions t
13 partition by (t.category)
14 on (t.trans_date between y.start_of_year and y.end_of_year)
15 group by t.category
16 , y.year
17 order by t.category
18 , y.year
19 /
Category Year sum
-------- ---------- ----------
A 2008 200
A 2009 0
B 2008 100
B 2009 5
4 rows selected.
Also note that I used start_of_year and end_of_year, so if you want to filter on trans_date and you have an index on that column, it could be used. Another option is to simply use trunc(t.trans_date) = y.year as on-condition.
Hope this helps.
Regards,
Rob.
You ideally need a table of categories and a table of years:
select c.category, y.year, nvl(sum(t.trans_value),0)
from categories c
cross join years y
left outer join transaction t
on to_char(t.trans_date, 'YYYY') = y.year
and t.category = c.category
group by c.category, y.year
order by 1, 2;
Hopefully you do have a table of categories, but you may well not have a table of years, in which case you can "fake" one like this:
with years as
( select 2007+rownum year
from dual
connect by rownum < 10) -- returns 2008, 2009, ..., 2017
select c.category, y.year, nvl(sum(t.trans_value),0)
from categories c
cross join years y
left outer join transaction t
on to_char(t.trans_date, 'YYYY') = y.year
and t.category = c.category
group by c.category, y.year
order by 1, 2;
Here's a complete, working example:
CREATE TABLE transactions (CATEGORY VARCHAR(1), trans_date DATE, trans_value NUMBER(25,8));
CREATE TABLE YEAR (YEAR NUMBER(4));
CREATE TABLE categories (CATEGORY VARCHAR(1));
INSERT INTO categories VALUES ('A');
INSERT INTO categories VALUES ('B');
INSERT INTO transactions VALUES ('A',to_date('2008-01-01','YYYY-MM-DD'),100.0);
INSERT INTO transactions VALUES ('A',to_date('2008-02-01','YYYY-MM-DD'),100.0);
INSERT INTO transactions VALUES ('B',to_date('2008-01-01','YYYY-MM-DD'),50.0);
INSERT INTO transactions VALUES ('B',to_date('2008-02-01','YYYY-MM-DD'),50.0);
INSERT INTO transactions VALUES ('B',to_date('2009-08-01','YYYY-MM-DD'),5.0);
INSERT INTO YEAR VALUES (2008);
INSERT INTO YEAR VALUES (2009);
SELECT b.category
, b.year
, SUM(nvl(a.trans_value,0))
FROM (SELECT to_char(a.trans_date,'YYYY') YEAR
, CATEGORY
, SUM(NVL(trans_value,0)) trans_value
FROM transactions a
GROUP BY to_char(a.trans_date,'YYYY')
, a.category ) a
, (SELECT
DISTINCT a.category
, b.year
FROM categories a
, YEAR b ) b
WHERE b.year = to_char(a.year(+))
AND b.category = a.category(+)
GROUP BY
b.category
, b.year
ORDER BY 1
,2;
Output:
CATEGORY YEAR SUM(NVL(A.TRANS_VALUE,0))
1 A 2008 200
2 A 2009 0
3 B 2008 100
4 B 2009 5

Resources