How to compare two sets of rows in Oracle? - oracle

So, the problem is that i have two results (eg. number):
RES1:
10
11
RES2:
10
13
I need to compare those like if RES1 in RES2 and RES2 in RES1.
I would like to have result like:
RES3:
11
13
How do i do that?
I tried
RES1 MINUS RES2
UNION
RES2 MINUS RES1
but this approach is very slow, becouse my table contains milions of rows...

Why not to use one of supplied packages. DBMS_COMPARISON
The Package allows to compare and sync tables. It's only required that tables have an index.
1) create diff datasets
create table to_compare2 as (select OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, case when mod(object_id,18) = 0 then CREATED +1 else CREATED end CREATED from all_objects where mod(object_id,6) = 0 );
CREATE table to_compare1 as (SELECT OBJECT_NAME, SUBOBJECT_NAME, OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, case when mod(object_id,12) = 0 then CREATED +1 else CREATED end CREATED FROM ALL_OBJECTS where mod(object_id,3) = 0 );
2) create indexes.
CREATE UNIQUE INDEX to_compare1_idx on to_compare1(object_id);
CREATE UNIQUE INDEX to_compare2_idx on to_compare2(object_id);
3) Prepare comparision context
BEGIN
DBMS_COMPARISON.create_comparison (
comparison_name => 'MY_COMPARISION',
schema_name => user,
object_name => 'to_compare1',
dblink_name => NULL,
remote_schema_name => null,
remote_object_name => 'to_compare2');
END;
/
4) Execute comparison and check results.
DECLARE
v_scan_info DBMS_COMPARISON.comparison_type;
v_result BOOLEAN;
BEGIN
v_result := DBMS_COMPARISON.compare (
comparison_name => 'MY_COMPARISION',
scan_info => v_scan_info,
perform_row_dif => TRUE
);
IF NOT v_result THEN
DBMS_OUTPUT.put_line('Differences. scan_id=' || v_scan_info.scan_id);
ELSE
DBMS_OUTPUT.put_line('No differences.');
END IF;
END;
/
4) Results
SELECT *
FROM user_comparison_row_dif
WHERE comparison_name = 'MY_COMPARISION';
if local_rowid is not null and remote_rowid is null -> record exit in table_1
if local_rowid is null and remote_rowid is not null -> record exit in table_2
if local_rowid is not null and remote_rowid is not null -> record exist in both tables but it has different values

solutin 1:
try UNION ALL instead of UNION.
why UNION ALL is better then UNION you can read here: What is the difference between UNION and UNION ALL?
solutin 2:
you can try to use full outer join
select coalesce(a.id,b.id)
from a
full outer join b on a.id = b.id
where a.id is null
or b.id is null
Example: http://www.sqlfiddle.com/#!4/88f81/3

Are those Values unique in RES1 or RES2? Then you could try counting:
SELECT col
FROM (
SELECT col FROM RES1
UNION ALL
SELECT col FROM RES2
)
GROUP BY col
HAVING COUNT(1) = 1
If it is not unique, you'd have to add a distinct on both sides of the union, which makes it a lot slower

Related

Oracle delete from tableA where a duplicate row is in tableB

As the title says, I am looking for a way to remove all rows from TableA where there is a matching row in TableB.
the Tables A & B have about 30 columns in them so a WHERE A.col1 = B.col1 etc would be a little problematical. Ideally I was hoping for something like
DELETE FROM tableA WHERE IN TableB
(overly simplified by this type of thing)
IN clause can compare all columns returned from select
DELETE FROM tableA WHERE ( col1,col2,col3,.. ) IN ( select col1,col2,col3... FROM TableB );
The brute force way to establish if two records from each table are the same is to just compare every column:
DELETE
FROM tableA a
WHERE EXISTS (SELECT 1 FROM tableB b WHERE a.col1 = b.col1 AND a.col2 = b.col2 AND ...
a.col30 = b.col30);
You could create function which checks structures of tables and, if they are the same, creates string containing correct conditions to compare.
For example here are two tables:
create table t1 (id, name, age) as (
select 1, 'Tom', 67 from dual union all
select 2, 'Tia', 42 from dual union all
select 3, 'Bob', 16 from dual );
create table t2 (id, name, age) as (
select 1, 'Tom', 51 from dual union all
select 3, 'Bob', 16 from dual );
Now use function:
select generate_condition('T1', 'T2') from dual;
result:
T1.ID = T2.ID and T1.NAME = T2.NAME and T1.AGE = T2.AGE
Copy this, paste and run delete query:
delete from t1 where exists (select 1 from t2 where <<PASTE_HERE>>)
Here is the function, adjust it if needed. I used user_tab_columns so if tables are on different schemas you need all_tab_columns and compare owners too. If you have Oracle 11g you can replace loop with listagg(). Second table has to contain all columns of first table and they have to be same type and length.
create or replace function generate_condition(i_t1 in varchar2, i_t2 in varchar2)
return varchar2 is
v varchar2(1000) := '';
begin
for rec in (select column_name, u2.column_id
from user_tab_cols u1
left join (select * from user_tab_cols where table_name = i_t2) u2
using (column_name, data_type, data_length)
where u1.table_name = i_t1 order by u1.column_id)
loop
if rec.column_id is null then
v := 'ERR: incompatible structures';
goto end_loop;
end if;
v := v||' and '||i_t1||'.'||rec.column_name
||' = '||i_t2||'.'||rec.column_name;
end loop;
<< end_loop >>
return(ltrim(v, ' and '));
end;
If you want to avoid running process manually you need dynamic PL/SQL.
create table tableA (a NUMBER, b VARCHAR2(5), c INTEGER);
create table tableB (a NUMBER, b VARCHAR2(5), c INTEGER);
As you said
WHERE A.col1 = B.col1 etc would be a little problematical
you could intersect the tables and mention all columns from tableA one time, like this:
delete tableA
where (a,b,c) in (select * from tableA
intersect
select * from tableB);

How to retrieve only columns which have at least one not null value in any row in Oracle

I have table structure and data as below
https://ibb.co/mkGp67
I want a SQL Query to retrieve data only for those columns which have at least one not null value in it, in above case i want data comes out to be
https://ibb.co/mz9967
i.e. i don't need column Col2, Col5 and Col6, also which column having all null value is not fixed.
Please let me know the SQL query which retreive data that having only those column which having not null value with data as above.
As far, as I know, you will not be able to achieve this with an SQL query. One of the strong assumptions of SELECT statements is that the list of returned columns is static - defined in the query, not by the data. Even for PIVOT queries (available - as far, as I know - since Oracle 11), the list of columns is defined in the query, by providing a list of values to be converted to columns has to be explicitly given.
What you are looking for is some kind of code, dynamically generating the query. This can be PL/SQL, returning cursor references or any application code.
Edit:
What you could do with a query, is to have a clear information on which columns do contain nulls, which do not, etc. It could look something like this:
SELECT CASE
WHEN COUNT(*) = 0 THEN 'no rows'
WHEN COUNT(Col1) = 0 THEN 'all NULLs'
WHEN COUNT(Col1) = COUNT(*) THEN 'no NULLs'
ELSE 'some NULLs'
END Col1NullStatus,
CASE
WHEN COUNT(*) = 0 THEN 'no rows'
WHEN COUNT(Col2) = 0 THEN 'all NULLs'
WHEN COUNT(Col2) = COUNT(*) THEN 'no NULLs'
ELSE 'some NULLs'
END Col2NullStatus,
CASE
WHEN COUNT(*) = 0 THEN 'no rows'
WHEN COUNT(Col3) = 0 THEN 'all NULLs'
WHEN COUNT(Col3) = COUNT(*) THEN 'no NULLs'
ELSE 'some NULLs'
END Col3NullStatus,
CASE
WHEN COUNT(*) = 0 THEN 'no rows'
WHEN COUNT(Col4) = 0 THEN 'all NULLs'
WHEN COUNT(Col4) = COUNT(*) THEN 'no NULLs'
ELSE 'some NULLs'
END Col4NullStatus,
CASE
WHEN COUNT(*) = 0 THEN 'no rows'
WHEN COUNT(Col5) = 0 THEN 'all NULLs'
WHEN COUNT(Col5) = COUNT(*) THEN 'no NULLs'
ELSE 'some NULLs'
END Col5NullStatus,
CASE
WHEN COUNT(*) = 0 THEN 'no rows'
WHEN COUNT(Col6) = 0 THEN 'all NULLs'
WHEN COUNT(Col6) = COUNT(*) THEN 'no NULLs'
ELSE 'some NULLs'
END Col6NullStatus
FROM myTable
See SQL Fiddle for the above.
Edit 2:
And the output of this query would look something like this:
Col1NullStatus | Col2NullStatus | Col3NullStatus | Col4NullStatus | Col5NullStatus | Col6NullStatus
---------------+----------------+----------------+----------------+----------------+----------------
no NULLs | all NULLs | some NULLs | no NULLs | all NULLs | all NULLs
This is the format, you could be using, to post your input data and expected results.
So, since you give no no formal table structure, and you seem to be confusing numbers and chars(s), I will do my best to try and make a query that will at least produce the results you want.
create table foo as (
col1 varchar(10),
col2 varchar(10),
col3 varchar(10),
col4 varchar(10),
col5 varchar(10),
col6 varchar(10)
);
select *
CASE cust1 WHEN null then 'null' else cust1 as cust1 end,
CASE cust2 WHEN null then 'null' else cust1 as cust1 end,
CASE cust3 WHEN null then 'null' else cust1 as cust1 end,
CASE cust4 WHEN null then 'null' else cust1 as cust1 end,
CASE cust5 WHEN null then 'null' else cust1 as cust1 end,
CASE cust6 WHEN null then 'null' else cust1 as cust1 end
from foo ;
As per below query , I able to get not null columns at row-level col1,col3 and col4.
Query :
select 'col1' as "Name",col1 from temp
where exists (select 1
from temp
group by to_char(col1)
having (count(to_char(col1)))> 0)
union all
select 'col2' as "Name",to_char(col2) from temp
where exists (select 1
from temp
group by to_char(col2)
having (count(to_char(col2)))> 0)
union all
select 'col3' as "Name" , to_char(col3) from temp
where exists (select 1
from temp
group by to_char(col3)
having (count(to_char(col3)))> 0)
union all
select 'col4'as "Name" , to_char(col4) from temp
where exists (select 1
from temp
group by to_char(col4)
having (count(to_char(col4)))> 0)
union all
select 'col5' as "Name" , to_char(col5) from temp
where exists (select 1
from temp
group by to_char(col5)
having (count(to_char(col5)))> 0)
union all
select 'col6' as "Name" , to_char(col6) from temp
where exists (select 1
from temp
group by to_char(col6)
having (count(to_char(col6)))> 0)
output:
col1 A
col1 B
col1 C
col1 D
col3 10
col3 20
col3 -
col3 10
col4 12
col4 23
col4 34
col4 43
I tried to make this output of rows to columns but I couldn't make it in single query ... Hope this will be helpful ...
I would do this usually in three steps.
Firstly, make sure that the table statistics are up to date. Check if last_analyzed is later than the last change to the table.
SELECT last_analyzed FROM user_tables WHERE table_name = 'MYTABLE';
If in doubt, update the statistics with
BEGIN dbms_stats.gather_table_stats('MYSCHEMA','MYTABLE'); END;
/
Now, the view user_tab_columns has a column num_nulls. This is the number of rows where this column is NULL. If the value is the same than the number of rows in the table, all rows are NULL. This can be used to let Oracle generate the required SQL:
WITH
qtab AS (SELECT owner, table_name, num_rows
FROM all_tables
WHERE owner='SCOTT' -- change to your schema
AND table_name='EMPLOYEES' -- change to your table name
),
qcol AS (SELECT owner, table_name, column_name, column_id
FROM qtab t
JOIN all_tab_columns c USING (owner, table_name)
WHERE c.nullable = 'N' -- protected by NOT NULL constraint
OR c.num_nulls = 0 -- never NULL
OR c.num_nulls < t.num_rows -- at least 1 row is NOT NULL
)
)
SELECT 'SELECT '||LISTAGG(column_name,',') WITHIN GROUP (ORDER BY column_id)||
' FROM '||owner||'.'||table_name||';' AS my_query
FROM qcol
GROUP BY owner, table_name;
This will output a query like
SELECT col1, col3, col4, col5 FROM myschema.mytable;
This query can now be executed to show the column values.

How to use 'EXIST' in a simple oracle query

I have a table called ‘MainTable’ with following data
Another table called ‘ChildTable’ with following data (foreighn key Number)
Now I want to fetch those records from ‘ChildTable’ if there exists at least one ‘S’ status.
But if any other record for this number id ‘R’ then I don’t want to fetch it
Something like this-
I tried following
Select m.Number, c.Status from MainTable m, ChildTable c
where EXISTS (SELECT NULL
FROM ChildTable c2
WHERE c2.status =’S’ and c2.status <> ‘R’
AND c2.number = m.number)
But here I am getting record having ‘R’ status also, what I am doing wrong?
You can try something like this
select num, status
from
(select id, num, status,
sum(decode(status, 'R', 1, 0)) over (partition by num) Rs,
sum(decode(status, 'S', 1, 0)) over (partition by num) Ss
from child_table) t
where t.Rs = 0 and t.Ss >= 1
-- and status = 'S'
Here is a sqlfiddle demo
The child records with 'R' might be associated with a maintable record that also has another child record with status 'S' -- that is what your query is asking for.
Select
m.Number,
c.Status
from MainTable m
join ChildTable c on c.number = m.number
where EXISTS (
SELECT NULL
FROM ChildTable c2
WHERE c2.status =’S’
AND c2.number = m.number) and
NOT EXISTS (
SELECT NULL
FROM ChildTable c2
WHERE c2.status =’R’
AND c2.number = m.number)
WITH ChildrenWithS AS (
SELECT Number
FROM ChildTable
WHERE Status = 'S'
)
,ChildrenWithR AS (
SELECT Number
FROM ChildTable
WHERE Status = 'R'
)
SELECT MaintTable.Number
,ChildTable.Status
FROM MainTable
INNER JOIN ChildTable
ON MainTable.Number = ChildTable.Number
WHERE MainTable.Number IN (SELECT Number FROM ChildrenWithS)
AND MainTable.Number NOT IN (SELECT Number FROM ChildrenWithR)

How to compare items in an array to those in a database column using regular expressions?

I'm trying to take a list of elements in an array like this:
['GRADE', 'GRATE', 'GRAPE', /*About 1000 other entries here ...*/ ]
and match them to their occurrences in a column in an Oracle database full of entries like this:
1|'ANTERIOR'
2|'ANTEROGRADE'
3|'INGRATE'
4|'RETROGRADE'
5|'REIGN'
...|...
/*About 1,000,000 other entries here*/
For each entry in that array of G words, I'd like to loop through the word column of the Oracle database and try to find the right-sided matches for each entry in the array. In this example, entries 2, 3, and 4 in the database would all match.
In any other programming language, it would look something like this:
for entry in array:
for each in column:
if entry.right_match(each):
print entry
How do I do this in PL/SQL?
In PL/SQL it can be done in this way:
declare
SUBTYPE my_varchar2_t IS varchar2( 100 );
TYPE Roster IS TABLE OF my_varchar2_t;
names Roster := Roster( 'GRADE', 'GRATE', 'GRAPE');
begin
FOR c IN ( SELECT id, name FROM my_table )
LOOP
FOR i IN names.FIRST .. names.LAST LOOP
IF regexp_like( c.name, names( i ) ) THEN
DBMS_OUTPUT.PUT_LINE( c.id || ' ' || c.name );
END IF;
END LOOP;
END LOOP;
end;
/
but this is row by row processing, for large table it would be very slow.
I think it might be better to do it in a way shown below:
create table test123 as
select 1 id ,'ANTERIOR' name from dual union all
select 2,'ANTEROGRADE' from dual union all
select 3,'INGRATE' from dual union all
select 4,'RETROGRADE' from dual union all
select 5,'REIGN' from dual ;
create type my_table_typ is table of varchar2( 100 );
/
select *
from table( my_table_typ( 'GRADE', 'GRATE', 'GRAPE' )) x
join test123 y on regexp_like( y.name, x.column_value )
;
COLUMN_VALUE ID NAME
------------- ---------- -----------
GRADE 2 ANTEROGRADE
GRATE 3 INGRATE
GRADE 4 RETROGRADE

how to make selecting random rows in oracle faster with table with millions of rows

Is there a way to make selecting random rows faster in oracle with a table that has million of rows. I tried to use sample(x) and dbms_random.value and its taking a long time to run.
Thanks!
Using appropriate values of sample(x) is the fastest way you can. It's block-random and row-random within blocks, so if you only want one random row:
select dbms_rowid.rowid_relative_fno(rowid) as fileno,
dbms_rowid.rowid_block_number(rowid) as blockno,
dbms_rowid.rowid_row_number(rowid) as offset
from (select rowid from [my_big_table] sample (.01))
where rownum = 1
I'm using a subpartitioned table, and I'm getting pretty good randomness even grabbing multiple rows:
select dbms_rowid.rowid_relative_fno(rowid) as fileno,
dbms_rowid.rowid_block_number(rowid) as blockno,
dbms_rowid.rowid_row_number(rowid) as offset
from (select rowid from [my_big_table] sample (.01))
where rownum <= 5
FILENO BLOCKNO OFFSET
---------- ---------- ----------
152 2454936 11
152 2463140 32
152 2335208 2
152 2429207 23
152 2746125 28
I suspect you should probably tune your SAMPLE clause to use an appropriate sample size for what you're fetching.
Start with Adam's answer first, but if SAMPLE just isn't fast enough, even with the ROWNUM optimization, you can use block samples:
....FROM [table] SAMPLE BLOCK (0.01)
This applies the sampling at the block level instead of for each row. This does mean that it can skip large swathes of data from the table so the sample percent will be very rough. It's not unusual for a SAMPLE BLOCK with a low percentage to return zero rows.
Here's the same question on AskTom:
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6075151195522
If you know how big your table is, use sample block as described above. If you don't, you can modify the routine below to get however many rows you want.
Copied from: http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6075151195522#56174726207861
create or replace function get_random_rowid
( table_name varchar2
) return urowid
as
sql_v varchar2(100);
urowid_t dbms_sql.urowid_table;
cursor_v integer;
status_v integer;
rows_v integer;
begin
for exp_v in -6..2 loop
exit when (urowid_t.count > 0);
if (exp_v < 2) then
sql_v := 'select rowid from ' || table_name
|| ' sample block (' || power(10, exp_v) || ')';
else
sql_v := 'select rowid from ' || table_name;
end if;
cursor_v := dbms_sql.open_cursor;
dbms_sql.parse(cursor_v, sql_v, dbms_sql.native);
dbms_sql.define_array(cursor_v, 1, urowid_t, 100, 0);
status_v := dbms_sql.execute(cursor_v);
loop
rows_v := dbms_sql.fetch_rows(cursor_v);
dbms_sql.column_value(cursor_v, 1, urowid_t);
exit when rows_v != 100;
end loop;
dbms_sql.close_cursor(cursor_v);
end loop;
if (urowid_t.count > 0) then
return urowid_t(trunc(dbms_random.value(0, urowid_t.count)));
end if;
return null;
exception when others then
if (dbms_sql.is_open(cursor_v)) then
dbms_sql.close_cursor(cursor_v);
end if;
raise;
end;
/
show errors
Below Solution to this question is not the exact answer but in many scenarios you try to select a row and try to use it for some purpose and then update its status with "used" or "done" so that you do not select it again.
Solution:
Below query is useful but that way if your table is large, I just tried and see that you definitely face performance problem with this query.
SELECT * FROM
( SELECT * FROM table
ORDER BY dbms_random.value )
WHERE rownum = 1
So if you set a rownum like below then you can work around the performance problem. By incrementing rownum you can reduce the possiblities. But in this case you will always get rows from the same 1000 rows. If you get a row from 1000 and update its status with "USED", you will almost get different row everytime you query with "ACTIVE"
SELECT * FROM
( SELECT * FROM table
where rownum < 1000
and status = 'ACTIVE'
ORDER BY dbms_random.value )
WHERE rownum = 1
update the rows status after selecting it, If you can not update that means another transaction has already used it. Then You should try to get a new row and update its status. By the way, getting the same row by two different transaction possibility is 0.001 since rownum is 1000.
Someone told sample(x) is the fastest way you can.
But for me this method works slightly faster than sample(x) method.
It should take fraction of the second (0.2 in my case) no matter what is the size of the table. If it takes longer try to use hints (--+ leading(e) use_nl(e t) rowid(t)) can help
SELECT *
FROM My_User.My_Table
WHERE ROWID = (SELECT MAX(t.ROWID) KEEP(DENSE_RANK FIRST ORDER BY dbms_random.value)
FROM (SELECT o.Data_Object_Id,
e.Relative_Fno,
e.Block_Id + TRUNC(Dbms_Random.Value(0, e.Blocks)) AS Block_Id
FROM Dba_Extents e
JOIN Dba_Objects o ON o.Owner = e.Owner AND o.Object_Type = e.Segment_Type AND o.Object_Name = e.Segment_Name
WHERE e.Segment_Name = 'MY_TABLE'
AND(e.Segment_Type, e.Owner, e.Extent_Id) =
(SELECT MAX(e.Segment_Type) AS Segment_Type,
MAX(e.Owner) AS Owner,
MAX(e.Extent_Id) KEEP(DENSE_RANK FIRST ORDER BY Dbms_Random.Value) AS Extent_Id
FROM Dba_Extents e
WHERE e.Segment_Name = 'MY_TABLE'
AND e.Owner = 'MY_USER'
AND e.Segment_Type = 'TABLE')) e
JOIN My_User.My_Table t
ON t.Rowid BETWEEN Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 0)
AND Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 32767))
Version with retries when no rows returned:
WITH gen AS ((SELECT --+ inline leading(e) use_nl(e t) rowid(t)
MAX(t.ROWID) KEEP(DENSE_RANK FIRST ORDER BY dbms_random.value) Row_Id
FROM (SELECT o.Data_Object_Id,
e.Relative_Fno,
e.Block_Id + TRUNC(Dbms_Random.Value(0, e.Blocks)) AS Block_Id
FROM Dba_Extents e
JOIN Dba_Objects o ON o.Owner = e.Owner AND o.Object_Type = e.Segment_Type AND o.Object_Name = e.Segment_Name
WHERE e.Segment_Name = 'MY_TABLE'
AND(e.Segment_Type, e.Owner, e.Extent_Id) =
(SELECT MAX(e.Segment_Type) AS Segment_Type,
MAX(e.Owner) AS Owner,
MAX(e.Extent_Id) KEEP(DENSE_RANK FIRST ORDER BY Dbms_Random.Value) AS Extent_Id
FROM Dba_Extents e
WHERE e.Segment_Name = 'MY_TABLE'
AND e.Owner = 'MY_USER'
AND e.Segment_Type = 'TABLE')) e
JOIN MY_USER.MY_TABLE t ON t.ROWID BETWEEN Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 0)
AND Dbms_Rowid.Rowid_Create(1, Data_Object_Id, Relative_Fno, Block_Id, 32767))),
Retries(Cnt, Row_Id) AS (SELECT 1, gen.Row_Id
FROM Dual
LEFT JOIN gen ON 1=1
UNION ALL
SELECT Cnt + 1, gen.Row_Id
FROM Retries
LEFT JOIN gen ON 1=1
WHERE Retries.Row_Id IS NULL AND Retries.Cnt < 10)
SELECT *
FROM MY_USER.MY_TABLE
WHERE ROWID = (SELECT Row_Id
FROM Retries
WHERE Row_Id IS NOT NULL)
Can you use pseudorandom rows?
select * from (
select * from ... where... order by ora_hash(rowid)
) where rownum<100

Resources