Related
I would like to query an Oracle DB table for the number of rows containing each distinct value in a CLOB column.
This returns all rows containing a value:
select * from mytable where dbms_lob.instr(mycol,'value') > 0;
Using DBMS_LOB, this returns the number of rows containing that value:
select count(*) from mytable where dbms_lob.instr(mycol,'value') > 0;
But is it possible to query for the number of times (rows in which) each distinct value appears?
Depending on what that column really contains, see whether TO_CHAR helps.
SQL> create table mytable (mycol clob);
Table created.
SQL> insert into mytable
2 select 'Query to count distinct values' from dual union all
3 select 'I have no idea which values are popular' from dual;
2 rows created.
SQL> select count(*), to_char(mycol) toc
2 from mytable
3 where dbms_lob.instr(mycol,'value') > 0
4 group by to_char(mycol);
COUNT(*) TOC
---------- ----------------------------------------
1 Query to count distinct values
1 I have no idea which values are popular
SQL>
If your CLOB values are more than 4000 bytes (and if not, why are they CLOBs?) then it's not perfect - collisions are possible, if unlikely - but you could hash the CLOB values.
If you want to count the number of distinct values:
select count(distinct dbms_crypto.hash(src=>mycol, typ=>2))
from mytable
where dbms_lob.instr(mycol,'value') > 0;
If you want to count how many times each distinct value appears:
select mycol, cnt
from (
select mycol,
count(*) over (partition by dbms_crypto.hash(src=>mycol, typ=>2)) as cnt,
row_number() over (partition by dbms_crypto.hash(src=>mycol, typ=>2) order by null) as rn
from mytable
where dbms_lob.instr(mycol,'value') > 0
)
where rn = 1;
Both are likely to be fairly expensive and slow with a lot of data.
(typ=>2 gives the numeric value for dbms_crypto.hash_md5, as you can't refer to the package constant in a SQL call, at least up to 12cR1...)
Rather more crudely, but possibly significantly quicker, you could base the count on the just the first 4000 characters - which may or may not be plausible for your actual data:
select count(distinct dbms_lob.substr(mycol, 4000, 1))
from mytable
where dbms_lob.instr(mycol,'value') > 0;
select dbms_lob.substr(mycol, 4000, 1), count(*)
from mytable
where dbms_lob.instr(mycol,'value') > 0
group by dbms_lob.substr(mycol, 4000, 1);
Standard Oracle functions do not support distinction of CLOB values. But, if you have access to DBMS_CRYPTO.HASH function, you can compare CLOB hashes instead, and thus, get the desired output:
select myCol, h.num from
myTable t join
(select min(rowid) rid, count(rowid) num
from myTable
where dbms_lob.instr(mycol,'value') > 0
group by DBMS_CRYPTO.HASH(myCol, 3)) h
on t.rowid = h.rid;
Also, note, that there's a very little possibility of hash collision. But if that's ok with you, you can use this approach.
I have a function, which will get greatest of three dates from the table.
create or replace FUNCTION fn_max_date_val(
pi_user_id IN number)
RETURN DATE
IS
l_modified_dt DATE;
l_mod1_dt DATE;
l_mod2_dt DATE;
ret_user_id DATE;
BEGIN
SELECT MAX(last_modified_dt)
INTO l_modified_dt
FROM table1
WHERE id = pi_user_id;
-- this table contains a million records
SELECT nvl(MAX(last_modified_ts),sysdate-90)
INTO l_mod1_dt
FROM table2
WHERE table2_id=pi_user_id;
-- this table contains clob data, 800 000 records, the table 3 does not have user_id and has to fetched from table 2, as shown below
SELECT nvl(MAX(last_modified_dt),sysdate-90)
INTO l_mod2_dt
FROM table3
WHERE table2_id IN
(SELECT id FROM table2 WHERE table2_id=pi_user_id
);
execute immediate 'select greatest('''||l_modified_dt||''','''||l_mod1_dt||''','''||l_mod2_dt||''') from dual' into ret_user_id;
RETURN ret_user_id;
EXCEPTION
WHEN OTHERS THEN
return SYSDATE;
END;
this function works perfectly fine and executes within a second.
-- random user_id , just to test the functionality
SELECT fn_max_date_val(100) as max_date FROM DUAL
MAX_DATE
--------
27-02-14
For reference purpose i have used the table name as table1,table2 and table3 but my business case is similar to what i stated below.
I need to get the details of the table1 along with the highest modified date among the three tables.
I did something like this.
SELECT a.id,a.name,a.value,fn_max_date_val(id) as max_date
FROM table1 a where status_id ='Active';
The above query execute perfectly fine and got result in millisecods. But the problem came when i tried to use order by.
SELECT a.id,a.name,a.value,a.status_id,last_modified_dt,fn_max_date_val(id) as max_date
FROM table1 where status_id ='Active' a
order by status_id desc,last_modified_dt desc ;
-- It took almost 300 seconds to complete
I tried using index also all the values of the status_id and last_modified, but no luck. Can this be done in a right way?
How about if your query is like this?
select a.*, fn_max_date_val(id) as max_date
from
(SELECT a.id,a.name,a.value,a.status_id,last_modified_dt
FROM table1 where status_id ='Active' a
order by status_id desc,last_modified_dt desc) a;
What if you don't use the function and do something like this:
SELECT a.id,a.name,a.value,a.status_id,last_modified_dt x.max_date
FROM table1 a
(
select max(max_date) as max_date
from (
SELECT MAX(last_modified_dt) as max_date
FROM table1 t1
WHERE t1.id = a.id
union
SELECT nvl(MAX(last_modified_ts),sysdate-90) as max_date
FROM table2 t2
WHERE t2.table2_id=a.id
...
) y
) x
where a.status_id ='Active'
order by status_id desc,last_modified_dt desc;
Syntax might contain errors, but something like that + the third table in the derived table too.
I created a dummy database for learning purposes, and I purposefully created some duplicated records in one of the tables. In every case I want to flag one of the duplicated records as Latest='Y', and the other record as 'N', and for every single record the Latest flag would be 'Y'.
I tried to use PlSQL to go through all of my records, but when I try to use the previously calculated value (which would tell that its a duplicated record) it says that:
ORA-06550: line 20, column 17:
PLS-00201: identifier 'COUNTER' must be declared
Here is the statement I try to use:
DECLARE
CURSOR cur
IS
SELECT order_id, order_date, person_id,
amount, successfull_order, country_id, latest, ROWCOUNT AS COUNTER
FROM (SELECT order_id,
order_date,
person_id,
amount,
successfull_order,
country_id,
latest,
ROW_NUMBER () OVER (PARTITION BY order_id, order_date,
person_id, amount, successfull_order, country_id
ORDER BY order_id, order_date,
person_id, amount, successfull_order, country_id) ROWCOUNT
FROM orders) orders
FOR UPDATE OF orders.latest;
rec cur%ROWTYPE;
BEGIN
FOR rec IN cur
LOOP
IF MOD (COUNTER, 2) = 0
THEN
UPDATE orders
SET latest = 'N'
WHERE CURRENT OF cur;
ELSE
UPDATE orders
SET latest = 'Y'
WHERE CURRENT OF cur;
END IF;
END LOOP;
END;
I am new to PlSQL so I tried to modify the statements I found here:
http://www.adp-gmbh.ch/ora/plsql/cursors/for_update.html
What should I change in my statement, or should I use a different approach?
Thanks for your answers in advance!
Botond
Your refer the ROWNUM as COUNTER in your cursor.
While fetching, you should be accessing it from the cursor reference like MOD (rec.COUNTER, 2)
You need to declare the variable COUNTER and then you need to maintain (ie increment) it in your loop.
I suspect that you example is just for learning PL/SQL. However be aware that it's often much more performant to do things with a single SQL statement, as opposed to using cursor loops.
Your issue is that COUNTER is an attribute of the cursor record rec and not a PL/SQL variable. So:
IF MOD (COUNTER, 2) = 0
Should be:
IF MOD (rec.COUNTER, 2) = 0
However, you do not need to use PL/SQL or cursors, it can be done in a single MERGE statement:
Oracle Setup:
CREATE TABLE orders ( order_id, order_date, latest ) AS
SELECT 1, DATE '2017-01-01', CAST( NULL AS CHAR(1) ) FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-02', NULL FROM DUAL UNION ALL
SELECT 1, DATE '2017-01-03', NULL FROM DUAL UNION ALL
SELECT 2, DATE '2017-01-04', NULL FROM DUAL UNION ALL
SELECT 2, DATE '2017-01-01', NULL FROM DUAL UNION ALL
SELECT 3, DATE '2017-01-06', NULL FROM DUAL;
Update Statement:
MERGE INTO orders dst
USING ( SELECT ROW_NUMBER() OVER ( PARTITION BY order_id
ORDER BY order_date DESC ) AS rn
FROM orders
) src
ON ( src.ROWID = dst.ROWID )
WHEN MATCHED THEN
UPDATE SET latest = CASE src.rn WHEN 1 THEN 'Y' ELSE 'N' END;
Output:
SELECT * FROM orders;
ORDER_ID ORDER_DATE LATEST
-------- ---------- ------
1 2017-01-01 N
1 2017-01-02 N
1 2017-01-03 Y
2 2017-01-04 Y
2 2017-01-01 N
3 2017-01-06 Y
I just recently found out that subqueries are not allowed in INSERT statements that are inside stored procedures. This is my script:
begin
execute immediate 'truncate table itcustadm.GL_DTPJ_TEST2';
insert into GL_DTPJ_TEST2
(rule_no,
posted_by_user_id,
transaction_id,
transaction_sr_no,
dr_amount,
cr_amount,
tran_crncy_code,
bkdt_tran_flg,
bank_desc
)
select
tq.rule_no,
tq.posted_by_user_id,
tq.transaction_id,
tq.transaction_sr_no,
tq.dr_amount,
tq.cr_amount,
tq.tran_crncy_code,
tq.bkdt_tran_flg,
(select ent.bank_desc from crmuser.end ent where ent.bank_id = gam.bank_id);
But since the (select ent.bank_desc from crmuser.end ent where ent.bank_id = gam.bank_id) at the bottom of the SELECT statement is not allowed by Oracle, what's the best way to accomplish this?
I actually have this code right before the INSERT statement, but I don't know how to exactly use it:
get_bank_desc := '(select ent.bank_desc from crmuser.end ent ' ||
'where ent.bank_id = gam.bank_id)';
I am not sure what you are exactly trying for, but below code may be useful for you, you can achieve inserting a SubQuery output into a table using below query sample, but make sure output of the SubQuery is a single row o/p, so that you can escape from "ORA-01427: single-row SubQuery returns more than one row" ERROR.
insert into test_ins1
values(1,(SELECT COL2 FROM TEST_INS WHERE COL1=1 ));
Even then you can use rownum in where condition and take the single value.
Please let me know in case of any doubts
declare
bank_desc_temp bank_desk_type; /* the type defined in crmuser.ent for bank_desc*/
begin
select ent.bank_desc into bank_desc_temp from crmuser.end ent where ent.bank_id = gam.bank_id;
execute immediate 'truncate table itcustadm.GL_DTPJ_TEST2';
insert into GL_DTPJ_TEST2
(rule_no,
posted_by_user_id,
transaction_id,
transaction_sr_no,
dr_amount,
cr_amount,
tran_crncy_code,
bkdt_tran_flg,
bank_desc
)
select
tq.rule_no,
tq.posted_by_user_id,
tq.transaction_id,
tq.transaction_sr_no,
tq.dr_amount,
tq.cr_amount,
tq.tran_crncy_code,
tq.bkdt_tran_flg,
bank_desc_temp;
end;
When you say "not allowed" what do you mean? Did you get an error?
I ask, because subqueries are definitely allowed inside an insert as select statement, providing you have the syntax correct (and the subquery returns at most one row), e.g.:
create table test_tab (col1 number, col2 varchar2(10));
begin
insert into test_tab
select 1,
(select 'Yes' from dual d2 where d.dummy = d2.dummy)
from dual d;
commit;
end;
/
select * from test_tab;
COL1 COL2
---------- ----------
1 Yes
There are some syntax issues with the code you provided - where is the from clause, and where are the tq and gam aliases defined?
There are two syntax you can use in your insert statement:
(I)
INSERT INTO table_name( column1, column2....columnN)
VALUES ( value1, value2....valueN);
(II)
INSERT INTO table (column1, column2, ... )
SELECT expression1, expression2, ...
FROM source_table(s)
WHERE conditions;
In your example, you should choose the second approach:
insert into GL_DTPJ_TEST2 (rule_no,
posted_by_user_id,
transaction_id,
transaction_sr_no,
dr_amount,
cr_amount,
tran_crncy_code,
bkdt_tran_flg,
bank_desc
)
select tq.rule_no,
tq.posted_by_user_id,
tq.transaction_id,
tq.transaction_sr_no,
tq.dr_amount,
tq.cr_amount,
tq.tran_crncy_code,
tq.bkdt_tran_flg,
ent.bank_desc
from crmuser.gam
join crmuser.end ent
on ent.bank_id = gam.bank_id
;
basically, if you want to add records using an insert statement, you should use a full select statement first. Here is how I would do it:
(1)
select *
from table1;
(2)
select column1
,column2
,column3
from table1;
(3)
select t1.column1
,t1.column2
,t1.column3
,t2.column4
,t2.column5
from table1 t1
join table2 t2
on t2.id = t1.id
;
(4)
insert into table3 (col1
,col2
,col3
,col4
,col5)
select t1.column1
,t1.column2
,t1.column3
,t2.column4
,t2.column5
from table1 t1
join table2 t2
on t2.id = t1.id
;
Wondering if someone can help point me in the right direction with this challenge, or tell me I'm crazy for trying this via sql. If sql would be too challenging, are there any free or inexpensive tools that would help me automate this?
I'm working on testing some data between an old and new Oracle database. What I'd like to do is be able to dynamically generate this query for all tables in a schema.
Select Column_1, Column_2 FROM Table_1
MINUS
Select Column_1, Column_2 FROM Table_1#"OLD_SERVER"
One catch is that the columns selected for each table should only be columns that do not begin with 'ETL' since those are expected to change with the migration.
To keep this dynamic, can I use the all_tab_columns to loop through each table?
So for a simplified example, let's say this query returned the following results, and you can expect the results from ALL_TAB_COLUMNS to be identical between the OLD and NEW database:
select TABLE_NAME, COLUMN_NAME from ALL_TAB_COLUMNS where owner = 'OWNER1'
TABLE_NAME, COLUMN_NAME
-----------------------
TABLE1, COLUMN_1
TABLE1, COLUMN_2
TABLE1, ETLCOLUMN_3
TABLE2, COLUMN_A
TABLE2, COLUMN_B
TABLE2, ETLCOLUMN_C
How would I write a query that would run a minus between the same table and columns (that do not begin with ETL) on the old and new database, and output the results along with the table name and the date ran, and then loop through to the next table and do the same thing?
First - check out this:
http://docs.oracle.com/cd/E11882_01/server.112/e41481/spa_upgrade.htm#RATUG210
Second - you would like to write a query that issues a query - The problem is that in user_tab_columns each column is a row.
for doing that I would recommend you reading this : http://www.dba-oracle.com/t_converting_rows_columns.htm
The source table for you is USER_TAB_COLUMNS, and when running the query you can add a where that says "where column_name not like 'ETL%' etc.
After that - the query would look something like:
select 'select '
|| listagg..... (from the link) || 'from table name' sql
from user_tab_columns
where column_name not like 'ETL%'
and table_name = 'table name'
group by table_name
and btw - you're not crazy - before changing a system you need to be able to sign the upgrade will succeed - this is the only way to do it.
btw - if you'll describe in more depth the system and the upgrade - I'm sure the community will be able to help you find ways to test it in more depth, and will point you out to things to test.
Testing only the output is not enough in many cases....
GOOD LUCK!
This testing can automated with SQL and PL/SQL. You're not crazy for doing this. Comparison systems like this can be incredibly helpful for testing changes to complex systems. It's not as good as automated unit tests but it can significantly enhance the typical database testing.
The code below is a fully working example. But in the real world there are many gotchas that could easily take several days to resolve. For example, dealing with CLOBs, large tables, timestamps and sequence-based values, etc.
Sample schemas and data differences
create user schema1 identified by schema1;
create user schema2 identified by schema2;
alter user schema1 quota unlimited on users;
alter user schema2 quota unlimited on users;
--Data in 1, not 2.
create table schema1.table1 as select 1 a, 1 b from dual;
create table schema2.table1(a number, b number);
--Data in 2, not 1.
create table schema1.table2(a number, b number);
create table schema2.table2 as select 1 a, 1 b from dual;
--Same data in both, excluding unused column.
create table schema1.table3 as select 1 a, 1 b, 'asdf' ETL_c from dual;
create table schema2.table3 as select 1 a, 1 b, 'fdsa' ETL_c from dual;
--Table DDL difference.
create table schema1.table4(a number);
create table schema2.table4(b number);
--Privileges can be tricky.
grant select on dba_tab_columns to <your schema>;
Procedure to print differences script
create or replace procedure print_differences(
p_old_schema in varchar2,
p_new_schema in varchar2) authid current_user
is
v_table_index number := 0;
v_row_count number;
begin
--Print header information.
dbms_output.put_line('--Comparison between '||p_old_schema||' and '||
p_new_schema||', at '||to_char(sysdate, 'YYYY-MM-DD HH24:MI')||'.'||chr(10));
--Create a SQL statement to return the differences for each table.
for differences in
(
--Return number of differences and SQL statements to view them.
select
'
with old_table as (select '||column_list||' from '||p_old_schema||'.'||table_name||')
, new_table as (select '||column_list||' from '||p_new_schema||'.'||table_name||')
select * from
(
select ''OLD'' old_or_new, old_table.* from old_table minus
select ''OLD'' old_or_new, new_table.* from new_table
)
union all
select * from
(
select ''NEW'' old_or_new, new_table.* from new_table minus
select ''NEW'' old_or_new, old_table.* from old_table
)
' difference_sql, table_name
from
(
select table_name
,listagg(column_name, ',') within group (order by column_id) column_list
from dba_tab_columns
where owner = p_old_schema
and column_name not like 'ETL%'
group by table_name
) column_lists
) loop
begin
--Print table information:
v_table_index := v_table_index+1;
dbms_output.put_line(chr(10)||'--'||lpad(v_table_index, 3, '0')||': '||differences.table_name);
--Count differences.
execute immediate 'select count(*) from ('||differences.difference_sql||')' into v_row_count;
--Print SQL statements to investigate differences.
if v_row_count = 0 then
dbms_output.put_line('--No differences.');
else
dbms_output.put_line('--Differences: '||v_row_count);
dbms_output.put_line(differences.difference_sql||';');
end if;
exception when others then
dbms_output.put_line('/*Error with this statement, possible DDL difference: '
||differences.difference_sql||dbms_utility.format_error_stack||
dbms_utility.format_error_backtrace||'*/');
end;
end loop;
end;
/
Running the procedure
begin
print_differences('SCHEMA1', 'SCHEMA2');
end;
/
Sample output
The procedure does not output the actual differences. If there are differences, it outputs a script that will display the differences. With a decent IDE this will be a much better way to view the data, and it also helps to further analyze the differences.
--Comparison between SCHEMA1 and SCHEMA2, at 2014-03-28 23:44.
--001: TABLE1
--Differences: 1
with old_table as (select A,B from SCHEMA1.TABLE1)
, new_table as (select A,B from SCHEMA2.TABLE1)
select * from
(
select 'OLD' old_or_new, old_table.* from old_table minus
select 'OLD' old_or_new, new_table.* from new_table
)
union all
select * from
(
select 'NEW' old_or_new, new_table.* from new_table minus
select 'NEW' old_or_new, old_table.* from old_table
)
;
--002: TABLE2
--Differences: 1
with old_table as (select A,B from SCHEMA1.TABLE2)
, new_table as (select A,B from SCHEMA2.TABLE2)
select * from
(
select 'OLD' old_or_new, old_table.* from old_table minus
select 'OLD' old_or_new, new_table.* from new_table
)
union all
select * from
(
select 'NEW' old_or_new, new_table.* from new_table minus
select 'NEW' old_or_new, old_table.* from old_table
)
;
--003: TABLE3
--No differences.
--004: TABLE4
/*Error with this statement, possible DDL difference:
with old_table as (select A from SCHEMA1.TABLE4)
, new_table as (select A from SCHEMA2.TABLE4)
select * from
(
select 'OLD' old_or_new, old_table.* from old_table minus
select 'OLD' old_or_new, new_table.* from new_table
)
union all
select * from
(
select 'NEW' old_or_new, new_table.* from new_table minus
select 'NEW' old_or_new, old_table.* from old_table
)
ORA-06575: Package or function A is in an invalid state
ORA-06512: at "JHELLER.PRINT_DIFFERENCES", line 48
*/