I'm planning to do a long running update on my huge table (more than billion rows). This update will multiply one column's values by fixed number.
The problem is that during my update (which may last several hours) there will definitely be short transactions that will update some rows and those rows will have correct value that should not be updated though they will still satisfy my update's condition.
So the question is - how do I skip (do not update) rows that were updated outside my long running update's transaction?
One way is to use FOR UPDATE SKIP LOCKED such that other sessions won't be able to pick the rows which are already picked for update.
For example,
Session 1:
SQL> SELECT empno, deptno
2 FROM emp WHERE
3 deptno = 10
4 FOR UPDATE NOWAIT;
EMPNO DEPTNO
---------- ----------
7782 10
7839 10
7934 10
SQL>
Session 2:
SQL> SELECT empno, deptno
2 FROM emp WHERE
3 deptno in (10, 20)
4 FOR UPDATE NOWAIT;
FROM emp WHERE
*
ERROR at line 2:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
Now let's skip the rows which are locked by session 1.
SQL> SELECT empno, deptno
2 FROM emp WHERE
3 deptno IN (10, 20)
4 FOR UPDATE SKIP LOCKED;
EMPNO DEPTNO
---------- ----------
7369 20
7566 20
7788 20
7876 20
7902 20
SQL>
So, department = 10 were locked by session 1 and then department = 20 are locked by session 2.
I have done something like your problem but my table isn't too huge like your.
I re-designed my table, added 2 columns.
created_date: A Trigger put sysdate when insert data.
modified_date: A Trigger put sysdate when update data.
Then I can use created_date or modified_date in my where clause.
Example:
UPDATE TABLE table_name
SET column_name = 'values'
WHERE created_date < SYSDATE;
I hope this will help you.
Oh, I absolutely forgot about this question.
So, I ended up making a snapshot of current rows by saving their copies to another table (I had to update only those rows that satisfied given condition). After that I updated rows that haven't changed their values (using merge).
Related
I am having a query where I am selecting 4 columns. And, I want to put a grand total at the bottom of one of the columns, and not do any grouping:
SELECT customer_id, email, total_amount, order_date
FROM...................
I want to do a grand total of TOTAL_AMOUNT at the bottom, but not worry about any grouping. I'm not seeing how to do this using GROUPING or ROLLUP. I'm hoping not to have this as any running total in another oolumn, but as a grand total at the bottom.
Many thanks.
You can add a grand total row with a UNION ALL and a column to track if the row is for the grand total.
select customer_id, email, total_amount, order_date, 0 is_grand_total
from orders
union all
select null, null, sum(total_amount), null, 1 is_grand_total
from orders
order by is_grand_total, customer_id;
SQL Fiddle example.
(In my opinion, this is often a good way to add summary logic to queries. I'd rather have a slightly more complicated solution with one language (SQL), than a solution that involves two or more languages or applications.)
A simple SQL*Plus option:
SQL> break on report
SQL> compute sum of sal on report
SQL>
SQL> select deptno, ename, sal
2 from emp
3 where deptno = 10;
DEPTNO ENAME SAL
---------- ---------- ----------
10 CLARK 2450
10 KING 5001
10 MILLER 1300
----------
sum 8751
SQL>
My goal is to get a distinct for Clm_Pd_Amt column only and return all other columns:
SELECT CLM_AMT, PAID_DATE, MBR, DISTINCT CLM_PD_AMT
FROM MY_CLAIMS
WHERE DATE >= '20200101
AND STATUS = 'CURRENT'
A GROUP BY is equivalent to a distinct, for example, the distinct list of departments in the EMP table can be found with (say)
SQL> select deptno, min(sal)
2 from emp
3 group by deptno
4 /
DEPTNO MIN(SAL)
---------- ----------
30 951
10 1300
20 800
But what if I want to know which employee had that minimum salary. Then you can use the KEEP clause to gather that as well, eg
SQL> select deptno, min(sal), min(empno) KEEP ( dense_rank FIRST order by sal) empno
2 from emp
3 group by deptno
4 /
DEPTNO MIN(SAL) EMPNO
---------- ---------- ----------
10 1300 7934
20 800 7369
30 951 7900
So using that approach, you should be able to adapt your query to get the distinct CLM_PD_AMT and then pick up the other columns with KEEP. This only works if you have a definition for which distinct CLM_PD_AMT means, ie, the smallest? the largest? etc
How to create a temporary table in oracle without knowing the number and name of columns.
For example:
Select columnA,columnB,* into temp_table from tableA.
Here,tableA maynot be a simple tablename but maybe derived from many queries.
How can this be achieved?Is there any alternative to this?
In Oracle, you have to first create a table, then insert into it. Or, create it directly (as in my example).
Note that I've created a "normal" table; if it were temporary, you could have chosen between a global or private (depending on database version you use).
For this discussion, I guess that it is just fine:
SQL> create table temp_table as
2 select a.*
3 from (select d.deptno, d.dname, e.ename --> this SELECT is your "tableA"
4 from emp e join dept d
5 on e.deptno = d.deptno
6 where job = 'CLERK'
7 ) a;
Table created.
SQL> select * from temp_table;
DEPTNO DNAME ENAME
---------- -------------------- ----------
10 ACCOUNTING MILLER
20 RESEARCH SMITH
20 RESEARCH ADAMS
30 SALES JAMES
SQL>
Alternatively, create a view and work with it:
SQL> create or replace view v_temp as
2 select d.deptno, d.dname, e.ename
3 from emp e join dept d
4 on e.deptno = d.deptno
5 where job = 'CLERK'
6 ;
View created.
SQL> select * from v_temp;
DEPTNO DNAME ENAME
---------- -------------------- ----------
10 ACCOUNTING MILLER
20 RESEARCH SMITH
20 RESEARCH ADAMS
30 SALES JAMES
SQL>
This statement creates temp_table which contains all columns and data from tableA and two other empty columns, varchar and numeric.
create table temp_table as
select cast (null as varchar2(10)) columnA,
cast (null as number(6)) columnB,
tableA.*
from tableA
If you need only structure, no data, then add:
where 1 = 0
I am not sure why you want to have a temp table as you do not know columns, but you can think of dynamic SQL to create table depending on columns required during your process and then drop it again. From my point of view I think it is not a good design.
I can suggest to think on using collection with 'x' number of columns with datatype as VARCHAR2. During transaction you can populate and process according to you need and it will also remain for that session.
The situation is that, when I import a file into the database, one of the first thing I usually do is to assign an unique ID for each record.
I normally do below in TSQL
ALTER TABLE MyTable
ADD ID INT IDENTITY(1,1)
I am wondering if there is something similar in PL SQL?
All my search result come back with multiple steps.
Then I'd like to know what PL SQL programmer typically do to ID records after importing a file. Do they do that?
The main purpose for me to ID these records is to trace it back after manipulation/copying.
Again, I understand there is solution there, my further question is whether PL SQL programmer actually do that, or there is other alternative which making this step not necessary in PL SQL?
OK then, as you're on Oracle 11g, there's no identity column there so - back to multiple steps. Here's an example:
I'm creating a table that simulates your imported table:
SQL> create table tab_import as
2 select ename, job, sal
3 from emp
4 where deptno = 10;
Table created.
Add the ID column:
SQL> alter table tab_import add id number;
Table altered.
Create a sequence which will be used to populate the ID column:
SQL> create sequence seq_imp;
Sequence created.
Update current rows:
SQL> update tab_import set
2 id = seq_imp.nextval;
3 rows updated.
Create a trigger which will take care about future inserts (if any):
SQL> create or replace trigger trg_bi_imp
2 before insert on tab_import
3 for each row
4 begin
5 :new.id := seq_imp.nextval;
6 end;
7 /
Trigger created.
Check what's in the table at the moment:
SQL> select * from tab_import;
ENAME JOB SAL ID
---------- --------- ---------- ----------
CLARK MANAGER 2450 1
KING PRESIDENT 5000 2
MILLER CLERK 1300 3
Let's import some more rows:
SQL> insert into tab_import (ename, job, sal)
2 select ename, job, sal
3 from emp
4 where deptno = 20;
3 rows created.
The trigger had silently populated the ID column:
SQL> select * From tab_import;
ENAME JOB SAL ID
---------- --------- ---------- ----------
CLARK MANAGER 2450 1
KING PRESIDENT 5000 2
MILLER CLERK 1300 3
SMITH CLERK 800 4
JONES MANAGER 2975 5
FORD ANALYST 3000 6
6 rows selected.
SQL>
Shortly: you need to
alter table and add the ID column
create a sequence
create a trigger
The end.
The answer given by #Littlefoot would be my recommendation too - but still I thought I could mention the following variant which will work only if you do not intend to add more rows to the table later.
ALTER TABLE MyTable add id number(38,0);
update MyTable set id = rownum;
commit;
My test:
SQL> create table tst as select * from all_tables;
Table created.
SQL> alter table tst add id number(38,0);
Table altered.
SQL> update tst set id = rownum;
3815 rows updated.
SQL> alter table tst add constraint tstPk primary key (id);
Table altered.
SQL>
SQL> select id from tst where id < 15;
ID
----------
1
2
3
4
5
6
7
8
9
10
11
ID
----------
12
13
14
14 rows selected.
But as mentioned initially,- this only fixes numbering for the rows you have at the time of the update - your'e not going to get new id values for new rows anytime later - if you need that, go for the sequence solution.
You can add an id column to a table with a single statement (Oracle 11g, see dbfiddle):
alter table test_
add id raw( 16 ) default sys_guid() ;
Example:
-- create a table without an id column
create table test_ ( str )
as
select dbms_random.string( 'x', 16 )
from dual
connect by level <= 10 ;
select * from test_ ;
STR
ULWL9EXFG6CIO72Z
QOM0W1R9IJ2ZD3DW
YQWAP4HZNQ57C2UH
EETF2AXD4ZKNIBBF
W9SECJYDER793MQW
alter table test_
add id raw( 16 ) default sys_guid() ;
select * from test_ ;
STR ID
ULWL9EXFG6CIO72Z 0x782C6EBCAE2D7B9FE050A00A02005D65
QOM0W1R9IJ2ZD3DW 0x782C6EBCAE2E7B9FE050A00A02005D65
YQWAP4HZNQ57C2UH 0x782C6EBCAE2F7B9FE050A00A02005D65
EETF2AXD4ZKNIBBF 0x782C6EBCAE307B9FE050A00A02005D65
W9SECJYDER793MQW 0x782C6EBCAE317B9FE050A00A02005D65
Testing
-- Are the id values unique and not null? Yes.
alter table test_
add constraint pkey_test_ primary key ( id ) ;
-- When we insert more rows, will the id be generated? Yes.
begin
for i in 1 .. 100
loop
insert into test_ (str) values ( 'str' || to_char( i ) ) ;
end loop ;
end ;
/
select * from test_ order by id desc ;
-- last 10 rows of the result
STR ID
str100 0x782C806E16A5E998E050A00A02005D81
str99 0x782C806E16A4E998E050A00A02005D81
str98 0x782C806E16A3E998E050A00A02005D81
str97 0x782C806E16A2E998E050A00A02005D81
str96 0x782C806E16A1E998E050A00A02005D81
str95 0x782C806E16A0E998E050A00A02005D81
str94 0x782C806E169FE998E050A00A02005D81
str93 0x782C806E169EE998E050A00A02005D81
str92 0x782C806E169DE998E050A00A02005D81
str91 0x782C806E169CE998E050A00A02005D81
Regarding your other questions:
{1} Then I'd like to know what PL SQL programmer typically do to ID records after importing a file. Do they do that? The main purpose for me to ID these records is to trace it back after manipulation/copying.
-> As you know, the purpose of an id is: to identify a row. We don't "do anything to IDs". Thus, your usage of IDs seems legit.
{2} Again, I understand there is solution there, my further question is whether PL SQL programmer actually do that, or there is other alternative which making this step not necessary in PL SQL?
-> Not quite sure what you are asking here. Although there is a ROWID() pseudocolumn (see documentation), we should not use it to identify rows.
"You should not use ROWID as the primary key of a table. If you delete
and reinsert a row with the Import and Export utilities, for example,
then its rowid may change. If you delete a row, then Oracle may
reassign its rowid to a new row inserted later."
I have two tables in my database they are EXP1 and EXP2. I tried with the below query, this query is working when both the tables have same number of columns but my table EXP1 has 1000 columns n EXP2 has 1000+4.
select *
from
(
(select * from exp1
minus
select * from exp2)
union all
(select * from exp2
minus
select * from exp1)
);
INTRO: Below I show how one can do "by hand" what the tools (SQL Developer for example) can do much faster and much better. My interest in this (and yours!) is two-fold: learn and use some ideas that can help in many other problems; and understand what those tools do under the hood in the first place.
OK. Suppose you have two tables, and they have many columns in common (possibly not in the same order) and a few columns may be different - there may be a handful of columns in one table but not in the other. First you want to be able to look just at the common columns.
Then, suppose that's done. Now what's left of the two tables has many rows in common, but there are a few that are different. A row may exist in one table but not in the other, or two rows, one from each table, may be very similar but they may differ in just one or a small number of column values. Logically these are still one row in the first table but not the second, and the other row only in the second table but not in the first. However, let's say both tables have the same PK column - then you may have the same PK value in both tables, but at least one of the OTHER columns has different values for that PK value in the two tables. And, you want to find these differences between the two tables.
In what follows I will assume that if two columns, in the two tables, have the same name, they will also have the same data type. If that is not guaranteed in your case, it can be fixed with a little more work in the part where I identify the "common columns" - instead of matching them just by name, from the catalog views, they would have to be matched also by data type.
When you get to comparing rows in the two tables in the final step, (A minus B) union all (B minus A) works, but is not very efficient. Each table is read twice, and minus is an expensive operator. The more efficient solution, which I illustrate below, was discussed in a long thread on AskTom several years ago. Namely: collect all the rows from both tables (with union all), group by all the columns, and disregard the groups that have a count of 2. This means rows that were found in both tables, so they are duplicates in the union all! Actually, you will see a small additional trick to identify from which table the "non-duplicated" rows come. Add a column for "table_name" and in the final select, after grouping and keeping the groups with count(*) = 1, select max(table_name). You need an aggregate function (like max()) because you are grouping, but for these rows each group only has one row, so the max() is really just the table name.
The beauty of this approach is that it can be used to identify the common columns, too! In that case, we will compare rows from the USER_TAB_COLS view - we select column names that appear in either of the tables, and keep only the column names that are duplicates (so the column names appear in both tables). In that part of the solution, I also retrieve column_id, which is used to order the columns. Don't worry if you are not familiar with keep (dense_rank first...) - it's not really that complicated, but it's not that important either.
First let's set up a test case. I copy the EMP table from the SCOTT schema to my own schema, I replicate it (so now I have two copies, named EMP1 and EMP2), and I modify them slightly. I delete a different column from each, I delete a few (different) rows from each, and I modify one salary in one table. I will not show the resulting (slightly different) tables, but if you are following along, just select * from both and compare them before you continue reading.
Create the tables:
create table EMP1 as select * from scott.emp;
Table EMP1 created.
select * from EMP1;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
----- ---------- --------- ---- ------------------- ----- ------ -------
7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20
7499 ALLEN SALESMAN 7698 1981-02-20 00:00:00 1600 300 30
7521 WARD SALESMAN 7698 1981-02-22 00:00:00 1250 500 30
7566 JONES MANAGER 7839 1981-04-02 00:00:00 2975 20
7654 MARTIN SALESMAN 7698 1981-09-28 00:00:00 1250 1400 30
7698 BLAKE MANAGER 7839 1981-05-01 00:00:00 2850 30
7782 CLARK MANAGER 7839 1981-06-09 00:00:00 2450 10
7788 SCOTT ANALYST 7566 1987-04-19 00:00:00 3000 20
7839 KING PRESIDENT 1981-11-17 00:00:00 5000 10
7844 TURNER SALESMAN 7698 1981-09-08 00:00:00 1500 0 30
7876 ADAMS CLERK 7788 1987-05-23 00:00:00 1100 20
7900 JAMES CLERK 7698 1981-12-03 00:00:00 950 30
7902 FORD ANALYST 7566 1981-12-03 00:00:00 3000 20
7934 MILLER CLERK 7782 1982-01-23 00:00:00 1300 10
Modify them slightly:
create table EMP2 as select * from EMP1;
Table EMP2 created.
alter table emp1 drop column hiredate;
Table EMP1 altered.
alter table emp2 drop column comm;
Table EMP2 altered.
delete from EMP1 where ename like 'A%';
2 rows deleted;
delete from EMP2 where sal >= 3000;
3 rows deleted
update EMP2 set sal = 2950 where empno = 7698;
1 row updated
commit;
At this point you would do well to select * from EMP1; and select * from EMP2; and compare.
Now let's find out what columns the two tables have left in common.
select column_name,
min(column_id) keep(dense_rank first order by table_name) as col_id
from user_tab_cols
where table_name in ('EMP1', 'EMP2')
group by column_name
having count(*) = 2
order by col_id;
COLUMN_NAME COL_ID
----------- ------
EMPNO 1
ENAME 2
JOB 3
MGR 4
SAL 5
DEPTNO 7
6 rows selected
Perfect, so now we can compare the two tables, but only after we "project" them along the common columns only.
select max(table_name) as table_name, EMPNO, ENAME, JOB, MGR, SAL, DEPTNO
from (
select 'EMP1' as table_name, EMPNO, ENAME, JOB, MGR, SAL, DEPTNO from EMP1
union all
select 'EMP2' as table_name, EMPNO, ENAME, JOB, MGR, SAL, DEPTNO from EMP2
)
group by EMPNO, ENAME, JOB, MGR, SAL, DEPTNO
having count(*) = 1
order by EMPNO, ENAME, JOB, MGR, SAL, DEPTNO, table_name;
TABLE_NAME EMPNO ENAME JOB MGR SAL DEPTNO
---------- ----- ---------- --------- ------ ------ --------
EMP2 7499 ALLEN SALESMAN 7698 1600 30
EMP1 7698 BLAKE MANAGER 7839 2850 30
EMP2 7698 BLAKE MANAGER 7839 2950 30
EMP1 7788 SCOTT ANALYST 7566 3000 20
EMP1 7839 KING PRESIDENT 5000 10
EMP2 7876 ADAMS CLERK 7788 1100 20
EMP1 7902 FORD ANALYST 7566 3000 20
7 rows selected
The output is pretty much what we needed. Notice the first column, which tells us where the "unpaired" row comes from; and note BLAKE, who has different salary in the two tables (and the first column helps us to see what salary he has in which table).
This looks perfect so far, but what to do when you have 1000 columns? You could put it together in C or Java etc., using the result from the "common columns" query above - or you could do it all in Oracle, with dynamic SQL.
As far as I know, there is no set limit on the length of the text of an SQL statement in Oracle; the documentation says "The limit on how long a SQL statement can be depends on many factors, including database configuration, disk space, and memory" (and probably on your Oracle version, which they didn't mention). In any case, it will be more than 4000 characters, so we need to work with CLOB. In particular, we can't use listagg() - we need a workaround. I use xmlagg() below. Then, the documentation says if you concatenate text and at least one operand is CLOB the result will be CLOB; if that doesn't work for you, you may have to wrap the smaller text fragments within to_clob(). The "dynamic SQL" query below will produce the full text of the query I used above; you will simply copy it and paste it back into your front-end and execute it. You may have to delete wrapping double-quotes or such, depending on your front-end and settings.
First here is how we can create a (potentially very long) string, the list of common column names, which is repeated five times in the final query - just look again at the "final query" we used to compare the two tables above.
with
common_cols ( column_name, col_id ) as (
select column_name,
min(column_id) keep(dense_rank first order by table_name) as col_id
from user_tab_cols
where table_name in ('EMP1', 'EMP2')
group by column_name
having count(*) = 2
),
col_string ( str ) as (
select rtrim(xmlcast(xmlagg(xmlelement(e, column_name, ', ') order by col_id)
as clob), ', ') from common_cols
)
select * from col_string;
STR
-----------------------------------
EMPNO, ENAME, JOB, MGR, SAL, DEPTNO
And finally the full dynamic SQL query (the result is exactly the query I used to compare EMP1 and EMP2 on their common columns earlier):
with
common_cols ( column_name, col_id ) as (
select column_name,
min(column_id) keep(dense_rank first order by table_name) as col_id
from user_tab_cols
where table_name in ('EMP1', 'EMP2')
group by column_name
having count(*) = 2
),
col_string ( str ) as (
select rtrim(xmlcast(xmlagg(xmlelement(e, column_name, ', ') order by col_id)
as clob), ', ') from common_cols
)
select 'select max(table_name) as table_name, ' || str || chr(10) ||
'from (' || chr(10) ||
' select ''EMP1'' as table_name, ' || str || ' from EMP1' || chr(10) ||
' union all' || chr(10) ||
' select ''EMP2'' as table_name, ' || str || ' from EMP2' || chr(10) ||
' )' || chr(10) ||
'group by ' || str || chr(10) ||
'having count(*) = 1' || chr(10) ||
'order by ' || str || ', table_name;' as comp_sql_str
from col_string;