Bitmap indexes: are they not good for DML operations? - oracle

I have two partitoned tables on weekly interval called table1 and table2 and each table has columns like
SR_NO,
CARD_NUMBER,
TRANSACTION_DATE and
STATUS which contains only values like 'N' or 'P' or 'Y'.
I have created two indexes on both tables.
First one is normal local index on TRANSACTION_DATE and STATUS.
Second index is bitmap index on STATS.
I am comparing two tables like as shown in below select statement.
select a.sr_no, a.transacton_date, a.record_status
from table1 a inner join table2 b on a.card_number = b.card_number
and a.transaction_date = b.transaction_date
and a.Status = 'P'
and b.staus = 'P';
I am looping through above select satement data and updating STATUS column to 'Y' for matched data in table1 and updating 5000 records at a time using bulk insert concept and update statement is as follows i have used bind variables also in update statement.
update table1 set
status = 'Y'
where transaction_date = :date
and SR_NO = :SRNO
and STATUS = :status;
Bind varable assigned with select statement returned data.
Now my problem is above update statement is taking time then up on co ordination with our internal DBA team they suggested to drop bitmap indexes which slows down perfomance of DML statements.
I want to know is it true that bitmap indexes slows down performance of DML statements.

Related

Compare differences before insert into oracle table

Could you please tell me how to compare differences between table and my select query and insert those results in separate table? My plan is to create one base table (name RESULT) by using select statement and populate it with current result set. Then next day I would like to create procedure which will going to compare same select with RESULT table, and insert differences into another table called DIFFERENCES.
Any ideas?
Thanks!
You can create the RESULT_TABLE using CTAS as follows:
CREATE TABLE RESULT_TABLE
AS SELECT ... -- YOUR QUERY
Then you can use the following procedure which calculates the difference between your query and data from RESULT_TABLE:
CREATE OR REPLACE PROCEDURE FIND_DIFF
AS
BEGIN
INSERT INTO DIFFERENCES
--data present in the query but not in RESULT_TABLE
(SELECT ... -- YOUR QUERY
MINUS
SELECT * FROM RESULT_TABLE)
UNION
--data present in the RESULT_TABLE but not in the query
(SELECT * FROM RESULT_TABLE
MINUS
SELECT ... );-- YOUR QUERY
END;
/
I have used the UNION and the difference between both of them in a different order using MINUS to insert the deleted data also in the DIFFERENCES table. If this is not the requirement then remove the query after/before the UNION according to your requirement.
-- Create a table with results from the query, and ID as primary key
create table result_t as
select id, col_1, col_2, col_3
from <some-query>;
-- Create a table with new rows, deleted rows or updated rows
create table differences_t as
select id
-- Old values
,b.col_1 as old_col_1
,b.col_2 as old_col_2
,b.col_3 as old_col_3
-- New values
,a.col_1 as new_col_1
,a.col_2 as new_col_2
,a.col_3 as new_col_3
-- Execute the query once again
from <some-query> a
-- Outer join to detect also detect new/deleted rows
full join result_t b using(id)
-- Null aware comparison
where decode(a.col_1, b.col_1, 1, 0) = 0
or decode(a.col_2, b.col_2, 1, 0) = 0
or decode(a.col_3, b.col_3, 1, 0) = 0;

How to copy all constrains and data form one schema to another in oracle

I am using Toad for oracle 12c. I need to copy a table and data (40M) from one shcema to another (prod to test). However there is an unique key(not the PK for this table) called record_Id col which has something data like this 3.000*******19E15. About 2M rows has same numbers(I believe its because very large number) which are unique in prod. When I try to copy it violets the unique key of that col. I am using toad "export data to another schema" function to copy the data.
when I execute query in prod
select count(*) from table_name
OR
select count(distinct(record_id) from table_name
Both query gives the exact same numbers of data.
I don't have DBA permission. How do I copy all data without violating unique key of the table.
Thanks in advance!
You can use UPSERT for decisional INSERT or UPDATE or you may write small procedure for this.
you may consider to use NOT EXISTS, but your data is big and it might not be resource efficient.
insert into prod_tab
select * from other_tab t1 where NOT exists (
select 1 from prod_tab t2 where t1.id = t2.id
);
In Oracle you can use a MERGE query for that.
The following query proceeds as follows for each data row :
if the source record_id does not yet exist in the target table, a new record is inserted
else, the existing record is updated with source values
For the sake of the example, I assumed that there are two other columns in the table : column1 and column2.
MERGE INTO target_table t1
USING (SELECT * from source_table t2)
ON (t1.record_id = t2.record_id)
WHEN MATCHED THEN UPDATE SET
t1.column1 = t2.column1,
t1.column2 = t2.column2
WHEN NOT MATCHED THEN INSERT
(record_id, column1, column2) VALUES (t2.record_id, t2.column1, t2.column2)

Sorting a table with a column with SQL Server 2012

I have a table like this :
How can I sort it out in the form below :
Each set of records has been marked by a FlagID at end
Each set of records has been marked by a FlagID at end
I assume this means that each record is implicitly associated with the first non-null FlagID value that occurs at or after its position along the ID primary key. Thus, we can use a correlated subquery to project this implicit FlagID value for each record, sort by it, then sort by your Row column as tiebreaker for each set.
SELECT *
FROM YourTable T1
ORDER BY
(
SELECT TOP 1 T2.FlagID
FROM YourTable T2
WHERE T2.ID <= T1.ID
AND T2.FlagID IS NOT NULL
ORDER BY T2.ID DESC
),
T1.Row
However, if you're able to alter the database content, I would recommend you to explicitly populate all the FlagID fields, as this would make your life easier. If you do so, then the query becomes:
SELECT *
FROM YourTable T1
ORDER BY T1.FlagID,
T1.Row

T-SQL - wrong query execution plan behaviour

One of our queries degraded after generating load on the DB.
Our query is a join between 3 tables:
Base table which contain 10 M rows.
EventPerson table which contain 5000 rows.
EventPerson788 which is empty.
It seems that the optimizer scans the index on the EventPerson instead of seek, this the script for replicating the issue:
--Create Tables
CREATE TABLE [dbo].[BASE](
[ID] [bigint] NOT NULL,
[IsActive] BIT
PRIMARY KEY CLUSTERED ([ID] ASC)
)ON [PRIMARY]
GO
CREATE TABLE [dbo].[EventPerson](
[DUID] [bigint] NOT NULL,
[PersonInvolvedID] [bigint] NULL,
PRIMARY KEY CLUSTERED ([DUID] ASC)
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [EventPerson_IDX] ON [dbo].[EventPerson]
(
[PersonInvolvedID] ASC
)
CREATE TABLE [dbo].[EventPerson788](
[EntryID] [bigint] NOT NULL,
[LinkedSuspectID] [bigint] NULL,
[sourceid] [bigint] NULL,
PRIMARY KEY CLUSTERED ([EntryID] ASC)
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[EventPerson788] WITH CHECK
ADD CONSTRAINT [FK7A34153D3720F84A]
FOREIGN KEY([sourceid]) REFERENCES [dbo].[EventPerson] ([DUID])
GO
ALTER TABLE [dbo].[EventPerson788] CHECK CONSTRAINT [FK7A34153D3720F84A]
GO
CREATE NONCLUSTERED INDEX [EventPerson788_IDX]
ON [dbo].[EventPerson788] ([LinkedSuspectID] ASC)
GO
--POPOLATE BASE TABLE
DECLARE #I BIGINT=1
WHILE (#I<10000000)
BEGIN
begin transaction
INSERT INTO BASE(ID) VALUES(#I)
SET #I+=1
if (#I%10000=0 )
begin
commit;
end;
END
go
--POPOLATE EventPerson TABLE
DECLARE #I BIGINT=1
WHILE (#I<5000)
BEGIN
BEGIN TRANSACTION
INSERT INTO EventPerson(DUID,PersonInvolvedID) VALUES(#I,(SELECT TOP 1 ID FROM BASE ORDER BY NEWID()))
SET #I+=1
IF(#I%10000=0 )
COMMIT TRANSACTION ;
END
GO
This the query :
select
count(EventPerson.DUID)
from
EventPerson
inner loop join
Base on EventPerson.DUID = base.ID
left outer join
EventPerson788 on EventPerson.DUID = EventPerson788.sourceid
where
(EventPerson.PersonInvolvedID = 37909 or
EventPerson788.LinkedSuspectID = 37909)
AND BASE.IsActive = 1
Do you have any idea why the optimizer decides to use index scan instead of index seek?
Workaround that already done :
Analyze tables and build statistics.
Rebuild Indices.
Try the FORCESEEK hint
None of the above persuaded the optimizer to run an index seek on EventPerson and seek on the base tables.
Thanks for your help .
The scan is there because of the or condition and the outer join against EventPerson788.
Either it will return rows from EventPerson when EventPerson.PersonInvolvedID = 37909 or when the there exists rows in EventPerson788 where EventPerson788.LinkedSuspectID = 37909. The last part means that every row in EventPerson has to be checked against the join.
The fact that EventPerson788 is empty can not be used by the query optimizer since the query plan is saved to be reused later when there might be matching rows in EventPerson788.
Update:
You can rewrite your query using a union all instead of or to get a seek in EventPerson.
select count(EventPerson.DUID)
from
(
select EventPerson.DUID
from EventPerson
where EventPerson.PersonInvolvedID = 1556 and
not exists (select *
from EventPerson788
where EventPerson788.LinkedSuspectID = 1556)
union all
select EventPerson788.sourceid
from EventPerson788
where EventPerson788.LinkedSuspectID = 1556
) as EventPerson
inner join BASE
on EventPerson.DUID=base.ID
where
BASE.IsActive=1
Well, you're asking SQL Server to count the rows of the EventPerson table - so why do you expect a seek to be better than a scan here?
For a COUNT, the SQL Server optimizer will almost always use a scan - it needs to count the rows, after all - all of them... it will do a clustered index scan, if no other non-nullable columns are indexed.
If you have an index on a small, non-nullable column (e.g. on a ID INT or something like that), it would probably do a scan on that index instead (less data to read to count all rows).
But in general: seek is great for selecting one or a few rows - but it sucks if you're dealing with all rows (like for a count)
You can easily observe this behavior if you're using the AdventureWorks sample database.
When doing a COUNT(*) on the Sales.SalesOrderDetail table which has over 120000 rows like this:
SELECT COUNT(*) FROM Sales.SalesOrderDetail
then you'll get an index scan on IX_SalesOrderDetail_ProductID - it just doesn't pay off to do seeks on over 120000 entries!
However, if you do the same operation on a smaller set of data, like this:
SELECT COUNT(*) FROM Sales.SalesOrderDetail
WHERE ProductID = 897
then you get back 2 rows out of all of them - and SQL Server will now use an index seek on that same index.

Oracle, how update statement works

Question 1
Can anyone tell me if there is any difference between following 2 update statements:
UPDATE TABA SET COL1 = '123', COL2 = '456' WHERE TABA.PK = 1
UPDATE TABA SET COL1 = '123' WHERE TABA.PK = 1
where the original value of COL2 = '456'
how does this affect the UNDO?
Question 2
What about if I update a record in table TABA using ROWTYPE like the following snippet.
how's the performance, and how does it affect the UNDO?
SampleRT TABA%rowtype
SELECT * INTO SampleRT FROM TABA WHERE PK = 1;
SampleRT.COL2 = '111';
UPDATE TABA SET ROW = SampleRT WHERE PK = SampleRT.PK;
thanks
Is your question 1 asking whether UNDO (and REDO) is generated when you're running an UPDATE against a row but not actually changing the value?
Something like?
update taba set col2='456' where col2='456';
If this is the question, then the answer is that even if you're updating a column to the same value then UNDO (and REDO) is generated.
(An exception is when you're updating a NULL column to NULL - this doesn't generate any REDO).
For Question 1:
The outcome of the two UPDATEs for rows in your table where PK=1 and COL2='456' is identical. (That is, each such row will have its COL1 value set to '123'.)
Note: there may be rows in your table with PK=1 and COL2 <> '456'. The outcome of the two statements for these rows will be different. Both statements will alter COL1, but only the first will alter the value in COL2, the second will leave it unchanged.
For question 1:
There can be a difference as triggers can fire depending on which columns are updated. Even if you are updating column_a to the same value, the trigger will fire.
The UNDO shouldn't be different as, if you expand or shrink the length of a variable length column (eg VARCHAR or NUMBER), all the rest of the bytes of the record need to be shuffled along too.
If the columns don't change size, then you MAY get a benefit in not specifying the column. You can probably test it using v$transaction queries to look at undo generated.
For question 2:
I'd be more concerned about memory (especially if you are bulk collecting SELECT * ) and triggers firing than UNDO.
If you don't need SELECT *, specify the columns (eg as follows)
cursor c_1 is select pk, col1, col2 from taba;
SampleRT c_1%rowtype;
SELECT pk, col1, col2 INTO SampleRT FROM TABA WHERE PK = 1;
SampleRT.COL2 = '111';
UPDATE (select pk, col1, col2 from taba)
SET ROW = SampleRT WHERE PK = SampleRT.PK;

Resources