Procedure to alter and update table on hierarchical relationship to see if there are any children - oracle

I have a hierarchical table on Oracle pl/sql. something like:
create table hierarchical (
id integer primary key,
parent_id references hierarchical ,
name varchar(100));
I need to create a procedure to alter that table so I get a new field that tells, for each node, if it has any children or not.
Is it possible to do the alter and the update in one single procedure?
Any code samples would be much appreciated.
Thanks

You can not do the ALTER TABLE (DDL) and the UPDATE (DML) in a single step.
You will have to do the ALTER TABLE, followed by the UPDATE.
BEGIN
EXECUTE IMMEDIATE 'ALTER TABLE hierarchical ADD child_count INTEGER';
--
EXECUTE IMMEDIATE '
UPDATE hierarchical h
SET child_count = ( SELECT COUNT(*)
FROM hierarchical h2
WHERE h2.parent_id = h.id )';
END;
Think twice before doing this though. You can easily find out now if an id has any childs with a query.
This one would give you the child-count of all top-nodes for example:
SELECT h.id, h.name, COUNT(childs.id) child_count
FROM hierarchical h
LEFT JOIN hierarchical childs ON ( childs.parent_id = h.id )
WHERE h.parent_id IS NULL
GROUP BY h.id, h.name
Adding an extra column with redundant data will make changing your data more difficult, as you will always have to update the parent too, when adding/removing childs.

If you just need to know whether children exist, the following query can do it without the loop or the denormalized column.
select h.*, connect_by_isleaf as No_children_exist
from hierarchical h
start with parent_id is null
connect by prior id = parent_id;
CONNECT_BY_LEAF returns 0 if the row has children, 1 if it does not.
I think you could probably get the exact number of children through a clever use of analytic functions and the LEVEL pseudo-column, but I'm not sure.

Related

Right way to create index organized table

I am trying to create a index organized table in oracle 11. I create the index organized table and insert the row from another table.
create table salIOT (
mypk ,
cid ,
date,
CONSTRAINT sal_pk PRIMARY KEY (mypk)
) ORGANIZATION INDEX
AS Select * from another table;
But the leaf blocks are empty when I query
SQL> Select owner, index_name, table_name, leaf_blocks from all_indexes where table_name like 'SALIOT';
Am I missing something here?
You still need to gather statistics on the table, something like:
exec dbms_stats.gather_table_stats('ABC', 'IOTTableName')
(assuming 'ABC' is your username; change as needed) - then re-run your SELECT against ALL_INDEXES and you will see how many leaf blocks you have.

Does updating a view affects the base table?

When I updated a view which was created using a base table, the updation affected the base table as well. How is that possible? If view is considered as just a 'window' through which we can see a set of data of the base table then how can the base table change when I try to change the data inside a view.
You can make changes to the state of underlying table using the view as long as the you are targeting the change in single table.
View is a security layer on top of table object and allows most of the DML operation as long as you do not violet the base rule.
Example:
CREATE TABLE T1
(ID INT IDENTITY(1,1), [Value] NVARCHAR(50))
CREATE TABLE T2
(ID INT IDENTITY(1,1), [Value] NVARCHAR(50))
--Dummy Insert
INSERT INTO T1 VALUES ('TestT1')
INSERT INTO T2 VALUES ('TestT2')
--Create View
CREATE VIEW V1
AS
SELECT T1.ID AS T1ID, T2.ID AS T2ID, T1.Value AS T1Value, T2.Value AS T2Value FROM T1 INNER JOIN T2
ON T2.ID = T1.ID
--Check the result
SELECT * FROM V1
--Insert is possible via view as long as it affects only one table
INSERT INTO V1 (T1Value) VALUES
('TestT1_T1')
INSERT INTO V1 (T2Value) VALUES
('TestT2_T2')
--Change is possible only if target is only one table
UPDATE V1
SET T1Value = 'Changed'--**
WHERE T2ID = 1
--This is not allowed
INSERT INTO V1 (T1Value, T2Value) VALUES
('TestT1_T1','TestT2_T2')
--Msg 4405, Level 16, State 1, Line 1
--View or function 'V1' is not updatable because the modification affects multiple base tables.
--Check T1 and T2 with each statement to see how it gets affected
--
In some databases it's possible to update the source table(s) for a view if there is a one-to-one relationship between the rows in the view and the rows in the underlying table, that is, you cant have derived columns, aggregate functions or a distinct clause in your view for example.
In Oracle, even if a view is not inherently updatable, updates may be allowed if an INSTEAD OF DML trigger is defined.
If you use mysql, you can read a detailed description about this feature Updatable and insertable views.
" If view is considered as just a 'window' through which we can see a set of data of the base table "
- Where did you get this definition?
What oracle says about views:
A view is a logical representation of another table or combination of
tables. A view derives its data from the tables on which it is based.
These tables are called base tables. Base tables might in turn be
actual tables or might be views themselves. All operations performed
on a view actually affect the base table of the view. You can use
views in almost the same way as tables. You can query, update, insert
into, and delete from views, just as you can standard tables.
Such a view into which you can update or insert are fondly named as "Updatable and Insertable Views". Oracle documentation about them is here.
Also, this is how the purpose of an "insert" statement is defined by Oracle:
Use the INSERT statement to add rows to a table, the base table of a
view, a partition of a partitioned table or a subpartition of a
composite-partitioned table, or an object table or the base table of
an object view.
Yes we can achieve the DML Operation in Views like belows:
Create or replace view emp_dept_join as Select d.department_id,
d.department_name, e.first_name, e.last_name from employees
e, departments d where e.department_id = d.department_id;
SQL>CREATE OR REPLACE TRIGGER insert_emp_dept
INSTEAD OF INSERT ON emp_dept_join DECLARE v_department_id departments.department_id%TYPE;
BEGIN
BEGIN
SELECT department_id INTO v_department_id
FROM departments
WHERE department_id = :new.department_id;
EXCEPTION
WHEN NO_DATA_FOUND THEN
INSERT INTO departments (department_id, department_name)
VALUES (dept_sequence.nextval, :new.department_name)
RETURNING ID INTO v_department_id;
END;
INSERT INTO employees (employee_id, first_name, last_name, department_id)
VALUES(emp_sequence.nextval, :new.first_name, :new.last_name, v_department_id);
END insert_emp_dept;
/
if the viwe is defined through a simple query involving single base relation and either containing primary key or candidate key, so there will be change in base relation if changing the view. ( however there is restriction)
And updates are not allowed through view if there is multiple base relations or grouping operations.

T-SQL - wrong query execution plan behaviour

One of our queries degraded after generating load on the DB.
Our query is a join between 3 tables:
Base table which contain 10 M rows.
EventPerson table which contain 5000 rows.
EventPerson788 which is empty.
It seems that the optimizer scans the index on the EventPerson instead of seek, this the script for replicating the issue:
--Create Tables
CREATE TABLE [dbo].[BASE](
[ID] [bigint] NOT NULL,
[IsActive] BIT
PRIMARY KEY CLUSTERED ([ID] ASC)
)ON [PRIMARY]
GO
CREATE TABLE [dbo].[EventPerson](
[DUID] [bigint] NOT NULL,
[PersonInvolvedID] [bigint] NULL,
PRIMARY KEY CLUSTERED ([DUID] ASC)
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [EventPerson_IDX] ON [dbo].[EventPerson]
(
[PersonInvolvedID] ASC
)
CREATE TABLE [dbo].[EventPerson788](
[EntryID] [bigint] NOT NULL,
[LinkedSuspectID] [bigint] NULL,
[sourceid] [bigint] NULL,
PRIMARY KEY CLUSTERED ([EntryID] ASC)
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[EventPerson788] WITH CHECK
ADD CONSTRAINT [FK7A34153D3720F84A]
FOREIGN KEY([sourceid]) REFERENCES [dbo].[EventPerson] ([DUID])
GO
ALTER TABLE [dbo].[EventPerson788] CHECK CONSTRAINT [FK7A34153D3720F84A]
GO
CREATE NONCLUSTERED INDEX [EventPerson788_IDX]
ON [dbo].[EventPerson788] ([LinkedSuspectID] ASC)
GO
--POPOLATE BASE TABLE
DECLARE #I BIGINT=1
WHILE (#I<10000000)
BEGIN
begin transaction
INSERT INTO BASE(ID) VALUES(#I)
SET #I+=1
if (#I%10000=0 )
begin
commit;
end;
END
go
--POPOLATE EventPerson TABLE
DECLARE #I BIGINT=1
WHILE (#I<5000)
BEGIN
BEGIN TRANSACTION
INSERT INTO EventPerson(DUID,PersonInvolvedID) VALUES(#I,(SELECT TOP 1 ID FROM BASE ORDER BY NEWID()))
SET #I+=1
IF(#I%10000=0 )
COMMIT TRANSACTION ;
END
GO
This the query :
select
count(EventPerson.DUID)
from
EventPerson
inner loop join
Base on EventPerson.DUID = base.ID
left outer join
EventPerson788 on EventPerson.DUID = EventPerson788.sourceid
where
(EventPerson.PersonInvolvedID = 37909 or
EventPerson788.LinkedSuspectID = 37909)
AND BASE.IsActive = 1
Do you have any idea why the optimizer decides to use index scan instead of index seek?
Workaround that already done :
Analyze tables and build statistics.
Rebuild Indices.
Try the FORCESEEK hint
None of the above persuaded the optimizer to run an index seek on EventPerson and seek on the base tables.
Thanks for your help .
The scan is there because of the or condition and the outer join against EventPerson788.
Either it will return rows from EventPerson when EventPerson.PersonInvolvedID = 37909 or when the there exists rows in EventPerson788 where EventPerson788.LinkedSuspectID = 37909. The last part means that every row in EventPerson has to be checked against the join.
The fact that EventPerson788 is empty can not be used by the query optimizer since the query plan is saved to be reused later when there might be matching rows in EventPerson788.
Update:
You can rewrite your query using a union all instead of or to get a seek in EventPerson.
select count(EventPerson.DUID)
from
(
select EventPerson.DUID
from EventPerson
where EventPerson.PersonInvolvedID = 1556 and
not exists (select *
from EventPerson788
where EventPerson788.LinkedSuspectID = 1556)
union all
select EventPerson788.sourceid
from EventPerson788
where EventPerson788.LinkedSuspectID = 1556
) as EventPerson
inner join BASE
on EventPerson.DUID=base.ID
where
BASE.IsActive=1
Well, you're asking SQL Server to count the rows of the EventPerson table - so why do you expect a seek to be better than a scan here?
For a COUNT, the SQL Server optimizer will almost always use a scan - it needs to count the rows, after all - all of them... it will do a clustered index scan, if no other non-nullable columns are indexed.
If you have an index on a small, non-nullable column (e.g. on a ID INT or something like that), it would probably do a scan on that index instead (less data to read to count all rows).
But in general: seek is great for selecting one or a few rows - but it sucks if you're dealing with all rows (like for a count)
You can easily observe this behavior if you're using the AdventureWorks sample database.
When doing a COUNT(*) on the Sales.SalesOrderDetail table which has over 120000 rows like this:
SELECT COUNT(*) FROM Sales.SalesOrderDetail
then you'll get an index scan on IX_SalesOrderDetail_ProductID - it just doesn't pay off to do seeks on over 120000 entries!
However, if you do the same operation on a smaller set of data, like this:
SELECT COUNT(*) FROM Sales.SalesOrderDetail
WHERE ProductID = 897
then you get back 2 rows out of all of them - and SQL Server will now use an index seek on that same index.

Oracle: use index for searching null values

I've done some search but I prefer something like an hint or similar
http://www.dba-oracle.com/oracle_tips_null_idx.htm
http://www.oracloid.com/2006/05/using-index-for-is-null/
What about a functional index using NVL2, like;
CREATE TABLE foo (bar INTEGER);
INSERT INTO foo VALUES (1);
INSERT INTO foo VALUES (NULL);
CREATE INDEX baz ON foo (NVL2(bar,0,1));
and then;
DELETE plan_table;
EXPLAIN PLAN FOR SELECT * FROM foo WHERE NVL2(bar,0,1) = 1;
SELECT operation, object_name FROM plan_table;
should give you
OPERATION OBJECT_NAME
---------------- -----------
SELECT STATEMENT
TABLE ACCESS FOO
INDEX BAZ << yep
If you're asking, "How can I create an index that would allow it to be used when searching for NULL values on a particular field", my suggestion is to create an index on the field you're interested in PLUS the primary key field(s). Thus, if you've got a table called A_TABLE, with field VAL that you want to search for NULLs, and a primary key named PK, I'd create an index on (VAL, PK).
Share and enjoy.
I'm going to "answer" the non-question above.
The articles you link to are kinda right - Oracle's b-tree indexes will not capture when the leaf nodes are null. Take this example:
CREATE TABLE MYTABLE (
ID NUMBER(8) NOT NULL,
DAT VARCHAR2(100)
);
CREATE INDEX MYTABLE_IDX_1 ON MYTABLE (DAT);
/* Perform inserts into MYTABLE where some DAT are null */
SELECT COUNT(*) FROM MYTABLE WHERE DAT IS NULL;
The ending SELECT will not be able to use the index, because the leafs (right-most column) will not capture the nulls. Burleson's solution is stupid, because now you have to use a NVL in all your queries and have compromised the data in the tables. Gorbachev's method includes a known NOT NULL column for the leaves of the b-tree, but this expands the index for no reason. Maybe in his case the index made sense that way for tuning other queries, but if all you want to do is find the NULLs then the easiest solution is to make the leaf a constant.
CREATE INDEX MYTABLE_IDX_1 ON MYTABLE (DAT, 1);
Now, the leaves are all the constant (1), and by default the nulls will all be together (either at the top or bottom of the index, but it doesn't really matter as Oracle can use the index forwards or backwards). There is a slight storage penalty for that constant, but a single number is smaller than most other data fields in a typical table. Now the database can use the index when querying for nulls...if the optimizer finds that the best way to get the data.

Oracle: Insertion on an indexed table, avoiding duplicates. Looking for tips and advice

Im looking for the best solution (performance wise) to achieve this.
I have to insert records into a table, avoiding duplicates.
For example, take table A
Insert into A (
Select DISTINCT [FIELDS] from B,C,D..
WHERE (JOIN CONDITIONS ON B,C,D..)
AND
NOT EXISTS
(
SELECT * FROM A ATMP WHERE
ATMP.SOMEKEY = A.SOMEKEY
)
);
I have an index over A.SOMEKEY, just to optimize the NOT EXISTS query, but i realize that inserting on an indexed table will be a performance hit.
So I was thinking of duplicating Table A in a Global Temporary Table, where I would keep the index. Then, removing the index from Table A and executing the query, but modified
Insert into A (
Select DISTINCT [FIELDS] from B,C,D..
WHERE (JOIN CONDITIONS ON B,C,D..)
AND
NOT EXISTS
(
SELECT * FROM GLOBAL_TEMPORARY_TABLE_A ATMP WHERE
ATMP.SOMEKEY = A.SOMEKEY
)
);
This would solve the "inserting on an index table", but I would have to update the Global Temporary A with each insertion I make.
I'm kind of lost here,
Is there a better way to achieve this?
Thanks in advance,
if the column A.SOMEKEY is declared NOT NULL and if you insert a large amound of data, a NOT IN clause might be more efficient than your NOT EXISTS since it will be able to use a HASH ANTI-JOIN.
INSERT INTO A
(SELECT DISTINCT FIELDS
FROM B, C, D ..
WHERE (JOIN CONDITIONS ON B, C, D..)
AND [B].SOMEKEY NOT IN (SELECT SOMEKEY FROM A)
AND [B].SOMEKEY IS NOT NULL;
HASH ANTI-JOINS are brutally efficient with large data sets.
I don't think the temporary table is a good idea in that case because you will be in one of these two cases:
the temporary table is indexed on SOMEKEY, your point about inserting into an indexed table being therefore moot
the temporary table is unindexed and your anti-join will be inefficient
Which method is the most efficient will probably depends upon the volume of data.
How about having the index on the table A.
create table b (same structure as table a) with NOLOGGING
Insert /*+APPEND */ into b (
Select DISTINCT [FIELDS] from B,C,D..
WHERE (JOIN CONDITIONS ON B,C,D..)
AND
NOT EXISTS
(
SELECT * FROM A ATMP WHERE
ATMP.SOMEKEY = A.SOMEKEY
)
);
Then drop the index on A and INSERT INTO A SELECT * FROM B
You could make B a global temporary table, but make sure that the data is persistent for the session as dropping the index will implictly commit.

Resources