Wrong index is chosen by Oracle - oracle

I have a problem in indexing in Oracle. Will try to explain my problem with an instance as follows.
I have a table TABLE1 with columns A,B,C,D
another table TABLE2 with columns A,B,C,E,F,H
I have created Indexes for TABLE1
IX_1 A
IX_2 A,B
IX_3 A,C
IX_4 A,B,C
I have created Indexes for TABLE1
IY_1 A,B,C
IY_2 A
when i gave query similar to this
SELECT * FROM TABLE1 T1,TABLE2 T2
WHERE T1.A=T2.A
When i give Explain Plan i got its not getting IX_1 nor IY_2
Its taking IX_4 nor IY_1
why this is not picking right index?
EDITED:
Can anyone help me to know difference between INDEX RANGE SCAN,INDEX UNIQUE SCAN, INDEX SKIP SCAN
I guess SKIP SCAN means when a column is skipped in Composite Index by Oracle
what about others i dont have idea!

The best benefit of indexes is that you can select a few rows from a table without scanning the entire table.
If you ask for too many rows(let's say 30% - depends of many things) the engine will prefer to scan the entire table for those rows.
That's because reading a row using an index is gets an overhead : reading some index blocks, and after that reading table blocks.
In your case, in order to join tables T1 and T2, Oracle needs all the rows from those table. Reading(full) the index will be an unsefull operation, adding unnecesary cost.
UPDATE: A step forward: if you run:
SELECT T1.B, T2.B FROM TABLE1 T1,TABLE2 T2
WHERE T1.A=T2.A
Oracle probably will use the indexes(IX2, IY2), because it does not need to read anything from table, because the values T1.B, T2.B, are in indexes.

Related

How can I merge two tables using ROWID in oracle?

I know that ROWID is distinct for each row in different tables.But,I am seeing somewhere that two tables are being merged using rowid.So,I also tried to see it,but I am getting the blank output.
I have person table which looks as:
scrowid is the column which contains rowid as:
alter table ot.person
add scrowid VARCHAR2(200) PRIMARY KEY;
I populated this person table as:
insert into ot.person(id,name,age,scrowid)
select id,name, age,a.rowid from ot.per a;
After this I also created another table ot.temp_person by same steps.Both table has same table structure and datatypes.So, i wanted to see them using inner join and I tried them as:
select * from ot.person p inner join ot.temp_person tp ON p.scrowid=tp.scrowid
I got my output as empty table:
Is there is any possible way I can merge two tables using rowid? Or I have forgotten some steps?If there is any way to join these two tables using rowid then suggest me.
Define scrowid as datatype ROWID or UROWID then it may work.
However, in general the ROWID may change at any time unless you lock the record, so it would be a poor key to join your tables.
I think perhaps you misunderstood the merging of two tables via rowid, unless what you actually saw was a Union, Cross Join, or Full Outer Join. Any attempt to match rowid, requardless of you define it, doomed to fail. This results from it being an internal definition. Rowid in not just a data type it is an internal structure (That is an older version of description but Oracle doesn't link documentation versions.) Those fields are basically:
- The data object number of the object
- The data block in the datafile in which the row resides
- The position of the row in the data block (first row is 0)
- The datafile in which the row resides (first file is 1). The file
number is relative to the tablespace.
So while it's possible for different tables to have the same rowid, it would be exteremly unlikely. Thus making an inner join on them always return null.

Table joins with between clause performances

I want to fetch a column from one table using between condition as in the below query. I joined the tables but it takes lot of time if the tables are having 100k records. Is there any way to rewrite this ?
I need a.grade in the result of my value lies between a.low and a.high.. there can be many matches for one value.
Select a.grade, b.val
From tbl1 a, tbl2b
Where b.val between a.low and a.high;
Also I have an index on (low,high) but optimiser is not using it

Oracle Partition pruning not happening

I have a fact table with millions of records. The table is range partitioned on a date column.
FACT_AUM (ACCOUNT_ID VARCHAR2(30),MARKET_VALUE NUMBER(20,6), POSTING_DATE DATE);
I have another temp table
ACCOUNT_TMP (ACCOUNT_ID VARCHAR2(30), POSTING_DATE DATE);
When I run this query by hard coding the date I see partition pruning happens and the results come back quickly
SELECT A.ACCOUNT_ID, SUM(A.MARKET_VALUE) FROM
FACT_AUM A JOIN ACCOUNT_TMP B ON A.ACCOUNT_ID = B.ACCOUNT_ID
AND A.POSTING_DATE=TO_DATE('30-DEC-2016',DD-MON-YYYY') GROUP BY
A.ACCOUNT_ID;
when I run the following, I don't see partition pruning and the query keeps spinning
SELECT A.ACCOUNT_ID, SUM(A.MARKET_VALUE) FROM
FACT_AUM A JOIN ACCOUNT_TMP B ON A.ACCOUNT_ID = B.ACCOUNT_ID
AND A.POSTING_DATE = B.POSTING_DATE GROUP BY
A.ACCOUNT_ID;
Any insights on this would be helpful.
Oracle used partition pruning while you hard coded the value, because Oracle felt it would get benefit of doing the partition pruning there.
When you joined the fact table with your temporary ( i would reword it to staging) table, Oracle wouldn't be able to guess which all partitions would it have to hit for computing the answer. Please note Oracle will assess what would be the range of values available in the staging table.
But unless you provide stats of the tables involved, i couldn't dwell into more important topics of the table ordering and tables joins. For quick fix use an Order hint or nested loop hint.

How can I speed up a diff between tables?

I am working on doing a diff between tables in postgresql, it takes a long time, as each table is ~13GB...
My current query is:
SELECT * FROM tableA EXCEPT SELECT * FROM tableB;
and
SELECT * FROM tableB EXCEPT SELECT * FROM tableA;
When I do a diff on the two (unindexed) tables it takes 1:40 hours (1 hour and 40 minutes) In order to get both the new and removed rows I need to run the query twice, bringing the total time to 3:30 hours.
I ran the Postgresql EXPLAIN query on it to see what it was doing. It looks like it is sorting the first table, then the second, then comparing them. Well that made me think that if I indexed the tables they would be presorted and the diff query would be much faster.
Indexing each table took 45 minutes. Once Indexed, each Diff took 1:35 hours.
Why do the indexes only shave off 5 minutes off the total diff time? I would assume that it would be more than half, since in the unindexed queries I am sorting each table twice (I need to run the query twice)
Since one of these tables will not be changing much, it will only need to be indexed once, the other will be updated daily. So the total runtime for the indexed method is 45 minutes for the index, plus 2x 1:35 for the diff, giving a total of 3:55 hours, almost 4hours.
What am I doing wrong here, I can't possibly see why with the index my net diff time is larger than without it?
This is in slight reference to my other question here: Postgresql UNION takes 10 times as long as running the individual queries
EDIT:
Here is the schema for the two tables, they are identical except the table name.
CREATE TABLE bulk.blue
(
"partA" text NOT NULL,
"type" text NOT NULL,
"partB" text NOT NULL
)
WITH (
OIDS=FALSE
);
In the statements above you are not using the indexes.
You could do something like:
SELECT * FROM tableA a
FULL OUTER JOIN tableB b ON a.someID = b.someID
You could then use the same statement to show which tables had missing values
SELECT * FROM tableA a
FULL OUTER JOIN tableB b ON a.someID = b.someID
WHERE ISNULL(a.someID) OR ISNULL(b.someID)
This should give you the rows that were missing in table A OR table B
Confirm you indexes are being used (they are likely not in such a generic except statement), but you are not joining against a specified column(s) so likely that lack of explicit join will not make for an optimized query:
http://www.postgresql.org/docs/9.0/static/indexes-examine.html
This will help you view the explain analyze more clearly:
http://explain.depesz.com
Also, make sure you do an analyze on the table after you create the index if you want it to perform well right away:}
The queries as specified require a comparison of every column of the tables.
For example if tableA and tableB each have five columns then the query is having to compare tableA.col1 to tableB.col1, tableA.col2 to tableB.col2, . . . tableA.col5 to tableB.col5
If there are just few columns that uniquely identify a record instead of all the columnS in the table then joining the tables on the specific columns that uniquely identify a record will improve your performance.
The above statement assumes that a primary key has not been created. If a primary key has been defined to indicated which columns uniquely identify a record then I believe the EXCEPT statement would take that into consideration.
What kind of index did you apply? Indexes are only useful to improve WHERE conditions. If you're doing a select *, you're grabbing all the fields and the index is probably not doing anything, but taking up space, and adding a little more processing behind the scenes for the db-engine to compare the query to the index cache.
Instead of SELECT *, you can try selecting your unique fields and create an index for those unique fields
You can also use an OUTER JOIN to show results from both tables that did not match on the unique fields
You may want to consider is clustering your tables
What version of Postgres are you running?
When was the last time you vacuumed?
Other than the above, 13GB is pretty large, so you'll want to check your config settings. It shouldn't take hours to run that, unless you don't have enough memory on your system.

Oracle 10g - optimize WHERE IS NOT NULL

We have Oracle 10g and we need to query 1 table (no joins) and filter out rows where 1 of the columns is null. When we do this - WHERE OurColumn IS NOT NULL - we get a full table scan on a very large table - BAD BAD BAD. The column has an index on it but it gets ignored in this instance. Are there any solutions to this?
Thanks
The optimizer thinks that the full table scan will be better.
If there are just a few NULL rows, the optimizer is right.
If you are absolutely sure that the index access will be faster (that is, you have more than 75% rows with col1 IS NULL), then hint your query:
SELECT /*+ INDEX (t index_name_on_col1) */
*
FROM mytable t
WHERE col1 IS NOT NULL
Why 75%?
Because using INDEX SCAN to retrieve values not covered by the index implies a hidden join on ROWID, which costs about 4 times as much as table scan.
If the index range includes more than 25% of rows, the table scan is usually faster.
As mentioned by Tony Andrews, clustering factor is more accurate method to measure this value, but 25% is still a good rule of thumb.
The optimiser will make its decision based on the relative cost of the full table scan and using the index. This mainly comes down to how many blocks will have to be read to satisfy the query. The 25%/75% rule of thumb mentioned in another answer is simplistic: in some cases a full table scan will make sense even to get 1% of the rows - i.e. if those rows happen to be spread around many blocks.
For example, consider this table:
SQL> create table t1 as select object_id, object_name from all_objects;
Table created.
SQL> alter table t1 modify object_id null;
Table altered.
SQL> update t1 set object_id = null
2 where mod(object_id,100) != 0
3 /
84558 rows updated.
SQL> analyze table t1 compute statistics;
Table analyzed.
SQL> select count(*) from t1 where object_id is not null;
COUNT(*)
----------
861
As you can see, only approximately 1% of the rows in T1 have a non-null object_id. But due to the way I built the table, these 861 rows will be spread more or less evenly around the table. Therefore, the query:
select * from t1 where object_id is not null;
is likely to visit almost every block in T1 to get data, even if the optimiser used the index. It makes sense then to dispense with the index and go for a full table scan!
A key statistic to help identify this situation is the index clustering factor:
SQL> select clustering_factor from user_indexes where index_name='T1_IDX';
CLUSTERING_FACTOR
-----------------
460
This value 460 is quite high (compared to the 861 rows in the index), and suggests that a full table scan will be used. See this DBAZine article on clustering factors.
If you are doing a select *, then it would make sense to do a table scan rather than using the index. If you know which columns you are interested in, you could create a covered index with those colums plus the one you are applying the IS NOT NULL condition.
It can depend on the type of index you have on the table.
Most B-tree indexes do not store null entries. Bitmap indexes do store null entries.
So, if you have:
select * from mytable
where mycolumn is null
and you have a standard B-tree index on mycolumn, then the query can't use the index as the "null" isn't in the index.
(If the index is against multiple columns, and one of the indexed columns is not null then there will be an entry in the index.)
Create an index on that column.
To make sure the index is used, it should be on the index and other columns in the where.
ocdecio answered:
If you are doing a select *, then it would make sense to do a table scan rather than using the index.
That's not strictly true; an index will be used if there is an index that fits your where clause, and the query optimizer decides using that index would be faster than doing a table scan. If there is no index, or no suitable index, only then must a table scan be done.
It's also worth checking whether Oracle's statistics on the table are up to date. It may not know that a full table scan will be slower.
Oracle database don't index null values at all in regular (b-tree) indexes, so it can't use it nor you can't force oracle database to use it.
BR
Using hints should be done only as a work around rather than a solution.
As mentioned in other answers, the null value is not available in B-TREE indexes.
Since you know that you have mostly null values in this column, would you be able to replace the null value by a range for instance.
That really depends on your column and the nature of your data but typically, if your column is a date type for instance:
where mydatecolumn is not null
Can be translated in a rule saying: I want all rows which have a date.
Then you can most definitely do this:
where mydatecolumn <=sysdate (in oracle)
This will return all rows with a date and ommit null values while taking advantage of the index on that column without using any hints.
See http://www.oracloid.com/2006/05/using-index-for-is-null/
If your index is on one single field, it will NOT be used. Try to add a dummy field or a constant in the index:
create index tind on t(field_to_index, 1);

Resources