Oracle SQL Query Performance, Function based Indexes - oracle

I have been trying to fine tune a SQL Query that takes 1.5 Hrs to process approx 4,000 error records. The run time increases along with the number of rows.
I figured out there is one condition in my SQL that is actually causing the issue
AND (DECODE (aia.doc_sequence_value,
NULL, DECODE(aia.voucher_num,
NULL, SUBSTR(aia.invoice_num, 1, 10),
aia.voucher_num) ,
aia.doc_sequence_value) ||'_' ||
aila.line_number ||'_' ||
aida.distribution_line_number ||'_' ||
DECODE (aca.doc_sequence_value,
NULL, DECODE(aca.check_voucher_num,
NULL, SUBSTR(aca.check_number, 1, 10),
aca.check_voucher_num) ,
aca.doc_sequence_value)) = " P_ID"
(P_ID - a value from the first cursor sql)
(Note that these are standard Oracle Applications(ERP) Invoice tables)
P_ID column is from the staging table that is derived the same way as above derivation and compared here again in the second SQL to get the latest data for that record. (Basically reprocessing the error records, the value of P_ID is something like "999703_1_1_9995248" )
Q1) Can I create a function based index on the whole left side derivation? If so what is the syntax.
Q2) Would it be okay or against the oracle standard rules, to create a function based index on standard Oracle tables? (Not creating directly on the table itself)
Q3) If NOT what is the best approach to solve this issue?

Briefly, no you can't place a function-based index on that expression, because the input values are derived from four different tables (or table aliases).
What you might look into is a materialised view, but that's a big and potentially difficult to solve a single query optimisation problem with.
You might investigate decomposing that string "999703_1_1_9995248" and applying the relevant parts to the separate expressions:
DECODE(aia.doc_sequence_value,
NULL,
DECODE(aia.voucher_num,
NULL, SUBSTR(aia.invoice_num, 1, 10),
aia.voucher_num) ,
aia.doc_sequence_value) = '999703' and
aila.line_number = '1' and
aida.distribution_line_number = '1' and
DECODE (aca.doc_sequence_value,
NULL,
DECODE(aca.check_voucher_num,
NULL, SUBSTR(aca.check_number, 1, 10),
aca.check_voucher_num) ,
aca.doc_sequence_value)) = '9995248'
Then you can use indexes on the expressions and columns.
You could separate the four components of the P_ID value using regular expressions, or a combination of InStr() and SubStr()

Ad 1) Based on the SQL you've posted, you cannot create function based index on that. The reason is that function based indexes must be:
Deterministic - i.e. the function used in index definition has to always return the same result for given input arguments, and
Can only use columns from the table the index is created for. In your case - based on aliases you're using - you have four tables (aia, aila, aida, aca).
Req #2 makes it impossible to build a functional index for that expression.

Related

Teradata 3848 and small int with COMPRESS

I am learning Teradata and have run into an issue when I UNION two queries. The error I run into is error 3848, "The ORDER BY clause must contain only integer constants".
I checked the table definition and all the columns I have been retrieving are Small Ints with identical definitions, except for one which uses COMPRESS with a long series of consecutive numbers starting from 3.
SELECT
COALESCE (ContractType, 'InvalidType') AS "Contract",
COALESCE (ContractStatus, 'InvalidStatus') AS "Status",
COUNT(ContractType) AS "Contract_Type_Count",
COUNT (ContractStatus) AS "Contract_Status_Count"
NULL AS "negCodeErr_count"
FROM fund_inventory_db.ContractDetail
GROUP BY CUBE (ContractType, ContractStatus)
UNION
NULL,
NULL,
NULL,
NULL,
SELECT COUNT(*)
FROM fund_inventory_db.ContractDetail
WHERE ContractSource = -2
ORDER BY ContractType, ContractStatus;
The definitions for all those fields look like this:
[...columnName...] SMALLINT NOT NULL DEFAULT 0
Except for one column, which is:
[...columnName...] SMALLINT NOT NULL DEFAULT 0 COMPRESS (3,4,5,6,7,8...)
Does using COMPRESS like this make it possible that they are not able to order normally? As in, if one column uses COMPRESS(3,4,5,6,7...) and the other either uses COMPRESS (1,2,3,4,5...) or does not use COMPRESS at all, would that make a difference?
This might be embarrassing, but is it actually possible to use a UNION where one of the queries is using CUBE()?
Sorry, this is all new to me and my mentor is moving a little fast! I sincerely appreciate your time.

Oracle fuzzy text search with wildcards

I've got a SAP Oracle database full with customer data.
In our custom CRM it is quite common to search the for customers using wildcards. In addtion to the SAP standard search, we would like to do some fuzzy text searching for names which are similar to the entered name.
Currently we're using the UTL_MATCH.EDIT_DISTANCE function to search for similar names. The only disadvantage is that it is not possible to use some wildcard patterns.
Is there any possiblity to use wildcards in combination with the UTL_MATCH.EDIT_DISTANCE function or are there different(or even better) approaches to do that?
Let's say, there are the following names in the database:
PATRICK NOR
ORVILLE ALEX
OWEN TRISTAN
OKEN TRIST
The query could look like OKEN*IST* and both OWEN TRISTAN and OKEN TRISTAN should be returned. OKEN would be a 100% match and OWEN less.
My current test-query looks like:
SELECT gp.partner, gp.bu_sort1, UTL_MATCH.edit_distance(gp.bu_sort1, ?) as edit_distance,
FROM but000 gp
WHERE UTL_MATCH.edit_distance(gp.bu_sort1, ?) < 4
This query works fine except if wildcards * are used within the search string (which is quite common).
Beware of the implications of your approach in terms of performances. Even if it "functionally" worked, with UTL_MATCH you can only filter the results obtained by an internal table scan.
What you likely need is an index on such data.
Head to Oracle Text, the text indexing capabilities of Oracle. Bear in mind that they require some effort to be put at work.
You might juggle with the fuzzy operator, but handle with care. Most oracle text features are language dependent (they take into account the English dictionary, German, etc..).
For instance
-- create and populate the table
create table xxx_names (name varchar2(100));
insert into xxx_names(name) values('PATRICK NOR');
insert into xxx_names(name) values('ORVILLE ALEX');
insert into xxx_names(name) values('OWEN TRISTAN');
insert into xxx_names(name) values('OKEN TRIST');
insert into xxx_names(name) values('OKENOR SAD');
insert into xxx_names(name) values('OKENEAR TRUST');
--create the domain index
create index xxx_names_ctx on xxx_names(name) indextype is ctxsys.context;
This query would return results that you'd probably like (input is the string "TRST")
select
SCORE(1), name
from
xxx_names n
where
CONTAINS(n.name, 'definescore(fuzzy(TRST, 1, 6, weight),relevance)', 1) > 0
;
SCORE(1) NAME
---------- --------------------
1 OWEN TRISTAN
22 OKEN TRIST
But with the input string "IST" it would likely return nothing (in my case this is what it does).
Also note that in general, inputs of less than 3 characters are considered non-matching by default.
You'll possibly get a more "predictable" outcome if you take off the "fuzzy" requirement and stick to finding rows that just "contains" the exact sequence you passed in.
In this case try using a ctxcat index, which, by the way supports some wildcards (warning: supports multi columns, but a column cannot exceed 30 chars in size!)
-- create and populate the table
--max length is 30 chars, otherwise the catsearch index can't be created
create table xxx_names (name varchar2(30));
insert into xxx_names(name) values('PATRICK NOR');
insert into xxx_names(name) values('ORVILLE ALEX');
insert into xxx_names(name) values('OWEN TRISTAN');
insert into xxx_names(name) values('OKEN TRIST');
insert into xxx_names(name) values('OKENOR SAD');
insert into xxx_names(name) values('OKENEAR TRUST');
begin
ctx_ddl.create_index_set('xxx_names_set');
ctx_ddl.add_index('xxx_names_set', 'name');
end;
/
drop index xxx_names_cat;
CREATE INDEX xxx_names_cat ON xxx_names(name) INDEXTYPE IS CTXSYS.CTXCAT
PARAMETERS ('index set xxx_names_set');
The latter, with this query would work nicely (input is "*TRIST*")
select
UTL_MATCH.edit_distance(name, 'TRIST') dist,
name
from
xxx_names
where
catsearch(name, '*TRIST*', 'order by name desc') > 0
;
DIST NAME
---------- --------------------
7 OWEN TRISTAN
5 OKEN TRIST
But with the input "*O*TRIST*" wouldn't return anything (for some reasons).
Bottom line: text indexes are probably the only way to go (for performance) but you have to fiddle quite a bit to understand all the intricacies.
References:
fuzzy search: Oracle Text CONTAINS Query Operators
catsearch : Oracle Text SQL Statements and Operators
Assuming "wildcard" means an asterisk, you want a name that matches all specified letters to rank highest, with more specified letters matching better than less, otherwise rank by edit distance similarity.
using the placeholder ? for your search term, try this:
select *
from mytable
order by case
when name like '%' || replace(?, '*', '%') || '%' then 0 - length(replace(?, '*', ''))
else 100 - UTL_MATCH.edit_distance_similarity(?, name) end
fetch first 10 rows
FYI all "like" matches have a negative number for their ordering with magnitude the number of letters specified. All like misses have a non-negative ordering number with magnitude of the percentage difference. In all cases, a lower number is a better match.

oracle not using defined indexes

As seen below there is a simple join between my Tables A And B.
In addition, there is a condition on each table which is combined with Or operator.
SELECT /*+ NO_EXPAND */
* FROM IIndustrialCaseHistory B ,
IIndustrialCaseProduct A
where (
A.ProductId IN ('9_2') OR
contains(B.KeyWords,'%some text goes here%' ) <=0
)
and ( B.Id = A.IIndustrialCaseHistoryId)
on ProductId defined a b-tree index and for KeyWords there is a function index.
but I dont know why my execution plan dose not use these indexes and performs table access full?
as I found in this URL NO_EXPAND optimization hint could couse using indexes in execution plan(The NO_EXPAND hint prevents the cost-based optimizer from considering OR-expansion for queries having OR conditions or IN-lists in the WHERE clause ). But I didn't see any use of defined indexes
whats is oracle problem with my query?!
Unless there is something magical about the contains() function that I don't know about, Oracle cannot use an index to find a matching value that leads with a wildcard, i.e. a text string value within a varchar2 column but not starting in the first position with that value. [OR B.KeyWords LIKE'%some text goes here%' -- as opposed to -- OR B.KeyWords LIKE'Some text starts here%' -- optimizable via index.] The optimizer will default back to the full table scan in that case.
Also, although it may not be material, why use IN() if there is only one value in the list? Why not A.ProductId = '9_2' ?

Re-using a query result in a PL/SQL package

I have some trouble writing SQL queries. Inside a package function, I am trying to reuse the result of a query in two other queries. Here's how it goes :
My schema stores Requests. Each Request concerns multiple destinations. Also, each Request is detailed in another table (Request_Detail). In addition, Requests are identified by their Ids.
So, I am using mainly 3 tables. One for Requests, another for the destinations and the last one for the details. Each one of theses tables is indexed by the Request_Id column.
The query I want to optimize is when a user wants to find all requests, plus their destinations and commands that have been sent between two dates.
I want to query the Request_Table first in order to get all Request_ids. Then, use this Request_Ids list to query the Command table and the Destination one.
I couldn't find how to do that... I can't use ref cursors as they can't be fetched twice... I just need some array-like or column-like variable to store the Request_Ids, then use this variable twice or more...
Here's the original queries I would like to optimize :
FUNCTION EXTRACT_REQUEST_WITH_DATE (ze_from_date DATE, ze_to_date DATE, x_request_list OUT cursor_type, x_destination_list OUT cursor_type,
x_command_list OUT cursor_type) RETURN VARCHAR2 AS
my_function_id VARCHAR2(80) := PACKAGE_ID || '.EXTRACT_REQUEST_WITH_DATE';
my_return_code VARCHAR2(2);
BEGIN
OPEN x_request_list FOR
SELECT NAME,DESTINATION_TYPE,
SUCCESS_CNT, STATUS, STATUS_DESCRIPTION,
REQUEST_ID, PARENT_REQUEST_ID, DEDUPLICATION_ID, SUBMIT_DATE, LAST_UPDATE_DATE
FROM APP_DB.REQUEST_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID;
OPEN x_destination_list FOR
SELECT REQUEST_ID, DESTINATION_ID
FROM APP_DB.DESTINATION_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID;
OPEN x_command_list FOR
SELECT SEQUENCE_NUMBER, NAME, PARAMS, DESTINATION_ID
SEND_DATE, LAST_UPDATE_DATE,PROCESS_CNT, STATUS, STATUS_DESCRIPTION,
VALIDITY_PERIOD, TO_ABORT_FLAG
FROM APP_DB.REQUEST_DETAILS_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID, DESTINATION_ID, SEQUENCE_NUMBER;
return RETURN_OK;
END EXTRACT_REQUEST_WITH_DATE;
As you see, we use the same predicate (that is the SUBMIT_DATE conditions) for all 3 queries. I think there's maybe some way to optimize it by getting REQUEST_IDs then using them in the remaining queries.
Thanks for hearing me out !
Based on the queries you posted I'd just add a SUBMIT_DATE index to REQUEST_TABLE, DESTINATION_TABLE and REQUEST_DETAILS_TABLE and leave your SQL as is. All three queries will be optimized and will run just as fast as matching against a table of REQUEST_ID values.
So...
I found this method that seems to be efficient enough :
First, defining global types to use as arrays. Here's the code :
Object(Record) type :
create or replace
TYPE "GENERIC_ID" IS OBJECT(ID VARCHAR2(64));
Variable size array of GENERIC_ID
create or replace
TYPE "GENERIC_ID_ARRAY" IS TABLE OF "GENERIC_ID";
Then, populating is done via extend() in a FOR LOOP. The resulting array can be used as a table in SQL requests, using :
TABLE(CAST(my_array_of_ids AS GENERIC_ID_ARRAY)
Thanks,

How to write the following pl/sql block without using Cursor?

I had written a cursor in a pl/sql block. This block taking lot of time if it has more records.
How to write this without a cursor or Is there any other alternative way that will reduce the time?
Is there any alternative query to perform insert into one table and delete from another table using a single query?
DECLARE
MDLCursor SYS_REFCURSOR;
BEGIN
open MDLCursor for
select dc.dest_id, dc.digits, dc.Effectivedate, dc.expirydate
from DialCodes dc
INNER JOIN MDL d
ON dc.Dest_ID = d.Dest_ID
AND d.PriceEntity = 1
join sysmdl_calltypes s
on s.call_type_id = v_CallType_ID
and s.dest_id = dc.Dest_ID
and s.call_type_id not in
(select calltype_id from ignore_calltype_for_routing)
order by length(dc.digits) desc, dc.digits desc;
loop
fetch MDLCursor
into v_mdldest_id, v_mdldigits, v_mdlEffectiveDate, v_mdlExpDate;
insert into tt_pendingcost_temp
(Dest_ID,
Digits,
CCASDigits,
Destination,
tariff_id,
NewCost,
Effectivedate,
ExpiryDate,
previous,
Currency)
select v_mdldest_id,
Digits,
v_mdldigits,
Destination,
tariff_id,
NewCost,
Effectivedate,
ExpiryDate,
previous,
Currency
FROM tt_PendingCost
where substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
and instr(Digits, v_MDLDigits) = 1
and v_mdlEffectiveDate <= effectivedate
and (v_mdlExpDate > effectivedate or v_mdlExpDate is null);
if SQL%ROWCOUNT > 0 then
delete FROM tt_PendingCost
where substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
and instr(Digits, v_MDLDigits) = 1
and v_mdlEffectiveDate <= effectivedate
and (v_mdlExpDate > effectivedate or v_mdlExpDate is null);
end if;
exit when MDLCursor%NOTFOUND;
end loop;
close MDLCursor;
END;
I don't have your tables and your data so I can only guess at a couple of things that would be slowing you down.
Firstly, the query used in your cursor has an ORDER BY clause in it. If this query returns a lot of rows, Oracle has to fetch them all and sort them all before it can return the first row. If this query typically returns a lot of results, and you don't particularly need it to return sorted results, you may find your PL/SQL block speeds up a bit if you drop the ORDER BY. That way, you can start getting results out of the cursor without needing to fetch all the results, store them somewhere and sort them first.
Secondly, the following is the WHERE clause used in your INSERT INTO ... SELECT ... and DELETE FROM ... statements:
where substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
and instr(Digits, v_MDLDigits) = 1
and v_mdlEffectiveDate <= effectivedate
and (v_mdlExpDate > effectivedate or v_mdlExpDate is null);
I don't see how Oracle can make effective use of indexes with any of these conditions. It would therefore have to do a full table scan each time.
The last two conditions seem reasonable and there doesn't seem a lot that can be done with them. I'd like to focus on the first two conditions as I think there's more scope for improvement with them.
The second of the four conditions is
instr(Digits, v_MDLDigits) = 1
This condition holds if and only if Digits starts with the contents of v_MDLDigits. A better way of writing this would be
Digits LIKE v_MDLDigits || '%'
The advantage of using LIKE in this situation instead of INSTR is that Oracle can make use of indexes when using LIKE. If you have an index on the Digits column, Oracle will be able to use it with this query. Oracle would then be able to focus in on those rows that start with the digits in v_MDLDigits instead of doing a full table scan.
The first of the four conditions is:
substr(Digits, 1, 2) = substr(v_MDLDigits, 1, 2)
If v_MDLDigits has length at least 2, and all entries in the Digits columns also have length at least 2, then this condition is redundant since it is implied by the previous one we looked at.
I'm not sure why you would have a condition like this. The only reason I can think why you might have this condition is if you have a functional index on substr(Digits, 1, 2). If not, I would be tempted to remove this substr condition altogether.
I don't think the cursor is what is making this procedure run slowly, and there's no single statement I know of that can insert into one table and delete from another. To make this procedure speed up I think you just need to tune the queries a bit.

Resources