Performance: XMLDom vs XMLTable vs Extractvalue()

Performance: XMLDom vs XMLTable vs Extractvalue() - oracle

In Oracle, which one is faster in terms of accessing a single value ?
xmldom.getNodeValue(xmlNode);
Or
select extractvalue( XMLType('<nodeName>...'), '/nodeName/...') from dual
Or XMLTable?
Select *
From XMLTABLE ('nodeName',
XmlType('<nodeName>)
Columns — that 1 value);

Related

Jooq resolves unnest / table to union all (Oracle)

I'm using jOOQ to delete a variable number of rows from an Oracle database:
List<Integer> ids = Lists.newArrayList(1, 2, 3, 4);
db.deleteFrom(MESSAGE)
.where(MESSAGE.ID.in(ids))
.execute();
However, this means that a variable number of bind variables is used. This leads to the problem that Oracle always does a hard parse.
I have tried using the unnest or table function to create a statement with only one bind variable. Unfortunately, this does not seem to work. jOOQ creates statements with multiple bind variables and union all statements:
db.deleteFrom(MESSAGE)
.where(MESSAGE.ID.in(
select(field("*", Long.class))
.from(table(ids))
))
.execute();
LoggerListener DEBUG - Executing query : delete from "MESSAGE" where "MESSAGE"."ID" in (select * from ((select ? "COLUMN_VALUE" from dual) union all (select ? "COLUMN_VALUE" from dual) union all (select ? "COLUMN_VALUE" from dual) union all (select ? "COLUMN_VALUE" from dual)) "array_table")
LoggerListener DEBUG - -> with bind values : delete from "MESSAGE" where "MESSAGE"."ID" in (select * from ((select 1 "COLUMN_VALUE" from dual) union all (select 2 "COLUMN_VALUE" from dual) union all (select 3 "COLUMN_VALUE" from dual) union all (select 4 "COLUMN_VALUE" from dual)) "array_table")
The javadoc of the unnest function recommends using the table function for Oracle
Create a table from an array of values.
This is equivalent to the TABLE function for H2, or the UNNEST function in HSQLDB and Postgres
For Oracle, use table(ArrayRecord) instead, as Oracle knows only typed arrays
In all other dialects, unnesting of arrays is emulated using several UNION ALL connected subqueries.
Altough I use the table function, I still get the emulated UNION ALL statement.

TABLE operator specifics
Oracle's TABLE(collection) function doesn't accept arbitrary arrays, only SQL table types, such as:
CREATE TYPE t AS TABLE OF NUMBER
You could create such a type and use the code generator to create a TRecord, which you could pass to that TABLE operator:
from(table(new TRecord(ids)))
Unfortunately, such a type is required in Oracle. In its absence, the UNION ALL emulation seems to be the best way, but that doesn't solve your problem.
Note the TABLE operator has its own caveats, mainly that it produces poor cardinality estimates, so for small tables, it may not be the best choice. Though, do measure yourself! I've documented this here in a blog post
Avoiding cursor cache contention
jOOQ has a nice feature called IN list padding, which mitigates most of the dynamic SQL problems that arise from such hard parses. In short, it prevents arbitrary IN list sizes by repeating the last element to pad the list up to 2^n elements, getting you:
id IN (?) -- For 1 element
id IN (?, ?) -- For 2 elements
id IN (?, ?, ?, ?) -- For 3-4 elements
id IN (?, ?, ?, ?, ?, ?, ?, ?) -- For 5-8 elements
It's a hack for when the TABLE operator is too complex. It works by reducing the number of possible predicates logarithmically, which might just be good enough.
Other alternatives
You can always also use:
Temp tables, and batch insert the IDs in there
Use an actual semi join: ID IN (SELECT original_id FROM original_table) (this is usually the best approach, if possible)

Translate hierarchical Oracle query to DB2 query

I work primarily with SAS and Oracle and am still new to DB2. Im faced with needing a hierarchical query to separate a clob into chunks that can be pulled into sas. SAS has a limit of 32K for character variables so I cant just pull the dataset in normally.
I found an old stackoverflow question about the best way to pull a clob into a sas data set but it is written in Oracle.
Import blob through SAS from ORACLE DB
Since I am new to DB2 and the syntax for this type of join seems very different I was hoping to find someone that could help convert it and explain the syntax. I find the Oracle syntax to be much easier to understand. I'm not sure in DB2 if you would use a CTE recursion like this https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/apsg/src/tpc/db2z_xmprecursivecte.html or if you would use hierarchical queries like this https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_71/sqlp/rbafyrecursivequeries.htm
Here is the Oracle query.
SELECT
id
, level as chunk_id
, regexp_substr(clob_value, '.{1,32767}', 1, level, 'n') as clob_chunk
FROM (
SELECT id, clob_value
FROM schema.table
WHERE id = 1
)
CONNECT BY LEVEL <= regexp_count(clob_value, '.{1,32767}',1,'n')
order by id, chunk_id;
The table has two fields the id and the clob_value and would look like this.
ID CLOB_VALUE
1 really large clob
2 medium clob
3 another large clob
The thought is I would want this result. I would only ever be doing this one row at a time where id= which ever row I am processing.
ID CHUNK_ID CLOB
1 1 clob_chunk1of3
1 2 clob_chunk2of3
1 3 clob_chunk3of3
Thanks for any time spent reading and helping.

Here is a solution that should work in DB2 with few changes (but please be advised that I don't know DB2 at all; I am just using Oracle features that are in the SQL Standard, so they should be implemented identically - or almost so - in DB2).
Below I create a table with your sample data; then I show how to chunk it into substrings of length at most 8 characters. Although the strings are short, I defined the column as CLOB and I am using CLOB tools; this should work on much larger CLOBs.
You can make both the chunk size and the id into bind parameters, if needed. In my demo below I hardcoded the chunk size and I show the result for all IDs in the table. In case the CLOB is NULL, I do return one chunk (which is NULL, of course).
Note that touching CLOBs in a query is very expensive; so most of the work is done without touching the CLOBs. I only work on them as little as possible.
PREP WORK
drop table tbl purge; -- If needed
create table tbl (id number, clob_value clob);
insert into tbl (id, clob_value)
select 1, 'really large clob' from dual union all
select 2, 'medium clob' from dual union all
select 3, 'another large clob' from dual union all
select 4, null from dual -- added to check handling
;
commit;
QUERY
with
prep(id, len) as (
select id, dbms_lob.getlength(clob_value)
from tbl
)
, rec(id, len, ord, pos) as (
select id, len, 1, 1
from prep
union all
select id, len, ord + 1, pos + 8
from rec
where len >= pos + 8
)
select id, ord, dbms_lob.substr(clob_value, 8, pos)
from tbl inner join rec using (id)
order by id, ord
;
ID ORD CHUNK
---- ---- --------
1 1 really l
1 2 arge clo
1 3 b
2 1 medium c
2 2 lob
3 1 another
3 2 large cl
3 3 ob
4 1

Another option is to enable the Oracle compatibility in Db2 and just issue the hierarchical query.
This GitHub repository has background information on SQL recursion in DB2, including the Oracle-style syntax and a side by side example (both work against the Db2 sample database):
-- both queries are against the SAMPLE database
-- and should return the same result
SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
START WITH DEPTNO = 'A00'
CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT;
WITH tdep(level, deptname, deptno) as (
SELECT 1, CAST( DEPTNAME AS VARCHAR(40)) AS DEPTNAME, deptno
FROM department
WHERE DEPTNO = 'A00'
UNION ALL
SELECT t.LEVEL+1, CAST(SPACE(t.LEVEL * 4) || '/' || d.DEPTNAME
AS VARCHAR(40)) AS DEPTNAME, d.deptno
FROM DEPARTMENT d, tdep t
WHERE d.admrdept=t.deptno and d.deptno<>'A00')
SELECT level, deptname
FROM tdep;

Oracle. CAST COLLECT for date datatype

Consider the types:
CREATE OR REPLACE TYPE date_array AS TABLE OF DATE;
CREATE OR REPLACE TYPE number_array AS TABLE OF NUMBER;
CREATE OR REPLACE TYPE char_array AS TABLE OF VARCHAR2(80);
Queries:
WITH q AS
(SELECT LEVEL ID,
TRUNC(SYSDATE) + LEVEL MyDate,
to_char(LEVEL) STRING
FROM dual
CONNECT BY LEVEL < 5)
SELECT CAST(COLLECT(ID) AS number_array)
FROM q;
return collection of numbers
WITH q AS
(SELECT LEVEL ID,
TRUNC(SYSDATE) + LEVEL MyDate,
to_char(LEVEL) STRING
FROM dual
CONNECT BY LEVEL < 5)
SELECT CAST(COLLECT(STRING) AS char_array)
FROM q;
return collection of strings
WITH q AS
(SELECT LEVEL ID,
TRUNC(SYSDATE) + LEVEL MyDate,
to_char(LEVEL) STRING
FROM dual
CONNECT BY LEVEL < 5)
SELECT CAST(COLLECT(MyDate) AS date_array)
FROM q
return error invalid datatype.
Can anyone explain why the Date datatype behaves differently?

Here are my findings... it seems you are facing a bug caused by the fact that calculated dates seem to have a different internal representation from "database" dates. I did find a workaround, so keep on reading.
On my oracle dev installation (Oracle 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production) I am experiencing your same problem.
BUT... If I create a physical table containing your test data:
create table test_data as
SELECT LEVEL ID,
TRUNC(SYSDATE) + LEVEL MyDate,
to_char(LEVEL) STRING
FROM dual
CONNECT BY LEVEL < 5
and then run the "cast collect" operator on this physical table, it works as expected:
-- this one works perfectly
SELECT CAST(COLLECT(MyDate) AS date_array) from test_data
but all these examples still do NOT work:
-- here I just added 1 .. and it doesn't work
SELECT CAST(COLLECT(MyDate + 1) AS date_array)
from test_data
-- here I am extracting sysdate, instead of a physical column... and it doesn't work
SELECT CAST(COLLECT(sysdate) AS date_array)
from test_data
It feels like oracle doesn't think that calculated dates are the same thing of physical dates
So, I tried to "persuade" oracle that the data I am providing is actually a normal DATE value, using an explicit cast... and EUREKA! this does the job correctly:
WITH q AS
(SELECT LEVEL ID,
-- this apparently unnecessary cast does the trick
CAST( TRUNC(SYSDATE) + LEVEL AS DATE) MyDate,
to_char(LEVEL) STRING
FROM dual
CONNECT BY LEVEL < 5)
SELECT CAST(COLLECT(MyDate) AS date_array)
FROM q
Yes... but why??
It really seems that these two values are not exactly the same thing, even if the values we see are actually the same:
select sysdate, cast (sysdate as date) from dual
So I dug in the internal representation of the two values applying the "dump" function to both of them:
select dump(sysdate), dump(cast (sysdate as date)) from dual
and these are the results I got:
DUMP(SYSDATE ) -> Typ=13 Len=8: 226,7,11,9,19,20,47,0
DUMP(CAST(SYSDATEASDATE) as DUAL) -> Typ=12 Len=7: 120,118,11,9,20,21,48
Internally they look like two totally different data types! one is type 12 and the other is type 13... and they have different length and representation.
Anyway I discovered something more.. it seems someone else has noticed this: https://community.oracle.com/thread/4122627
The question has an answer pointing to this document: http://psoug.org/reference/datatypes.html
which contains a lengthy note about dates... an excerpt of it reads:
"What happened? Is the information above incorrect or does the DUMP()
function not handle DATE values? No, you have to look at the "Typ="
values to understand why we are seeing these results. ". The datatype
returned is 13 and not 12, the external DATE datatype. This occurs
because we rely on the TO_DATE function! External datatype 13 is an
internal c-structure whose length varies depending on how the
c-compiler represents the structure. Note that the "Len=" value is 8
and not 7. Type 13 is not a part of the published 3GL interfaces for
Oracle and is used for date calculations mainly within PL/SQL
operations. Note that the same result can be seen when DUMPing the
value SYSDATE."
Anyway, I repeat: I think this is a bug, but at least I found a workaround: use an explicit cast to DATE.

Regarding Oracle query functional behaviour

CREATE TABLE ak_temp
(
misc_1 varchar2(4000),
misc_2 varchar2(4000),
time_of_insert timestamp
);
Query 1(Original) :
select *
from ak_temp
where misc_1 in('ankush')
or misc_2 in('ankush')
Query 2 :
select *
from ak_temp
where 'ankush' in (misc_1,misc_2)
Hi, I have somewhat similar problem with the queries, i want to avoid Query 1 since the cost in our live environment is coming bit on higher side so i have figured out Query 2 with less cost, are these both functionally equivalent?

The two are functionally equivalent and should produce the same query plan, so one should not have a performance advantage over the other (or any such advantage would be miniscule). Oracle might be smart enough to take advantage of two indexes, one on ak_temp(misc_1) and one on ak_temp(misc_2).
However, you might also consider:
select t.*
from ak_temp t
where misc_1 = 'ankush'
union
select t.*
from ak_temp t
where misc_2 = 'ankush';
This version will definitely take advantage of indexes, but the union can slow things down if many rows match the two conditions.
EDIT:
To avoid the union, you can do:
select t.*
from ak_temp t
where misc_1 = 'ankush'
union all
select t.*
from ak_temp t
where misc_2 = 'ankush' and (misc_1 <> 'ankush' and misc_1 is not null); -- or use `lnnvl()`

oracle sorting with text as number

My NLS settings:
NLS_SORT POLISH
NLS_COMP BINARY
Simple test query:
select * from (
select '11117' as x from dual
union
select '12988' as x from dual
union
select '14659' as x from dual
union
select '1532' as x from dual
union
select '18017' as x from dual
) order by x;
Actual result:
x
-----
11117
12988
14659
1532
18017
Desired result:
x
-----
1532
11117
12988
14659
18017
Question:
Is there a NLS setting that will help me achive desired result? I know I can do order by to_number(x) or, even better, order by lpad(x, 5), but it's not good in this case - I need a system-wide solution that won't require query change.
What I tried:
order by nlssort(x, 'nls_sort=binary');
alter session set nls_sort='binary';

Maybe a somewhat odd solution, but ...
Rename your TABLE_SOMETHING to TABLE_OTHER.
Create a view TABLE_SOMETHING on top of the TABLE_OTHER which will TO_NUMBER() the troubled string column of yours to the same column name.
If the table is, application-wide, being modified, then create INSTEAD OF triggers over the view TABLE_SOMETHING.
Recompile possibly invalidated packages.
Won't be particularly performant when it comes to modifying large quantities of records, won't allow for truncating the table, but it might solve the ordering problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio