Oracle 11g XMLType Experience - oracle

1
Hello community,
we are currently evaluating possibilities to store generic data structures. We found that at least from a functional point of view Oracle XMLType is a good alternative to the good old BLOB. Because you can query and update single fields from the xml and also create indexes on XPath expressions.
We are a bit worried about the performance of XMLType. Especially the select performance in interesting. We have queries that select multiple data structures at once. These need to be fast.
Such a query looks something like this
SELECT DOC_VALUE.getClobval() AS XML_VALUE FROM XML_TABLE WHERE d.ID = IN ('1','2',...);
Our XML documents are 7 to 8 KB in size. We are on Oracle 11g and create the XML column with type 'XMLTYPE'
Do you have experience about the performance of selects on xml type columns. What overall experiences do you have with XMLTYPE. Is this a robust and fast Oracle feature? Or is it rather something immature and experimental.
Regards, Mathias

XMLDB is a strong and reliable feature we have since 9i.
Flexible-schema XMLType columns are implemented over hidden CLOB columns, while fixed-schema XMLType are decomposed into hidden tables and views; after all XMLType is as reliable as standard objects.
Performance would vary on your use, but just reading the whole XML on an XMLType is fast as reading the same on a classical CLOB, because actually it really just read and give you the XML code stored on CLOB as-is.

You can assign an XDB.XMLIndex to an XMLType, and performance will be good after that. We had a table with only 13,000 records, each containing an XML Clob that wasn't too big, maybe 40 pages each if it was printed on text, and running simple queries without an XMLIndex often took over 10 minutes, and running more complex queries often took 2x or 3x longer!
With an XMLIndex, the performance was on par with a typical table (On the order of milliseconds for full table scans). The XMLIndex can be as complex as you need it to be, I tried this naive one and it worked fine: (I say naive because I do not fully understand the inner workings of this index type!)
CREATE INDEX myschema.my_idx_name ON myschema.mytable
(SYS_MAKEXML(0,"SYS_SOMETHING$"))
INDEXTYPE IS XDB.XMLINDEX
NOPARALLEL;
You should understand the various XMLIndex options available to you, and choose according to your needs. Documentation:
https://docs.oracle.com/cd/B28359_01/appdev.111/b28369/xdb_indexing.htm#CHDJECDA

Related

Temporarily storing multiple values in Oracle [duplicate]

This question already has an answer here:
Alternate method to global temp tables for Oracle Stored Procedure
(1 answer)
Closed 8 years ago.
I need a way to temporarily store and use multiple values returned from an Oracle query. In SQL Server, I stored my values in a temp table, did my work, then dropped the table. I'm discovering the Oracle equivalent isn't as clear cut.
Here's a SQL Server example of what I'm trying to do:
select id into #temp from SomeTable where SomeColumn = 'Some Value'
:
(do whatever I need to do with #temp data)
:
drop table #temp
I can code my way around SQL Server pretty well, but am almost clueless when it comes to Oracle syntax. I've been reading various Oracle references, and they haven't been very helpful. I did read that Oracle temp tables work differently than SQL, and are often not recommended.
I'm looking into the temp table route, but if there's a better way to do this that doesn't use temp tables, I'm all ears. Anyone know a better way to do this in Oracle?
Thanks in advance.
As with many things, that depends. It depends on how much data you'll be retrieving and how you want to use it. If you don't have too much data to work with ("too much" meaning, oh, say, more than a couple thousand rows) and you want to manipulate the data procedurally, i.e. in a PL/SQL procedure or script, AND you don't want to access it using DML, i.e. you don't want to say something like SELECT * FROM your_temp_data... than loading the data into a PL/SQL collection, as #EgorSkriptunoff mentions above, might be a workable solution.
However, if the temp data is large (more than a couple thousand rows) and/or you need to be able to do something like SELECT * FROM your_temp_data... then your best bet it to use Oracle's Global Temp Tables. A GTT is a table which is used to hold data which should only last as long as either a single transaction or a complete session (i.e. as long as you're attached to the database). Documentation here and here, and another write-up on them here.
Share and enjoy.

Improving performance of CLOB insert across a DBLINK in Oracle

I am seeing poor performance in Oracle (11g) when trying to copy CLOBs from one database to another. I have tried several things, but haven't been able to improve this.
The CLOBs are used for gathering report data. This can be quite large on a record by record basis. I am calling a procedure on the remote databases (across a WAN) to build the data, then copying the results back to the database at the corporate headquarters for comparison. The general format is:
CREATE TABLE my_report(the_db VARCHAR2(30), object_id VARCHAR2(30),
final_value CLOB, CONSTRAINT my_report_pk PRIMARY KEY (the_db, object_id));
To gain performance, I accumulate the results for remote sites into remote copies of the table. At the end of the procedure run, I try to copy the data back. This query is very simple:
INSERT INTO my_report SELECT * FROM my_report#europe;
The performance that I am seeing is around 9 rows per second, with an average CLOB size of 3500 bytes. (I am using CLOBs as this size often goes above 4k, the VARCHAR2 limit.) For 70,000 records (not uncommon) this takes around 2 hours to transfer. I have tried using the create table as select method, but this gets the same performance. I also spent more than a few hours tuning SQL*NET, but see no improvement from this. Changing the Arraysize does not improve the performance (though it can reduce it if the value is reduced.
I am able to get a copy over using the old exp/imp methods (export the table from remote, import it back in), which runs much faster, but this is fairly manual for my automated report. I have considered trying to write a pipelined function to select this data from, using it to split the CLOBS into BYTE/VARCHAR2 chunks (with an additional chunk number column), but didn't want to do this if someone had tried it and found a problem.
Thanks for your help.
I was able to get better performance when increasing arraysize to 1500 or higher. See also attached document: http://www.fors.com/velpuri2/PERFORMANCE/SQLNET.pdf

Equivalent spatial types for insert statement

I have a number of files containing vast amounts of Insert statements (which were generated by Toad for Oracle) which I need to run on a Postgresql database.
Sounds simple I know but there are also oracle specific spatial data types in there which are hampering my efforts. I tried to use a number of tools for this from SwisSQL to SDO2Shp to migrate the data and none have been any help whatsoever so my only plan left to try is to come up with a C# program to open the file, replace the oracle specific types with types that will work in postgis and then save the file again. The problem is I have no idea which types I could substitute with the Oracle ones or the format or syntax I must use.
I am very new to postgresql and postgis and my oracle knowledge is also limited as I had previously used SQL Server.
Here is an example of the Insert statement. They will all have the same format as this as the tables are the same but with different data for different zoom levels on a map.
Insert into CLUSTER_1000M
(CLUSTER_ID, CELL_GEOM, CELL_CENTROID)
Values
(4410928,
"MDSYS"."SDO_GEOMETRY"(2003,81989,NULL,
"MDSYS"."SDO_ELEM_INFO_ARRAY"(1,1003,3,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL),
"MDSYS"."SDO_ORDINATE_ARRAY"(80000,106280,81000,107280,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL)),
"MDSYS"."SDO_GEOMETRY"(2001,81989,
"MDSYS"."SDO_POINT_TYPE"(80500,106780,NULL),NULL,NULL));
How can I get this into a format that will work with postgis?
I have no idea on how the Oracle GIS implementation works:
But looking at the data, I don't think conversion will be possible (it might be, but the effort might be huge).
Look at the way PostGIS defines Geometry
INSERT INTO geotable ( the_geom, the_name )
VALUES ( ST_GeomFromText('POINT(-126.4 45.32)', 312), 'A Place');
PostGIS follows standards on how to display/store the data and offers methods do assist the developer to do so. This conversion is mostly with functions that have a *from* in their name. So to create the proper data from a line, the output is similar to aline
SELECT ST_LineFromWKB(ST_AsBinary(ST_GeomFromText('LINESTRING(1 2, 3 4)'))) AS aline,
ST_LineFromWKB(ST_AsBinary(ST_GeomFromText('POINT(1 2)'))) IS NULL AS null_return;
aline | null_return
------------------------------------------------
010200000002000000000000000000F ... | t
Judging from your example output from Oracle, the format is pretty different and might not be convertable (if Oracle isn't offerent something that is able to stick to the standard).
On the other hand, when looking at the Oracle example
INSERT INTO cola_markets VALUES(
1,
'cola_a',
SDO_GEOMETRY(
2003, -- two-dimensional polygon
NULL,
NULL,
SDO_ELEM_INFO_ARRAY(1,1003,3), -- one rectangle (1003 = exterior)
SDO_ORDINATE_ARRAY(1,1, 5,7) -- only 2 points needed to
-- define rectangle (lower left and upper right) with
-- Cartesian-coordinate data
)
);
you might be able to replace some of the Oracle names with the ones for PostGIS, so SDO_ORDINATE_ARRAY(1,1, 5,7) might turn into something like ST_GeomFromText(LINESTRING(1 1, 5 7))

Oracle DBMS package command to export table content as INSERT statement

Is there any subprogram similar to DBMS_METADATA.GET_DDL that can actually export the table data as INSERT statements?
For example, using DBMS_METADATA.GET_DDL('TABLE', 'MYTABLE', 'MYOWNER') will export the CREATE TABLE script for MYOWNER.MYTABLE. Any such things to generate all data from MYOWNER.MYTABLE as INSERT statements?
I know that for instance TOAD Oracle or SQL Developer can export as INSERT statements pretty fast but I need a more programmatically way for doing it. Also I cannot create any procedures or functions in the database I'm working.
Thanks.
As far as I know, there is no Oracle supplied package to do this. And I would be skeptical of any 3rd party tool that claims to accomplish this goal, because it's basically impossible.
I once wrote a package like this, and quickly regretted it. It's easy to get something that works 99% of the time, but that last 1% will kill you.
If you really need something like this, and need it to be very accurate, you must tightly control what data is allowed and what tools can be used to run the script. Below is a small fraction of the issues you will face:
Escaping
Single inserts are very slow (especially if it goes over a network)
Combining inserts is faster, but can run into some nasty parsing bugs when you start inserting hundreds of rows
There are many potential data types, including custom ones. You may only have NUMBER, VARCHAR2, and DATE now, but what happens if someone adds RAW, BLOB, BFILE, nested tables, etc.?
Storing LOBs requires breaking the data into chunks because of VARCHAR2 size limitations (4000 or 32767, depending on how you do it).
Character set issues - This will drive you ¿¿¿¿¿¿¿ insane.
Enviroment limitations - For example, SQL*Plus does not allow more than 2500 characters per line, and will drop whitespace at the end of your line.
Referential Integrity - You'll need to disable these constraints or insert data in the right order.
"Fake" columns - virtual columns, XML lobs, etc. - don't import these.
Missing partitions - If you're not using INTERVAL partitioning you may need to manually create them.
Novlidated data - Just about any constraint can be violated, so you may need to disable everything.
If you want your data to be accurate you just have to use the Oracle utilities, like data pump and export.
Why don't you use regular export ?
If you must you can generate the export script:
Let's assume a Table myTable(Name VARCHAR(30), AGE Number, Address VARCHAR(60)).
select 'INSERT INTO myTable values(''' || Name || ','|| AGE ||',''' || Address ||''');' from myTable
Oracle SQL Developer does that with it's Export feature. DDL as well as data itself.
Can be a bit unconvenient for huge tables and likely to cause issues with cases mentioned above, but works well 99% of the time.

How to reduce cost on select statement?

I have a table in oracle 10g with around 51 columns and 25 Million number of records in it. When I execute a simple select query on the table to extract 3 columns I am getting the cost too high around 182k. So I need to reduce the cost effect. Is there any possible way to reduce it?
Query:
select a,b,c
from X
a - char
b - varchar2
c - varchar2
TIA
In cases like this it's difficult to give good advice without knowing why you would need to query 25 million records. As #Ryan says, normally you'd have a WHERE clause; or, perhaps you're extracting the results into another table or something?
A covering index (i.e. over a,b,c) would probably be the only way to make any difference to the performance - the query could then do a fast full index scan, and would get many more records per block retrieved.
Well...if you know you only need a subset of those values, throwing a WHERE clause on there would obviously help out quite a bit. If you truly need all 25 million records, and the table is properly indexed, then I'd say there's really not much you can do.
Yes, better telling the purpose of the select as jeffrey Kemp said.
If normal select, you just need to give index to your fields that mostly you can do, provide table statistic on index (DBMS_STATS.GATHER_TABLE_STATS), check the statistic of each field to be sure your index is right (Read: http://bit.ly/qR12Ul).
If need to load to another table, use cursor, limit the records of each executing and load to the table via the bulk insert (FORALL technique).

Resources