Tuning the below Sql query - oracle

I have a database of Old version Oracle 8.1.7 there I have been running the below Union query
select c_ordine_es
,c_ordine_salesnet
,v_oyov
,v_annuale
,v_oneoff
,v_canone
,c_operatore_tam
from v_ordine_cliente_easysell
where d_ultima_modifica>DataRif
union
select c_ordine_es
,c_ordine_salesnet
,v_oyov
,v_annuale
,v_oneoff
,v_canone
,c_operatore_tam
from v_ordine_cliente_easysell v
,scarti_interfaccia_easysell_o s
where s.c_codice_es=v.c_ordine_es
and s.t_tabella_es=pkType.K_SCARTO_ORDINE_CLIENTE;
Every time I run these query the SQL Client(I am using Toad) hangs. Here I must mention data in v_ordine_cliente_easysell and scarti_interfaccia_easysell_o these two views/synonyms are fetched using DB Link(To another SIEBEL DB). I guess the problem is happening at the time of fetching data via DB_LINK, as the SIEBEL DB is alwas very busy. Would you please suggest me how could I tune the above query?
The Explain plan goes like below
OPERATION OPTIONS OBJECT_NODE POSITION COST CARDINALITY BYTES
SELECT STATEMENT 28 28 478912 61779648
HASH JOIN 1 28 478912 61779648
INDEX FAST FULL SCAN 1 2 11354 102186
REMOTE SIEB.WORLD 2 23 4218 506160

Make a local copy of table v_ordine_cliente_easysell.
Make a local copy of table scarti_interfaccia_easysell_o filtered with
t_tabella_es=pkType.K_SCARTO_ORDINE_CLIENTE.
Run the query on local copies, but using UNION ALL

Related

Import massive table from Oracle to PostgreSQL with oracle-fdw return ORA-01406

I work on a project to transfer data from an Oracle database to a PostgreSQL database to create a datawarehouse with bash & SQL scripts. To access to the Oracle database, I use the PostgreSQL extension oracle-fdw.
One of my scripts import data from a massive table (~ 100 000 000 new rows/day). This table is partitioned and each partition contains 1 day of data. The query I use to import data looks like that :
INSERT INTO postgre_target_table (some_fields)
SELECT some_aggregated_fields -- (~150 fields)
FROM oracle_source_table
WHERE partition_id = :v_partition_id AND some_others_filters
GROUP BY primary_key;
On DEV server, the query works fine (there is much less data on this server) but in PREPROD, it returns the error ORA-01406: fetched column value was truncated.
In some posts, people say that the output fields may be too small but if I try to send a simple SELECT query without INSERT or GROUP BY I have the same error.
Another idea I found in another post is to create an Oracle side view but in my query I use multiple parameters that I cannot use in a view.
The last idea I found is to create an Oracle stored procedure that fills a table with aggregated data and then import data from this table but the Oracle database is critical and my customer prefers to avoid adding more data on it.
Now, I'm starting to think there's no solution and it's not good...
PostgreSQL version : 12.4 / Oracle version : 11.2
UPDATE
It seems my problem is more complecated than I thought.
After applying the modification given by Laurenz Albe, the query runs correctly on PGAdmin but the problem still appears when I use psql command.
Moreover, another query seems to have the same problem. This other query does not use the same source table as the first query, it uses 4 joined tables without any partition. The common point between these queries is the structure.
The detail I omit to specify in the original post is that the purpose of both queries is to pivot a table. They look like that :
SELECT osr.id,
MIN(CASE osr.category
WHEN 123 THEN
1
END) AS field1,
MIN(CASE osr.category
WHEN 264 THEN
1
END) AS field2,
MIN(CASE osr.category
WHEN 975 THEN
1
END) AS field3,
...
FROM oracle_source_table osr
WHERE osr.category IN (123, 264, 975, ...)
GROUP BY osr.id;
Now that I have detailed what the queries look like, I can give you some results I had with the second one without changing the value of max_long (this query is lighter than the first one) :
Sometimes it works (~10%), sometimes it failed (~90%) on PGadmin but it never works with psql command
If I delete the WHERE, it always works
I don't understand why deleting the WHERE change something, the field used in this clause is a NUMBER(6, 0) between 0 and 2500 and it is still used in the SELECT clause... Oh and in the 4 Oracle tables used by this query, there is no LONG datatype, only NUMBER datatype is used.
Among 20 queries I have, only these two have a problem, their structure is similar and I don't believe in coincidences.
Don't despair!
Set the max_long option on the foreign table big enough that all your oversized data fit.
The documentation has the details:
max_long (optional, defaults to "32767")
The maximal length of any LONG, LONG RAW and XMLTYPE columns in the Oracle table. Possible values are integers between 1 and 1073741823 (the maximal size of a bytea in PostgreSQL). This amount of memory will be allocated at least twice, so large values will consume a lot of memory.
If max_long is less than the length of the longest value retrieved, you will receive the error message
ORA-01406: fetched column value was truncated
Example:
ALTER FOREIGN TABLE my_tab OPTIONS (ADD max_long '1000000');

Performance balancing while querying in a remote database

We are working on 2 AIX 7 server and 2 Oracle databases 12.1.0.2.
1 database (called in this topic DB1) is our central PROD db.
The second database (called in this topic DB2) is a production DB too, but for used for a non critical application.
We want to isolate traitement (impact as less as possible DB1) executed on DB2 (with joins) from the central production database DB1.
These traitements uses DBLINK to read DB1 datas.
So the question is:
If we perform a query like
select col1, col2 from table1#dblink_DB1, table2#dblink_DB1 where JOIN DB1/DB2
On which server the JOIN treatment is executed?
Are only reads occurring on DB1 (so low performance case) and JOIN treatment is executed with SGA/CPU on DB2?
Or is everything executing on DB1?
Such queries (which can be executed fully remotely, without access to local database) usually work on the remote db link site and it's much better than if it work on local database, since in this case it would read leading table and run (Select * from table#dblink_DB1 where col=:a) so many times as a number of rows returned from table1#dblink_DB2. Of course, you can force it run locally using hint driving_site, but this case it would be far less effective for both databases. Read more about driving_site hint. And also you should now that dml statements (update/delete/merge/insert) work always on the database where you change data.

Spoon run slow from Postgres to Oracle

I have an ETL Spoon that read a table from Postgres and write into Oracle.
No transformation, no sort. SELECT col1, col2, ... col33 from table.
350 000 rows in input. The performance is 40-50 rec/sec.
I try to read/write the same table from PS to PS with ALL columns (col1...col100) I have 4-5 000 rec/sec
The same if I read/write from Oracle to Oracle: 4-5 000 rec/sec
So, for me, is not a network problem.
If I try with another table Postgres and only 7 columns, the performances are good.
Thanks for the help.
It happened same in my case also, while loading data from Oracle and running it on my local machine(Windows) the processing rate was 40 r/s but it was 3000 r/s for Vertica database.
I couldn't figure it out what was the exact problem but I found a way to increase the row count. It worked from me. you can also do the same.
Right click on the Table Input steps, you will see "Change Number Of Copies to Start"
Please include below in the where clause, This is to avoid duplicates. Because when you choose the option "Change Number Of Copies to Start" the query will be triggered N number of time and return duplicates but keeping below code in where clause will get only distinct records
where ora_hash(v_account_number,10)=${internal.step.copynr}
v_account_number is primary key in my case
10 is, say for example you have chosen 11 copies to start means, 11 - 1 = 10 so it is up to you to set.
Please note this will work, I suggest you to use on local machine for testing purpose but on the server definitely you will not face this issue. so comment the line while deploying to servers.

After upgrading from Sql Server 2008 to Sql Server 2016 a stored procedure that was fast is now slow

We have a stored procedure that returns all of the records that fall within a geospatial region ("geography"). It uses a CTE (with), some unions, some inner joins and returns the data as XML; nothing controversial or cutting edge but also not trivial.
This stored procedure has served us well for many years on SQL Server 2008. It has been running within 1 sec on a relatively slow server. We have just migrated to SQL Server 2016 on a super fast server with lots of memory and a super fast SDDs.
The entire database and associated application is going really fast on this new server and we are very happy with it. However this one stored procedure is running in 16 sec rather than 1 sec - against exactly the same parameters and exactly the same dataset.
We have updated the indexes and statistics on this database. We have also changed the compatibility level of the database from 100 to 130.
Interesting, I have re-written the stored procedure to use a temporary table and 'insert' rather than using the CTE. This has brought the time down from 16 sec to 4 sec.
The execution plan does not provide any obvious insights into where a bottleneck may be.
We are a bit stuck for ideas. What should we do next? Thanks in advance.
--
I have now spent more time on this problem than i care to admit. I have boiled down the stored procedure to the following query to demonstrate the problem.
drop table #T
declare #viewport sys.geography=convert(sys.geography,0xE610000001041700000000CE08C22D7740C002370B7670F4624000CE08C22D7740C002378B5976F4624000CE08C22D7740C003370B3D7CF4624000CE08C22D7740C003378B2082F4624000CE08C22D7740C003370B0488F4624000CE08C22D7740C004378BE78DF4624000CE08C22D7740C004370BCB93F4624000CE08C22D7740C004378BAE99F4624000CE08C22D7740C005370B929FF4624000CE08C22D7740C005378B75A5F4624000CE08C22D7740C005370B59ABF462406F22B7698E7640C005370B59ABF462406F22B7698E7640C005378B75A5F462406F22B7698E7640C005370B929FF462406F22B7698E7640C004378BAE99F462406F22B7698E7640C004370BCB93F462406F22B7698E7640C004378BE78DF462406F22B7698E7640C003370B0488F462406F22B7698E7640C003378B2082F462406F22B7698E7640C003370B3D7CF462406F22B7698E7640C002378B5976F462406F22B7698E7640C002370B7670F4624000CE08C22D7740C002370B7670F4624001000000020000000001000000FFFFFFFF0000000003)
declare #outputControlParameter nvarchar(max) = 'a value passed in through a parameter to the stored that controls the nature of data to return. This is not the solution you are looking for'
create table #T
(value int)
insert into #T
select 136561 union
select 16482 -- These values are sourced from parameters into the stored proc
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location
inner join GeoServices_GeographicServicesGateway
on GeoServices_Location.GeographicServicesGatewayId = GeoServices_GeographicServicesGateway.GeographicServicesGatewayId
where
(
(len(#outputControlParameter) > 0 and GeoServices_Location.GeographicServicesGatewayId in (select value from #T))
or (len(#outputControlParameter) = 0 and GeoServices_Location.Coordinate.STIntersects(#viewport) = 1)
)
and GeoServices_GeographicServicesGateway.PrimarilyFoundOnLayerId IN (3,8,9,5)
GO
With the stored procedure boiled down to this, it runs in 0 sec on SQL Server 2008 and 5 sec on SQL Server 2016
http://www.filedropper.com/newserver-slowexecutionplan
http://www.filedropper.com/oldserver-fastexecutionplan
Windows Server 2016 is choking on the Geospatial Intersects call with 94% of the time spent there. Sql Server 2008 is spending its time with with a bunch of other steps including Hash Matching and Parallelism and other standard stuff.
Remember this is the same database. One has just been copied to a SQL Server 2016 machine and had its compatibility level increased.
To get around the problem I have actually rewritten the stored procedure so that Sql Server 2016 does not choke. I have running in 250msec. However this should not have happened in the first place and I am concerned that there are other previously finely tuned queries or stored procedures that are now not running efficiently.
Thanks in advance.
--
Furthermore, I had a suggestion to add the traceflag -T6534 to start up parameter of the service. It made no difference to the query time. Also I tried adding option(QUERYTRACEON 6534) to the end of the query too but again it made no difference.
From the query plans you provided I see that spatial index is not used on newer server version.
Use spatial index hint to make sure query optimizer chose the plan with spatial index:
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location with (index ([spatial_index_name]))...
I see that the problem with the hint is OR operation in query predicate, so my suggestion with hint actually won’t help in this case.
However, I see that predicate depends on #outputControlParameter so rewriting query in order to have these two cases separated might help (see my proposal below).
Also, from your query plans I see that query plan on SQL 2008 is parallel while on SQL 2016 is serial. Use option (recompile, querytraceon 8649) to force parallel plan (should help if your new superfast server has more cores then the old one).
if (len(#outputControlParameter) > 0)
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location
inner join GeoServices_GeographicServicesGateway
on GeoServices_Location.GeographicServicesGatewayId = GeoServices_GeographicServicesGateway.GeographicServicesGatewayId
where
GeoServices_Location.GeographicServicesGatewayId in (select value from #T))
and GeoServices_GeographicServicesGateway.PrimarilyFoundOnLayerId IN(3,8,9,5)
option (recompile, querytraceon 8649)
else
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location with (index ([SPATIAL_GeoServices_Location]))
inner join GeoServices_GeographicServicesGateway
on GeoServices_Location.GeographicServicesGatewayId = GeoServices_GeographicServicesGateway.GeographicServicesGatewayId
where
GeoServices_Location.Coordinate.STIntersects(#viewport) = 1
and GeoServices_GeographicServicesGateway.PrimarilyFoundOnLayerId IN (3,8,9,5)
option (recompile, querytraceon 8649)
check the growth of the data/log files on the new server (DBs) vs old server (DBs) configuration: the DB the query is running on + tempdb
check the log for I/O buffer errors
check recovery model of the DB's - simple vs full/bulk
is this a consistent behavior? maybe a process is running during the execution?
regarding statistics/indexes - are you sure it's running on correct data sample? (look at the plan)
many more things can be checked/done - but there is not enough info in this question.

How can I see the SQL execution plan in Oracle?

I'm learning about database indexes right now, and I'm trying to understand the efficiency of using them.
I'd like to see whether a specific query uses an index.
I want to actually see the difference between executing the query using an index and without using the index (so I want to see the execution plan for my query).
I am using sql+.
How do I see the execution plan and where can I found in it the information telling me whether my index was used or not?
Try using this code to first explain and then see the plan:
Explain the plan:
explain plan
for
select * from table_name where ...;
See the plan:
select * from table(dbms_xplan.display);
Edit: Removed the brackets
The estimated SQL execution plan
The estimated execution plan is generated by the Optimizer without executing the SQL query. You can generate the estimated execution plan from any SQL client using EXPLAIN PLAN FOR or you can use Oracle SQL Developer for this task.
EXPLAIN PLAN FOR
When using Oracle, if you prepend the EXPLAIN PLAN FOR command to a given SQL query, the database will store the estimated execution plan in the associated PLAN_TABLE:
EXPLAIN PLAN FOR
SELECT p.id
FROM post p
WHERE EXISTS (
SELECT 1
FROM post_comment pc
WHERE
pc.post_id = p.id AND
pc.review = 'Bingo'
)
ORDER BY p.title
OFFSET 20 ROWS
FETCH NEXT 10 ROWS ONLY
To view the estimated execution plan, you need to use DBMS_XPLAN.DISPLAY, as illustrated in the following example:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY (FORMAT=>'ALL +OUTLINE'))
The ALL +OUTLINE formatting option allows you to get more details about the estimated execution plan than using the default formatting option.
Oracle SQL Developer
If you have installed SQL Developer, you can easily get the estimated execution plan for any SQL query without having to prepend the EXPLAIN PLAN FOR command:
##The actual SQL execution plan
The actual SQL execution plan is generated by the Optimizer when running the SQL query. So, unlike the estimated Execution Plan, you need to execute the SQL query in order to get its actual execution plan.
The actual plan should not differ significantly from the estimated one, as long as the table statistics have been properly collected by the underlying relational database.
GATHER_PLAN_STATISTICS query hint
To instruct Oracle to store the actual execution plan for a given SQL query, you can use the GATHER_PLAN_STATISTICS query hint:
SELECT /*+ GATHER_PLAN_STATISTICS */
p.id
FROM post p
WHERE EXISTS (
SELECT 1
FROM post_comment pc
WHERE
pc.post_id = p.id AND
pc.review = 'Bingo'
)
ORDER BY p.title
OFFSET 20 ROWS
FETCH NEXT 10 ROWS ONLY
To visualize the actual execution plan, you can use DBMS_XPLAN.DISPLAY_CURSOR:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'ALLSTATS LAST ALL +OUTLINE'))
Enable STATISTICS for all queries
If you want to get the execution plans for all queries generated within a given session, you can set the STATISTICS_LEVEL session configuration to ALL:
ALTER SESSION SET STATISTICS_LEVEL='ALL'
This will have the same effect as setting the GATHER_PLAN_STATISTICS query hint on every execution query. So, just like with the GATHER_PLAN_STATISTICS query hint, you can use DBMS_XPLAN.DISPLAY_CURSOR to view the actual execution plan.
You should reset the STATISTICS_LEVEL setting to the default mode once you are done collecting the execution plans you were interested in. This is very important, especially if you are using connection pooling, and database connections get reused.
ALTER SESSION SET STATISTICS_LEVEL='TYPICAL'
Take a look at Explain Plan. EXPLAIN works across many db types.
For sqlPlus specifically, see sqlplus's AUTO TRACE facility.
Try this:
http://www.dba-oracle.com/t_explain_plan.htm
The execution plan will mention the index whenever it is used. Just read through the execution plan.

Resources