Im pretty new to oracle databases. The situation I have right now is that I need to extract and compare read/write performance statistics between File System and ASM in Oracle 11g Database Infrastructure R2.
I currently have two virtual machines, one is configured as File System and the other one is ASM (diskgroups are ok and good to go). Both running win7 pro x64 in separate physical HDDs on my computer.
The test that I came up with is to create a table in both DBs, and using a custom sequence I created, insert 100,000 records on them. Next thing, I will need to see how well did ASM performed compared to File System in terms of performance.
The problem? I dont know how to pull the performance statistics so that I can compare them. Can someone please help? It'll be much appreciated.
SQL> create sequence SQX
start with 1
increment by 1;
SQL> create table test(
x number(9),
y date);
SQL> begin
for i in 1.. 100000 loop
insert into test(x,y) values(SQX.nextval,sysdate);
end loop;
end;
/
You don't use the good method. The performance of the I/O system can be only evaluated with specific tools such as ORION (and some others).
This article from Tim Hall gives you a good overview of the possibilities you have: [http://oracle-base.com/articles/misc/measuring-storage-performance-for-oracle-systems.php].
Regards,
Cyryl
Related
I have several Oracle databases where my in-house applications are running. Those applications use both dba_jobs and dba_scheduler_jobs.
I want to write monitoring function: check_my_jobs which will be called periodically by Nagios to check if everything is OK with my jobs. (Are they running? Is it Broken? Is next_run_date delayed? and so on)
Solutions: Due to the fact that I have to monitor jobs on different databases there is two way of implementing solution:
Create a monitoring function and configuration tables only on one database which will check jobs on every database using database links.
pros: Centralized functionality, easy to maintain.
cons: I have to do the checks using database links.
Create a monitoring function and configuration tables on every database where I want to check jobs.
pros: I don't have to use DB links
cons: Duplicated monitoring code on every database
Which solution is better?
I'd go with option #1 - centralized functionality that uses database links.
Database links have an undeserved bad reputation. One of the main reasons is that too many people use public database links, where anyone connecting to the database can use the link. That's obviously a security nightmare, but it's not the default setting and it's easy to avoid that trap.
Some other issues with database links:
They don't perform well for huge inserts of millions of rows. On the other hand they're great at many small SELECTs or INSERTs. I frequently have hundreds of links open and fetching data concurrently, on 10 year-old hardware, and it works great.
They make execution plans more difficult to troubleshoot.
Not all data types are natively supported. This is better in 12.2, but in earlier versions you will need to use an INSERT to move data types like CLOB into tables, and then read from those tables.
For DDL you'll need to use DBMS_UTILITY.EXEC_DDL_STATEMENT#LINK_NAME('create ...'); Make sure to only use DDL in there. Other types of commands will silently fail.
Links may hang indefinitely in a few rare situations, like if the database has an archiver error or a guaranteed restore point that's full. (This one is really a blessing in disguise - many tools like Oracle Enterprise Manager will not catch those issues. You may want to have a background job checking for database link queries that have been running longer than X minutes.)
Links should not be hard-coded or else they could invalidate the package. But this may not matter - you'll probably want to loop through the list of databases and use dynamic SQL anyway. And if the link doesn't exist it's pretty easy to create a new one. Here's an example:
declare
v_result varchar2(4000);
begin
--Loop through a configuration table of links.
for links in
(
select database_name, db_link
from dbs_to_monitor
left join user_db_links
on dbs_to_monitor.database_name = user_db_links.db_link
order by database_name
) loop
--Run the query if the link exists.
if links.db_link is not null then
begin
--Note the user of REPLACE and the alternative quoting mechanism, q'[...]';
--This looks a bit silly with this small example, but in a real-life query
--it avoids concatenation hell and makes the query much easier to read.
execute immediate replace(q'[
select dummy from dual##DB_LINK#
]',
'#DB_LINK#', links.db_link)
into v_result;
dbms_output.put_line('Result: '||v_result);
--Catch errors if the links are broken or some other error happens.
exception when others then
dbms_output.put_line('Error with '||links.db_link||': '||sqlerrm);
end;
--Error if the link was not created.
--You will have to run:
--create database link LINK_NAME connect to USERNAME identified by "PASSWORD" using 'TNS_STRING';
else
dbms_output.put_line('ERROR - '||links.db_link||' does not exist!');
end if;
end loop;
end;
/
Despite all of that, database links are great because you can do everything in PL/SQL, on one database. In a single language you can create an agentless monitoring solution and don't have to worry about installing and fixing agents.
As an example, I built the open source program Method5 to do everything using database links. With that program installed you could gather results from hundreds of databases as simply as running select * from table(m5('select * from dba_jobs'));. That program is probably overkill for your scenario but it shows that database links are all you need for a full monitoring solution.
We have a stored procedure that returns all of the records that fall within a geospatial region ("geography"). It uses a CTE (with), some unions, some inner joins and returns the data as XML; nothing controversial or cutting edge but also not trivial.
This stored procedure has served us well for many years on SQL Server 2008. It has been running within 1 sec on a relatively slow server. We have just migrated to SQL Server 2016 on a super fast server with lots of memory and a super fast SDDs.
The entire database and associated application is going really fast on this new server and we are very happy with it. However this one stored procedure is running in 16 sec rather than 1 sec - against exactly the same parameters and exactly the same dataset.
We have updated the indexes and statistics on this database. We have also changed the compatibility level of the database from 100 to 130.
Interesting, I have re-written the stored procedure to use a temporary table and 'insert' rather than using the CTE. This has brought the time down from 16 sec to 4 sec.
The execution plan does not provide any obvious insights into where a bottleneck may be.
We are a bit stuck for ideas. What should we do next? Thanks in advance.
--
I have now spent more time on this problem than i care to admit. I have boiled down the stored procedure to the following query to demonstrate the problem.
drop table #T
declare #viewport sys.geography=convert(sys.geography,0xE610000001041700000000CE08C22D7740C002370B7670F4624000CE08C22D7740C002378B5976F4624000CE08C22D7740C003370B3D7CF4624000CE08C22D7740C003378B2082F4624000CE08C22D7740C003370B0488F4624000CE08C22D7740C004378BE78DF4624000CE08C22D7740C004370BCB93F4624000CE08C22D7740C004378BAE99F4624000CE08C22D7740C005370B929FF4624000CE08C22D7740C005378B75A5F4624000CE08C22D7740C005370B59ABF462406F22B7698E7640C005370B59ABF462406F22B7698E7640C005378B75A5F462406F22B7698E7640C005370B929FF462406F22B7698E7640C004378BAE99F462406F22B7698E7640C004370BCB93F462406F22B7698E7640C004378BE78DF462406F22B7698E7640C003370B0488F462406F22B7698E7640C003378B2082F462406F22B7698E7640C003370B3D7CF462406F22B7698E7640C002378B5976F462406F22B7698E7640C002370B7670F4624000CE08C22D7740C002370B7670F4624001000000020000000001000000FFFFFFFF0000000003)
declare #outputControlParameter nvarchar(max) = 'a value passed in through a parameter to the stored that controls the nature of data to return. This is not the solution you are looking for'
create table #T
(value int)
insert into #T
select 136561 union
select 16482 -- These values are sourced from parameters into the stored proc
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location
inner join GeoServices_GeographicServicesGateway
on GeoServices_Location.GeographicServicesGatewayId = GeoServices_GeographicServicesGateway.GeographicServicesGatewayId
where
(
(len(#outputControlParameter) > 0 and GeoServices_Location.GeographicServicesGatewayId in (select value from #T))
or (len(#outputControlParameter) = 0 and GeoServices_Location.Coordinate.STIntersects(#viewport) = 1)
)
and GeoServices_GeographicServicesGateway.PrimarilyFoundOnLayerId IN (3,8,9,5)
GO
With the stored procedure boiled down to this, it runs in 0 sec on SQL Server 2008 and 5 sec on SQL Server 2016
http://www.filedropper.com/newserver-slowexecutionplan
http://www.filedropper.com/oldserver-fastexecutionplan
Windows Server 2016 is choking on the Geospatial Intersects call with 94% of the time spent there. Sql Server 2008 is spending its time with with a bunch of other steps including Hash Matching and Parallelism and other standard stuff.
Remember this is the same database. One has just been copied to a SQL Server 2016 machine and had its compatibility level increased.
To get around the problem I have actually rewritten the stored procedure so that Sql Server 2016 does not choke. I have running in 250msec. However this should not have happened in the first place and I am concerned that there are other previously finely tuned queries or stored procedures that are now not running efficiently.
Thanks in advance.
--
Furthermore, I had a suggestion to add the traceflag -T6534 to start up parameter of the service. It made no difference to the query time. Also I tried adding option(QUERYTRACEON 6534) to the end of the query too but again it made no difference.
From the query plans you provided I see that spatial index is not used on newer server version.
Use spatial index hint to make sure query optimizer chose the plan with spatial index:
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location with (index ([spatial_index_name]))...
I see that the problem with the hint is OR operation in query predicate, so my suggestion with hint actually won’t help in this case.
However, I see that predicate depends on #outputControlParameter so rewriting query in order to have these two cases separated might help (see my proposal below).
Also, from your query plans I see that query plan on SQL 2008 is parallel while on SQL 2016 is serial. Use option (recompile, querytraceon 8649) to force parallel plan (should help if your new superfast server has more cores then the old one).
if (len(#outputControlParameter) > 0)
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location
inner join GeoServices_GeographicServicesGateway
on GeoServices_Location.GeographicServicesGatewayId = GeoServices_GeographicServicesGateway.GeographicServicesGatewayId
where
GeoServices_Location.GeographicServicesGatewayId in (select value from #T))
and GeoServices_GeographicServicesGateway.PrimarilyFoundOnLayerId IN(3,8,9,5)
option (recompile, querytraceon 8649)
else
select
[GeoServices_Location].[GeographicServicesGatewayId],
[GeoServices_Location].[Coordinate].Lat,
[GeoServices_Location].[Coordinate].Long
from GeoServices_Location with (index ([SPATIAL_GeoServices_Location]))
inner join GeoServices_GeographicServicesGateway
on GeoServices_Location.GeographicServicesGatewayId = GeoServices_GeographicServicesGateway.GeographicServicesGatewayId
where
GeoServices_Location.Coordinate.STIntersects(#viewport) = 1
and GeoServices_GeographicServicesGateway.PrimarilyFoundOnLayerId IN (3,8,9,5)
option (recompile, querytraceon 8649)
check the growth of the data/log files on the new server (DBs) vs old server (DBs) configuration: the DB the query is running on + tempdb
check the log for I/O buffer errors
check recovery model of the DB's - simple vs full/bulk
is this a consistent behavior? maybe a process is running during the execution?
regarding statistics/indexes - are you sure it's running on correct data sample? (look at the plan)
many more things can be checked/done - but there is not enough info in this question.
The problem I am trying to solve:
I have a SAS dataset work.testData (in the work library) that contains 8 columns and around 1 million rows. All columns are in text (i.e. no numeric data). This SAS dataset is around 100 MB in file size. My objective is to have a step to parse this entire SAS dataset into Oracle. i.e. sort of like a "copy and paste" of the SAS dataset from the SAS platform to the Oracle platform. The rationale behind this is that on a daily basis, this table in Oracle gets "replaced" by the one in SAS which will enable downstream Oracle processes.
My approach to solve the problem:
One-off initial setup in Oracle:
In Oracle, I created a table called testData with a table structure pretty much identical to the SAS dataset testData. (i.e. Same table name, same number of columns, same column names, etc.).
On-going repeating process:
In SAS, do a SQL-pass through to truncate ora.testData (i.e. remove all rows whilst keeping the table structure). This ensure the ora.testData is empty before inserting from SAS.
In SAS, a LIBNAME statement to assign the Oracle database as a SAS library (called ora). So I can "see" what's in Oracle and perform read/update from SAS.
In SAS, a PROC SQL procedure to "insert" data from the SAS dataset work.testData into the Oracle table ora.testData.
Sample codes
One-off initial setup in Oracle:
Step 1: Run this Oracle SQL Script in Oracle SQL Developer (to create table structure for table testData. 0 rows of data to begin with.)
DROP TABLE testData;
CREATE TABLE testData
(
NODENAME VARCHAR2(64) NOT NULL,
STORAGE_NAME VARCHAR2(100) NOT NULL,
TS VARCHAR2(10) NOT NULL,
STORAGE_TYPE VARCHAR2(12) NOT NULL,
CAPACITY_MB VARCHAR2(11) NOT NULL,
MAX_UTIL_PCT VARCHAR2(12) NOT NULL,
AVG_UTIL_PCT VARCHAR2(12) NOT NULL,
JOBRUN_START_TIME VARCHAR2(19) NOT NULL
)
;
COMMIT;
On-going repeating process:
Step 2, 3 and 4: Run this SAS code in SAS
******************************************************;
******* On-going repeatable process starts here ******;
******************************************************;
*** Step 2: Trancate the temporary Oracle transaction dataset;
proc sql;
connect to oracle (user=XXX password=YYY path=ZZZ);
execute (
truncate table testData
) by oracle;
execute (
commit
) by oracle;
disconnect from oracle;
quit;
*** Step 3: Assign Oracle DB as a libname;
LIBNAME ora Oracle user=XXX password=YYY path=ZZZ dbcommit=100000;
*** Step 4: Insert data from SAS to Oracle;
PROC SQL;
insert into ora.testData
select NODENAME length=64,
STORAGE_NAME length=100,
TS length=10,
STORAGE_TYPE length=12,
CAPACITY_MB length=11,
MAX_UTIL_PCT length=12,
AVG_UTIL_PCT length=12,
JOBRUN_START_TIME length=19
from work.testData;
QUIT;
******************************************************;
**** On-going repeatable process ends here *****;
******************************************************;
The limitation / problem to my approach:
The Proc SQL step (that transfer 100 MB of data from SAS to Oracle) takes around 5 hours to perform - the job takes too long to run!
The Question:
Is there a more sensible way to perform data transfer from SAS to Oracle? (i.e. updating an Oracle table from SAS).
First off, you can do the drop/recreate from SAS if that's a necessity. I wouldn't drop and recreate each time - a truncate seems easier to get the same results - but if you have other reasons then that's fine; but either way you can use execute (truncate table xyz) from oracle or similar to drop, using a pass-through connection.
Second, assuming there are no constraints or indexes on the table - which seems likely given you are dropping and recreating it - you may not be able to improve this, because it may be based on network latency. However, there is one area you should look in the connection settings (which you don't provide): how often SAS commits the data.
There are two ways to control this, the DBCOMMMIT setting and the BULKLOAD setting. The former controls how frequently commits are executed (so if DBCOMMIT=100 then a commit is executed every 100 rows). More frequent commits = less data is lost if a random failure occurs, but much slower execution. DBCOMMIT defaults to 0 for PROC SQL INSERT, which means just make one commit (fastest option assuming no errors), so this is less likely to be helpful unless you're overriding this.
Bulkload is probably my recommendation; that uses SQLLDR to load your data, ie, it batches the whole bit over to Oracle and then says 'Load this please, thanks.' It only works with certain settings and certain kinds of queries, but it ought to work here (subject to other conditions - read the documentation page above).
If you're using BULKLOAD, then you may be up against network latency. 5 hours for 100 MB seems slow, but I've seen all sorts of things in my (relatively short) day. If BULKLOAD didn't work I would probably bring in the Oracle DBAs and have them troubleshoot this, starting from a .csv file and a SQL*LDR command file (which should be basically identical to what SAS is doing with BULKLOAD); they should know how to troubleshoot that and at least be able to monitor performance of the database itself. If there are constraints on other tables that are problematic here (ie, other tables that too-frequently recalculate themselves based on your inserts or whatever), they should be able to find out and recommend solutions.
You could look into PROC DBLOAD, which sometimes is faster than inserts in SQL (though all in all shouldn't really be, and is an 'older' procedure not used too much anymore). You could also look into whether you can avoid doing a complete flush and fill (ie, if there's a way to transfer less data across the network), or even simply shrinking the column sizes.
I am seeing poor performance in Oracle (11g) when trying to copy CLOBs from one database to another. I have tried several things, but haven't been able to improve this.
The CLOBs are used for gathering report data. This can be quite large on a record by record basis. I am calling a procedure on the remote databases (across a WAN) to build the data, then copying the results back to the database at the corporate headquarters for comparison. The general format is:
CREATE TABLE my_report(the_db VARCHAR2(30), object_id VARCHAR2(30),
final_value CLOB, CONSTRAINT my_report_pk PRIMARY KEY (the_db, object_id));
To gain performance, I accumulate the results for remote sites into remote copies of the table. At the end of the procedure run, I try to copy the data back. This query is very simple:
INSERT INTO my_report SELECT * FROM my_report#europe;
The performance that I am seeing is around 9 rows per second, with an average CLOB size of 3500 bytes. (I am using CLOBs as this size often goes above 4k, the VARCHAR2 limit.) For 70,000 records (not uncommon) this takes around 2 hours to transfer. I have tried using the create table as select method, but this gets the same performance. I also spent more than a few hours tuning SQL*NET, but see no improvement from this. Changing the Arraysize does not improve the performance (though it can reduce it if the value is reduced.
I am able to get a copy over using the old exp/imp methods (export the table from remote, import it back in), which runs much faster, but this is fairly manual for my automated report. I have considered trying to write a pipelined function to select this data from, using it to split the CLOBS into BYTE/VARCHAR2 chunks (with an additional chunk number column), but didn't want to do this if someone had tried it and found a problem.
Thanks for your help.
I was able to get better performance when increasing arraysize to 1500 or higher. See also attached document: http://www.fors.com/velpuri2/PERFORMANCE/SQLNET.pdf
Is there any subprogram similar to DBMS_METADATA.GET_DDL that can actually export the table data as INSERT statements?
For example, using DBMS_METADATA.GET_DDL('TABLE', 'MYTABLE', 'MYOWNER') will export the CREATE TABLE script for MYOWNER.MYTABLE. Any such things to generate all data from MYOWNER.MYTABLE as INSERT statements?
I know that for instance TOAD Oracle or SQL Developer can export as INSERT statements pretty fast but I need a more programmatically way for doing it. Also I cannot create any procedures or functions in the database I'm working.
Thanks.
As far as I know, there is no Oracle supplied package to do this. And I would be skeptical of any 3rd party tool that claims to accomplish this goal, because it's basically impossible.
I once wrote a package like this, and quickly regretted it. It's easy to get something that works 99% of the time, but that last 1% will kill you.
If you really need something like this, and need it to be very accurate, you must tightly control what data is allowed and what tools can be used to run the script. Below is a small fraction of the issues you will face:
Escaping
Single inserts are very slow (especially if it goes over a network)
Combining inserts is faster, but can run into some nasty parsing bugs when you start inserting hundreds of rows
There are many potential data types, including custom ones. You may only have NUMBER, VARCHAR2, and DATE now, but what happens if someone adds RAW, BLOB, BFILE, nested tables, etc.?
Storing LOBs requires breaking the data into chunks because of VARCHAR2 size limitations (4000 or 32767, depending on how you do it).
Character set issues - This will drive you ¿¿¿¿¿¿¿ insane.
Enviroment limitations - For example, SQL*Plus does not allow more than 2500 characters per line, and will drop whitespace at the end of your line.
Referential Integrity - You'll need to disable these constraints or insert data in the right order.
"Fake" columns - virtual columns, XML lobs, etc. - don't import these.
Missing partitions - If you're not using INTERVAL partitioning you may need to manually create them.
Novlidated data - Just about any constraint can be violated, so you may need to disable everything.
If you want your data to be accurate you just have to use the Oracle utilities, like data pump and export.
Why don't you use regular export ?
If you must you can generate the export script:
Let's assume a Table myTable(Name VARCHAR(30), AGE Number, Address VARCHAR(60)).
select 'INSERT INTO myTable values(''' || Name || ','|| AGE ||',''' || Address ||''');' from myTable
Oracle SQL Developer does that with it's Export feature. DDL as well as data itself.
Can be a bit unconvenient for huge tables and likely to cause issues with cases mentioned above, but works well 99% of the time.