I have to use an PL/SQL package as API for importing data into an Oracle database. I'm doing this within an Java application with the latest ojdbc driver. All statements (of cause PreparedStatements) I'm using during the import are initialized only one time and reused for every set to import.
Now I'm facing following problem: The first call of an procedure of the package takes over 90% of the time for one set. I have to call about 10 procedures during the import and the first one takes about 4 seconds the rest about 0.4 seconds. It doesn't matter if it's the 10th or 100,000th set to import the first procedure call allways takes that time.
Important to know is, if I'm calling another procedure on first position this on takes the 90%. So, may be I'm wrong, it is something about the package initialization? But if I'm (re-)using prepared statements, shouldn't that happen only at first call?
The PL/SQL package has about 10,000 lines of code and also calls several other packages during the import.
So now my questions are:
What are possible reasons for this problem? And what are potential solutions?
Are there any tools I can use to identify the causer?
EDIT: I could identify the cause of the slow import. It had nothing to do with wrong code or something. The reason was simply the kind of data I used in my test scenario. My mistake was importing allways the same data.
If thread one made an update on a data-set in the first procedure it was holding an lock on this row until the commit after the complete import. Thread two to n were trying to update exactly the same row. The result is effectivly a synchronization of all threads.
First of all, this is not normal. So there is definitely something awry with your code. But without being able to see your source there's no way we're going to be able to spot the problem. And frankly I don't want to debug 10000 LOC, not even mine let alone yours. Sorry.
So the best we can do is give you some pointers.
One:
"The first call of an procedure of the package takes over 90% of the
time for one set. .... if I'm calling another procedure on first
position this on takes the 90%"
Perhaps there is some common piece of coding which every procedure executes that behaves differently depending on whether the calling procedure is the first one to execute it in any given run. You need to locate that rogue code.
Two:
" I've used the profiler in pl/sql developer. The execution is very
fast there. "
Your program behaves differently depending on whether you call it from PL/SQL Developer of JDBC. So there is a strong possiblity that the problem lies not in the PL/SQL code but in the JDBC code. Acquiring database connections is definitely one potential source of pain. Depnding on your architecture, network traffic may be another problem: are you returning lots of data to the Java program which is then used in subsequent procedural calls?
In short: you either need to identify something common in your PL/SQL code which can cause the same outcome in different proocedural calls or identify what happens differently when you call the program in PL/SQL Developer and JDBC.
Related
I have a ~2,300 line package, which is split into many procedures & function. It is running slower than I would like. Many years ago on a previous release of Oracle (9i or 11g) I had a similar problem and I was able to build a hierarchical structure which contained everything that was executed in the procedure/package an how much time was spent on each item.
I cannot seem to find a tutorial/blog that shows how to accomplish this. It is probably done with the DBMS_STATS package, but I find Oracle's documentation unsuitable for task oriented problem solving. It may be great if you want to learn everything there is to know about a subject, but generally all I need to know is how to solve the issue I am currently working on.
At any rate can someone point me to how I can get the runtime statistics of a run of an Oracle Procedure?
There are two options:
dbms_profiler - records times on each statement executed. docs
dbms_hpof - similar, but collects statistics on a hierarchy of pl/sql calls docs
Either method will require some setup using SYS (dba) access. Setup Instuctions for dbms_profiler
Did you tried dbms_utility.get_time method as described in THIS post.
Link to the original question which is almost the same as your question.
https://www.quora.com/How-can-I-log-the-execution-time-of-a-stored-procedures-in-a-table-in-Oracle-database
I am a relative beginner at SSIS so I may be doing something silly.
I have a process that involves looping over a heterogenous queue and processing the objects 1 at a time. The process is currently being done in 'set logic' and its dropping stuff. I was asked to rework it in a looping manner, so that decision has been made for me.
I have chosen to implement queue logic in 1 package and the actual processing in another package.
This is all going relatively well considering...
I now have the process up and running, but its slow. 9 seconds per item. Clearly I cant present this solution. :-)
One thing i notice, 1.5 - 2 seconds of each loop are on the ExecutePackage Task in the queue loop.
I cant figure out how to get a hard number, I am using the flashing green box method of performance tuning. The other steps seem to be very fast. Adding indexes, changing sql to sps, all the usual tricks have helped.
Is the UI realiable at all with regards to boxes turning white/yellow/green? Some tasks report times in the progress tab, some dont seem to. So I am counting yellow time.
Should calling a subpackage be that expensive? 1 change i made was I change 'RunInASeparateProcess' to FALSE. I did that because the subpackage produces the following message otherwise:
Error: 0xC0012024 at Script Task: The task "Script Task" cannot run on this edition of Integration Services. It requires a higher level edition.
Task failed: Script Task
The reading i have done seems to advocate multiple packages. Anyone have any counter patterns? Should i stay the course? I started changing to 1 package. Copy/paste doesnt seem to work well w/ SequenceContainers. I would also need to recreate all the variables in the parent package. Doable, but im not sure that is the answer.
Does anyone know of any tuning resources/websites/books they would be willing to share.
Update - I have been tearing things down in an effort to figure out what the problem is. I was thinking it was the package configurations passing variable values. I dont think that is it. I can pass variables to another package w/ nothing in it and it is fast.
I can make the trivial subpackage slow by adding the two connection managers to it.
I suddenly realize I may be making and breaking a connection to both an Oracle Server and a SQL server in both the main package and then the sub package.
Am I correct in this observation?
Is there any way I can reuse the connection between the two packages?
When i google it, most of what i see is suggestions for passing the connection string.
UPDATE - I combined the two packages into one. This performance is not about 1.25 seconds per item, down from about 9. the only thing i can point to that changed is i am now reusing a single connection instead of making multiple connections.
Thanks, I appreciate any help you are kind enough to offer.
Greg
Once you enable logging, I'd suggest running the package from a command window using dtexec. While that doesn't perfectly duplicate the server environment, it does have the advantages of (a) eliminating BIDS as a potential performance issue and (b) being something you can do without jumping through change control hoops.
When running a sproc with SqlDataAdapter.fill(), I noticed it was taking upwards of 90 seconds when running the same sproc in management studio took only 1-2 seconds. I started messing around with the parameters to try to find the issue, and I eventually did, though it's a strange one. I discovered that if I simply declared three new variables in the sproc and directly copied the contents of the parameters into them, and then used those new variables in the body of the sproc, the fill() method dropped to 1-2 seconds just like running the sproc directly in management studio. In other words, doing this:
CREATE PROCEDURE [dbo].[TestProc]
#location nvarchar(100), #startTime datetime, #endTime datetime
AS
declare #location2 nvarchar(100), #endTime2 datetime, #startTime2 datetime
set #location2 = #location
set #startTime2 = #startTime
set #endTime2 = #endTime
--... query using #location2, #startTime2, #endTime2
If I changed even just one of the references in the query body from #startTime2 back to #startTime (the actual parameter passed in from C#), the query jumped right back up to around 90s or even longer.
SO.... why in the world does SQLDataAdapter or SQL Server care what I do with its parameters once they're passed into the sproc? Why would this affect execution time? Any guidance of how to root out this issue further is greatly appreciated. Thanks!
Edit: Although I could've sworn there was a difference between running the query from C# using SqlDataAdapter and using management studio, as of right now, I can't replicate the difference. Now, management studio also takes > 90 seconds to run the sproc when I do NOT copy the parameters. This is a huge relief, because it means the problem isn't somehow with C#, and it just a more run of the mill (though still strange) SQL Server issue. One of the guys on my team that's an excellent SQL guy is looking at the execution path of the sproc when run with and without first copying the parameters. If we figure it out, I'll post the answer here. Thanks for the help so far!
It's undoubtedly a case of parameter sniffing and improper reuse of execution plans that were created with a different set of parameters that had a very different optimal access pattern.
The sudden change to the two different-style accesses being the same (rather than one quick) strongly suggests that the cached execution plan was updated to a version that now performs slowly with both access methods, or your data or your parameters changed.
In my experience the general culprit in this sort of small/huge time difference of execution is use of a nested loop join where a hash match is actually required. (For a very small number of rows the nested loop is superior, past a certain fairly low barrier, then the hash match becomes less expensive. Unless you're lucky that your inputs are both sorted by the join criteria, a merge join is rare to find as sorting large sets tends to be more expensive than hash matching.)
The reason that your parameter tweaking in the SP fixed the problem is that then SQL Server became aware you were doing something to the parameters by setting them to some value (ignoring what you'd set them to) and it had to compute a new execution plan, so it threw out the old one and designed a new access path based on the current set of parameters, getting better results.
If this problem persists then playing with SP recompilation/clearing the plan cache combined with using different parameters that must deal with hugely different number of rows may reveal where the problem is. Look at the execution plan that is used to run the SP with different parameters and see the effects of different access strategies being employed in the wrong conditions.
We're discussing the performance impact of putting a common function/procedure in a separate package or using a local copy in each package.
My thinking is that it would be cleaner to have the common code in a package, but others worry about the performance overhead.
Thoughts/experiences?
Put it in one place and call it from many - that's basic code re-use. Any overhead in calling one package from another will be minuscule. If they still doubt it, get them to demonstrate the performance difference.
The worriers are perfectly at liberty to prove the validity of their concerns by demonstrating a performance overhead. that ought to be trivial.
Meanwhile they should consider the memory usage and maintenance overhead in repeating code in multiple places.
Common code goes in one package.
Unless you are calling a procedure in a package situated on a different data base over a DB link, the overhead of calling a procedure in another package is negligible.
There are some performance concerns, as well as memory concerns, but they are rare and far between. Besides, they fall into "Oracle black magic" category. For example, check this link. If you can clearly understand what that is about, consider yourself an accomplished Oracle professional. If not - don't worry, because it's really hardcore stuff.
What you should consider, however, is the question of dependencies.
Oracle package consists of 2 parts: spec and body:
Spec is a header, where public procedures and functions (that is, visible outside the package) are declared.
Body is their implementation.
Although closely connected, they are 2 separate database objects.
Oracle uses package status to indicate if the package is VALID or INVALID. If a package becomes invalid, then all the other packages
that depend on it become invalid too.
For example, If you programme calls a procedure in package A, which calls a procedure in package B, that means that
you programme depends on package A, and package A depends on package B. In Oracle this relation is transitive and that means that
your programme depends on package B. Hence, if package B is broken, your programme also brakes (terminates with error).
That should be obvious. But less obvious is that Oracle also tracks dependencies during the compile time via package specs.
Let's assume that the specs and bodies for both of your package A and package B are successfully compiled and valid.
Then you go and make a change to the package body of package B. Because you only changed the body, but not the spec,
Oracle assumes that the way package B is called have not changed and doesn't do anything.
But if along with the body you change the package B's spec, then Oracle suspects that you might have changed some
procedure's parameters or something like that, and marks the whole chain as invalid (that is, package B and A and your programme).
Please note that Oracle doesn't check if the spec is really changed, it just checks the timestemp. So, it's enough just to recomplie the spec to invalidate everything.
If invalidation happens, next time you run you programme it will fail.
But if you run it one more time after that, Oracle will recompile everything automatically and execute it successfully.
I know it's confusing. That's Oracle. Don't try to wrap your brains too much around it.
You only need to remember a couple of things:
Avoid complex inter-package dependencies if possible. If one thing depends on the other thing, which depends on one more thing and so on,
then the probability of invalidating everything by recompiling just one database object is extremely high.
One of the worst cases is "circular" dependencies, when package A calls a procedure in package B, and package B calls procedure in package A.
It that case it is almost impossible to compile one without braking another.
Keep package spec and package body in separate source files. And if you need to change the body only, don't touch the spec!
I am not a DBA. However, I work on a web application that lives entirely in an Oracle database (Yes, it uses PL/SQL procedures to write HTML to clobs and then vomits the clob at your browser. No, it wasn't my idea. Yes, I'll wait while you go cry.).
We're having some performance issues, and I've been assigned to find some bottlenecks and remove them. How do I go about measuring Oracle performance and finding these bottlenecks? Our unhelpful sysadmin says that Grid Control wasn't helpful, and that he had to rely on "his experience" and queries against the data dictionary and "v$" views.
I'd like to run some tests against my local Oracle instance and see if I can replicate the problems he found so I can make sure my changes are actually improving things. Could someone please point me in the direction of learning how to do this?
Not too surprising there are entire books written on this topic.
Really what you need to do is divide and conquer.
First thing is to just ask yourself some standard common sense questions. Has performance slowly degraded or was there a big drop in performance recently is an example.
After the obvious a good starting point for you would be to narrow down where to spend your time - top queries is a decent start for you. This will give you particular queries which run for a long time.
If you know specifically what screens in you front-end are slow and you know what stored procedures go with that, I'd put some logging. Simple DBMS_OUTPUT.put_lines with some wall clock information at key points. Then I'd run those interactively in SQLNavigator to see what part of the stored procedure is going slow.
Once you start narrowing it down you can look to evaluate why a particular query is going slow. EXPLAIN_PLAN will be your best friend to start with.
It can be overwhelming to analyze database performance with Grid Control, and I would suggest starting with the simplier AWR report - you can find the scripts to generate them in $ORACLE_HOME/rdbms/admin on the db host. This report will rank the SQL seen in the database by various categories (e.g. CPU time, disk i/o, elapsed time) and give you an idea where the bottlenecks are on the database side.
One advantage of the AWR report is that it is a SQL*Plus script and can be run from any client - it will spool HTML or text files to your client.
edit:
There's a package called DBMS_PROFILER that lets you do what you want, I think. I found out my IDE will profile PL/SQL code as I would guess many other IDE's do. They probably use this package.
http://www.dba-oracle.com/t_dbms_profiler.htm
http://www.databasejournal.com/features/oracle/article.php/2197231/Oracles-DBMSPROFILER-PLSQL-Performance-Tuning.htm
edit 2:
I just tried the Profiler out in PL/SQL Developer. It creates a report on the total time and occurrences of snippets of code during runtime and gives code location as unit name and line number.
original:
I'm in the same boat as you, as far as the crazy PL/SQL generated pages go.
I work in a small office with no programmer particularly versed in advanced features of Oracle. We don't have any established methods of measuring and improving performance. But the best bet I'd guess is to try out different PL/SQL IDE's.
I use PL/SQL Developer by Allaround Automations. It's got a testing functionality that lets you debug your PL/SQL code and that may have some benchmarking feature I haven't used yet.
Hope you find a better answer. I'd like to know too. :)
"I work on a web application that
lives entirely in an Oracle database
(Yes, it uses PL/SQL procedures to
write HTML to clobs and then vomits
the clob at your browser"
Is it the Apex product ? That's the web application environment now included as standard part of the Oracle database (although technically it doesn't spit out CLOBs).
If so there is a whole bunch of instrumentation already built in to the product/environment (eg it keeps a rolling two-week history of activity).