I'm looking to learn about possible ways of deploying large number of plsql packages as dependencies seem to be quite a problem.
As it works now, packages are being deployed in several iterations redeploying em again if they couldn't be deployed in previous pass due to missing dependency.
I hope to hear about different approaches to the problem and will update my question if u happend to havequestions for me to make it more clear.
Would it even be ok to search guidance this way on SO?
I would recommend to install all specs first in proper order.
Then install all bodies.
All dependencies need to be predefined once in master install script.
Update:
What else you can do is:
1) load all package specs into main list (I assume all specs and bodies are stored separately. if not then it need to be done)
2) loop all specs from the main list.
3) try to compile it. Add to failed list if it fails.
4) When reach to the end of main list replace all items from it with items from failed list.
5) Go to step 2.
At the same time you can save results of the first run and second run could order items according to results of previous call. This will minimize number of iterations.
Bodies could be installed in any order...
However you need to keep in mind dependencies on the views and from the views - specs could depend on views (view_name%TYPE, cursors and etc) and views depends on package specs (could call package functions). This is not trivial problem... Can you explain how it is solved currently please?
I myself just install all the procedural code (in any order) and later (re)compile all invalid objects.
There are several way to recompile all invalid objects:
UTL_RECOMP
DBMS_UTILITY.COMPILE_SCHEMA
Manually like Tom Kyte suggest and I use
Related
I am generating code for Oracle Stored Procedure (SP) based on a dependency graph. In order to reduce recompilation unit size I have organised them in Oracle Packages. But this is resulting in large number of packages (250+). The number of procedures are 1000+.
My question: Will this large number of package create any performance issues with Oracle 11gR2+ ? Or will there be any deployment/management related issues ? Can somebody share their experience on working with large number of Oracle packages ?
In one of the products that I've worked on, the schema had many thousands of stored procedures, functions and packages, totalling almost half a million lines of code. Oracle shouldn't have any issues with this at all. The biggest headache was maintenance and version control of the objects.
We stored each package header and body in separate files so that we could version them independently (the header typically changes much less frequently than the body), and we used an editor that supported ctags to make navigation within a package more manageable. When you have a hundred or more procedures and functions within a package, finding the right place to actually make changes takes as much time as actually doing the work! Another great tool was OpenGrok, which indexes the entire code base and makes searching for things super quick.
Deployment wise, we just used a simple script that wrapped SQL*Plus to load the files and log any issues with compilation or connectivity. There are more advanced tools that sit on top of your source control system and "manage" deployment and dependencies, but we never found that it was necessary.
The purpose of writing packages in oracle to implement the concept of modular methodology which explained as follows:
Consolidate logical procedure and functional under one package
There is way to define member variable in global and can be accessed with in package or outside packages
The program units defined in package will be loaded at once in memory for processing and reduces context switching time
More details provided under link:
https://oracle-concepts-learning.blogspot.com/
I am making my first steps into Go and obviously am reasoning from what I'm used to in other languages rather than understanding go specificity and styles yet.
I've decided to rewrite a ruby background job I have that takes ages to execute. It iterates over a huge table in my database and process data individually for each row, so it's a good candidate for parallelization.
Coming from a ruby on rails task and using ORM, this was meant to be, as I thought of it, a quite simple two files program: one that would contain a struct type and its methods to represent and work with a row and the main file to operate the database query and loop on rows (maybe a third file to abstract database access logic if it gets too heavy in my main file). This file separation as I intended it was meant for codebase clarity more than having any relevance in the final binary.
I've read and seen several things on the topic, including questions and answers here, and it always tends to resolve into writing code as libraries, installing them and then using them into a single file source (in package main) program.
I've read that one may pass multiple files to go build/run, but it complains if there is several package name (so basically, everything should be in main) and it doesn't seem that common.
So, my questions are :
did I get it right, and having code mostly as a library with a single file program importing it the way to go?
if so, how do you deal with having to build libraries repeatedly? Do you build/install on each change in library codebase before executing (which is way less convenient that what go run promise to be) or is there something common I don't know of to execute library dependent program quick and fast while working on those libraries code?
No.
Go and the go tool works on packages only (just go run works on files, but that is a different story): You should not think about files when organizing Go code but packages. A package may be split into several files, but that is used for keeping test code separated and limiting file size or
grouping types, methods, functions, etc.
Your questions:
did I get it right, and having code mostly as a library with a single file program
importing it the way to go?
No. Sometimes this has advantages, sometimes not. Sometimes a split may be one lib + one short main,
in other cases, just one large main might be better. Again: It is all about packages and never about files. There is nothing wrong with a single 12 file main package if this is a real standalone program. But maybe extracting some stuff into one or a few other packages might result in more readable code. It all depends.
if so, how do you deal with having to build libraries repeatedly? Do you build/install on each change in library codebase before executing (which is way less convenient that what go run promise to be) or is there something common I don't know of to execute library dependent program quick and fast while working on those libraries code?
The go tool tracks the dependencies and recompiles whatever is necessary. Say you have a package main in main.go which imports a package foo. If you execute go run main.go it will recompile package foo transparently iff needed. So for quick hacks: No need for a two-step go install foo; go run main. Once you extract code into three packages foo, bar, and waz it might be a bit faster to install foo, bar and waz.
No. Look at the Go commands and Go standard packages for exemplars of good programming style.
Go Source Code
I am a relative beginner at SSIS so I may be doing something silly.
I have a process that involves looping over a heterogenous queue and processing the objects 1 at a time. The process is currently being done in 'set logic' and its dropping stuff. I was asked to rework it in a looping manner, so that decision has been made for me.
I have chosen to implement queue logic in 1 package and the actual processing in another package.
This is all going relatively well considering...
I now have the process up and running, but its slow. 9 seconds per item. Clearly I cant present this solution. :-)
One thing i notice, 1.5 - 2 seconds of each loop are on the ExecutePackage Task in the queue loop.
I cant figure out how to get a hard number, I am using the flashing green box method of performance tuning. The other steps seem to be very fast. Adding indexes, changing sql to sps, all the usual tricks have helped.
Is the UI realiable at all with regards to boxes turning white/yellow/green? Some tasks report times in the progress tab, some dont seem to. So I am counting yellow time.
Should calling a subpackage be that expensive? 1 change i made was I change 'RunInASeparateProcess' to FALSE. I did that because the subpackage produces the following message otherwise:
Error: 0xC0012024 at Script Task: The task "Script Task" cannot run on this edition of Integration Services. It requires a higher level edition.
Task failed: Script Task
The reading i have done seems to advocate multiple packages. Anyone have any counter patterns? Should i stay the course? I started changing to 1 package. Copy/paste doesnt seem to work well w/ SequenceContainers. I would also need to recreate all the variables in the parent package. Doable, but im not sure that is the answer.
Does anyone know of any tuning resources/websites/books they would be willing to share.
Update - I have been tearing things down in an effort to figure out what the problem is. I was thinking it was the package configurations passing variable values. I dont think that is it. I can pass variables to another package w/ nothing in it and it is fast.
I can make the trivial subpackage slow by adding the two connection managers to it.
I suddenly realize I may be making and breaking a connection to both an Oracle Server and a SQL server in both the main package and then the sub package.
Am I correct in this observation?
Is there any way I can reuse the connection between the two packages?
When i google it, most of what i see is suggestions for passing the connection string.
UPDATE - I combined the two packages into one. This performance is not about 1.25 seconds per item, down from about 9. the only thing i can point to that changed is i am now reusing a single connection instead of making multiple connections.
Thanks, I appreciate any help you are kind enough to offer.
Greg
Once you enable logging, I'd suggest running the package from a command window using dtexec. While that doesn't perfectly duplicate the server environment, it does have the advantages of (a) eliminating BIDS as a potential performance issue and (b) being something you can do without jumping through change control hoops.
On a maven project, on process-test-resources phase I set up the database schemas with sql-maven-plugin. On this project that are N database shards which I set up with N repeated with exactly the same content bar the database name. Everything works as expected.
Problem here is that with a growing number of shards the number of similar blocks grows, which is cumbersome and makes maintenance annoying (since, per definition, all of those databases are literally the same). I would like to be able to define a "list" of database names and let sql-maven-plugin run once for each, without having to define the whole block many times.
I'm not looking for changes in the test setup as I positively want to setup as many shards as needed on the test environment. I need solely some "maven sugar" for conveniently define the over which values the executions should "loop".
I understand that maven itself does not support iteration by itself and am looking for alternatives or ideas of how to better achieve this. Things that come to my mind are:
Using/writing a "loop" plugin that manages the multiple parameterized executions
Extending sql-maven-plugin to support my use case
???
Does anyone has a better/cleaner solution?
Thanks in advance.
In this case i would recommend to use the maven-antrun-plugin to handle this situation, but of course it also possible to implement a particular maven plugin for this kind of purpose.
We're discussing the performance impact of putting a common function/procedure in a separate package or using a local copy in each package.
My thinking is that it would be cleaner to have the common code in a package, but others worry about the performance overhead.
Thoughts/experiences?
Put it in one place and call it from many - that's basic code re-use. Any overhead in calling one package from another will be minuscule. If they still doubt it, get them to demonstrate the performance difference.
The worriers are perfectly at liberty to prove the validity of their concerns by demonstrating a performance overhead. that ought to be trivial.
Meanwhile they should consider the memory usage and maintenance overhead in repeating code in multiple places.
Common code goes in one package.
Unless you are calling a procedure in a package situated on a different data base over a DB link, the overhead of calling a procedure in another package is negligible.
There are some performance concerns, as well as memory concerns, but they are rare and far between. Besides, they fall into "Oracle black magic" category. For example, check this link. If you can clearly understand what that is about, consider yourself an accomplished Oracle professional. If not - don't worry, because it's really hardcore stuff.
What you should consider, however, is the question of dependencies.
Oracle package consists of 2 parts: spec and body:
Spec is a header, where public procedures and functions (that is, visible outside the package) are declared.
Body is their implementation.
Although closely connected, they are 2 separate database objects.
Oracle uses package status to indicate if the package is VALID or INVALID. If a package becomes invalid, then all the other packages
that depend on it become invalid too.
For example, If you programme calls a procedure in package A, which calls a procedure in package B, that means that
you programme depends on package A, and package A depends on package B. In Oracle this relation is transitive and that means that
your programme depends on package B. Hence, if package B is broken, your programme also brakes (terminates with error).
That should be obvious. But less obvious is that Oracle also tracks dependencies during the compile time via package specs.
Let's assume that the specs and bodies for both of your package A and package B are successfully compiled and valid.
Then you go and make a change to the package body of package B. Because you only changed the body, but not the spec,
Oracle assumes that the way package B is called have not changed and doesn't do anything.
But if along with the body you change the package B's spec, then Oracle suspects that you might have changed some
procedure's parameters or something like that, and marks the whole chain as invalid (that is, package B and A and your programme).
Please note that Oracle doesn't check if the spec is really changed, it just checks the timestemp. So, it's enough just to recomplie the spec to invalidate everything.
If invalidation happens, next time you run you programme it will fail.
But if you run it one more time after that, Oracle will recompile everything automatically and execute it successfully.
I know it's confusing. That's Oracle. Don't try to wrap your brains too much around it.
You only need to remember a couple of things:
Avoid complex inter-package dependencies if possible. If one thing depends on the other thing, which depends on one more thing and so on,
then the probability of invalidating everything by recompiling just one database object is extremely high.
One of the worst cases is "circular" dependencies, when package A calls a procedure in package B, and package B calls procedure in package A.
It that case it is almost impossible to compile one without braking another.
Keep package spec and package body in separate source files. And if you need to change the body only, don't touch the spec!