Overhead for calling a procedure/function in another Oracle package - oracle

We're discussing the performance impact of putting a common function/procedure in a separate package or using a local copy in each package.
My thinking is that it would be cleaner to have the common code in a package, but others worry about the performance overhead.
Thoughts/experiences?

Put it in one place and call it from many - that's basic code re-use. Any overhead in calling one package from another will be minuscule. If they still doubt it, get them to demonstrate the performance difference.

The worriers are perfectly at liberty to prove the validity of their concerns by demonstrating a performance overhead. that ought to be trivial.
Meanwhile they should consider the memory usage and maintenance overhead in repeating code in multiple places.
Common code goes in one package.

Unless you are calling a procedure in a package situated on a different data base over a DB link, the overhead of calling a procedure in another package is negligible.
There are some performance concerns, as well as memory concerns, but they are rare and far between. Besides, they fall into "Oracle black magic" category. For example, check this link. If you can clearly understand what that is about, consider yourself an accomplished Oracle professional. If not - don't worry, because it's really hardcore stuff.
What you should consider, however, is the question of dependencies.
Oracle package consists of 2 parts: spec and body:
Spec is a header, where public procedures and functions (that is, visible outside the package) are declared.
Body is their implementation.
Although closely connected, they are 2 separate database objects.
Oracle uses package status to indicate if the package is VALID or INVALID. If a package becomes invalid, then all the other packages
that depend on it become invalid too.
For example, If you programme calls a procedure in package A, which calls a procedure in package B, that means that
you programme depends on package A, and package A depends on package B. In Oracle this relation is transitive and that means that
your programme depends on package B. Hence, if package B is broken, your programme also brakes (terminates with error).
That should be obvious. But less obvious is that Oracle also tracks dependencies during the compile time via package specs.
Let's assume that the specs and bodies for both of your package A and package B are successfully compiled and valid.
Then you go and make a change to the package body of package B. Because you only changed the body, but not the spec,
Oracle assumes that the way package B is called have not changed and doesn't do anything.
But if along with the body you change the package B's spec, then Oracle suspects that you might have changed some
procedure's parameters or something like that, and marks the whole chain as invalid (that is, package B and A and your programme).
Please note that Oracle doesn't check if the spec is really changed, it just checks the timestemp. So, it's enough just to recomplie the spec to invalidate everything.
If invalidation happens, next time you run you programme it will fail.
But if you run it one more time after that, Oracle will recompile everything automatically and execute it successfully.
I know it's confusing. That's Oracle. Don't try to wrap your brains too much around it.
You only need to remember a couple of things:
Avoid complex inter-package dependencies if possible. If one thing depends on the other thing, which depends on one more thing and so on,
then the probability of invalidating everything by recompiling just one database object is extremely high.
One of the worst cases is "circular" dependencies, when package A calls a procedure in package B, and package B calls procedure in package A.
It that case it is almost impossible to compile one without braking another.
Keep package spec and package body in separate source files. And if you need to change the body only, don't touch the spec!

Related

go/packages.Load() returns different types.Named for identical code

I'm trying to determine if two types are identical with go/types.Identical, and suprisingly enough, types of the same piece of code returned by different packages.Load calls are always different.
Am I making a wrong assumption on those APIs?
package main
import (
"fmt"
"go/types"
"golang.org/x/tools/go/packages"
)
func getTimeTime() *types.Named {
pkgs, err := packages.Load(&packages.Config{
Mode: packages.NeedImports | packages.NeedSyntax | packages.NeedTypes | packages.NeedDeps | packages.NeedTypesInfo,
Overlay: map[string][]byte{
"/t1.go": []byte(`package t
import "time"
var x time.Time`),
},
}, "file=/t1.go")
if err != nil {
panic(err)
}
for _, v := range pkgs[0].TypesInfo.Types {
return v.Type.(*types.Named) // named type of time.Time
}
panic("unreachable")
}
func main() {
t1, t2 := getTimeTime(), getTimeTime()
if !types.Identical(t1, t2) {
fmt.Println(t1, t2, "are different")
}
}
Apparently, there is a piece of hidden doc explaining all these (it attaches to nothing, so it's not on godoc): https://cs.opensource.google/go/x/tools/+/master:go/packages/doc.go;l=75
Motivation and design considerations
The new package's design solves problems addressed by two existing
packages: go/build, which locates and describes packages, and
golang.org/x/tools/go/loader, which loads, parses and type-checks
them. The go/build.Package structure encodes too much of the 'go
build' way of organizing projects, leaving us in need of a data type
that describes a package of Go source code independent of the
underlying build system. We wanted something that works equally well
with go build and vgo, and also other build systems such as Bazel and
Blaze, making it possible to construct analysis tools that work in all
these environments. Tools such as errcheck and staticcheck were
essentially unavailable to the Go community at Google, and some of
Google's internal tools for Go are unavailable externally. This new
package provides a uniform way to obtain package metadata by querying
each of these build systems, optionally supporting their preferred
command-line notations for packages, so that tools integrate neatly
with users' build environments. The Metadata query function executes
an external query tool appropriate to the current workspace.
Loading packages always returns the complete import graph "all the way
down", even if all you want is information about a single package,
because the query mechanisms of all the build systems we currently
support ({go,vgo} list, and blaze/bazel aspect-based query) cannot
provide detailed information about one package without visiting all
its dependencies too, so there is no additional asymptotic cost to
providing transitive information. (This property might not be true of
a hypothetical 5th build system.)
In calls to TypeCheck, all initial packages, and any package that
transitively depends on one of them, must be loaded from source.
Consider A->B->C->D->E: if A,C are initial, A,B,C must be loaded from
source; D may be loaded from export data, and E may not be loaded at
all (though it's possible that D's export data mentions it, so a
types.Package may be created for it and exposed.)
The old loader had a feature to suppress type-checking of function
bodies on a per-package basis, primarily intended to reduce the work
of obtaining type information for imported packages. Now that imports
are satisfied by export data, the optimization no longer seems
necessary.
Despite some early attempts, the old loader did not exploit export
data, instead always using the equivalent of WholeProgram mode. This
was due to the complexity of mixing source and export data packages
(now resolved by the upward traversal mentioned above), and because
export data files were nearly always missing or stale. Now that 'go
build' supports caching, all the underlying build systems can
guarantee to produce export data in a reasonable (amortized) time.
Test "main" packages synthesized by the build system are now reported
as first-class packages, avoiding the need for clients (such as
go/ssa) to reinvent this generation logic.
One way in which go/packages is simpler than the old loader is in its
treatment of in-package tests. In-package tests are packages that
consist of all the files of the library under test, plus the test
files. The old loader constructed in-package tests by a two-phase
process of mutation called "augmentation": first it would construct
and type check all the ordinary library packages and type-check the
packages that depend on them; then it would add more (test) files to
the package and type-check again. This two-phase approach had four
major problems: 1) in processing the tests, the loader modified the
library package, leaving no way for a client application to see
both the test package and the library package; one would mutate
into the other. 2) because test files can declare additional methods
on types defined in the library portion of the package, the
dispatch of method calls in the library portion was affected by the
presence of the test files. This should have been a clue that the
packages were logically different. 3) this model of "augmentation"
assumed at most one in-package test per library package, which is
true of projects using 'go build', but not other build systems. 4)
because of the two-phase nature of test processing, all packages that
import the library package had to be processed before augmentation,
forcing a "one-shot" API and preventing the client from calling Load
in several times in sequence as is now possible in WholeProgram mode.
(TypeCheck mode has a similar one-shot restriction for a different
reason.)
Early drafts of this package supported "multi-shot" operation.
Although it allowed clients to make a sequence of calls (or concurrent
calls) to Load, building up the graph of Packages incrementally, it
was of marginal value: it complicated the API (since it allowed some
options to vary across calls but not others), it complicated the
implementation, it cannot be made to work in Types mode, as explained
above, and it was less efficient than making one combined call (when
this is possible). Among the clients we have inspected, none made
multiple calls to load but could not be easily and satisfactorily
modified to make only a single call. However, applications changes may
be required. For example, the ssadump command loads the user-specified
packages and in addition the runtime package. It is tempting to
simply append "runtime" to the user-provided list, but that does not
work if the user specified an ad-hoc package such as [a.go b.go].
Instead, ssadump no longer requests the runtime package, but seeks it
among the dependencies of the user-specified packages, and emits an
error if it is not found.

Splitting client/server code

I'm developing a client/server application in golang, and there are certain logical entities that exist both on client and server(the list is limited)
I would like to ensure certain code for this entities is included ONLY in the server part but NOT in the client(wise versa is nice, but not so important).
The naive thought would be to rely on dead code elimination, but from my brief research it's not a reliable way to handle the task... go build simply won't eliminate dead code from the fact that it may have been used via reflection(nobody cares that it wasn't and there is no option to tune this)
More solid approach seems to be splitting code in different packages and import appropriately, this seems reliable but over-complicates the code forcing you to physically split certain entities between different packages and constantly keep this in mind...
And finally there are build tags allowing to have multiple files under the same package built conditionally for client and server
The motivation with using build tags is that I want to keep code as clean as possible without introducing any synthetic entities
Use case:
there are certain cryptography routines, client works with public key, server operates with private... Code logically belongs to the same entity
What option would you choose and why?
This "dead code elimination" is already done–partly–by the go tool. The go tool does not include everything from imported packages, only what is needed (or more precisely: it excludes things that it can prove unreachable).
For example this application
package main; import _ "fmt"; func main() {}
results in almost 300KB smaller executable binary (on windows amd64) compared to the following:
package main; import "fmt"; func main() {fmt.Println()}
Excludable things include functions, types and even unexported and exported variables. This is possible because even with reflection you can't call a function or "instantiate" types or refer to package variables just by having their names as a string value. So maybe you shouldn't worry about it that much.
Edit: With Go 1.7 released, it is even better: read blog post: Smaller Go 1.7 binaries
So if you design your types and functions well, and you don't create "giant" registries where you enumerate functions and types (which explicitly generates references to them and thus renders them unexcludable), compiled binaries will only contain what is actually used from imported packages.
I would not suggest to use build tags for this kind of problem. By using them, you'll have an extra responsibility to maintain package / file dependencies yourself which otherwise is done by the go tool.
You should not design and separate code into packages to make your output executables smaller. You should design and separate code into packages based on logic.
I would go with separating stuffs into packages when it is really needed, and import appropriately. Because this is really what you want: some code intended only for the client, some only for the server. You may have to think a little more during your design and coding phase, but at least you will see the result (what actually belongs / gets compiled into the client and into the server).

Large number of Oracle Packages

I am generating code for Oracle Stored Procedure (SP) based on a dependency graph. In order to reduce recompilation unit size I have organised them in Oracle Packages. But this is resulting in large number of packages (250+). The number of procedures are 1000+.
My question: Will this large number of package create any performance issues with Oracle 11gR2+ ? Or will there be any deployment/management related issues ? Can somebody share their experience on working with large number of Oracle packages ?
In one of the products that I've worked on, the schema had many thousands of stored procedures, functions and packages, totalling almost half a million lines of code. Oracle shouldn't have any issues with this at all. The biggest headache was maintenance and version control of the objects.
We stored each package header and body in separate files so that we could version them independently (the header typically changes much less frequently than the body), and we used an editor that supported ctags to make navigation within a package more manageable. When you have a hundred or more procedures and functions within a package, finding the right place to actually make changes takes as much time as actually doing the work! Another great tool was OpenGrok, which indexes the entire code base and makes searching for things super quick.
Deployment wise, we just used a simple script that wrapped SQL*Plus to load the files and log any issues with compilation or connectivity. There are more advanced tools that sit on top of your source control system and "manage" deployment and dependencies, but we never found that it was necessary.
The purpose of writing packages in oracle to implement the concept of modular methodology which explained as follows:
Consolidate logical procedure and functional under one package
There is way to define member variable in global and can be accessed with in package or outside packages
The program units defined in package will be loaded at once in memory for processing and reduces context switching time
More details provided under link:
https://oracle-concepts-learning.blogspot.com/

Is it recommended to keep a program sources (as opposed to lib sources) in a single file?

I am making my first steps into Go and obviously am reasoning from what I'm used to in other languages rather than understanding go specificity and styles yet.
I've decided to rewrite a ruby background job I have that takes ages to execute. It iterates over a huge table in my database and process data individually for each row, so it's a good candidate for parallelization.
Coming from a ruby on rails task and using ORM, this was meant to be, as I thought of it, a quite simple two files program: one that would contain a struct type and its methods to represent and work with a row and the main file to operate the database query and loop on rows (maybe a third file to abstract database access logic if it gets too heavy in my main file). This file separation as I intended it was meant for codebase clarity more than having any relevance in the final binary.
I've read and seen several things on the topic, including questions and answers here, and it always tends to resolve into writing code as libraries, installing them and then using them into a single file source (in package main) program.
I've read that one may pass multiple files to go build/run, but it complains if there is several package name (so basically, everything should be in main) and it doesn't seem that common.
So, my questions are :
did I get it right, and having code mostly as a library with a single file program importing it the way to go?
if so, how do you deal with having to build libraries repeatedly? Do you build/install on each change in library codebase before executing (which is way less convenient that what go run promise to be) or is there something common I don't know of to execute library dependent program quick and fast while working on those libraries code?
No.
Go and the go tool works on packages only (just go run works on files, but that is a different story): You should not think about files when organizing Go code but packages. A package may be split into several files, but that is used for keeping test code separated and limiting file size or
grouping types, methods, functions, etc.
Your questions:
did I get it right, and having code mostly as a library with a single file program
importing it the way to go?
No. Sometimes this has advantages, sometimes not. Sometimes a split may be one lib + one short main,
in other cases, just one large main might be better. Again: It is all about packages and never about files. There is nothing wrong with a single 12 file main package if this is a real standalone program. But maybe extracting some stuff into one or a few other packages might result in more readable code. It all depends.
if so, how do you deal with having to build libraries repeatedly? Do you build/install on each change in library codebase before executing (which is way less convenient that what go run promise to be) or is there something common I don't know of to execute library dependent program quick and fast while working on those libraries code?
The go tool tracks the dependencies and recompiles whatever is necessary. Say you have a package main in main.go which imports a package foo. If you execute go run main.go it will recompile package foo transparently iff needed. So for quick hacks: No need for a two-step go install foo; go run main. Once you extract code into three packages foo, bar, and waz it might be a bit faster to install foo, bar and waz.
No. Look at the Go commands and Go standard packages for exemplars of good programming style.
Go Source Code

PL/SQL package call via JDBC performance issue

I have to use an PL/SQL package as API for importing data into an Oracle database. I'm doing this within an Java application with the latest ojdbc driver. All statements (of cause PreparedStatements) I'm using during the import are initialized only one time and reused for every set to import.
Now I'm facing following problem: The first call of an procedure of the package takes over 90% of the time for one set. I have to call about 10 procedures during the import and the first one takes about 4 seconds the rest about 0.4 seconds. It doesn't matter if it's the 10th or 100,000th set to import the first procedure call allways takes that time.
Important to know is, if I'm calling another procedure on first position this on takes the 90%. So, may be I'm wrong, it is something about the package initialization? But if I'm (re-)using prepared statements, shouldn't that happen only at first call?
The PL/SQL package has about 10,000 lines of code and also calls several other packages during the import.
So now my questions are:
What are possible reasons for this problem? And what are potential solutions?
Are there any tools I can use to identify the causer?
EDIT: I could identify the cause of the slow import. It had nothing to do with wrong code or something. The reason was simply the kind of data I used in my test scenario. My mistake was importing allways the same data.
If thread one made an update on a data-set in the first procedure it was holding an lock on this row until the commit after the complete import. Thread two to n were trying to update exactly the same row. The result is effectivly a synchronization of all threads.
First of all, this is not normal. So there is definitely something awry with your code. But without being able to see your source there's no way we're going to be able to spot the problem. And frankly I don't want to debug 10000 LOC, not even mine let alone yours. Sorry.
So the best we can do is give you some pointers.
One:
"The first call of an procedure of the package takes over 90% of the
time for one set. .... if I'm calling another procedure on first
position this on takes the 90%"
Perhaps there is some common piece of coding which every procedure executes that behaves differently depending on whether the calling procedure is the first one to execute it in any given run. You need to locate that rogue code.
Two:
" I've used the profiler in pl/sql developer. The execution is very
fast there. "
Your program behaves differently depending on whether you call it from PL/SQL Developer of JDBC. So there is a strong possiblity that the problem lies not in the PL/SQL code but in the JDBC code. Acquiring database connections is definitely one potential source of pain. Depnding on your architecture, network traffic may be another problem: are you returning lots of data to the Java program which is then used in subsequent procedural calls?
In short: you either need to identify something common in your PL/SQL code which can cause the same outcome in different proocedural calls or identify what happens differently when you call the program in PL/SQL Developer and JDBC.

Resources