I have a client who runs an Oracle 11 database. The DBA for this client insists that every stored procedure added to the database contain an exception handler (even if the sproc just does a SELECT). Does this make any sense? Our data layer on the web site already has built-in exception handling and logging (which works well). The data layer is written in .NET.
I have told the DBA that an exception handler is only called for when there is some specific action that can be taken in case of an error (like a transaction rollback), or when some predictable error is likely to occur.
He insists that every sproc have an exception block.
Any opinions on whether this is a good or needed practice would be appreciated.
Thanks!
I can't speak to any Oracle-specific need that this may be satisfying. Maybe there's something the DBA knows that we don't?
But, in general, you're correct. An exception handler is needed only where one can meaningfully handle an exception. If there's nothing that can be logically done at that point in the code, then there's no reason to catch an exception. (I wonder... would the DBA's requirement be satisfied if the exception handling block did nothing but re-throw?)
Now, having said that, there's also a certain amount of prudence is following the coding standards set forth by the team which owns a particular system. That is... Is this a fight worth having? (In my own personal and not universally applicable experience, this is a particularly grueling fight with DBAs. Especially DBAs in charge of dinosaurs like Oracle or DB2. But take that with a grain of salt.)
A good compromise in a situation like this would likely be to sit down with the DBA and discuss what he/she expects the exception handler to do. Logically determine, collaboratively, how to meaningfully handle exceptions in this particular stored procedure. The DBA may see that there's less need for it, or conversely you may find a need for it that you hadn't considered. Either way, it's a good opportunity to arrive at an understanding of an otherwise coarsely-expressed blanket rule as it applies to a specific scenario.
This is a similar argument as when to use a checked exception vs. an unchecked exception in Java. It can be a matter of opinion, but it seems to me that a good starting point is what you have already said: catch the exception if there's a way you can deal with it.
Related
Ruby offers two possibilities to cause an exception programmatically: raise and fail, both being Kernel methods. According to the documents, they are absolutely equivalent.
Out of a habit, I used only raise so far. Now I found several recommendations (for example here), to use raise for exceptions to be caught, and fail for serious errors which are not meant to be handled.
But does it really make sense? When you are writing a class or module, and cause a problem deep inside, which you signal by fail, your programming colleagues who are reviewing the code, might happily understand your intentions, but the person who is using my code will most likely not look at my code and has no way of knowing, whether the exception was caused by a raise or by fail. Hence, my careful usage of raise or fail can't have any influence on his decision, whether she should or should not handle it.
Could someone see flaws in my arguments? Or are there other criteria, which might me want to use fail instead of raise?
use 'raise' for exceptions to be caught, and 'fail' for serious errors which are not meant to be handled
This is not what the official style guide or the link you provided say on the matter.
What is meant here is use raise only in rescue blocks. Aka use fail when you want to say something is failing and use raise when rethrowing an exception.
As for the "does it matter" part - it is not one of the most hardcore strictly followed rules, but you could make the same argument for any convention. You should follow in that order:
Your project style guide
Your company style guide
The community style guide
Ideally, the three should be the same.
Update: As of this PR (December 2015), the convention is to always use raise.
I once had a conversation with Jim Weirich about this very thing, I have since always used fail when my method is explicitly failing for some reason and raise to re-thrown exceptions.
Here is a post with a message from Jim (almost verbatim to what he said to me in person):
http://www.virtuouscode.com/2014/05/21/jim-weirich-on-exceptions/
Here is the relevant text from the post, a quote attributed to Jim:
Here’s my basic philosophy (and other random thoughts) on exceptions.
When you call a method, you have certain expectations about what the method will accomplish. Formally, these expectations are called post-conditions. A method should throw an exception whenever it fails to meet its postconditions.
To effectively use this strategy, it implies you must have a small understanding of Design by Contract and the meaning of pre- and post-conditions. I think that’s a good thing to know anyways.
Here’s some concrete examples. The Rails model save method:
model.save!
-- post-condition: The model object is saved.
If the model is not saved for some reason, then an exception must be raised because the post-condition is not met.
model.save
-- post-condition: (the model is saved && result == true) ||
(the model is not saved && result == false)
If save doesn’t actually save, then the returned result will be false , but the post-condition is still met, hence no exception.
I find it interesting that the save! method has a vastly simpler post-condition.
On the topic of rescuing exceptions, I think an application should have strategic points where exceptions are rescued. There is little need for rescue/rethrows for the most part. The only time you would want to rescue and rethrow is when you have a job half-way done and you want to undo something so avoid a partially complete state. Your strategic rescue points should be chosen carefully so that the program can continue with other work even if the current operation failed. Transaction processing programs should just move on to the next transaction. A Rails app should recover and be ready to handle the next http request.
Most exception handlers should be generic. Since exceptions indicate a failure of some type, then the handler needs only make a decision on what to do in case of failure. Detailed recovery operations for very specific exceptions are generally discouraged unless the handler is very close (call graph wise) to the point of the exception.
Exceptions should not be used for flow control, use throw/catch for that. This reserves exceptions for true failure conditions.
(An aside, because I use exceptions to indicate failures, I almost always use the fail keyword rather than the raise keyword in Ruby. Fail and raise are synonyms so there is no difference except that fail more clearly communcates that the method has failed. The only time I use raise is when I am catching an exception and re-raising it, because here I’m not failing, but explicitly and purposefully raising an exception. This is a stylistic issue I follow, but I doubt many other people do).
There you have it, a rather rambling memory dump on my thoughts on exceptions.
I know that there are many style guides that do not agree (the style guide used by RoboCop, for example). I don't care. Jim convinced me.
I always tend to "protect" my persistance layer from violations via the service layer. However, I am beginning to wonder if it's really necessary. What's the point in taking the time in making my database robust, building relationships & data integrity when it never actually comes into play.
For example, consider a User table with a unique contraint on the Email field. I would naturally want to write blocker code in my service layer to ensure the email being added isn't already in the database before attempting to add anything. In the past I have never really seen anything wrong with it, however, as I have been exposed to more & more best practises/design principles I feel that this approach isn't very DRY.
So, is it correct to always ensure data going to the persistance layer is indeed "valid" or is it more natural to let the invalid data get to the database and handle the error?
Please don't do that.
Implementing even "simple" constraints such as keys is decidedly non-trivial in a concurrent environment. For example, it is not enough to query the database in one step and allow the insertion in another only if the first step returned empty result - what if a concurrent transaction inserted the same value you are trying to insert (and committed) in between your steps one and two? You have a race condition that could lead to duplicated data. Probably the simplest solution for this is to have a global lock to serialize transactions, but then scalability goes out of the window...
Similar considerations exist for other combinations of INSERT / UPDATE / DELETE operations on keys, as well as other kinds of constraints such as foreign keys and even CHECKs in some cases.
DBMSes have devised very clever ways over the decades to be both correct and performant in situations like these, yet allow you to easily define constraints in declarative manner, minimizing the chance for mistakes. And all the applications accessing the same database will automatically benefit from these centralized constraints.
If you absolutely must choose which layer of code shouldn't validate the data, the database should be your last choice.
So, is it correct to always ensure data going to the persistance layer is indeed "valid" (service layer) or is it more natural to let the invalid data get to the database and handle the error?
Never assume correct data and always validate at the database level, as much as you can.
Whether to also validate in upper layers of code depends on a situation, but in the case of key violations, I'd let the database do the heavy lifting.
Even though there isn't a conclusive answer, I think it's a great question.
First, I am a big proponent of including at least basic validation in the database and letting the database do what it is good at. At minimum, this means foreign keys, NOT NULL where appropriate, strongly typed fields wherever possible (e.g. don't put a text field where an integer belongs), unique constraints, etc. Letting the database handle concurrency is also paramount (as #Branko Dimitrijevic pointed out) and transaction atomicity should be owned by the database.
If this is moderately redundant, than so be it. Better too much validation than too little.
However, I am of the opinion that the business tier should be aware of the validation it is performing even if the logic lives in the database.
It may be easier to distinguish between exceptions and validation errors. In most languages, a failed data operation will probably manifest as some sort of exception. Most people (me included) are of the opinion that it is bad to use exceptions for regular program flow, and I would argue that email validation failure (for example) is not an "exceptional" case.
Taking it to a more ridiculous level, imagine hitting the database just to determine if a user had filled out all required fields on a form.
In other words, I'd rather call a method IsEmailValid() and receive a boolean than try to have to determine if the database error which was thrown meant that the email was already in use by someone else.
This approach may also perform better, and avoid annoyances like skipped IDs because an INSERT failed (speaking from a SQL Server perspective).
The logic for validating the email might very well live in a reusable stored procedure if it is more complicated than simply a unique constraint.
And ultimately, that simple unique constraint provides final protection in case the business tier makes a mistake.
Some validation simply doesn't need to make a database call to succeed, even though the database could easily handle it.
Some validation is more complicated than can be expressed using database constructs/functions alone.
Business rules across applications may differ even against the same (completely valid) data.
Some validation is so critical or expensive that it should happen prior to data access.
Some simple constraints like field type/length can be automated (anything run through an ORM probably has some level of automation available).
Two reasons to do it. The db may be accessed from another application..
You might make a wee error in your code, and put data in the db, which because your service layer operates on the assumption that this could never happen, makes it fall over if you are lucky, silent data corruption being worst case.
I've always looked at rules in the DB as backstop for that exceptionally rare occasion when I make a mistake in the code. :)
The thing to remember, is if you need to , you can always relax a constraint, tightening them after your users have spent a lot of effort entering data will be far more problematic.
Be real wary of that word never, in IT, it means much sooner than you wished.
Are there any good reasons why one would not have transaction management in their code?
The question came up when talking with a dba who gets very nervous when I bring up spring/hibernate. I mention that Spring can handle transactions, in use with Hibernate mapping tables to objects etc, and the issue comes up that the database(Oracle10g) already handles transaction management, so we should just use that. He even offered up the idea that we create a bunch of DB procedures to do inserts/updates so the database can handle things more efficiently, returning a 0/1 on whether the insert/update worked.
Are there any good reasons to not have your application deal with any transactions? Is my dba clueless? I'm thinking he is, but I'm not a great speaker when I'm unsure of the answer... which is why I'm out looking for the answer.
I think there is some misunderstanding here.
The point is that database doesn't manage transactions in the same sense as Spring/Hibernate.
Database "manages transactions" by providing transactional behaviour, and your application "manages transactions" by using that behaviour and defining transaction boundaries (in particular, with the help of Spring or Hibernate).
Since boundaries of transactions are defined by business logic, implementing an application without transaction management would require you to move all your business logic to the database side. Even if you implement simple insert/update operations as stored procedures, it won't be enough to free application from transaction management as long as application needs to define that several inserts/updates should be executed inside the same transaction.
I am not entirely sure if you mean that there will be a bunch of crud stored procedures (that do single inserts or updates), or if there will be stored procedures encompassing business logic (transaction scripts). If you mean the crud stored procedures, that is an entirely bad idea. (Actually even if you start with the crud approach you will end up with transaction scripts as business logic accretes, so it amounts to the same thing.) If you mean transaction scripts, that's an approach some places take. It is painful and there is no reuse, and you end up with a bunch of very complex stored procedures that are terribly hard to test. But DBAs like it because they know what's going on.
There is also an argument (applying to Transaction Scripts) that it's faster because there are less round trips, you have one call to the stored procedure that goes and does everything and returns a result, as opposed to your usual Spring/Hibernate application where you have multiple queries or updates and each statement is going over the network to the database (although Hibernate caches and reorders to try to minimize this). Minimizing network round-trips is probably the most valid reason for this approach, you have to weigh whether it is worth sacrificing flexibility for the reduced network traffic, or if it is a premature optimization.
Another argument made in favor of transaction scripts is that less competence is required to implement the system correctly. In particular Hibernate expertise is not required. You can hire a horde of code monkeys and have them bang out the code. All the hard stuff is removed from them and placed under the DBA's control.
So, to recap, here are the arguments for transaction scripts:
Less network traffic
Cheap developers
total DBA control (from your point of view, he will be a total bottleneck)
As mentioned above, there's no way to "use transactions" from the database standpoint without making your application aware of it at some level. Although, if you're using Spring, you can make this fairly painless by using <tx:annotation-driven> and applying the #Transactional annotations to the relevant methods in the service implementation classes.
That said, there are times when you should bypass transactions and write directly to the database. Specifically any time when speed is more important than guaranteed data integrity.
I've recently upgraded my project to Visual Studio 2010 from Visual Studio 2008.
In Visual Studio 2008, this Code Analysis rule doesn't exist.
Now I'm not sure if I should use this rule or not.
I'm building an open source library so it seems important to keep people safe from doing mistakes. However, if all I'm going to do is throw ArgumentNullException when the parameter is null, it seems like writing useless code since ArgumentNullException will be thrown even if I won't write that code.
EDIT: Also, there is a performance issue that needs to be addressed. Checking for null in every public method can cause performance issues.
Should I remove that rule or fix the violations?
That depends. The convention when using ArgumentNullException is to include the name of the null argument in the description. So the caller will know exactly what went wrong.
The source of a NullReferenceException (which is what will happen if you don't validate) may be easy to spot, but if the method is complex it may be more difficult. You may end up at a line of code where multiple references could be null.
Personally I prefer to have public methods throw ArgumentNullException if they cannot handle null input for the given argument, because it allows for a consistent behavior no matter how the method is implemented.
In reply to edit: In my opinion it is more important to provide a consistent set of interfaces with few surprises than optimizing every line of code. My guess is that in most cases the performance overhead of the null check will not be significant. If it does turn out to be a problem (as in profiling says so) I would consider changing it. Otherwise I wouldn't consider it a problem at this point.
We keep this rule on because we decided that it was always best to check parameters on their first entry into your code. That way, when the person using your code gets the exception the context in it will be better. Someone who does not have your source code for example will see that the exception was thrown in your code and may file a bug report against you for it wasting your time and theirs.
IMHO No. Checking for nulls is almost never going to be the performance bottleneck in your application. (And in the one in a million case where it is significant, you'll find it with your profiler and remove that one case).
The other question that should form in your mind is "is throw new NullReferenceException() really the best way to handle the error?" Often you can handle things better than that (even if only to provide a better error report to the user and/or yourself for debugging purposes). In many cases code can handle nulls gracefully, making it unnecessary for this to be an error at all.
edit
To answer your edit: Null checks really don't take long. The overhead for simply calling a method will be tens if not hundreds of times more than a null check. The only place where a null check will make any significant difference is in a large, tight loop where you are doing very little else. This situation doesn't happen very often - usually you will check for a null and then do something relatively expensive with that reference.
There is no situation where a crash or failure is a good thing. It is always better to "slow your application down" with null checks than to crash and lose your customer's data.
So don't prematurely optimise your code. Write it well, to be maintainable and robust, then profile it to see where the bottlenecks are. I've been programming for 28 years, being very liberal with null checks, and have never found that a null check was the cause of a performance problem. Usually it's things like doing lots of unnecessary work in a loop, using an O(n^3) algorithm where an O(n^2) approach is possible, failing to cache expensive-to-compute values, etc.
I was having a discussion with one of my colleagues about how defensive your code should be. I am all pro defensive programming but you have to know where to stop. We are working on a project that will be maintained by others, but this doesn't mean we have to check for ALL the crazy things a developer could do. Of course, you could do that but this will add a very big overhead to your code.
How do you know where to draw the line?
Anything a user enters directly or indirectly, you should always sanity-check. Beyond that, a few asserts here and there won't hurt, but you can't really do much about crazy programmers editing and breaking your code, anyway!-)
I tend to change the amount of defense I put in my code based on the language. Today I'm primarily working in C++ so my thoughts are drifting in that direction.
When working in C++ there cannot be enough defensive programming. I treat my code as if I'm guarding nuclear secrets and every other programmer is out to get them. Asserts, throws, compiler time error template hacks, argument validation, eliminating pointers, in depth code reviews and general paranoia are all fair game. C++ is an evil wonderful language that I both love and severely mistrust.
I'm not a fan of the term "defensive programming". To me it suggests code like this:
void MakePayment( Account * a, const Payment * p ) {
if ( a == 0 || p == 0 ) {
return;
}
// payment logic here
}
This is wrong, wrong, wrong, but I must have seen it hundreds of times. The function should never have been called with null pointers in the first place, and it is utterly wrong to quietly accept them.
The correct approach here is debatable, but a minimal solution is to fail noisily, either by using an assert or by throwing an exception.
Edit: I disagree with some other answers and comments here - I do not think that all functions should check their parameters (for many functions this is simply impossible). Instead, I believe that all functions should document the values that are acceptable and state that other values will result in undefined behaviour. This is the approach taken by the most succesful and widely used libraries ever written - the C and C++ standard libraries.
And now let the downvotes begin...
I don't know that there's really any way to answer this. It's just something that you learn from experience. You just need to ask yourself how common a potential problem is likely to be and make a judgement call. Also consider that you don't necessarily have to always code defensively. Sometimes it's acceptable just to note any potential problems in your code's documentation.
Ultimately though, I think this is just something that a person has to follow their intuition on. There's no right or wrong way to do it.
If you're working on public APIs of a component then its worth doing a good amount of parameter validation. This led me to have a habit of doing validation everywhere. Thats a mistake. All that validation code never gets tested and potentially makes the system more complicated than it needs to be.
Now I prefer to validate by unit testing. Validation definitely happens for data coming from external sources, but not for calls from non-external developers.
I always Debug.Assert my assumptions.
My personal ideology: the defensiveness of a program should be proportional to the maximum naivety/ignorance of the potential user base.
Being defensive against developers consuming your API code is not that different from being defensive against regular users.
Check the parameters to make sure they are within appropriate bounds and of expected types
Verify that the number of API calls which could be made are within your Terms of Service. Generally called throttling it usually only applies to web services and password checking functions.
Beyond that there's not much else to do except make sure your app recovers well in the event of a problem and that you always give ample information to the developer so that they understand what's going on.
Defensive programming is only one way of hounouring a contract in a design-by-contract manner of coding.
The other two are
total programming and
nominal programming.
Of course you shouldnt defend yourself against every crazy thing a developer could do, but then you should state in wich context it will do what is expected to using preconditions.
//precondition : par is so and so and so
function doSth(par)
{
debug.assert(par is so and so and so )
//dostuf with par
return result
}
I think you have to bring in the question of whether you're creating tests as well. You should be defensive in your coding, but as pointed out by JaredPar -- I also believe it depends on the language you're using. If it's unmanaged code, then you should be extremely defensive. If it's managed, I believe you have a little bit of wiggleroom.
If you have tests, and some other developer tries to decimate your code, the tests will fail. But then again, it depends on test coverage on your code (if there is any).
I try to write code that is more than defensive, but down right hostile. If something goes wrong and I can fix it, I will. if not, throw or pass on the exception and make it someone elses problem. Anything that interacts with a physical device - file system, database connection, network connection should be considered unereliable and prone to failure. anticipating these failures and trapping them is critical
Once you have this mindset, the key is to be consistent in your approach. do you expect to hand back status codes to comminicate problems in the call chain or do you like exceptions. mixed models will kill you or at least drive you to drink. heavily. if you are using someone elses api, then isolate these things into mechanisms that trap/report in terms you use. use these wrapping interfaces.
If the discussion here is how to code defensively against future (possibly malevolent or incompetent) maintainers, there is a limit to what you can do. Enforcing contracts through test coverage and liberal use of asserting your assumptions is probably the best you can do, and it should be done in a way that ideally doesn't clutter the code and make the job harder for the future non-evil maintainers of the code. Asserts are easy to read and understand and make it clear what the assumptions of a given piece of code is, so they're usually a great idea.
Coding defensively against user actions is another issue entirely, and the approach that I use is to think that the user is out to get me. Every input is examined as carefully as I can manage, and I make every effort to have my code fail safe - try not to persist any state that isn't rigorously vetted, correct where you can, exit gracefully if you cannot, etc. If you just think about all the bozo things that could be perpetrated on your code by outside agents, it gets you in the right mindset.
Coding defensively against other code, such as your platform or other modules, is exactly the same as users: they're out to get you. The OS is always going to swap out your thread at an inopportune time, networks are always going to go away at the wrong time, and in general, evil abounds around every corner. You don't need to code against every potential problem out there - the cost in maintenance might not be worth the increase in safety - but it sure doesn't hurt to think about it. And it usually doesn't hurt to explicitly comment in the code if there's a scenario you thought of but regard as unimportant for some reason.
Systems should have well designed boundaries where defensive checking happens. There should be a decision about where user input is validated (at what boundary) and where other potential defensive issues require checking (for example, third party integration points, publicly available APIs, rules engine interaction, or different units coded by different teams of programmers). More defensive checking than that violates DRY in many cases, and just adds maintenance cost for very little benifit.
That being said, there are certain points where you cannot be too paranoid. Potential for buffer overflows, data corruption and similar issues should be very rigorously defended against.
I recently had scenario, in which user input data was propagated through remote facade interface, then local facade interface, then some other class, to finally get to the method where it was actually used. I was asking my self a question: When should be the value validated? I added validation code only to the final class, where the value was actually used. Adding other validation code snippets in classes laying on the propagation path would be too defensive programming for me. One exception could be the remote facade, but I skipped it too.
Good question, I've flip flopped between doing sanity checks and not doing them. Its a 50/50
situation, I'd probably take a middle ground where I would only "Bullet Proof" any routines that are:
(a) Called from more than one place in the project
(b) has logic that is LIKELY to change
(c) You can not use default values
(d) the routine can not be 'failed' gracefully
Darknight