recursive locks - data-structures

Is there a use case for recursive locks? Is there a scenario that absolutely requires recursive locking.
Seems to be complicated and dangerous to use. I can see that we may avoid deadlocks (provided the lock stack doesn't overflow) but don't we want to catch such problems.
Maybe I'm missing something here. Any pointers are appreciated.
Thanks in advance.

I like the title of this Blog entry:
Recursive locks will kill you
I also like this quote:
http://www.thinkingparallel.com/2006/09/27/recursive-locks-a-blessing-or-a-curse/
Don’t use recursive mutexes. It’s akin to sex with used condoms.
Finally, here's an extremely interesting article about how recursive locks got into Posix pthreads in the first place:
http://groups.google.com/group/comp.programming.threads/msg/d835f2f6ef8aed99?hl=en&pli=1
Recursive mutexes are a hack. There's nothing wrong with using them,
but they're a crutch. Got a broken leg or library? Fine, use the
crutch. But at least be aware that you're using a crutch, and why; and
once in a while check out the leg (or library) to be sure you still
need the crutch. And if it's not healing up, go see a doctor, because
that's just not OK. When you have no choice, there's no shame in using
a crutch... but you can't run very well on a crutch, and you'll also
be slowing down anyone who depends on you.
Recursive mutexes can be a great tool for prototyping thread support
in an existing library, exactly because it lets you defer the hard
part: the call path and data dependency analysis of the library. But
for that same reason, always remember that you're not DONE until
they're all gone, so you can produce a library you're proud of, that
won't unnecessarily contrain the concurrency of the entire
application.

Related

How do I do a For loop without freezing the GUI?

I would like to know how I could run the following loop in a way where it doesn't freeze the GUI, as the loop can take minutes to complete. Thank you.
For i = 0 To imageCount
'code
Next
The short answer is you run the loop on another thread. The long answer is a whole book and a couple of semesters at university, because it entails resource access conflicts and various ways of addressing them such as locking and queueing.
Since you appear to be using VB.NET I suggest you use the latest version of the .NET framework and take advantage of Async and Await, which you can learn about from MSDN.
These keywords implement a very sophisticated canned solution that will allow you to achieve your goals in blissful ignorance of the nightmare behind them :)
Why experienced parallel coders would bother with async/await
Standout features of async/await are
automatic temporary marshalling back to the UI thread as required
scope of exception handlers (try/catch/finally) can span both setup and callback code
you write what is conceptually linear code with blocking calls on the UI thread, but because you declare calls that block using "await", the compiler rewrites your code as a state machine makes the preceding points true
Linear code with blocking calls is easy to write and easy to read. So it's much better from a maintenance perspective. But it provides an atrocious UX. Async/await means you can have it both ways.
All this is built on TPL; in a quite real sense it's nothing more than a compiler supported design pattern for TPL, which is why methods tagged as async are required to return a Task<>. There's so much to love about this, and no technical downside that I've seen.
My only concern is that it's all too good, so a whole generation will have no idea how tall the giants on whose shoulders they perch, just as most modern programmers have only dim awareness of the mechanics of stack frames in call stacks (the magic behind local variables).
You can run the loop on a separate thread. Read about using BackgroundWorker here http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx

How to use DoEvents() without being "evil"?

A simple search for DoEvents brings up lots of results that lead, basically, to:
DoEvents is evil. Don't use it. Use threading instead.
The reasons generally cited are:
Re-entrancy issues
Poor performance
Usability issues (e.g. drag/drop over a disabled window)
But some notable Win32 functions such as TrackPopupMenu and DoDragDrop perform their own message processing to keep the UI responsive, just like DoEvents does.
And yet, none of these seem to come across these issues (performance, re-entrancy, etc.).
How do they do it? How do they avoid the problems cited with DoEvents? (Or do they?)
DoEvents() is dangerous. But I bet you do lots of dangerous things every day. Just yesterday I set off a few explosive devices (future readers: note the original post date relative to a certain American holiday). With care, we can sometimes account for the dangers. Of course, that means knowing and understanding what the dangers are:
Re-entry issues. There are actually two dangers here:
Part of the problem here has to do with the call stack. If you call .DoEvents() in a loop that itself handles messages that use DoEvents(), and so on, you're getting a pretty deep call stack. It's easy to over-use DoEvents() and accidentally fill up your call stack, resulting in a StackOverflow exception. If you're only using .DoEvents() in one or two places, you're probably okay. If it's the first tool you reach for whenever you have a long-running process, you can easily find yourself in trouble here. Even one use in the wrong place can make it possible for a user to force a stackoverflow exception (sometimes just by holding down the enter key), and that can be a security issue.
It is sometimes possible to find your same method on the call stack twice. If you didn't build the method with this in mind (hint: you probably didn't) then bad things can happen. If everything passed in to the method is a value type, and there is no dependance on things outside of the method, you might be fine. But otherwise, you need to think carefully about what happens if your entire method were to run again before control is returned to you at the point where .DoEvents() is called. What parameters or resources outside of your method might be modified that you did not expect? Does your method change any objects, where both instances on the stack might be acting on the same object?
Performance Issues. DoEvents() can give the illusion of multi-threading, but it's not real mutlithreading. This has at least three real dangers:
When you call DoEvents(), you are giving control on your existing thread back to the message pump. The message pump might in turn give control to something else, and that something else might take a while. The result is that your original operation could take much longer to finish than if it were in a thread by itself that never yields control, definitely longer than it needs.
Duplication of work. Since it's possible to find yourself running the same method twice, and we already know this method is expensive/long-running (or you wouldn't need DoEvents() in the first place), even if you accounted for all the external dependencies mentioned above so there are no adverse side effects, you may still end up duplicating a lot of work.
The other issue is the extreme version of the first: a potential to deadlock. If something else in your program depends on your process finishing, and will block until it does, and that thing is called by the message pump from DoEvents(), your app will get stuck and become unresponsive. This may sound far-fetched, but in practice it's surprisingly easy to do accidentally, and the crashes are very hard to find and debug later. This is at the root of some of the hung app situations you may have experienced on your own computer.
Usability Issues. These are side-effects that result from not properly accounting for the other dangers. There's nothing new here, as long as you looked in other places appropriately.
If you can be sure you accounted for all these things, then go ahead. But really, if DoEvents() is the first place you look to solve UI responsiveness/updating issues, you're probably not accounting for all of those issues correctly. If it's not the first place you look, there are enough other options that I would question how you made it to considering DoEvents() at all. Today, DoEvents() exists mainly for compatibility with older code that came into being before other credible options where available, and as a crutch for newer programmers who haven't yet gained enough experience for exposure to the other options.
The reality is that most of the time, at least in the .Net world, a BackgroundWorker component is nearly as easy, at least once you've done it once or twice, and it will do the job in a safe way. More recently, the async/await pattern or the use of a Task can be much more effective and safe, without needing to delve into full-blown multi-threaded code on your own.
Back in 16-bit Windows days, when every task shared a single thread, the only way to keep a program responsive within a tight loop was DoEvents. It is this non-modal usage that is discouraged in favor of threads. Here's a typical example:
' Process image
For y = 1 To height
For x = 1 to width
ProcessPixel x, y
End For
DoEvents ' <-- DON'T DO THIS -- just put the whole loop in another thread
End For
For modal things (like tracking a popup), it is likely to still be OK.
I may be wrong, but it seems to me that DoDragDrop and TrackPopupMenu are rather special cases, in that they take over the UI, so don't have the reentrancy problem (which I think is the main reason people describe DoEvents as "Evil").
Personally I don't think it's helpful to dismiss a feature as "Evil" - rather explain the pitfalls so that people can decide for themselves. In the case of DoEvents there are rare cases where it's still reasonable to use it, for example while a modal progress dialog is displayed, where the user can't interact with the rest of the UI so there is no re-entrancy issue.
Of course, if by "Evil" you mean "something you shouldn't use without fully understanding the pitfalls", then I agree that DoEvents is evil.

Should you wrap 3rd party libraries that you adopt into your project? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
A discussion I had with a colleague today.
He claims whenever you use a 3rd party library, you should always write for it a wrapper. So you can always change things later and accomodate things for your specific use.
I disagree with the word always, the discussion arose regarding log4j and I claimed that log4j has well tested and time proven API and implementation, and everything thinkable can be configured a posteriori and there is nothing you should wrap. Even if you wanted to wrap there are proven wrappers like commons-logging and log5j.
Another example that we touched in our discussion is Hibernate. I claimed that it has a very big API to be wrapped. Furthermore it has a layered API which lets you tweak its inside if you so need. My friend claimed that he still believes it should be wrapped but he didn't do it because of the size of the API (this co-worker is much veteran than me in our current project).
I claimed this, and that wrapping should be done in specific cases:
you are not sure how the library will fit your needs
you will only use a small portion of a libary (in which case you may only expose a part of its API).
you are not sure of the quality of the library's API or implementation.
I also maintained that sometimes you can wrap your code instead of the library. For example, puting your database related code in a DAO layer, instead of preemptively wrapping all of hibernate.
Well, in the end this is not really a question, but your insights, experiences and opinions are highly appreciated.
It's a perfect example for YAGNI:
it is more work
it inflates your project
it may complicate your design
it has no immediate benefit
the scenarion you write it for may never manifest
when it does, your wrapper most likely needs to be re-written completely because it is tied too closely to the concrete library you were using and the new one's API simply doesn't match yours.
Well, the obvious benefit is for switching technologies. If you have a library that becomes deprecated, and you want to switch, you may end up rewriting a lot of code to accommodate the change, whereas if it were wrapped, you'd have an easier time writing a new wrapper for the new lib, than changing all your code.
On the other hand, it would mean that you have to write a wrapper for every trivial library that you include, which is probably an unacceptable amount of overhead.
My industry is all about speed, so the only time I'd be able to justify writing a wrapper is if it was around some critical library that was likely to change dramatically on a regular basis. Or, more commonly, if I need to take a new library and shoehorn it into old code, which is an unfortunate reality.
It's definitely not an "always" situation. It's something that may be desirable. But the time isn't always going to be there, and, in the end, if writing a wrapper takes hours and the long term code library changes are going to be few, and trivial...Why bother?
No. Java architects/wanna-bees are too busy designing against imaginary changes.
With modern IDE, it's a piece of cake when you do need change. Until then, keep it simple.
I agree with everything that's been said pretty much.
The only time wrapping third party code is useful (bar violating YAGNI) is for unit testing.
Mocking statics and so forth requires you to wrap the code, this is a valid reason to write wrappers for third party code.
In the case of logging code, its not needed though.
The problem here is partially the word 'wrapper', partially a false dichotomy, and partially a false distinction between the JDK and everything else.
The word 'wrapper'
Wrapping all of Hibernate, as you say, is a completely impractical enterprise.
Restricting the Hibernate dependencies to an identified, controlled, set of source files, on the other hand, may well be practical and achieve the same results.
The false dichotomy
The false dichotomy is the failure to recognize a third option: standards. If you use, say, JPA annotations, you can swap Hibernate for other things. If you are writing a web service and use JAX-WS annotations and JAX-B, you can swap between the JDK, CXF, Glassfish, or whatever.
The false distinction
Sure, the JDK changes slowly and is unlikely to die. But major open source packages also change slowly and are unlikely to die. Untold thousands of developers and projects use Hibernate. There's really no more risk of Hibernate disappearing or making radical incompatible API changes than there is of Java itself.
If the library you are planning to wrap is unique in its "access principles, metaphors and idioms" from other offerings in the same domain, then your wrapper is pretty much going to be similar to that library and won't do you any good if you one day switch to a different library since you will need a new wrapper.
If the library is accessed in a similar way to other libraries and the same wrapper can apply to these libraries, then they are probably written based on some existing standard and there is some common layer that already exists to access both of them.
I would only go with wrappers if I knew for sure that I would have to support multiple and substantially different libraries in production.
The main factor for deciding to wrap a library or not is the impact a library change will have on the code. When a library is only called from 1 class the impact of changing library will be minimal. If on the other side a library is called in all classes a wrapper is much more likely.
Any uncertainty around the choice of 3rd party library should be flushed out at the beginning of the project using prototypes to test the scalability/suitability/whatever of the 3rd party library.
If you decide to go ahead and provide full de-coupling/abstraction support it should be costed up and ultimately approved by the project sponsor - ultimately it's a commercial decision as someone has to pay for it and the work required to do it (unless it's absolutely trivial, in which case the api is probably low risk anyway).
Generally an experienced architect will chose a technology that they can be reasonably confident with, and have experience of, and that they are confident will last the lifetime of the app, OR else they will eliminate any risk in the decision early on in the project, thus removing any need to do this, most of the time
I'd tend to agree with most of your points. Using absolutes often gets you into trouble and saying you should "always" do something limits your flexibility. I'd add some more points to your list.
When you use wrapping code around a very common API, like Hibernate or log4j you make it more difficult to bring on new developers. New developers now have to learn a whole new API, where if you hadn't wrapped the code they would have been very familiar right away.
On the flip side of that, you also limit your developers' view into the API. Using an advanced feature of the API takes more time because you have to make sure that your wrapper is implemented in a way that can handle it.
Many of the wrapping layers I've seen also are very specific to the underlying implementation. So, if you write a log wrapper around log4j, you are thinking in log4j terms. If some new cool framework comes out, it may change the whole paradigm, so your wrapping code doesn't migrate as well as you had thought.
I'm definitely not saying wrapping code is always bad, but as you stated, there are a lot of factors you have to consider.
The purpose of wrapping even a well-tested and time-proven 3rd-party library is that you might decide to switch libraries at some point in the future. Wrapping it makes it easier to switch without changing any code in your core application. Only the wrapper needs to change.
If you're absolutely sure that you'll never (another absolute) use a different logging framework in your project, go ahead and skip the wrapper. Even having said that, I'd probably hold off on writing the wrapper until I knew I needed it, like the first time I need to switch.
This is kind of a funny question.
I've worked in systems where we've found showstopper bugs in libraries we were using, and which upstream was either no longer maintaining, or not interested in fixing. In a language like Java, you usually can't fix internal bugs from a wrapper. (Fortunately, if they're open-source, you can at least fix them yourself.) So it's no help here.
But I'm often working in a language where you can easily modify libraries at any time, without seeing or even having their source code -- I commonly add new methods to existing classes, for example. So in this case, there's no point in wrapping: just make the change you want.
Also, does your colleague draw the line at things called "libraries"? What about Java itself? Does he wrap built-in classes? Does he wrap the filesystem? The thread scheduler? The kernel? (That is, with his own wrappers -- in a sense, everything is a wrapper around the CPU, but it sounds like he's talking about wrappers in your source repo that are completely under your control.) I've had built-in functionality change or disappear when new versions of it appear. Java is not immune from this.
So the idea to always write a wrapper comes down to a bet. Assuming he's only wrapping third-party libraries, he seems to be implicitly betting that:
"first-party" functionality (like Java itself, the kernel, etc.) will never change
when "third-party" functionality changes, it will always be done in a way that can be fixed in a wrapper
Is that true in your case? I don't know. Of the medium-large Java projects I've done, it's rarely true for me. I wouldn't spend effort wrapping all third-party libraries, because it seems like a poor bet, but your situation is certainly different from mine.
There is one situation where you with good reason can wrap. Namely if you need to test stuff, and the default third party object is heavy weight. Then having an interface can really make a difference.
Note, this is not to replace the library ,but make it manageable where it doesn't matter much.
Wrapping a whole library is boilerplate, ineffective, and wrong in most cases. It can be done in a much clever way. I'd say that wrapping a library is appropriate mostly in case of UI component libraries, and again, you have to be adding some additional core functionality of yours to all the components for this to be needed.
if too much modifications and additions are needed, this is most likely not the library you are looking for
if there is a moderate amount of additions and modifications - there are always the design patterns that come handy in those cases. The Decorator pattern (allows new/additional behaviour to be added to an existing object dynamically) , for example, is rather suitable for the most cases.
IDE search/replace and refactoring capabilities offer an easy way to change your code in all required places if some important change is needed and a wrapping object appears. (of course, unit-tests would be helpful here ;) )
In my experience the question becomes fairly moot if you're using abstractions sufficiently. Coupling to a library is just like coupling to any other interface. Thus you want to reduce accidental coupling and the scope of rewrite necessary if you need to swap out the implementation. Don't bind your application logic to some construct, but don't just form a bunch of stupid (literally) wrappers around something and expect to gain any benefit.
A wrapper doesn't usually gain you anything unless it's answering a specific purpose (such as polymorphizing a non-polymorphic construct). They often show up in refactoring, but I wouldn't recommend forming an architecture on them. There's a few exceptions of course, but there is with any principle.
This doesn't speak toward adapters. An adapter can be a pretty important component for when you want to actually alter the interface of a library and its use to be in line with architecture, code, or domain concepts in your project.
You should do it always, often, sometimes, rarely, or never. Not even your colleague does it always, but the instructive cases are always and never. Suppose that it is sometimes necessary. If you never wrapped a library, the worst consequence is that one day you discovered that it was necessary for a library that you had used all over the place. It would take you some time to wrap that library and to perform shotgun surgery on the clients. The question is whether that eventuality would take more time than habitually providing wrappers that are rarely necessary, but having never to perform the shotgun surgery.
My instinct is to appeal to the YAGNI (you ain't gonna need it) principle and opt for "rarely".
I would not wrap it as a one to one thing, but I would layer the app so that each part it replaceable as much as possible. The ISO OSI model works well for all types of software :-)

How "defensive" should my code be?

I was having a discussion with one of my colleagues about how defensive your code should be. I am all pro defensive programming but you have to know where to stop. We are working on a project that will be maintained by others, but this doesn't mean we have to check for ALL the crazy things a developer could do. Of course, you could do that but this will add a very big overhead to your code.
How do you know where to draw the line?
Anything a user enters directly or indirectly, you should always sanity-check. Beyond that, a few asserts here and there won't hurt, but you can't really do much about crazy programmers editing and breaking your code, anyway!-)
I tend to change the amount of defense I put in my code based on the language. Today I'm primarily working in C++ so my thoughts are drifting in that direction.
When working in C++ there cannot be enough defensive programming. I treat my code as if I'm guarding nuclear secrets and every other programmer is out to get them. Asserts, throws, compiler time error template hacks, argument validation, eliminating pointers, in depth code reviews and general paranoia are all fair game. C++ is an evil wonderful language that I both love and severely mistrust.
I'm not a fan of the term "defensive programming". To me it suggests code like this:
void MakePayment( Account * a, const Payment * p ) {
if ( a == 0 || p == 0 ) {
return;
}
// payment logic here
}
This is wrong, wrong, wrong, but I must have seen it hundreds of times. The function should never have been called with null pointers in the first place, and it is utterly wrong to quietly accept them.
The correct approach here is debatable, but a minimal solution is to fail noisily, either by using an assert or by throwing an exception.
Edit: I disagree with some other answers and comments here - I do not think that all functions should check their parameters (for many functions this is simply impossible). Instead, I believe that all functions should document the values that are acceptable and state that other values will result in undefined behaviour. This is the approach taken by the most succesful and widely used libraries ever written - the C and C++ standard libraries.
And now let the downvotes begin...
I don't know that there's really any way to answer this. It's just something that you learn from experience. You just need to ask yourself how common a potential problem is likely to be and make a judgement call. Also consider that you don't necessarily have to always code defensively. Sometimes it's acceptable just to note any potential problems in your code's documentation.
Ultimately though, I think this is just something that a person has to follow their intuition on. There's no right or wrong way to do it.
If you're working on public APIs of a component then its worth doing a good amount of parameter validation. This led me to have a habit of doing validation everywhere. Thats a mistake. All that validation code never gets tested and potentially makes the system more complicated than it needs to be.
Now I prefer to validate by unit testing. Validation definitely happens for data coming from external sources, but not for calls from non-external developers.
I always Debug.Assert my assumptions.
My personal ideology: the defensiveness of a program should be proportional to the maximum naivety/ignorance of the potential user base.
Being defensive against developers consuming your API code is not that different from being defensive against regular users.
Check the parameters to make sure they are within appropriate bounds and of expected types
Verify that the number of API calls which could be made are within your Terms of Service. Generally called throttling it usually only applies to web services and password checking functions.
Beyond that there's not much else to do except make sure your app recovers well in the event of a problem and that you always give ample information to the developer so that they understand what's going on.
Defensive programming is only one way of hounouring a contract in a design-by-contract manner of coding.
The other two are
total programming and
nominal programming.
Of course you shouldnt defend yourself against every crazy thing a developer could do, but then you should state in wich context it will do what is expected to using preconditions.
//precondition : par is so and so and so
function doSth(par)
{
debug.assert(par is so and so and so )
//dostuf with par
return result
}
I think you have to bring in the question of whether you're creating tests as well. You should be defensive in your coding, but as pointed out by JaredPar -- I also believe it depends on the language you're using. If it's unmanaged code, then you should be extremely defensive. If it's managed, I believe you have a little bit of wiggleroom.
If you have tests, and some other developer tries to decimate your code, the tests will fail. But then again, it depends on test coverage on your code (if there is any).
I try to write code that is more than defensive, but down right hostile. If something goes wrong and I can fix it, I will. if not, throw or pass on the exception and make it someone elses problem. Anything that interacts with a physical device - file system, database connection, network connection should be considered unereliable and prone to failure. anticipating these failures and trapping them is critical
Once you have this mindset, the key is to be consistent in your approach. do you expect to hand back status codes to comminicate problems in the call chain or do you like exceptions. mixed models will kill you or at least drive you to drink. heavily. if you are using someone elses api, then isolate these things into mechanisms that trap/report in terms you use. use these wrapping interfaces.
If the discussion here is how to code defensively against future (possibly malevolent or incompetent) maintainers, there is a limit to what you can do. Enforcing contracts through test coverage and liberal use of asserting your assumptions is probably the best you can do, and it should be done in a way that ideally doesn't clutter the code and make the job harder for the future non-evil maintainers of the code. Asserts are easy to read and understand and make it clear what the assumptions of a given piece of code is, so they're usually a great idea.
Coding defensively against user actions is another issue entirely, and the approach that I use is to think that the user is out to get me. Every input is examined as carefully as I can manage, and I make every effort to have my code fail safe - try not to persist any state that isn't rigorously vetted, correct where you can, exit gracefully if you cannot, etc. If you just think about all the bozo things that could be perpetrated on your code by outside agents, it gets you in the right mindset.
Coding defensively against other code, such as your platform or other modules, is exactly the same as users: they're out to get you. The OS is always going to swap out your thread at an inopportune time, networks are always going to go away at the wrong time, and in general, evil abounds around every corner. You don't need to code against every potential problem out there - the cost in maintenance might not be worth the increase in safety - but it sure doesn't hurt to think about it. And it usually doesn't hurt to explicitly comment in the code if there's a scenario you thought of but regard as unimportant for some reason.
Systems should have well designed boundaries where defensive checking happens. There should be a decision about where user input is validated (at what boundary) and where other potential defensive issues require checking (for example, third party integration points, publicly available APIs, rules engine interaction, or different units coded by different teams of programmers). More defensive checking than that violates DRY in many cases, and just adds maintenance cost for very little benifit.
That being said, there are certain points where you cannot be too paranoid. Potential for buffer overflows, data corruption and similar issues should be very rigorously defended against.
I recently had scenario, in which user input data was propagated through remote facade interface, then local facade interface, then some other class, to finally get to the method where it was actually used. I was asking my self a question: When should be the value validated? I added validation code only to the final class, where the value was actually used. Adding other validation code snippets in classes laying on the propagation path would be too defensive programming for me. One exception could be the remote facade, but I skipped it too.
Good question, I've flip flopped between doing sanity checks and not doing them. Its a 50/50
situation, I'd probably take a middle ground where I would only "Bullet Proof" any routines that are:
(a) Called from more than one place in the project
(b) has logic that is LIKELY to change
(c) You can not use default values
(d) the routine can not be 'failed' gracefully
Darknight

What helps to you improve your ability to find a bug?

I want to know if there are method to quickly find bugs in the program.
It seems that the more you master the architecture of your software, the more quickly
you can locate the bugs.
How the programmers improve their ability to find a bug?
Logging, and unit tests. The more information you have about what happened, the easier it is to reproduce it. The more modular you can make your code, the easier it is to check that it really is misbehaving where you think it is, and then check that your fix solves the problem.
Divide and conquer. Whenever you are debugging, you should be thinking about cutting down the possible locations of the problem. Every time you run the app, you should be trying to eliminate a possible source and zero in on the actual location. This can be done with logging, with a debugger, assertions, etc.
Here's a prophylactic method after you have found a bug: I find it really helpful to take a minute and think about the bug.
What was the bug exactly in essence.
Why did it occur.
Could you have found it earlier, easier.
Anything else you learned from the bug.
I find taking a minute to think about these things will make it far less likely that you will produce the same bug in the future.
I will assume you mean logic bugs. The best way I have found to capture logic bugs is to implement some sort of testing scheme. Check out jUnit as the standard. Pretty much you define a set of accepted outputs of your methods. Every time you compile your system it checks all of your test cases. If you have introduced new logic that breaks your tests, you will know about it instantly and know exactly what you have to fix.
Test driven design is a pretty big movement in programming right now. You will be hard pressed to find a language that doesn't support some kind of testing. Even JavaScript has a multitude of test suites.
Experience makes you a better debugger. Pay close attention to the bugs that you AND others commonly make. Try to figure out if/how these bugs apply to ALL code that affects you, not the single instance of where the bug was seen.
Raymond Chen is famous for his powers of psychic debugging.
Most of what looks like psychic
debugging is really just knowing what
people tend to get wrong.
That means that you don't necessarily have to be intimately familiar with the architecture / system. You just need enough knowledge to understand the types of bugs that apply and are easy to make.
I personally take the approach of thinking about where the bug may be in the code before actually opening up the code and taking a look. When you first start with this approach, it may not actually work very well, especially if you are pretty unfamiliar with the code base. However, over time someone will be able to tell you the behavior they are experiencing and you'll have a good idea where the problem is located or you may even know what to fix in the code to remedy the problem before even looking at the code.
I was on a project for several years that maintained by a vendor. They were not very good debuggers and most of the time it was up to us to point them to an area of the code that had the problem. What made our problem worse was that we didn't have a nice way to view the source code, so a lot of our "debugging" was just feeling.
Error checking and reporting. The #1 newbie coder debugging mistake is to turn off error reporting, avoid checking for whether what's going on makes sense, etc etc. In general, people feel like if they can't see anything going wrong then nothing is going wrong. Which of course could not be further from the case.
Instead, your code should be chock full of error conditions that will make lots of noise, with detailed reporting, someplace you will see it. (This doesn't mean inside a production web page.) Then, instead of having to trace an error all over the place because it got passed through sixteen layers of execution before it finally got someplace that broke, your errors start happening proximately to the actual issue.
It seems that the more you master the
architecture of your software ,the
more quickly you can locate the bugs.
After understanding the architecture, one's ability to find bugs in the application increases with their ability to identify and write extensive tests.
Know your tools.
Make sure that you know how to use conditional breakpoints and watches in your debugger.
Use static analysis tools as well - they can point out the more obvious issues.
Sleep and rest.
Use programming methods that produce fewer bugs in the first place.
If to implement a single stand-alone functional requirement it takes N separate point-edits to source code, the number of bugs put into the code is roughly proportional to N, so find programming methods that minimize N. Ways to do this: DRY (don't repeat yourself), code generation, and DSL (domain-specific-language).
Where bugs are likely, have unit tests.
Obviously.IMHO, the best unit tests are monte-carlo.
Make intermediate results visible.
For example, compilers have intermediate representations, in the form of 4-tuples. If there is a bug, the intermediate code can be examined. That tells if the bug is in the first or second half of the compiler.
P.S. Most programmers are not aware that they have a choice of how much data structure to use. The less data structure you use, the less are the chances for bugs (and performance issues) caused by it.
I find tracepoints to be an invaluable debugging tool. They are a bit like logging, except you create them during a debugging session to solve a particular issue, like breakpoints.
Printing the stacktrace in a tracepoint can be especially useful. For example, you can print the hash code and stacktrace in the constructor of an object, and then later on when the object is used again you can search for its hashcode to see which client code created it. Same for seeing who disposed it or called a certain method etc.
They are also great for debugging issues related to window focus changes etc, where the debugger would interfere if you drop in break mode.
Static code tools like FindBugs
Assertions, assertions, and assertions.
Some areas of our code has 4 or 5 assertions for each line of real code. When we get a bug report the first thing that happens is that the customer data is processed in our debug build 99 times out a hundred an assert will fire near the cause of the bug.
Additionally our debug build perform redundant calculations to ensure that an optimized algorithm is returning the correct result, and also debug functions are used to examine the sanity of data structures.
The hardest thing new developers have to contend with is getting their code to survive the assertions of the code gthey are calling.
Additionally we do not allow any code to be putback to toplevel that causes any integration or unit test to fail.
Stepping through the code, examining flow/state where unexpected behavior is occurring. (Then develop a test for it, of course).
Writing Debug.Write(message) in your code and using DebugView is another option. And then run your application find out what is going on.
"Architecture" in software means something like:
Several components
The components interact across clearly-defined interfaces
Each component has a well-defined responsibility
The responsibility of one component is unlike the responsibilities of other components
So, as you said, the better the architecture the easier it is to find bugs.
First: knowing the bug, you can decide which functionality is broken, and therefore know which component implements that functionality. For example, if the bug is that something isn't being logged properly, therefore this bug should be in one of 3 places:
In the component that's responsible for logging (your logging library)
Or, above that in the application code which is using this library
Or, below that in the system code which this library is using
Second: examine the data transfered across the interfaces between components. To continue the previous example above:
Set a debugger breakpoint on the application code which invokes the logger API, to verify whether the logger API is being used correctly (e.g. whether it's being invoked at all, whether parameters are as-expected, etc.).
Doing this tells you whether the bug is in the component above this interface, or in the component that's below this interface.
Repeat (perhaps using binary search if the call stack is very deep) until you've found which component is at fault.
When you come to the point that you think there must be a bug in the OS, check your assertions -- and put them into the code with "assert" statements.
Conversely, as you are writing the code, think of the range of valid inputs for your algorithms and put in assertions to make sure you have what you think you have. Same goes for output: Check that you produced what you think you produced.
E.g. if you expect a non-empty list:
l = getList(input)
assert l, "List was empty for input: %s" % str(input)
I'm part of the QA team # work, and knowing anything about the product and how it is developed, helps a lot in finding bugs, also when I make new QA tools I pass it to our dev team to test it, finding bugs in your own code is just plain hard!
Some people say programmers are tainted, so we cannot see bugs in their own product; we are not talking about code here, we are beyond that, usability and functionality itself.
Meanwhile unit testing seams to be a nice solution to find bugs in your own code, its totally pointless if you're wrong even before writing the unit test, how are you going to find the bugs then? you don't!, let your co-worker find them, hire a QA guy.
Scientific debugging is what I always used, and it greatly helps.
Basically, if you can replicate a bug, you can track its origin. You should then experiment some tests, observe the results, and infer hypotheses on why the bug happens.
Writing about all your hypotheses, attempts, expected results and observed results can help you track down the bugs, particularly if they're nasty.
There are automated tools that can help you with that process, particularly git-bisect (and similar bisection tools on other revision systems) to quickly find which change introduced the bug, unit testing to reproduce a bug and prevent regressions in your code (can be used in combination with bisect), and delta debugging to find the culprit in your code (similar to git-bisect but whereas git-bisect works on the code history, delta debugging works on the code directly).
But whatever the tools you are using, the most important benefit is in the scientific methodology, as this is the formalization of what most experienced debuggers do.

Resources