An error which does not present itself with a debugger attached - debugging

I am using Intel's FORTRAN compiler to compile a numerical library. The test case provided errors out within libc.so.6. When I attach Intel's debugger (IDB) the application runs through successfully. How do I debug a bug where the debugger prevents the bug? Note that the same bug arose with gfortran.
I am working within OpenSUSE 11.2 x64.
The error is:
forrtl: severe (408): fort: (3): Subscript #1 of the array B has value -534829264 which is less than the lower bound of 1

The error message is pretty clear to me, you are attempting to access a non-existent element of an array. I suspect that the value -534829264 is either junk when you use an uninitialised variable to identify the element in the array, or the result of an integer arithmetic overflow. Either way you should switch on the compilation flag to force array bounds checking and run some tests. I think the flag for the Intel compiler would be -CB, but check the documentation.
As to why the program apparently runs successfully in the debugger I cannot help much, but perhaps the debugger imposes some default values on variables that the run time system itself doesn't. Or some other factor entirely is responsible.
EDIT:
Doesn't the run-time system tell you what line of code causes the problem ? Some more things to try to diagnose the problem. Use the compiler to warn you of
use of variables before they are initialised;
integer arithmetic overflow (not sure if the compiler can spot this ?);
any forced conversions from one type to another and from one kind to another within the same type.
Also, check that the default integer size is what you expect it to be and, more important, what the rest of the code expects it to be.

Not an expert in the area but couple of things to consider:
1) Is the debugger initialising the variable used as the index to zero first, but the non-debug does not and so the variable starts with a "junk" value (had an old version of Pascal that used to do that).
2) Are you using threading? If so is the debug changing the order of execution so some prep-thread is completing in time.

Related

How does a bytecode interpreter know what line a runtime error occurred on?

As of now, I am working on a language that compiles to bytecode, and then is ran by a VM. My question is, when a runtime error occurs, how does the VM know what line of the source code caused the error, as all whitespace is removed during the compilation process. One thing I would think of is to store a separate array of integers correlating to the bytecode with the line numbers within it, but that sounds extremely memory-inefficient, especially when there are a lot of instructions.
Some forms of bytecode contain information about line numbers, method names, etc. which are included to provide better debugging information. In the JVM, for example, method bytecode contains a table that maps ranges of bytecode addresses to source line numbers. That’s a more efficient way of storing it than tagging each bytecode operation with a line number, since there are typically multiple operations per line. It does use extra space, though I wouldn’t classify it as extremely inefficient.
Absent this info, there really isn’t a way for the interpreter to report anything about the original program, since as you’ve noted all that information is otherwise discarded.
This is similar to how compiled executables handle debug info. With debug symbols included, the program has tables mapping code addresses to function names and line numbers. With symbols stripped out, you just have raw instructions and data and there’s no way to reference the original code.

Follow register changes with gdb

How can I follow on changes in specific registers using GDB?
I want to write a log each instruction's address that changed the value on this register
How can I do that using GDB ?
I want to write a log each instruction's address that changed the value on this register
The only way to do this is to single-step the program, compare values of registers to previously-saved values, and print previous value of instruction pointer if the value of the register of interest has changed.
You can automate this by using GDB embedded Python, but even with automation this will be impractically slow for any non-trivial program (as would single-stepping without actually doing anything between the steps).
P.S. Depending on what actual problem you are trying to solve (see http://xyproblem.info), more practical solutions may exist.

C Program crashes when run, works in GDB

I am working on a program that crashes when it is run, but works just fine when debugged in GDB. I have seen this thread and removed optimizations and tried checking values of relevant local and global variables, with nothing seemingly out of place. It is not a concurrent program, so there shouldn't be issues with race conditions between threads. Windows Event Viewer logs the issue as a heap corruption (a problem with ntdll.dll), and I'm not sure what could be causing this. I am compiling with the 64-bit version of MinGW.
The program itself is rather large, and I'm not even sure which part to post. I don't really know how to proceed or what else I could check for. Any guidance if this is a known issue would be greatly appreciated, and if there is any other information I could post please let me know.
I was able to track down the issue - somewhere in the code, I was using fscanf to read in arrays of type int, but the variables that they were being stored in (i.e., the third arg to fscanf) were of type char*. Changed the argument to one of type int* and it fixed the issue.

Creating deliberately dirty heap in gdb?

I found that trying to debug accidentally uninitialized data in gdb can be annoying. The program will crash when directly executed from the command line, but not while under inspection in gdb. It seems like gdb's heap is often clean (all zeroes), whereas from the command line, clearly not.
Is there a reason for this? If so, can I deliberately tell gdb or gcc to dirty the heap? IE, is there way to specify a "debug" allocator that will always give random data to malloc() and new? I imagine this might involve a special libc? Obviously if there was a way to do this without changing the linker options would be great so that the release version is as similar as possible to the debug version.
I'm currently using MinGW-w64 (gcc 4.7 based), but I'd be interested in a general answer.
The Linux way of doing this would be to use valgrind. On Mac OS X there are environment variables that control allocation debugging, see the Mac OS X man page for malloc. Valgrind support for Mac OS X is starting to appear but 10.8 support is not complete as of me writing this.
As you're using MinGW-w64 I am assuming you're using Windows. It seems like this SO question talks about alternatives to valgrind on Windows. One solution would be to run your app in Wine on a Linux box under valgrind.
If your program is running under valgrind, it is not directly running on a CPU. Valgrind is simulating every instruction, hence you can't simply attach a debugger to it. To get this to work you need to use the valgrind GDB server, see this page for more details.
Another approach would be to use calloc instead of malloc, which would zero your heap allocations. This doesn't give you a deliberately dirty heap but at least gives you consistent behaviour with or without a debugger.
Yes, GDB zeroes out everything, this is both useful and very annoying. Useful, insofar as everything is guaranteed to be in a well-defined state (no random values in memory, just zero). Which means, in theory, no nasty surprises while debugging.
In practice, and this is where it gets annoying, the theory sometimes fails spectacularly. The infamous "works fine, but crashes in debugger!?!" or "works fine in debugger, but crashes otherwise?!" issues are an example of this. Usually, this is a combination of an uninitialized pointer with a well-intended if(ptr != NULL) somewhere, which totally blows up for "no good reason" because the debugger initializes memory to zero, so the test fails to do what you intended.
About your question on deliberately garbling data allocated by malloc, GCC supports malloc hooks (see docs here and question here on SO).
This lets you, in a very easy and unintrusive manner, redirect all calls to malloc to a function of your own. From there you can call the real malloc and fill the allocated block with garbage (or some invalid-pointer magic value like DEADBEEF), if you wish to do so.
As for operator new, this happens to be a wrapper around malloc (that's an implementation detail, but malloc hooks are non-portable already, so relying on that won't make things worse), therefore malloc hooking should already deal with this, too.

glibc Heap Consistency Checking

According to posts from 2008 (I can't find it right now), glibc heap check doesn't work in multithreaded environment. Is it still situation now in 2010?
Does heap check enabled by default? (gcc 4.1.2)? I don't set MALLOC_CHECK_, don't aware of calling mcheck(), but still sometimes receive double free glibc error with backtrace. Maybe it's enabled by some compilation flag?
By default, without using malloc_check_ or mcheck(), glibc does some little checks that doesn't hurt the performance, like calling twice free() on the same memory chunk. That's why you are getting some of these messages, but you won't have all messages provided by the malloc substitute api you can get by using MALLOC_CHECK_ (which are doing far more tests, but far more cpu intensive too). You can check this by triggering an error, and testing it with and without malloc_check_. For example, for a simple double-free(), i get "double free or corruption (top)" or "free(): invalid pointer" errors depending whenever I set MALLOC_CHECK_ or not.
To answer the 1/ question, mcheck relies on malloc hooks since they exists (like 15 years), and those are not intended to be thread safe.
Sources: glibc/malloc/malloc.c, http://sourceware.org/bugzilla/show_bug.cgi?id=9939

Resources