glibc Heap Consistency Checking - gcc

According to posts from 2008 (I can't find it right now), glibc heap check doesn't work in multithreaded environment. Is it still situation now in 2010?
Does heap check enabled by default? (gcc 4.1.2)? I don't set MALLOC_CHECK_, don't aware of calling mcheck(), but still sometimes receive double free glibc error with backtrace. Maybe it's enabled by some compilation flag?

By default, without using malloc_check_ or mcheck(), glibc does some little checks that doesn't hurt the performance, like calling twice free() on the same memory chunk. That's why you are getting some of these messages, but you won't have all messages provided by the malloc substitute api you can get by using MALLOC_CHECK_ (which are doing far more tests, but far more cpu intensive too). You can check this by triggering an error, and testing it with and without malloc_check_. For example, for a simple double-free(), i get "double free or corruption (top)" or "free(): invalid pointer" errors depending whenever I set MALLOC_CHECK_ or not.
To answer the 1/ question, mcheck relies on malloc hooks since they exists (like 15 years), and those are not intended to be thread safe.
Sources: glibc/malloc/malloc.c, http://sourceware.org/bugzilla/show_bug.cgi?id=9939

Related

Force malloc to pre-fault/MAP_POPULATE/MADV_WILLNEED all allocations for an entire program/process

For the sake of some user-space performance profiling, I'd like to cleanly separate the costs of allocating memory from operations that access it. The application does no over-allocation, so every page that gets mapped will be faulted in, probably in code that runs shortly after its allocation.
What I'd like to do is set some flag, environment variable, something, to tell malloc that it should uniformly do the equivalent of calling mmap(..., MAP_POPULATE) or madvise(..., MADV_WILLNEED) or just touching every page of whatever it allocated itself. I haven't found any documentation, on any platform(!), that describes a way to do this. Is there some existing technique that's utterly undocumented, up to my ability to search? Is this a fundamentally misguided or bad idea?
If I wanted to implement this myself, I'm thinking of an LD_PRELOAD including just a reimplementation of malloc that calls the underlying malloc and then does the madvise thing (to be at least somewhat agnostic to huge pages behavior). Any reason that shouldn't work?
malloc is one of the most used, yet relatively slow functions in common use. As a result, it has received a lot of optimization attention over the years. I seriously doubt that any serious implementation of malloc does anything so slow as the string parsing that would be required to check an environment variable at every call.
LD_PRELOAD is not a bad idea, considering what you're doing, you wouldn't even need to recompile to switch between profile and release builds. If you're open to recompiling, I would suggest doing a #define malloc(size) { malloc(size); mmap(...);}. You could even do this at the compile command line via -Dmalloc=... (so long as the system malloc is not itself a define, which would overwrite the cli one).
Another option would be to find/implement a program that uses the debug interface to intercept and redirect calls to malloc. You could theoretically do this by messing with the post-compiled (or post-load) program's import section to point to your dll/so file.
Edit: On second thought, the define might not work on every allocation, since it is often implied by the compiler (e.g. new).

Creating deliberately dirty heap in gdb?

I found that trying to debug accidentally uninitialized data in gdb can be annoying. The program will crash when directly executed from the command line, but not while under inspection in gdb. It seems like gdb's heap is often clean (all zeroes), whereas from the command line, clearly not.
Is there a reason for this? If so, can I deliberately tell gdb or gcc to dirty the heap? IE, is there way to specify a "debug" allocator that will always give random data to malloc() and new? I imagine this might involve a special libc? Obviously if there was a way to do this without changing the linker options would be great so that the release version is as similar as possible to the debug version.
I'm currently using MinGW-w64 (gcc 4.7 based), but I'd be interested in a general answer.
The Linux way of doing this would be to use valgrind. On Mac OS X there are environment variables that control allocation debugging, see the Mac OS X man page for malloc. Valgrind support for Mac OS X is starting to appear but 10.8 support is not complete as of me writing this.
As you're using MinGW-w64 I am assuming you're using Windows. It seems like this SO question talks about alternatives to valgrind on Windows. One solution would be to run your app in Wine on a Linux box under valgrind.
If your program is running under valgrind, it is not directly running on a CPU. Valgrind is simulating every instruction, hence you can't simply attach a debugger to it. To get this to work you need to use the valgrind GDB server, see this page for more details.
Another approach would be to use calloc instead of malloc, which would zero your heap allocations. This doesn't give you a deliberately dirty heap but at least gives you consistent behaviour with or without a debugger.
Yes, GDB zeroes out everything, this is both useful and very annoying. Useful, insofar as everything is guaranteed to be in a well-defined state (no random values in memory, just zero). Which means, in theory, no nasty surprises while debugging.
In practice, and this is where it gets annoying, the theory sometimes fails spectacularly. The infamous "works fine, but crashes in debugger!?!" or "works fine in debugger, but crashes otherwise?!" issues are an example of this. Usually, this is a combination of an uninitialized pointer with a well-intended if(ptr != NULL) somewhere, which totally blows up for "no good reason" because the debugger initializes memory to zero, so the test fails to do what you intended.
About your question on deliberately garbling data allocated by malloc, GCC supports malloc hooks (see docs here and question here on SO).
This lets you, in a very easy and unintrusive manner, redirect all calls to malloc to a function of your own. From there you can call the real malloc and fill the allocated block with garbage (or some invalid-pointer magic value like DEADBEEF), if you wish to do so.
As for operator new, this happens to be a wrapper around malloc (that's an implementation detail, but malloc hooks are non-portable already, so relying on that won't make things worse), therefore malloc hooking should already deal with this, too.

Is UNALIGNED memory access required on LINUX (porting from Windows to Linux)

I am porting code from Windows to Linux (Red Hat Linux or Fed). In the existing code, I do find code having (datatype UNALIGNED*) reference.
Can you please let me know
1) is UNALIGNED memory access required when porting to Linux
2) If required, can you please let me know how can I achieve the same.
I have looked around for an linux version. I have come across the use of arm/unaligned.h. When I try to add the same, it gives me an error "No such file or directory".
Thanks.
With recent gcc you might consider using __attribute__ ((__packed__))
But I suggest to avoid using it when possible. The compiler makes a quite good job on aligning fields. And the ABI might define rules for alignment.
You should understand why your source code use UNALIGNED; is it because the data has an externally defined format, or is it for "performance" reasons? Leave the optimization to the compiler!
Alignment is a CPU restriction, not a OS thing. x86 CPUs can do unaligned accesses (with some performance penalty), many others will produce a bus error under the same Linux (or whatever) versions if you try to load a word from something other than an aligned pointer.
The UNALIGNED keyword in MSVC is, on x86, a noop as far as I know. On other architectures it will emit more complicated instruction sequences to make sure that the access completes successfully. Are you trying to find a gcc equivalent? I don't believe one exists.

Memory leak in c++

How to detect memory leak. I mean is there any tool/utility available or any piece of code (i.e. overloading of operator new and delete) or just i need to check the new and delete in the code??
If there any tool/utility is there please tell me. and if code is there then what is the code can any one explain?
Tools that might help you:
Linux: valgrind
Win32: MemoryValidator
You have to check every bit of memory that gets allocated (new, malloc, ...) if it get's freed using the appropriate function (delete, free, ...).
Use e.g. boost:shared_ptr instead of naked pointers.
Analyze your application with one of these: http://en.wikipedia.org/wiki/Memory_debugger
One way is to insert file name and line number strings (via pointer) of the module allocating memory into the allocated block of data. The file and line number is handled by using the C++ standard "FILE" and "LINE" macros. When the memory is de-allocated, that information is removed.
One of our systems has this feature and we call it a "memory hog report". So anytime from our CLI we can print out all the allocated memory along with a big list of information of who has allocated memory. This list is sorted by which code module has the most memory allocated. Many times we'll monitor memory usage this way over time, and eventually the memory hog (leak) will bubble up to the top of the list.
valgrind is a very powerful tool that you can use to detect memory leaks. Once installed, you can run
valgrind --leak-check=full path/to/program arguments...
and valgrind will run the program, finding leaks and reporting them to you.
I can also recommend UMDH: http://support.microsoft.com/kb/268343
Your best solution is probably to use valgrind, which is one of the better tools.
If you are running in OS X with Xcode, you can use the Leaks tool. If you click Run with performance tool and select Leaks, it will show allocated and leaked memory.
Something to remember though. Most of the tools listed only describe tools that catch memory leaks as they occur. So if you have some code that leaks memory but is rarely called (or rarely enough that you don't encounter it when testing for memory leaks), then you could miss it. If you want something that actually runs through your code, you'll need a static analyzer. The only one I know of is the Clang Static Analyzer, but it is for C and Obj-C (I don't know if it supports C++).

Question about g++ generated code

Dear g++ hackers, I have the following question.
When some data of an object is overwritten by a faulty program, why does the program eventually fail on destruction of that object with a double free error? How does it know if the data is corrupted or not? And why does it cause double free?
It's usually not that the object's memory is overwritten, but some part of the memory outside of the object. If this hits malloc's control structures, free will freak out once it accesses them and tries to do weird things based on the corrupted structure.
If you'd really only overwrite object memory with silly stuff, there's no way malloc/free would know. Your program might crash, but for other reasons.
Take a look at valgrind. It's a tool that emulates the CPU and watches every memory access for anomalies (like trying to overwrite malloc's control structures). It's really easy to use, most of the time you just start your program inside valgrind by prepending valgrind on the shell, and it saves you a lot of pain.
Regarding C++: always make sure that you use new in conjunction with delete and, respectively, new[] in conjunction with delete[]. Never mix them up. Bad things will happen, often similar to what you are describing (but valgrind would warn you).

Resources