How do I interpret this error from GDB?

How do I interpret this error from GDB? - debugging

I feel pretty dumb right now, but how do I interpret this message in GDB?
Program received signal SIGSEGV, Segmentation fault.
0x00007fe2eb46073a in clearerr (fp=0x4359790) at clearerr.c:27
27 clearerr.c: No such file or directory.
in clearerr.c
What file is missing that's causing the segfault? Is it clearerr.c or the file that clearerr is trying to access?

What file is missing that's causing the segfault?
We don't know what is causing SIGSEGV, but it's unlikely that any missing file has anything to do with it.
First, this:
clearerr.c: No such file or directory.
simply means that GDB can not show you the source where SIGSEGV occurred. That is because clearerr() is part of your libc, and you either didn't install sources for your libc (they may not even be available for your environment), or you didn't tell GDB how to find these sources.
Second, the actual cause of SIGSEGV is most likely because the fp that you invoked it with has been corrupted or is invalid in some other way.
Here are a few ways this could happen:
char c;
FILE *fp = (FILE*) &c; // fp is bogus: doesn't point to a FILE at all
clearerr(fp); // likely will crash
FILE *fp2; // fp2 contains uninitialized garbage
clearerr(fp2); // likely will crash
FILE *fp3 = fopen("/tmp/foo", "w");
fclose(fp3); // destroys fp3
clearerr(fp3); // accesses dangling memory, likely will crash
There are of course many other ways as well. You'll need to look at the caller of clearerr to see if it's doing something stupid. To find the caller, use GDB where command.

The seg fault is being caused by a file that clearerr.c is trying to access (at line 27).

Related

Segmentation Fault Using LLDB

When I was debugging my .c file using lldb on terminal for Mac, I some how cannot find the location of the segmentation fault. I have debugged the code numerous of times and it is still producing the same error. Can someone help me on why I can find the location of segmentation fault. enter image description here

Use the bt command in lldb to see the call stack. You've called a libc function like scanf() and are most likely passing an invalid argument to it. When you see the call stack, you will see a stack frame with your own code on it, say it is frame #3. You can select that frame with f 3, and you can look at variables with the v command to understand what arguments were passed to the libc function that led to a crash.

Without knowing what your code is doing, I would suggest using a tool like valgrind instead of just a normal debugger. It's designed to look for memory issues for lower-level languages like C/C++/FORTRAN. For example, it will tell you if you're trying to use an index that is too large for an array.
From the quick start guide, try valgrind --leak-check=yes myprog arg1 arg2

Errors in OpenCL kernel code at runtime

I am new to Visual Studio and I am using it to write a simple parallel sorting program using OpenCL.
When I run it, I get a line before my output (i.e. from before I receive and print the result buffer) saying "5 Errors Generated.".
I assume this is telling me that I have errors in my kernel file, and if I deliberately write errors in my kernel file that number increases.
I would really like to know what those errors are so I can correct my program. Being unfamiliar with VS I simply cannot find them listed anywhere.
Does anyone know where I can find what errors are being generated.
Thanks

You need to call clGetProgramBuidlInfo asking for the CL_PROGRAM_BUILD_LOG in order to get the runtime errors of the compiler.
char result[4096];
size_t size;
clGetProgramBuildInfo( program, device, CL_PROGRAM_BUILD_LOG, sizeof(result), result, &size);
printf("%s\n", result);

Challenge: Access violation reading/executing location after successful compile

A challenge emerged from the Danish Center for Cyber Security a few weeks ago.
See https://puzzling.stackexchange.com/questions/49702/programming-news-paper-puzzle/49757
A part of the challenge is to fix an Assembly code, load an .img file to process and then compile it. The file is called u5emu.asm.
A user called David J posted a cleaned-up version of the .asm code here: http://pastebin.com/TChuYF29
There's a minor bug where he wrote . instead of , on line 126, otherwise it looks good. What I did additionally was to change the getchar and putchar to _getchar and _putchar in the .asm code so the C lib would work. Also, I edited the U5_LE to _asm_main: since driver.c's main calls _asm_main.
I've gotten as far as to create an .exe by doing:
nasm -f win32 u5emu.asm
gcc -o u5emu u5emu.obj driver.c asm_io.obj
Which creates an executable file. I'm pretty sure that the program will ask me for an input (since there's a getchar) and it will then process the included file (a B64 encoded string which I've cleaned up and removed odd symbols like [, ; etc) and put out a clue for the next part of the challenge.
When I run the exe it crashes and I get two types of errors when I debug:
Unhandled exception at 0x546CD4A1 in u5emu.exe: 0xC0000005: Access violation reading location 0x00000000.
And
Exception thrown at 0x00000000 in u5emu.exe: 0xC0000005: Access violation executing location 0x00000000
I've hit a dead end here, so hoping someone can assist me in how to crack this.

Not an answer to you question, but I can tell you what I did: I rewrote the small program into C (using a switch for the 32 opcodes). This makes it MUCH easier to add debug printout, etc. Hint #2: Remember to swap bytes, the emulated machine is big endian.

EXC_GUARD exception

A OSX app crashes when I try to close a socket handle, it worked fine in all the previous platforms, but it appears to crash in Yosemite.
The line where is crashes is
-(void)stopPacketReceiver
{
close(sd);
}
In Xcode it pauses all the threads and show EXC_GUARD exception, what kind of exception is this, any ideas ?
Thanks,
Ahmed
EDIT:
Here r the exception codes that I get
Exception Type: EXC_GUARD
Exception Codes: 0x4000000100000000, 0x08fd4dbfade2dead

From a post in Apple's old developer forums from Quinn "The Eskimo" (Apple Developer Relations, Developer Technical Support, Core OS/Hardware), edited by me to remove things which were specific to that specific case:
EXC_GUARD is a change in 10.9 designed to help you detect file
descriptor problems. Specifically, the system can now flag specific
file descriptors as being guarded, after which normal operations on
those descriptors will trigger an EXC_GUARD crash (when it wants to
operate on these file descriptors, the system uses special 'guarded'
private APIs).
We added this to the system because we found a lot of apps were
crashing mysteriously after accidentally closing a file descriptor
that had been opened by a system library. For example, if an app
closes the file descriptor used to access the SQLite file backing a
Core Data store, Core Data would then crash mysteriously much later
on. The guard exception gets these problems noticed sooner, and thus
makes them easier to debug.
For an EXC_GUARD crash, the exception codes break down as follows:
o The first exception code … contains three bit
fields:
The top three bits … indicate [the type of guard].
The remainder of the top 32 bits … indicate [which operation was disallowed].
The bottom 32 bits indicate the descriptor in question ….
o The second exception code is a magic number associated with the
guard. …
Your code is closing a socket it doesn't own. Maybe sd contains the descriptor number for a descriptor that you once owned but is now a dangling reference, because you already closed your descriptor and that number has now been reused for somebody else's descriptor. Or maybe sd just has a junk value somehow.
We can decode some more information from the exception codes, but most likely you just have to trace exactly where you're doing with sd over its life.
Update:
From the edited question, I see that you've posted the exception codes. Using the constants from the kernel source, the type of guard is GUARD_TYPE_FD, the operation that was disallowed was kGUARD_EXC_CLOSE (i.e. close()), and the descriptor was 0 (FILENO_STDIN).
So, in all probability, your stopPacketReceiver was called when the sd instance variable was uninitialized and had the default 0 value that all instance variables get when an object is first allocated.
The magic value is 0x08fd4dbfade2dead, which according to the original developer forums post, "indicates that the guard was applied by SQLite". That seems strange. Descriptor 0 would normally be open from process launch (perhaps referencing /dev/null). So, SQLite should not own that.
I suspect what has happened is that your code has actually closed descriptor 0 twice. The first time it was not guarded. It's legal to close FILENO_STDIN. Programs sometimes do it to reopen that descriptor to reference something else (such as /dev/null) if they don't want/need the original standard input. In your case, it would have been an accident but would not have raised an exception. Once it was closed, the descriptor would have been available to be reallocated to the next thing which opened a descriptor. I guess that was SQLite. At that time, SQLite put a guard on the descriptor. Then, your code tried to close it again and got the EXC_GUARD exception.
If I'm right, then it's somewhat random that your code got the exception (although it was always doing something bad). The fact that file descriptor 0 got assigned to a subsystem that applied a guard to it could be a race condition or it could be a change in order of operations between versions of the OS.
You need to be more careful to not close descriptors that you didn't open. You should initialize any instance variable meant to hold a file descriptor to -1, not 0. Likewise, if you close a descriptor that you did own, you should set the instance variable back to -1.

Firstly, that sounds awesome - it sounds like it caught what would have been EXC_BAD_ACCESS (but this is a guess).
My guess is that sd isn't a valid descriptor. It's possible an API changed in Yosemite that's causing the place you create the descriptor to return NULL, or it's possible a change in the event timeline in Yosemite causes it to have already been cleaned up.
Debugging tip here: trace back sd all the way to its creation.

How to debug stack-overwriting errors with Valgrind?

I just spent some time chasing down a bug that boiled down to the following. Code was erroneously overwriting the stack, and I think it wrote over the return address of the function call. Following the return, the program would crash and stack would be corrupted. Running the program in valgrind would return an error such as:
vex x86->IR: unhandled instruction bytes: 0xEA 0x3 0x0 0x0
==9222== valgrind: Unrecognised instruction at address 0x4e925a8.
I figure this is because the return jumped to a random location, containing stuff that were not valid x86 opcodes. (Though I am somehow suspicious that this address 0x4e925a8 happened to be in an executable page. I imagine valgrind would throw a different error if this wasn't the case.)
I am certain that the problem was of the stack-overwriting type, and I've since fixed it. Now I am trying to think how I could catch errors like this more effectively. Obviously, valgrind can't warn me if I rewrite data on the stack, but maybe it can catch when someone writes over a return address on the stack. In principle, it can detect when something like 'push EIP' happens (so it can flag where the return addresses are on the stack).
I was wondering if anyone knows if Valgrind, or anything else can do that? If not, can you comment on other suggestions regarding debugging errors of this type efficiently.

If the problem happens deterministically enough that you can point out particular function that has it's stack smashed (in one repeatable test case), you could, in gdb:
Break at entry to that function
Find where the return address is stored (it's relative to %ebp (on x86) (which keeps the value of %esp at the function entry), I am not sure whether there is any offset).
Add watchpoint to that address. You have to issue the watch command with calculated number, not an expression, because with an expression gdb would try to re-evaluate it after each instruction instead of setting up a trap and that would be extremely slow.
Let the function run to completion.
I have not yet worked with the python support available in gdb7, but it should allow automating this.

In general, Valgrind detection of overflows in stack and global variables is weak to non-existant. Arguably, Valgrind is the wrong tool for that job.
If you are on one of supported platforms, building with -fmudflap and linking with -lmudflap will give you much better results for these kinds of errors. Additional docs here.
Udpdate:
Much has changed in the 6 years since this answer. On Linux, the tool to find stack (and heap) overflows is AddressSanitizer, supported by recent versions of GCC and Clang.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio