What is RUST_BACKTRACE supposed to tell me? - debugging

My program is panicking so I followed its advice to run RUST_BACKTRACE=1 and I get this (just a little snippet).
1: 0x800c05b5 - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
2: 0x800c22ed - std::panicking::default_hook::{{closure}}::h59672b733cc6a455
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:351
If the program panics it stops the whole program, so where can I figure out at which line it's panicking on?
Is this line telling me there is a problem at line 42 and line 351?
The whole backtrace is on this image, I felt it would be to messy to copy and paste it here.
I've never heard of a stack trace or a back trace. I'm compiling with warnings, but I don't know what debugging symbols are.

What is a stack trace?
If your program panics, you encountered a bug and would like to fix it; a stack trace wants to help you here. When the panic happens, you would like to know the cause of the panic (the function in which the panic was triggered). But the function directly triggering the panic is usually not enough to really see what's going on. Therefore we also print the function that called the previous function... and so on. We trace back all function calls leading to the panic up to main() which is (pretty much) the first function being called.
What are debug symbols?
When the compiler generates the machine code, it pretty much only needs to emit instructions for the CPU. The problem is that it's virtually impossible to quickly see from which Rust-function a set of instructions came. Therefore the compiler can insert additional information into the executable that is ignored by the CPU, but is used by debugging tools.
One important part are file locations: the compiler annotates which instruction came from which file at which line. This also means that we can later see where a specific function is defined. If we don't have debug symbols, we can't.
In your stack trace you can see a few file locations:
1: 0x800c05b5 - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
The Rust standard library is shipped with debug symbols. As such, we can see where the function is defined (gcc_s.rs line 42).
If you compile in debug mode (rustc or cargo build), debug symbols are activated by default. If you, however, compile in release mode (rustc -O or cargo build --release), debug symbols are disabled by default as they increase the executable size and... usually aren't important for the end user. You can tweak whether or not you want debug symbols in your Cargo.toml in a specific profile section with the debug key.
What are all these strange functions?!
When you first look at a stack trace you might be confused by all the strange function names you're seeing. Don't worry, this is normal! You are interested in what part of your code triggered the panic, but the stack trace shows all functions somehow involved. In your example, you can ignore the first 9 entries: those are just functions handling the panic and generating the exact message you are seeing.
Entry 10 is still not your code, but might be interesting as well: the panic was triggered in the index() function of Vec<T> which is called when you use the [] operator. And finally, entry 11 shows a function you defined. But you might have noticed that this entry is missing a file location... the above section describes how to fix that.
What do to with a stack trace? (tl;dr)
Activate debug symbols if you haven't already (e.g. just compile in debug mode).
Ignore any functions from std and core at the top of the stack trace.
Look at the first function you defined, find the corresponding location in your file and fix the bug.
If you haven't already, change all camelCase function and method names to snake_case to stick to the community wide style guide.

Related

Wrong line number in stack-trace in debugging Linux kernel using kgdb

I am trying to debug a driver for an Ethernet-MAC in the Linux kernel using kgdb over serial.
I halt the execution by making a call to "kgdb_breakpoint()" at the desired location in the code and recompile the kernel.
But after the code halts, as you may notice in the screenshot the backtrace shows correct function-graph and source filenames but, for some reason corresponding line numbers are not correct.
Please note: I have compiled this kernel with "CONFIG_FRAME_POINTER" set and "nokaslr" in the boot-args.
Is there a way I can see a stack trace with correct line number here ?
(I have used QtCreator during this screenshot, although behavior is similar with gdb over command-line or TUI)
Edit: No matter, whichever function I put the `kgdb_breakpoint()`
in, (inside the driver source) , line number in the stacktrace always say the same line number for the halted function.

When we get runtime error in swift project, Why does Xcode send us to Thread output in assembly language? What's the point ?

As you know when there is somethings wrong when we are running a Swift project in Xcode we will direct to tread debug navigator's thread section and we will be face with some assembly code like this :
I am wondering is there any reference, tutorial or tools for understanding these codes , there should be reasone that we direct to these code
let me clear; I know how to fix the errors but this suffering me when I do not understand some thing like this. I want to know what are these codes and how we can use them or at least understand them.
Thanks :)
Original question: what language is that? That's AT&T syntax assembly language for x86-64. https://stackoverflow.com/tags/x86/info for manuals from Intel and other resources, and https://stackoverflow.com/tags/att/info for how AT&T syntax differs from Intel syntax used in most manuals. (I think the x86 tag wiki has a few AT&T syntax tutorials.) Most AT&T-syntax disassemblers have an intel-syntax mode, too, so you can use that if you want asm that matches Intel's manuals.
What's the point?
The point is so you can debug your program if you know asm. Or you can show the asm to someone who does understand it, or include it in a bug report.
Did you compile without debug symbols? Or did it crash in library code without symbols? It's normal for debuggers to show you asm if it can't show you source, or if you ask for asm.
If you have debug symbols for your own code, you can at least backtrace into parent functions for which you do have source. (Unless the stack is corrupted.)
Did your program fault on that instruction highlighted in pink? That's a bit odd, since it's loading from static data (a RIP-relative load means the address is a link-time constant).
Did you maybe munmap or mprotect that page of your program's data or text segment so a load would fault? Normally you only get faults when an addressing mode involves a pointer.
(The call *0x1234(%rip) right before it is calling through a function pointer, though. The function-pointer is stored in memory, but code-fetch after the call executes would fault if it was pointing to an unmapped or non-executable page). But your first image shows you got a SIGABRT, not SIGSEGV, so that's more like the program on purpose aborted after failing an assertion.
I believe majority of swift coders don't know asm
There's nothing more useful a debugger can do without debug symbols and source files.
Also keep in mind that the majority of debugger authors do know asm, so for them it is an obviously-useful feature / behaviour. They know that many people won't be able to benefit from it, but that some will.
Asm is what's really running on the machine. Without asm, you couldn't find wrong-code compiler bugs, etc. etc. As far as software bugs, there is no lower level than asm, so it's not some arbitrary choice of some lower-level layer to stop at.
(Unless there's also a bug in your disassembler or debugger, in which case you need to check the hex machine code.)

Visual Studio: Debug before main() function call

I'm having an issue where my application is failing a debug assertion (_CrtIsValidHeapPointer) before anything is even executed. I know this because I added a breakpoint on the first statement of my main function, and it fails the assertion before the breakpoint is reached.
Is there a way to somehow "step through" everything that happens before my main function is called? Things like static member initializations, etc.
I should note that my program is written in C++/CLI. I recently upgraded to VS2015 and am targeting the v140 toolset. The C++ libraries I'm using (ImageMagick, libsquish, and one of my own C++ libraries) have been tested individually, and I do not receive the assertion failure with these libraries, so it has to be my main application.
I haven't changed any of the code since I upgraded from VS2013, so I'm a little stumped on what is going on.
EDIT:
Here is the call stack. This is after I click "Retry" in the assertion failed window. I then get a multitude of other exceptions being thrown, but they are different each time I run the program.
> ucrtbased.dll!527a6853()
[Frames below may be incorrect and/or missing, no symbols loaded for ucrtbased.dll]
ucrtbased.dll!527a7130()
ucrtbased.dll!527a69cb()
ucrtbased.dll!527c8116()
ucrtbased.dll!527c7eb3()
ucrtbased.dll!527c7fb3()
ucrtbased.dll!527c84b0()
PathCreator.exe!_onexit(int (void)* const function) Line 268 + 0xe bytes C++
PathCreator.exe!atexit(void (void)* const function) Line 276 + 0x9 bytes C++
PathCreator.exe!std::`dynamic initializer for '_Fac_tidy_reg''() Line 65 + 0xd bytes C++
[External Code]
mscoreei.dll!7401cd87()
mscoree.dll!741fdd05()
kernel32.dll!76c33744()
ntdll.dll!7720a064()
ntdll.dll!7720a02f()
You have to debug the C runtime initialization code. Not intuitive to do because the debugger tries hard to avoid it and get you into the main() entrypoint instead. But still possible, use Debug > New Breakpoint > Function Breakpoint.
Enter _initterm for the function name, Language = C.
Press F5 and the breakpoint will hit. You should see the C runtime source code. You can now single-step through the initialization functions of your program one-by-one, every call to (**it)() executes one.
That's exactly what you asked for. But not very likely what you actually want. The odds that your code produces this error are very low. Much more likely is that one of these libraries causes this problem. They are likely to be built targeting another version of the C runtime library. And therefore have their own _initterm() function.
Having more than one copy of the C runtime library in a process is generally very unhealthy. And highly likely to generate heap corruption. If you can't locate it from the stack trace (be sure to change the Debugger Type from Auto to Mixed, always post the stack trace in an SO question) then the next thing you should strongly consider is rebuilding those libraries with the VS version you use.

Why does eclipse debugger only show 1 or 2 lines of the stack followed by 0x0?

On Linux I get nice, healthy, full stack traces. On Windows, however, when something crashes (like a segfault violation), I only get the top one or two lines of the stack, followed by the entry 0x0 (which I cannot expand). This makes it very hard to debug
Probably you should start using WinDBG to debug your program instead of IDE like eclipse. This is very powerful command line tool and its functionality is very similar to GDB.
On Windows, "UnhandledExceptionFilter" function is called when no exception handler is defined to handle the exception that is raised. The function typically passes the exception up to the Ntdll.dll file, which catches and tries to handle it.
EXCEPTION_POINTERS structure does contains the most useful information about what is the exception and where it has occurred which gets passed as one of the parameter of the above function. This information would be used by .exr and .cxr command in WinDBG to get the complete stack trace.
typedef struct _EXCEPTION_POINTERS {
PEXCEPTION_RECORD ExceptionRecord;
PCONTEXT ContextRecord;
} EXCEPTION_POINTERS, *PEXCEPTION_POINTERS;
ExceptionRecord A pointer to an EXCEPTION_RECORD structure that
contains a machine-independent description of the exception.
ContextRecord A pointer to a CONTEXT structure that contains a
processor-specific description of the state of the processor at the
time of the exception.
For complete steps about how to get the complete back trace and analysis from the dump file(like GDB)or debug session, you may want to read and follow the steps mentioned in the following link:
http://support.microsoft.com/kb/313109

How to debug stack-overwriting errors with Valgrind?

I just spent some time chasing down a bug that boiled down to the following. Code was erroneously overwriting the stack, and I think it wrote over the return address of the function call. Following the return, the program would crash and stack would be corrupted. Running the program in valgrind would return an error such as:
vex x86->IR: unhandled instruction bytes: 0xEA 0x3 0x0 0x0
==9222== valgrind: Unrecognised instruction at address 0x4e925a8.
I figure this is because the return jumped to a random location, containing stuff that were not valid x86 opcodes. (Though I am somehow suspicious that this address 0x4e925a8 happened to be in an executable page. I imagine valgrind would throw a different error if this wasn't the case.)
I am certain that the problem was of the stack-overwriting type, and I've since fixed it. Now I am trying to think how I could catch errors like this more effectively. Obviously, valgrind can't warn me if I rewrite data on the stack, but maybe it can catch when someone writes over a return address on the stack. In principle, it can detect when something like 'push EIP' happens (so it can flag where the return addresses are on the stack).
I was wondering if anyone knows if Valgrind, or anything else can do that? If not, can you comment on other suggestions regarding debugging errors of this type efficiently.
If the problem happens deterministically enough that you can point out particular function that has it's stack smashed (in one repeatable test case), you could, in gdb:
Break at entry to that function
Find where the return address is stored (it's relative to %ebp (on x86) (which keeps the value of %esp at the function entry), I am not sure whether there is any offset).
Add watchpoint to that address. You have to issue the watch command with calculated number, not an expression, because with an expression gdb would try to re-evaluate it after each instruction instead of setting up a trap and that would be extremely slow.
Let the function run to completion.
I have not yet worked with the python support available in gdb7, but it should allow automating this.
In general, Valgrind detection of overflows in stack and global variables is weak to non-existant. Arguably, Valgrind is the wrong tool for that job.
If you are on one of supported platforms, building with -fmudflap and linking with -lmudflap will give you much better results for these kinds of errors. Additional docs here.
Udpdate:
Much has changed in the 6 years since this answer. On Linux, the tool to find stack (and heap) overflows is AddressSanitizer, supported by recent versions of GCC and Clang.

Resources