Deadlock during static initialization - gcc

I'm running into a deadlock during static initialization in Solaris. The situation strongly resembles that of this user's problem.
My environment is:
solaris 10
gcc 5.4 installed to a non-standard location
all relevant shared libraries are linked against the libstdc++ and/or libgcc_s libraries from that installation
boost 1.45 (we're moving away from it soon, but for the moment that cannot change)
I see this problem when linking dynamically or statically against boost libraries
The symptoms:
Deadlocks while executing boost::system::generic_category()
generic_category() is being called to initialize global static references in boost/system/error_code.hpp
If I shuffle link order, putting -lboost_system ahead of other libraries being linked in, the problem goes away.
If I set a breakpoint in generic_category() then attempt to step over the 1st line after the first time the breakpoint gets hit, the breakpoint gets hit again when executing the same function in a different shared library's _init() -- that is, it never stops on the 2nd line of generic_category() from when I told it to step over the 1st line.
Since stepping over the 1st line didn't work, I stepped into it then stepped out & again the breakpoint got hit.
I restarted the process & stepped in after the breakpoint got hit then began stepping. Stepping over the call to boost::system::error_category::error_category() I ran into the same problem.
I tried again, this time stepping an instruction at a time when I got to the error_category() call. It attempts to call it through the PLT which calls elf_rtbndr() which is supposed to return the real function's address in %o0, but when I step over the call to elf_rtbndr() it again hits the breakpoint instead of resuming where it left off.
When the breakpoint gets hit for the 2nd time it's calling generic_category() in some other shared library's _init(); that's when the deadlock occurs.
Thanks in advance for your time & help.

This has been reported several times (see this post in Boost and another in GCC). This seems to be a circular dependency issue during Boost initialization which, for some reason, only manifests on Solaris. The usual advice is to work around this by messing with library initialization (e.g. by shuffling the library order as you did with -lboost_system).
Another option is to disable thread-safe guards (-fno-threadsafe-statics flag) which would get rid of the deadlock but would keep the buggy nested constructor call which is undesirable.

Related

What is RUST_BACKTRACE supposed to tell me?

My program is panicking so I followed its advice to run RUST_BACKTRACE=1 and I get this (just a little snippet).
1: 0x800c05b5 - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
2: 0x800c22ed - std::panicking::default_hook::{{closure}}::h59672b733cc6a455
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:351
If the program panics it stops the whole program, so where can I figure out at which line it's panicking on?
Is this line telling me there is a problem at line 42 and line 351?
The whole backtrace is on this image, I felt it would be to messy to copy and paste it here.
I've never heard of a stack trace or a back trace. I'm compiling with warnings, but I don't know what debugging symbols are.
What is a stack trace?
If your program panics, you encountered a bug and would like to fix it; a stack trace wants to help you here. When the panic happens, you would like to know the cause of the panic (the function in which the panic was triggered). But the function directly triggering the panic is usually not enough to really see what's going on. Therefore we also print the function that called the previous function... and so on. We trace back all function calls leading to the panic up to main() which is (pretty much) the first function being called.
What are debug symbols?
When the compiler generates the machine code, it pretty much only needs to emit instructions for the CPU. The problem is that it's virtually impossible to quickly see from which Rust-function a set of instructions came. Therefore the compiler can insert additional information into the executable that is ignored by the CPU, but is used by debugging tools.
One important part are file locations: the compiler annotates which instruction came from which file at which line. This also means that we can later see where a specific function is defined. If we don't have debug symbols, we can't.
In your stack trace you can see a few file locations:
1: 0x800c05b5 - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
The Rust standard library is shipped with debug symbols. As such, we can see where the function is defined (gcc_s.rs line 42).
If you compile in debug mode (rustc or cargo build), debug symbols are activated by default. If you, however, compile in release mode (rustc -O or cargo build --release), debug symbols are disabled by default as they increase the executable size and... usually aren't important for the end user. You can tweak whether or not you want debug symbols in your Cargo.toml in a specific profile section with the debug key.
What are all these strange functions?!
When you first look at a stack trace you might be confused by all the strange function names you're seeing. Don't worry, this is normal! You are interested in what part of your code triggered the panic, but the stack trace shows all functions somehow involved. In your example, you can ignore the first 9 entries: those are just functions handling the panic and generating the exact message you are seeing.
Entry 10 is still not your code, but might be interesting as well: the panic was triggered in the index() function of Vec<T> which is called when you use the [] operator. And finally, entry 11 shows a function you defined. But you might have noticed that this entry is missing a file location... the above section describes how to fix that.
What do to with a stack trace? (tl;dr)
Activate debug symbols if you haven't already (e.g. just compile in debug mode).
Ignore any functions from std and core at the top of the stack trace.
Look at the first function you defined, find the corresponding location in your file and fix the bug.
If you haven't already, change all camelCase function and method names to snake_case to stick to the community wide style guide.

Visual Studio: Debug before main() function call

I'm having an issue where my application is failing a debug assertion (_CrtIsValidHeapPointer) before anything is even executed. I know this because I added a breakpoint on the first statement of my main function, and it fails the assertion before the breakpoint is reached.
Is there a way to somehow "step through" everything that happens before my main function is called? Things like static member initializations, etc.
I should note that my program is written in C++/CLI. I recently upgraded to VS2015 and am targeting the v140 toolset. The C++ libraries I'm using (ImageMagick, libsquish, and one of my own C++ libraries) have been tested individually, and I do not receive the assertion failure with these libraries, so it has to be my main application.
I haven't changed any of the code since I upgraded from VS2013, so I'm a little stumped on what is going on.
EDIT:
Here is the call stack. This is after I click "Retry" in the assertion failed window. I then get a multitude of other exceptions being thrown, but they are different each time I run the program.
> ucrtbased.dll!527a6853()
[Frames below may be incorrect and/or missing, no symbols loaded for ucrtbased.dll]
ucrtbased.dll!527a7130()
ucrtbased.dll!527a69cb()
ucrtbased.dll!527c8116()
ucrtbased.dll!527c7eb3()
ucrtbased.dll!527c7fb3()
ucrtbased.dll!527c84b0()
PathCreator.exe!_onexit(int (void)* const function) Line 268 + 0xe bytes C++
PathCreator.exe!atexit(void (void)* const function) Line 276 + 0x9 bytes C++
PathCreator.exe!std::`dynamic initializer for '_Fac_tidy_reg''() Line 65 + 0xd bytes C++
[External Code]
mscoreei.dll!7401cd87()
mscoree.dll!741fdd05()
kernel32.dll!76c33744()
ntdll.dll!7720a064()
ntdll.dll!7720a02f()
You have to debug the C runtime initialization code. Not intuitive to do because the debugger tries hard to avoid it and get you into the main() entrypoint instead. But still possible, use Debug > New Breakpoint > Function Breakpoint.
Enter _initterm for the function name, Language = C.
Press F5 and the breakpoint will hit. You should see the C runtime source code. You can now single-step through the initialization functions of your program one-by-one, every call to (**it)() executes one.
That's exactly what you asked for. But not very likely what you actually want. The odds that your code produces this error are very low. Much more likely is that one of these libraries causes this problem. They are likely to be built targeting another version of the C runtime library. And therefore have their own _initterm() function.
Having more than one copy of the C runtime library in a process is generally very unhealthy. And highly likely to generate heap corruption. If you can't locate it from the stack trace (be sure to change the Debugger Type from Auto to Mixed, always post the stack trace in an SO question) then the next thing you should strongly consider is rebuilding those libraries with the VS version you use.

Avoiding source-level "jumping around" in gdb

With C++ code built for debugging with g++ (i.e. options "-O0 -ggdb") and using the newest gcc (5.1.0) and gdb (7.9) the display of source code in gdb is still painfully non-linear when using the "next" command. As an example this function call might be expected to step through with a single "next":
7757| SDValue NewRoot = TLI->LowerFormalArguments(
7758| DAG.getRoot(), F.getCallingConv(), F.isVarArg(), Ins, dl, DAG, InVals);
however it takes four, with the displayed execution line being first 7757, then 7758, then again 7757, then again 7758. If the function call is condensed to a single line then just one "next" is needed. If the call is absurdly inflated then seven "next"s are needed (shown as the '#' annotations)
7757| SDValue
7758| NewRoot
7759| =
#1,6 7760| TLI
7761| ->
7762| LowerFormalArguments(
#5 7763| DAG.getRoot(),
7764| F.getCallingConv(),
#3 7765| F.isVarArg(),
7766| Ins,
7767| dl,
7768| DAG,
7769| InVals
#2,4,7 7770| );
So it's related to but not as simple as "each function call on a distinct line is a stepping point". This gets especially confusing with breakpoints in recursive functions, where I find myself checking the callstack to see whether it's really a new invocation or just a phony backwards step.
Since reflowing all of the LLVM source to contain function calls in a single line isn't really a viable option, is there some gcc/gdb option for controlling this behaviour?
EDIT: now checked with clang 3.5 and lldb 3.5: when built with clang only three "next"s occur. And gdb and lldb see the same "next" behaviour in either case (i.e. 4 with gcc, 3 with clang)
This sort of behavior from the debugger is a "GIGO" situation -- that is, normally gdb is just doing whatever the debug info tells it to do. That is, when there is odd behavior, it is generally due to decisions made by the compiler. It may be a bug, and probably worth a bug report, but I also wouldn't be surprised if it is intended to work this way for some reason.
You can investigate these kinds of problems by using readelf or objdump to examine the line table.

Cuda kernel launch failure

I am trying to call two kernels as shown below
for (t=0; t<=time_total; t++)
{
//kernel calls
kernel1<<<noOfBlocks,noOfThreadsPerBlock>>>(** SOME PARAMETERS **);
checkCudaError(cudaThreadSynchronize());
kernel2<<<noOfBlocks,noOfThreadsPerBlock>>>(** SOME PARAMETERS **);
checkCudaError(cudaThreadSynchronize());
}
And the structure of the second kernel is
var[index+0]=**SOME CALCULATION**
var[index+1]=**SOME CALCULATION**
var[index+2]=**SOME CALCULATION**
Now when I execute this code, checkCudaError does not report anything and the code is executed giving some output but visual studio gives the following exception
First-chance exception at 0x7640c41f in **.exe: Microsoft C++ exception: cudaError_enum at memory location 0x0039f9c4..
First-chance exception at 0x7640c41f in **.exe: Microsoft C++ exception: cudaError_enum at memory location 0x0039f9c4..
And when I check on Nsight it says kernel 2 is having the following error
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES
Now the problem is that var array in kernel 2 is giving some of the rows correct some are copies of other row values and some are garbage.
Also when I do this
var[index+0]=3
var[index+1]=3
var[index+2]=3
All the values of var are set to 3
A few side notes:
cudaThreadSynchronize() is deprecated in favor of cudaDeviceSynchronize().
The fact that nsight is reporting an error on the 2nd kernel launch, but your error checking code is not, leads me to believe your error checking code is broken.
Now, regarding your issue, out of resources is frequently due to a code requesting too many registers (too many registers per thread times the number of threads per threadblock requested.) Try re-compiling your code specifying -Xptxas -v to get verbose output, and then recompiling again with -maxrregcount 20 (or something like that) to try to work around this for test purposes.
If this "fixes" your problem, you may then want to consider the following:
See if there is a way you can re-order or restructure your code to reduce the register pressure
If not, then adjust your maxrregcount value upwards to approximately the highest value that will allow your code to compile and run according to the launch configurations (number of threads per block) that you care about. You may also want to benchmark your code at different levels of this setting, as it can affect occupancy. Usually if you have it set to the highest value that will compile and run, then you are limiting yourself to one threadblock per SM at execution time. This may be OK, or there may be a lower setting that is better, allowing two threadblocks per SM residency, and possibly higher performance. Only benchmarking your code will tell.

XCode/gdb loses stack when debugging over calls to dynamic library functions on iOS

I've got an iOS project that links to an external static library written in C++. The static library makes calls to functions implemented by libstdc++, which is dynamically linked. For instance, I call the initialization function for this library (let's call it foo_init()) and it immediately calls setlocale().
The static library is compiled with -g, meaning debug symbols are around for me to step into code inside the debugger. I successfully step into foo_init(). When I attempt to Step Over the call to setlocale(), XCode doesn't quite do that. It ends up in a function called dyld_stub_setlocale. This function is a single jmp instruction to perform the dynamic load & function call.
I've tried Stepping Over/In/Out of dyld_stub_setlocale but they don't get me where I want, which is back into foo_init(). Step Over and Step In end up in stub_helpers, and Step Out acts like continue. If I try Step Over/In inside stub_helpers, XCode single steps and the stack window displaying foo_init() changes to ??. At this point, the decision tree for stepping in/out kind of explodes so I won't go into further details, but no combinations I've tried end up back to the line after the call to setlocale.
I am able to set a breakpoint for the line, hit continue, and have it work, but this is not a scalable solution for debugging a static library with which I am not very familiar.
Note that I tried to find a way to link libstdc++-static instead so I could avoid the dynamic loader issues, but Apple has removed the library from newer SDKs and I don't have the older ones.
Is there a linker or compiler option to make the code easier for the debugger to decipher?

Resources