I took the advice that is given in the comments of this question Gfortran does not tell me what sort of FPE it is i.e. start up GDB , set a breakpoint to that line and inspect the values of the operation. At the outset my program is based on Fortran 77 code(I plan to migrate it to F90 after running this "test case" an idealistic CFD data test) and uses NetCDF shared libraries on Ubuntu 16.04 LTS. I use the gfortran 4.8.5 compiler(can upgrade to 5.x if required).
This is how the program is compiled
gfortran -Wall -O0 -c -g -fbacktrace -ffpe-trap=invalid,denormal,zero,overflow,underflow ${tool}.f ${ncdf_incs}
Now I started gdb in the directory where the program is located and then I typed
break inv_cart.f:1221
which is where the FPE is occurring(a divide by zero error). When I do this I get this message -
Make breakpoint based on future shared library load (y/n) ?
So I searched SO for this problem and I got this previously Q/A - How to set breakpoints with shared libraries and this is what I did
set breakpoint pending on
break inv_cart.f:1221
UPDATE
I had an oversight. After I run break I get this error message
No symbol table is loaded. Use the "file" command
Breakpoint 1 (inv_cart.f:1221) is pending.
END UPDATE
After I do this I get the same error I got when I ran inv_cart within gdb or as stand alone.
Program received signal SIGFPE - arithmetic exception
followed by a memory address and couple of question marks followed by ().
So I quit gdb and then it tells me that there is a a debugging session that is still active.
So my question still remains - How do I obtain the values where the FPE is occurring ?
This is a straightforward problem after the update has been noticed by me.
I looked up this question - gdb no symbol table is loaded and I went ahead and did this
file inv_cart
and finally the symbol table was loaded and to my joy I ran the program again via gdb and was able to print the value of the piece of code where the FPE was occurring.
Related
I'm having this "issue" with gcc and gdb, which by itself isn't a real problem but it still annoys me and I want to understand why it's happening and how to solve it. First I want to apologize because English is not my native language.
tl;dr: When I debug a file compiled with the MSYS2 MinGW-w64 gcc and I get to the last line of main and click 'Step over' (on VS code) or type the 'next' command (running gdb on the shell) I get an error indicating that the file 'crtexe.c' cannot be opened or be found. It doesn't cause me any trouble but it's annoying. Also, it doesn't happen when the official MinGW-w64 gcc compiler is used instead.
To put you in context, I'm doing the Harvard's CS50 course but I always want to dig deeper and end up spending much more time in topics don't covered by the course itself, so now I'm on Windows 10 with MSYS2, Mingw-w64, and VS Code installed. In the beginning, I started only with MinGW-w64 that I downloaded from the official website but then I realized that gcc was outdated and that installing libraries was quite complicated. So after some Google searches, I discarded the 'official' MinGW-w64 and ended up with MSYS2 and the MinGW-w64 built by them. I had the task.json, launch.json, and c_cpp_properties.json from VS Code already set up so I only changed the paths to gcc and gdb of MSYS2 and I was good to go.
But now I've noticed an error that wasn't happening before with the 'official' version of MinGW-w64. When I'm debugging a program (as simple as a 'helloworld') and I get to the last line of main (the final curly bracket) and click 'Step Over', this error message appears in VS Code:
I need to press 'Step Over' again (and receive the same error message again) two more times to finally end the program.
At first, I thought it was VS Code fault so I ran gdb directly from the shell and stepped over the code with the 'next' command, and I got the same error at the end:
(gdb) next
Hello world!6 }
(gdb) next
__tmainCRTStartup ()
at D:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:337
337 D:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c: No such file or
directory.
(gdb) next
338 in D:/mingwbuild/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c
(gdb) next
[Thread 4232.0x1a94 exited with code 0]
[Inferior 1 (process 4232) exited normally]
That made me think it was gdb the one causing the problem. But finally, after testing with both gcc and gdb from both the official MinGW-w64 and MSYS2's MinGW-w64 I concluded that the one with the issue was MSYS2 MinGW-w64 gcc. I can compile with the official mingw-w64 gcc and debug with gdb of msys2 and it works fine. But in reverse, if I compile with MSYS2 MinGW-w64 gcc and debug with the official MinGW-w64 gdb, the problem appears again.
When I compile using the official MinGW-w64 gcc and then debug it, the final gdb lines are these:
Hello world!6 }
(gdb) next
0x00000000004013c7 in __tmainCRTStartup ()
(gdb) next
Single stepping until exit from function __tmainCRTStartup,
which has no line number information.
[Thread 9436.0x1748 exited with code 0]
[Inferior 1 (process 9436) exited normally]
which doesn't translate into an error message in VS Code.
As I understand, that function (__tmainCRTStartup) is the one that starts every C program and also kills the process when it's over. I know I can simply ignore that error. But I hate error messages hehe. Besides, why if I'm stepping over the code, the debugger tries to step into that function's source code? I'd understand if I'm trying to step into, but that's not the case. Why is this happening and what can I do to fix it? (besides clicking 'Continue' instead of 'Step Over' when I'm at the end of main).
Thank you!
TLDR: Getting fatal error 'failed to get process times' on cross-native build of gcc. Can I remove report_times code from gcc.c OR use gcc command line option to disable report_times OR build gcc without libiberty (which contains pex_get_times used by report_times
DETAIL
After beating my head against various problems I've (finally) successfully used the Android NDK standalone toolchain to build binutils 2.23 and gcc 4.70.
My current problem is getting it to run on my device.
I've written a standard 'hello world' (copied from here) to test gcc on my device. When I run:
arm-linux-eabi-gcc hello.c -o hello
or:
arm-linux-eabi-gcc hello.c
I get the following error:
arm-linux-eabi-gcc: fatal error: failed to get process times: No such file or directory.
Google did not return much except for links to gcc.c source. Examining the source, I found the error in a function (module? extension?) called report_times. The error is returned by the function (module? extension?) pex_get_times....I'm guessing it does so if it can't get the process times.
The pex_get_times function (module? extension? I'm not sure what it is) is defined in libiberty. I can use --disable-build-libiberty, but it doesn't help for the host (my NookHD) gcc build.
My question(s):
Can this portion of gcc.c be safely (and easily) removed...i.e. the report_times function and everything associated with it?
or
Is there a command line option to tell arm-linux-eabi-gcc NOT to use report_times?
or
Is there a way to disable build of libiberty for host/target for both gcc and binutils, and would that fix the error?
As always...I'll keep researching while awaiting an answer.
Found this about an hour after posting this question. Maybe two.
Apparently report_times is part of debugging symbols (?) for GCC. To exclude report_times (which causes the 'failed to get process times' from the original question) you have to build the non-debug...or release...version of gcc.
To do this, I used info from this link: http://www-gpsg.mit.edu/~simon/gcc_g77_install/build.html
BUT, I omitted the -g from the LIBCXXFLAGS and LIBCFLAGS and I added LIBCPPFLAGS without -g just in case. Ran make DESTDIR=/staging/install/path install-host, tarballed and transferred to device. No more 'failed to get process times' error.
I am seeing another error, but it is not related to this question
I want to detect stack overflow or corruption in my code. Hence, i wrote a small program where stack overflow is simulated. I compiled it using the command:
gcc overflow.c -g -fstack-protector-all
However, upon executing the binary i got segmentation fault but no other information.
Can anybody please help me where did i go wrong?
If ulimit -c is set to a value much bigger than zero, a core dump named core is written; you can see the backtrace via running gdb program core and then typing backtrace at the prompt.
I wrote a CUDA application that has some hardcoded parameters in it (via #defines). Everything seemed to work right, so I tried some other parameters. Now, the program doesn't work correctly anymore.
So, I want to debug it. I compile the application with -deviceemu -g -O0 options, because I read that I can then use gdb to debug it. In gdb, I set a breakpoint at the kernel start using break kernelstart.
However, gdb, jumps at the start of my CUDA kernel, but I can not step through it, because it doesn't let me inspect things within the kernel. I think it's best if I give the output of gdb:
Breakpoint 1, kernelstart (__cuda_0=0x100000, __cuda_1=0x101000, __cuda_2=0x102000, __cuda_3=0x102100) at cudatest.cu:287
(gdb) s
__device_stub__Z12kernelstartPjS_S_S_ (__par0=0x100000, __par1=0x101000, __par2=0x102000, __par3=0x102100) at /tmp/tmpxft_000003c4_00000000-1_cudatest.cudafe1.stub.c:7
7 /tmp/tmpxft_000003c4_00000000-1_cudatest.cudafe1.stub.c: No such file or directory.
in /tmp/tmpxft_000003c4_00000000-1_cudatest.cudafe1.stub.c
(gdb) s
cudaLaunch<char> (entry=0x804a98d "U\211\345\203\354\030\213E\024\211D$\f\213E\020\211D$\b\213E\f\211D$\004\213E\b\211\004$\350\r\377\377\377\311\303U\211\345\203\354\070\307\004$\340 \005\b\350\345\341\377\377\243P!\005\b\307\004$x\234\004\b\350\b\001") at /usr/local/cuda/bin/../include/cuda_runtime.h:773
(gdb) s
(gdb) s
cudatest (__cuda_0=0x100000, __cuda_1=0x101000, __cuda_2=0x102000, __cuda_3=0x102100) at cudatest.cu:354
(gdb) s
After, this, it jumps back to my main procedure.
I know that my specifications are more than vague, but can anybody guess where the problem is? Is it possible to inspect kernels using gdb?
Use cuda-gdb
Compile: nvcc -g -G filename.cu
Invoke cuda-gdb on your a.out
You can set breakpoint inside your kernel function as usual.
Run the program, and it should stop inside your kernel function.
You can even get details of the current thread which is being executed using commands like cuda thread. Other commands like cuda block exist.
To switch between threads say cuda thread (x,y,z)
For more details refer to the latest version of cuda-gdb's documentation. If you are using the latest version of cuda toolkit (ie, 3.2 as of today), make sure you are looking at the latest version of the documentation (as the options have changed a lot).
And also make sure you are running cuda-gdb from a console (outside X11), since you are stopping your GPU for debugging.
Hope this helps.
Compiling with :
nvcc -g -G --keep
fixed this problem for me. This ensures all the intermediate files generated during compilation are not erased so that the debugger can find them.
I have a compiled .exe file (compiled with gfortran and -g option) that crashes. I can attach the WinDBG program to it using the WinDBG -I command.
Funny enough it generates a stack overflow:
(38f0.2830): Stack overflow - code c00000fd (!!! second chance !!!)
However, the output says that there is no debugging information in my program. It tries to search for either .dbg or .pdb files but they are not there. I would assume debugging information is included in the executable (coming from a unix-background).
Debug formats are compiler specific, so you need to use a debugger that understands the format produced by your compiler. As by gfortran I assume you mean GNU fortran, this would be the GNU gdb debugger.
I circumvented the problem by starting the program via gdb. In this way, gdb will give an error and you can issue the backtrace command.
It's not perfect, so I'm open for better solutions, but this works for now.