File doesn't open in Fortran [duplicate] - debugging

I get the following error when I execute a fortran code compiled with gfortran. The error is followed by a backtrace for this error pointing to memory locations.
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x2b2f8e39da2d in ???
#1 0x2b2f8e39cc63 in ???
#2 0x311823256f in ???
#3 0x311827a7be in ???
#4 0x2b2f8e39cff2 in ???
#5 0x2b2f8e4adde6 in ???
#6 0x2b2f8e4ae047 in ???
#7 0x2b2f8e4a62d7 in ???
#8 0x40482a in instrument_
at /home/user/model/instrument.f:90
#9 0x406c1e in funcdet
at /home/user/model/funcsynth.f:67
#10 0x406c98 in main
at /home/user/model/funcsynth.f:78
Segmentation fault (core dumped)
I would like to know where the first instance of error arises - is it the first line of the backtrace or the lastline? Also, strategies that might help me debug the issue.
Update:
Upon backtracing, the line 90 of instrument involves opening a file like so:
out_file3 = 'new_file'
OPEN(unit=3,file=out_file3,status='unknown')
To determine the issue behind I've incorporated error checking by doing so:
OPEN(unit=3,file=out_file3,status='unknown',iostat=io_status, err=100)
100 write(STDOUT,*) 'io status=', io_status
The code exits with the error:
Error: Invalid value for ERR specification at (1). How do I determine the appropriate value for ERR specification? This led me to suspect that unit=3 might be the cause for error, I've changed the value for unit, but everytime get the "Segmentation fault (core dumped)" error.
Update 2:
OPEN(unit=3,file=out_file3,status='unknown',err=17)
17 write(STDOUT,*) 'Problem'
Still get the Segmentation fault (core dumped) error at the line corresponding to OPEN.... I can only guess that unit=3 is the root cause of the problem.
Update 3
Attempt at a self sufficient example:
character*280 testfile,finalfile,outfile
DIR = '/storage/work/user/'
testfile = 'test.dat'
CALL getenv(DIR,outfile)
CALL sappend(outfile,testfile,finalfile)
OPEN(unit=3,file=finalfine,status='new')
write(3,*) 'Test'
END

Related

Code not running in Hackerrank though it is running in onlinegdb

The main topic is to write a code in C++ for Hash with Chaining (using single link list).here, data has been provided in terms of array of long datatype and we have to store them in hash(table size 13) in a sorted manner.
Here is my code for the same.
https://onlinegdb.com/B1pbgjxAI
There is no compiler error in the code but while running the code the following error arises.
*** buffer overflow detected ***: ./Solution terminated
Reading symbols from Solution...done.
[New LWP 86657]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./Solution'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
To enable execution of this file add
add-auto-load-safe-path /usr/local/lib64/libstdc++.so.6.0.25-gdb.py
line to your configuration file "//.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "//.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
Here, for the testcase, input is
201911169
and the output should be
93
In line 36 you're calling strcpy(p->name,Name), but Name is passed from x in main, and char x[4] isn't null-terminated as you only assign to x[j] for j from 2 downto 0. Add a statement x[3] = '\0';.

auto detect segmentation fault and re-run if segmentation fault occurs

I have a script get_new_GMC.sh that contains this line:
./casm.x < input_to_casm.txt
However, sometimes we might get a segmentation fault (and sometimes we do not)... Sometimes it would have the error:
get_new_GMC.sh: line 156: 72577 Segmentation fault: 11 ./casm.x <
input_to_casm.txt
How could I change my script so that whenever a segmentation fault occurs, I would just rerun ./casm.x < input_to_casm.txt again? Thank you.

Freeswitch terminates with signal SIGSEGV, Segmentation fault

While freeswitch is running and we call 1 client to another client then another client pick the call then code is dumped and throwing following error.
I am trying allocate the memory for that function but nothing happens.
Core was generated by `./freeswitch'.
Program terminated with signal SIGSEGV, Segmentation fault.
0 0x00007fb18e95a771 in H245NegLogicalChannels::FindChannelBySession (
this=<optimized out>, rtpSessionId=rtpSessionId#entry=0,
fromRemote=fromRemote#entry=false, anyState=anyState#entry=false)
at /root/opal/src/h323/h323neg.cxx:1097
warning: Source file is more recent than executable.
1097 if (channel != NULL && (rtpSessionId == 0 || channel->GetSessionID() == >rtpSessionId) &&
(gdb)

XIO: fatal IO error 11 caused by 32-bit libxcb

Yes, this question has been asked before, but reading the answers didn't enlighten me much.
I wrote a C program that crashes after a few days of use. An important point is that it does NOT generate a core file, even though everything is set up so that it should (core_pattern, ulimit -c unlimited, etc. I can trigger a core dump fine with kill -SIGQUIT).
The programs extensively logs what it does, but there's no hint about the crash in the log.
The only message displayed at the crash (or before?) is:
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
after 2322 requests (2322 known processed) with 0 events remaining.
So two questions:
- how is it possible for a program to crash (return $?=1) without core dump.
- what is this error message about and what can I do ?
System is RedHat Enterprise 6.4
Edit:
I managed to force a core dump by calling abort() from inside an atexit() callback:
(gdb) bt
#0 0x00bc8424 in __kernel_vsyscall ()
#1 0x0085a861 in raise () from /lib/libc.so.6
#2 0x0085c13a in abort () from /lib/libc.so.6
#3 0x0808f5cf in Unexpected () at MyCode.c:1378
#4 0x0085de9f in exit () from /lib/libc.so.6
#5 0x00c85701 in _XDefaultIOError () from /usr/lib/libX11.so.6
#6 0x00c85797 in _XIOError () from /usr/lib/libX11.so.6
#7 0x00c84055 in _XReply () from /usr/lib/libX11.so.6
#8 0x00c68b8f in XGetImage () from /usr/lib/libX11.so.6
#9 0x004fd6a7 in ?? () from /usr/local/lib/libcvi.so
#10 0x00478ad5 in ?? () from /usr/local/lib/libcvi.so
...
#29 0x001eed9d in ?? () from /usr/local/lib/libcvi.so
#30 0x001eee41 in RunUserInterface () from /usr/local/lib/libcvi.so
#31 0x0808fab4 in main (argc=2, argv=0xbfbdc984) at MyCode.c:1540
Anyone can enlighten me as to this X11 problem ? libcvi.so is not mine, only MyCode.c (LabWindows/CVI).
Edit 2014-12-05:
Here's an even more precise backtrace. Things definitely happen in X11, but I'm no X11 programmer, so looking at the source code for X from the provided linestell me only that the X server (?) is temporarily unavailable. Is there any way to simply tell it to ignore this error if it's only temporary ?
#4 0x00965eaf in __run_exit_handlers (status=1) at exit.c:78
#5 exit (status=1) at exit.c:100
#6 0x00c356b1 in _XDefaultIOError (dpy=0x88aeb80) at XlibInt.c:1292
#7 0x00c35747 in _XIOError (dpy=0x88aeb80) at XlibInt.c:1498
#8 0x00c340a6 in _XReply (dpy=0x88aeb80, rep=0xbf82fa90, extra=0, discard=0) at xcb_io.c:708
#9 0x00c18c0f in XGetImage (dpy=0x88aeb80, d=27263845, x=0, y=0, width=60, height=20, plane_mask=4294967295, format=2) at GetImage.c:75
#10 0x005f46a7 in ?? () from /usr/local/lib/libcvi.so
Corresponding lines:
XlibInt.c: _XDefaultIOError()
1292: exit(1);
XlibInt.c: _XIOError
1498: _XDefaultIOError(dpy);
xcb_io.c: _XReply()
708: if(!reply) _XIOError(dpy);
GetImage.c: XGetImage()
74: if (_XReply (dpy, (xReply *) &rep, 0, xFalse) == 0 || ...
OK, I finally found the cause (thanks to someone at National Instruments), a better diagnostic and a workaround.
The bug is in many versions of libxcb and is a 32-bit counter rollover problem that has been known for a few years: https://bugs.freedesktop.org/show_bug.cgi?id=71338
Not all versions of libxcb are affected libxcb-1.9-5 has it, libxcb-1.5-1 doesn't. From the bug list, 64-bits OS shouldn't be affected, but I managed to trigger it on at least one version.
Which brings me to a better diagnostic. The following program will crash in less than 15 minutes on affected libraries (better than the entire week it previously took):
// Compile with: gcc test.c -lX11 && time ./a.out
#include <X11/Xlib.h>
void main(void) {
Display *d = XOpenDisplay(NULL);
if (d)
for(;;)
XNoOp(d);
}
And one final thing, the above prog compiled and ran on a 64-bit system works fine, compiled and ran on an old 32-bit system also works fine, but if I transfer the 32-bit version to the 64-bit system, it crashes after a few minutes.
I just had a program that acted exactly like this, with exactly the same error message. I would expect the counter error to process 2^32 events before crashing.
The program was structured so that a worker thread has a separate X connection to the X thread so that it can send messages to the X thread to update the window.
In the end I traced the problem down to a place where the function sending the events to the window to redraw it was called by multiple threads, without a mutex on it, and since X to the same X connection is not re-entrant, crashed with this exact error. Put in a mutex on the function and no problems since.

Ruby/Glibc coredump (double free or corruption)

I am using a distributed continuous integration tool which I have written by myself in Ruby. It uses a fork of Mike Perham's "politics" for distribution of the tasks. The "politics" module is using threads for the mDNS part.
Every now and then I encounter a core dump which I don't understand:
*** glibc detected *** ruby: double free or corruption (fasttop): 0x086d8600 ***
======= Backtrace: =========
/lib/libc.so.6[0xb7cef494]
/lib/libc.so.6[0xb7cf0b93]
/lib/libc.so.6(cfree+0x6d)[0xb7cf3c7d]
/usr/lib/libruby18.so.1.8[0xb7e8adf8]
/usr/lib/libruby18.so.1.8(ruby_xmalloc+0x85)[0xb7e8b395]
/usr/lib/libruby18.so.1.8[0xb7e5065e]
...
/usr/lib/libruby18.so.1.8[0xb7e717f4]
/usr/lib/libruby18.so.1.8[0xb7e74296]
/usr/lib/libruby18.so.1.8(rb_yield+0x27)[0xb7e7fb57]
======= Memory map: ========
...
I am running on Gentoo and have rebuild Ruby and Glibc with "-gdbg" and turned off the striping to get a meaningful core:
...
Core was generated by `ruby /home/develop/dcc/bin/dcc-worker'.
Program terminated with signal 6, Aborted.
#0 0xb7f20410 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7f20410 in __kernel_vsyscall ()
#1 0xb7cacb60 in *__GI___open_catalog (cat_name=0x6 <Address 0x6 out of bounds>, nlspath=0xbf9d6f00 " ", env_var=0x0, catalog=0x1) at open_catalog.c:237
#2 0xb7cae498 in __sigdelset (set=0x6) from /lib/libc.so.6
#3 *__GI_sigfillset (set=0x6) at ../signal/sigfillset.c:42
#4 0xb7ce952d in freopen64 (filename=0x2 <Address 0x2 out of bounds>, mode=0xb7db02c8 "\" total=\"%zu\" count=\"%zu\"/>\n", fp=0x9) at freopen64.c:47
#5 0xb7cef494 in _IO_str_init_readonly (sf=0x86d8600, ptr=0xb7eef5a9 "te\213V\b\205\322\017\204\220", size=-1210273804) at strops.c:88
#6 0xb7cf0b93 in mALLINFo (av=0xb) at malloc.c:5865
#7 0xb7cf3c7d in __libc_calloc (n=141395456, elem_size=3214793136) at malloc.c:4019
#8 0xb7e8adf8 in ?? () at gc.c:1390 from /usr/lib/libruby18.so.1.8
#9 0x086d8600 in ?? ()
#10 0xb7e89400 in rb_gc_disable () at gc.c:256
#11 0xb7e8b395 in add_freelist () at gc.c:1087
#12 gc_sweep () at gc.c:1186
#13 garbage_collect () at gc.c:1524
#14 0xb7e5065e in ?? () from /usr/lib/libruby18.so.1.8
#15 0x00000340 in ?? ()
#16 0x00000000 in ?? ()
(gdb)
Hmm??? For me this looks like it's totally Ruby intern. On other "double free or corruption" problems here at stackoverflow I have seen that maybe threads are part of the problem.
Also the problem does not occur at the exactly same position. I have another backtrace which is much longer but the crash is also in garbage_collect but with a slightly different path:
(gdb) bt
#0 0xffffe430 in __kernel_vsyscall ()
#1 0xf7c8b8c0 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2 0xf7c8d1f5 in *__GI_abort () at abort.c:88
#3 0xf7cc7e35 in __libc_message (do_abort=2, fmt=0xf7d8daa8 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4 0xf7ccdd24 in malloc_printerr (action=2, str=0xf7d8dbec "double free or corruption (fasttop)", ptr=0x911f5d0) at malloc.c:6197
#5 0xf7ccf403 in _int_free (av=0xf7daa380, p=0x911f5c8) at malloc.c:4750
#6 0xf7cd24ad in *__GI___libc_free (mem=0x911f5d0) at malloc.c:3716
#7 0xf7e68768 in obj_free () at gc.c:1366
#8 gc_sweep () at gc.c:1174
#9 garbage_collect () at gc.c:1524
#10 0xf7e68be5 in rb_newobj () at gc.c:436
#11 0xf7eb9840 in str_alloc (klass=0) at string.c:67
... (150 lines of rb_eval/call/yield etc.)
Has anyone a suggestion how to isolate and maybe solve this problem?
Quick, easy, and not as helpful: export MALLOC_CHECK_=2. This causes glibc to do some extra level of checking during free(), to avoid heap corruption. It will abort() and give a core dump as soon as it detects corruption, instead of waiting until there's an actual problem caused by the corruption.
Not quite as quick and easy, but much more helpful (if you get it working): valgrind.
Valgrind makes it easy to find heap corruption issues. There are some spurious errors reported when using Ruby 1.8 under valgrind, but they can be eliminated using this ruby patch (and configuring with --enable-valgrind) or using a valgrind suppression file. To run your ruby program under valgrind, just prefix the command with valgrind:
valgrind ruby /home/develop/dcc/bin/dcc-worker
If the crashing process is a child of the process you are running, use valgrind --trace-children=yes. Look in particular for invalid writes, which are a sign of heap corruption.
I got this very same error in a simple 'C' program called rd_test; it would just read a given number of bytes using read(2) from a given input file (could be a device file).
The actual bug turned out to be a buffer overflow of 1 byte (as i did
...
buf[n]='\0';
...
where 'n' is the number of bytes read into the buffer 'buf').
Silly me.
BUT, the thing is I never caught that until I ran it with valgrind!
So IMHO valgrind is definitely worth running on cases like this.
The 'double free or corruption' error went away as soon as i got rid of the offending bug.
I got the same error message , not in ruby but in a zenity-program .
I discovered it had something todo with me closing two times an open pipe !
Check if You dont free two-or more times the same heap-memory , closing again already closed files or pipes .
Goodluck

Resources