`bash` is leaking memory, where do I report it? - bash

I have a super simple script to confirm this behavior:
leak.sh
#! /bin/bash
echo "Am I leaking?"
Execute under valgrind...
$ valgrind ./leak.sh
==1365336== Memcheck, a memory error detector
==1365336== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1365336== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==1365336== Command: ./leak.sh
==1365336==
Am I leaking?
==1365336==
==1365336== HEAP SUMMARY:
==1365336== in use at exit: 50,076 bytes in 766 blocks
==1365336== total heap usage: 858 allocs, 92 frees, 59,487 bytes allocated
==1365336==
==1365336== LEAK SUMMARY:
==1365336== definitely lost: 12 bytes in 1 blocks
==1365336== indirectly lost: 0 bytes in 0 blocks
==1365336== possibly lost: 0 bytes in 0 blocks
==1365336== still reachable: 50,064 bytes in 765 blocks
==1365336== suppressed: 0 bytes in 0 blocks
==1365336== Rerun with --leak-check=full to see details of leaked memory
==1365336==
==1365336== For lists of detected and suppressed errors, rerun with: -s
==1365336== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
If you change the script to use the dash, the default shell (sh) for Pop!_OS (and I assume all Debian distros), then it will run without leaking.
no_leak.sh
#! /bin/dash
echo "Am I leaking?"
$ valgrind ./no_leak.sh
==1365800== Memcheck, a memory error detector
==1365800== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1365800== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==1365800== Command: ./no_leak.sh
==1365800==
Am I leaking?
==1365800==
==1365800== HEAP SUMMARY:
==1365800== in use at exit: 10,666 bytes in 77 blocks
==1365800== total heap usage: 80 allocs, 3 frees, 14,809 bytes allocated
==1365800==
==1365800== LEAK SUMMARY:
==1365800== definitely lost: 0 bytes in 0 blocks
==1365800== indirectly lost: 0 bytes in 0 blocks
==1365800== possibly lost: 0 bytes in 0 blocks
==1365800== still reachable: 10,666 bytes in 77 blocks
==1365800== suppressed: 0 bytes in 0 blocks
==1365800== Rerun with --leak-check=full to see details of leaked memory
==1365800==
==1365800== For lists of detected and suppressed errors, rerun with: -s
==1365800== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Where can I report this observation, or find out if it has already been addressed?
bash Version
$ bash --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

As mentioned by #oguz_ismail in the comments, bug-bash#gnu.org is the appropriate place to report the bug.
However, a certain format for the email is required/requested, when you need to report a bug.
All bug reports should include:
The version number of Bash.
The hardware and operating system.
The compiler used to compile Bash.
A description of the bug behaviour.
A short script or ‘recipe’ which exercises the bug and may be used to reproduce it.
You can find ALL the details at:
https://www.gnu.org/software/bash/manual/html_node/Reporting-Bugs.html
Finally, there is a helper script built into bash itself. Call bashbug from the command line, and it will populate most of the requirements, leaving you to fill out the description and the steps required to reproduce the bug.
$ bashbug
GNU nano 5.2 /tmp/bbug.1414628/bbug1
From: zak
To: bug-bash#gnu.org
Subject: [50 character or so descriptive subject here (for reference)]
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -fdebug-prefix-map=/build/bash-SmNvvg/bash-5.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Wno-parentheses -Wno-format-secu>
uname output: Linux pop-os 5.11.0-7614-generic #15~1618626693~20.10~ecb25cd-Ubuntu SMP Thu Apr 22 16:00:45 UTC x86_64 x86_64 x86_64 GNU/Linux
...
Once you have filled in the template, you will be prompted if you would like to send the email. It's okay if you don't have an email client connected, it will store the completed form at ~/dead.bashbug and you can copy paste it into your email client.

Related

Check if mac executable has debug info

I want to make sure my executable has debug info, trying the linux equivalent doesn't help:
$ file ./my_lovely_program
./my_lovely_program: Mach-O 64-bit executable arm64 # with debug info? without?
EDIT (from the answer of #haggbart)
It seems that my executable has no debug info (?)
$ dwarfdump --debug-info ./compi
./compi: file format Mach-O arm64
.debug_info contents: # <--- empty, right?
And with the other option, I'm not sure:
$ otool -hv ./compi
./compi:
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC_64 ARM64 ALL 0x00 EXECUTE 19 1816 NOUNDEFS DYLDLINK TWOLEVEL WEAK_DEFINES BINDS_TO_WEAK PIE
This is very weird because I can perfectly debug it with lldb
(lldb) b main
Breakpoint 1: where = compi`main + 24 at main.cpp:50:9, address = 0x0000000100018650
(lldb) run
Process 6067 launched: '/Users/oren/Downloads/compi' (arm64)
Process 6067 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x0000000100018650 compi`main(argc=3, argv=0x000000016fdff7b8) at main.cpp:50:9
47 /*****************/
48 int main(int argc, char **argv)
49 {
-> 50 if (argc == 3)
51 {
52 const char *input = argv[1];
53 const char *output = argv[2];
Target 0: (compi) stopped.
Mach-O isn't like ELF: Its debug info is "sold separately" in a .dSYM file.
When you compile with -g you'll see a file gets generated along side your output, such that:
(~) gcc a.c -o /tmp/a -g2
(~) %ls -lFd /tmp/a /tmp/a.dSYM
-rwxr-xr-x 1 morpheus wheel 34078 Dec 6 12:56 /tmp/a*
drwxr-xr-x 3 morpheus wheel 96 Dec 6 12:56 /tmp/a.dSYM/
The .dSYM is a bundle (i.e. a directory structure) whose Contents/Resources/DWARF has the "companion file":
(~) %file /tmp/a.dSYM/Contents/Resources/DWARF/a
/tmp/a.dSYM/Contents/Resources/DWARF/a: Mach-O 64-bit dSYM companion file arm64
(~) %jtool2 -l /tmp/a.dSYM/Contents/Resources/DWARF/a | grep UUID
LC 00: LC_UUID UUID: BDD5C13E-F7B8-3B4D-BAF9-14DF3CD03724
(~) %jtool2 -l /tmp/a | grep UUID
LC 09: LC_UUID UUID: BDD5C13E-F7B8-3B4D-BAF9-14DF3CD03724
tools like lldb can figure out the debug data by trying for the companion file directory (usually in same location as the binary, or specified in a path), and then check the LC_UUID matches. This enables you to ship the binary without its dSym, and use the dSym when symbolicating a crash report (this is what Apple does). The debug info includes all local variable names, as well as debug_aranges (addr2line), etc:
(~) %jtool2 -l /tmp/a.dSYM/Contents/Resources/DWARF/a | grep DWARF
LC 07: LC_SEGMENT_64 Mem: 0x100009000-0x10000a000 __DWARF
Mem: 0x100009000-0x10000921f __DWARF.__debug_line
Mem: 0x10000921f-0x10000924f __DWARF.__debug_aranges
Mem: 0x10000924f-0x1000093dc __DWARF.__debug_info
Mem: 0x1000093dc-0x100009478 __DWARF.__debug_abbrev
Mem: 0x100009478-0x100009590 __DWARF.__debug_str
Mem: 0x100009590-0x1000095e8 __DWARF.__apple_names
Mem: 0x1000095e8-0x10000960c __DWARF.__apple_namespac
Mem: 0x10000960c-0x100009773 __DWARF.__apple_types
Mem: 0x100009773-0x100009797 __DWARF.__apple_objc
If you really want to get of any debug info - including, say, local function symbols (which are included by default in the binary), strip -d -x is your friend. This operates on the binary.
Note that running "dsymutil" (As suggested in other answers) can be a bit misleading, since in order to display information it will track down the accompanying dSym - which will be present on your machine, but not if you move the binary elsewhere.
If you run :
dsymutil -s ./my_lovely_propgram | grep N_OSO
and it shows output, it means there is debug info.

Loading an ELF File into Qemu

I have an .elf file created for a cortex-m3 processor. I want to run this in Qemu.
The .elf should start execution with this assembly file:
.thumb
.syntax unified
.global ResetHandler
ResetHandler:
LDR SP, =stack_top
NOP
BL main
B .
the associated linker script:
ENTRY(ResetHandler)
SECTIONS {
. = 0x08000000;
.startup : { startup.o(.text) }
.text : { *(.text) }
. = 0x20000000;
__bss_start__ = .;
.bss : { *(.bss) }
__bss_end__ = .;
.data : { *(.data) }
. = . + 0x100;
stack_top = .;
}
If I run the following command:
qemu-system-arm -s -S -machine stm32vldiscovery -cpu cortex-m3 -nographic -kernel myfile.elf
Qemu starts up and halts (as it should). However, when I connect gdb like so...
arm-none-eabi-gdb
(gdb) file myfile.elf
Reading symbols from myfile.elf...
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0xf002bf00 in ?? ()
(gdb) si
0x200001f8 in stack_top ()
You can see that GDB doesn't understand the .elf file. If I step through this, Qemu interprets my assembly language incorrectly and it will error and exit. But if I load the .elf file in GDB...
(gdb) load myfile.elf
Start address 0x08000000, load size 21891
Transfer rate: 16 KB/sec, 266 bytes/write.
(gdb) si
ResetHandler () at startup.s:7
7 NOP
(gdb) si
8 BL main
You can see that the .elf file is loaded correctly and can be stepped through.
My overall questions are:
What is load doing? The docs state:
Where it exists, it is meant to make filename (an executable) available for debugging on the remote system
But that is not clear to me. How assembly code is being executed changes, so I have to imagine "making a file available for debugging" is doing quite a bit.
edit (adding compilation steps and versions):
assembly and compilation...
arm-none-eabi-as -mcpu=cortex-m3 startup.s -g -o startup.o
arm-none-eabi-gcc \
-Tcortex-m3-tests.ld \
-mcpu=cortex-m3 \
-mthumb \
mysrcfile.c \
-g -o myfile.elf
versions...
qemu-system-arm --version
QEMU emulator version 6.2.0
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Toolchain for the Arm Architecture 11.2-2022.02 (arm-11.14)) 11.2.1 20220111
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
What is happening here is that your ELF file does not include a valid exception vector table at the correct address, and QEMU requires one. gdb is correctly showing you the results of the emulated CPU crashing as a result.
When QEMU starts for an M-profile Arm CPU, it tries to load the starting PC and SP values out of the vector table (this is how real hardware M-profile CPUs start). On this particular board, the vector table is at address 0x0000_0000, and address 0x0800_0000 is an alias for this. The initial SP and PC are at word offsets 0 and 1 in the table; it happens that your object file has words 0xd008f8df and 0xf002bf00 at those offsets, and you can see in gdb that gdb is correctly telling you that the initial PC is that bogus 0xf002bf00 value.
When you single-step, QEMU tries to load from 0xf002bf00, which has no memory there. It therefore takes a BusFault exception, which at this point in M-profile startup will always escalate to HardFault. That's exception number 3, whose entry point is stored at offset 3 in the vector table. As it happens with the way you've written your assembly, the word there is the address of stack_top, so QEMU will try to execute from there next. Since that's data and not a valid instruction, it goes downhill from there -- we will take another exception, which results in the CPU going into the Lockup state, which is fatal. You can see some of this if you tell QEMU to execute without talking to gdb and with some extra debug logging:
$ qemu-system-arm -machine stm32vldiscovery -cpu cortex-m3 -display none -serial stdio -kernel myfile.elf -d in_asm,cpu,exec,int
Taking exception 3 [Prefetch Abort] on CPU 0
...with CFSR.IACCVIOL
...BusFault with BFSR.STKERR
...taking pending nonsecure exception 3
----------------
IN:
0x20000558: 08000079 stmdaeq r0, {r0, r3, r4, r5, r6}
Trace 0: 0x7fd68be0e100 [00000401/20000558/00000130/ff000000]
R00=00000000 R01=00000000 R02=00000000 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=d008f8b8 R14=fffffff9 R15=20000558
XPSR=40000003 -Z-- A handler
Taking exception 18 [v7M INVSTATE UsageFault] on CPU 0
qemu: fatal: Lockup: can't escalate 3 to HardFault (current priority -1)
R00=00000000 R01=00000000 R02=00000000 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=d008f8b8 R14=fffffff9 R15=20000558
XPSR=40000003 -Z-- A handler
FPSCR: 00000000
Aborted (core dumped)
(You can't see all of the steps I describe above, you have to infer them, because QEMU doesn't currently log all the exception table loads or the initial PC/SP values. But you can see the BusFault, the attempt to execute at 0x20000558, the second exception and the Lockup. I've submitted some QEMU patches which improve the logging a little so that QEMU 7.0 and up should print the PC values being loaded from the vector table.)
The difference when you use the gdb 'load' command is that gdb both downloads the ELF file data into memory and also sets the initial PC value to the ELF file's entry-point address. So execution starts at 08000000 and continues from there.
Anyway, the way to fix this is to make sure the start of your ELF file that gets loaded at address 0 (or the 0x0800_0000 alias to 0) has a valid vector table. For an example look at https://git.linaro.org/people/peter.maydell/semihosting-tests.git/tree/start-microbit.S or https://git.linaro.org/people/peter.maydell/m-profile-tests.git/tree/init-m.S for instance.

Is there a way to 'Step Through' Fortran code like you can with VBA in excel?

I have some Fortran source code that I can understand the general idea behind, but I have never used Fortran before so I would like to see exactly what is happening as each line is being executed (like you can with VBA in Excel by stepping through the code line by line and observing what values variables and arrays have at any point in the code).
Is there a way to step through the source code with a user-interface of some kind so that I can see exactly what variables have been defined, what values they are taking etc..?
For some context: I work in Science and Engineering, but coding is not normally a significant part of my job (as you can probably tell from the content of my question), I have normally only deal with simple scripts in VBA to manipulate data. I have the compiled version of the Fortran code and it works fine, but I know I will need to modify the source and recompile it for my purposes. Unfortunately the person who wrote the original code is not available for advice/input. Another note: I'm not sure how to tell which version of Fortran was used...
Thanks!
Any respectable debugger will allow you to single step through a code. Below is a simple example showing one, gdb, in use. The important points:
Make sure you compile and link with the -g flag
run runs the code
break sets a break point at a given line in the code, i.e. when the code is running it will stop at that line
step steps one line
step n steps n lines
finish runs the program until the end
Note this is very much an oversimplified introduction to show what you want to do is possible. In real life you'll have to learn a bit more about your debugger, they are very powerful and useful pieces of software with many abilities not even hinted at here. For gdb there are a number of tutorials you can find by searching, e.g. https://sourceware.org/gdb/onlinedocs/gdb/index.html
ian#eris:~/work/stack$ gfortran -fcheck=all -Wall -Wextra -std=f2008 -g step.f90 -o step
ian#eris:~/work/stack$ gdb step
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from step...done.
(gdb) list
1 Program single_step
2
3 Implicit None
4
5 Integer :: i
6
7 Do i = 1, 10
8 Write( *, * ) 'I is now ', i
9 End Do
10
(gdb) run
Starting program: /home/ian/work/stack/step
I is now 1
I is now 2
I is now 3
I is now 4
I is now 5
I is now 6
I is now 7
I is now 8
I is now 9
I is now 10
[Inferior 1 (process 23965) exited normally]
(gdb) break 1
Breakpoint 1 at 0x5555555548c6: file step.f90, line 1.
(gdb) run
Starting program: /home/ian/work/stack/step
Breakpoint 1, single_step () at step.f90:1
1 Program single_step
(gdb) step
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step
I is now 1
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step
I is now 2
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step 5
I is now 3
I is now 4
I is now 5
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step 2
I is now 6
8 Write( *, * ) 'I is now ', i
(gdb) finish
Run till exit from #0 single_step () at step.f90:8
I is now 7
I is now 8
I is now 9
I is now 10
0x0000555555554a30 in main (argc=1, argv=0x7fffffffe2fa) at step.f90:11
11 End Program single_step
(gdb) quit
A debugging session is active.
Inferior 1 [process 23969] will be killed.
Quit anyway? (y or n) y
ian#eris:~/work/stack$

Stackoverflow error in Fortran with OpenMP [duplicate]

main program:
program main
use omp_lib
use my_module
implicit none
integer, parameter :: nmax = 202000
real(8) :: e_in(nmax) = 0.D0
integer i
call omp_set_num_threads(2)
!$omp parallel default(firstprivate)
!$omp do
do i=1,2
print *, e_in(i)
print *, eTDSE(i)
end do
!$omp end do
!$omp end parallel
end program main
module:
module my_module
implicit none
integer, parameter, private :: ntmax = 202000
double complex :: eTDSE(ntmax) = (0.D0,0.D0)
!$omp threadprivate(eTDSE)
end module my_module
compiled using:
ifort -openmp main.f90 my_module.f90
It gives the Segmentation fault when execution. If remove one of the print commands in the main program, it runs fine. Also if remove the omp function and compile without -openmp option, it runs fine too.
The most probable cause for this behaviour is that your stack size limit is too small (for whatever reason). Since e_in is private to each OpenMP thread, one copy per thread is allocated on the thread stack (even if you have specified -heap-arrays!). 202000 elements of REAL(KIND=8) take 1616 kB (or 1579 KiB).
The stack size limit can be controlled by several mechanisms:
On standard Unix system shells the amount of stack size is controlled by ulimit -s <stacksize in KiB>. This is also the stack size limit for the main OpenMP thread. The value of this limit is also used by the POSIX threads (pthreads) library as the default thread stack size when creating new threads.
OpenMP supports control over the stack size limit of all additional threads via the environment variable OMP_STACKSIZE. Its value is a number with an optional suffix k/K for KiB, m/M ffor MiB, or g/G for GiB. This value does not affect the stack size of the main thread.
The GNU OpenMP run-time (libgomp) recognises the non-standard environment variable GOMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE.
The Intel OpenMP run-time recognises the non-standard environment variable KMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE and also overrides the value of GOMP_STACKSIZE if the compatibility OpenMP run-time is used (which is the default as currently the only available Intel OpenMP run-time library is the compat one).
If none of the *_STACKSIZE variables are set, the default for Intel OpenMP run-time is 2m on 32-bit architectures and 4m on 64-bit ones.
On Windows, the stack size of the main thread is part of the PE header and is embedded there by the linker. If using Microsoft's LINK to do the linking, the size is specified using the /STACK:reserve[,commit]. The reserve argument specifies the maximum stack size in bytes while the optional commit argument specifies the initial commit size. Both can be specified as hexadecimal values using the 0x prefix. If re-linking the executable is not an option, the stack size could be modified by editing the PE header with EDITBIN. It takes the same stack-related argument as the linker. Programs compiled with MSVC's whole program optimisation enabled (/GL) cannot be edited.
The GNU linker for Win32 targets supports setting the stack size via the --stack argument. To pass the option directly from GCC, the -Wl,--stack,<size in bytes> can be used.
Note that thread stacks are actually allocated with the size set by *_STACKSIZE (or to the default value), unlike the stack of the main thread, which starts small and then grows on demand up to the set limit. So don't set *_STACKSIZE to an arbitrary large value otherwise you may hit the process virtual memory size limit.
Here are some examples:
$ ifort -openmp my_module.f90 main.f90
Set the main stack size limit to 1 MiB (the additional OpenMP thread would get 4 MiB as per default):
$ ulimit -s 1024
$ ./a.out
zsh: segmentation fault (core dumped) ./a.out
Set the main stack size limit to 1700 KiB:
$ ulimit -s 1700
$ ./a.out
0.000000000000000E+000
(0.000000000000000E+000,0.000000000000000E+000)
0.000000000000000E+000
(0.000000000000000E+000,0.000000000000000E+000)
Set the main stack size limit to 2 MiB and the stack size of the additional thread to 1 MiB:
$ ulimit -s 2048
$ KMP_STACKSIZE=1m ./a.out
zsh: segmentation fault (core dumped) KMP_STACKSIZE=1m ./a.out
On most Unix systems the stack size limit of the main thread is set by PAM or other login mechanism (see /etc/security/limits.conf). The default on Scientific Linux 6.3 is 10 MiB.
Another possible scenario that can lead to an error is if the virtual address space limit is set too low. For example, if the virtual address space limit is 1 GiB and the thread stack size limit is set to 512 MiB, then the OpenMP run-time would try to allocate 512 MiB for each additional thread. With two threads one would have 1 GiB for the stacks only, and when the space for code, shared libraries, heap, etc. is added up, the virtual memory size would grow beyond 1 GiB and an error would occur:
Set the virtual address space limit to 1 GiB and run with two additional threads with 512 MiB stacks (I have commented out the call to omp_set_num_threads()):
$ ulimit -v 1048576
$ KMP_STACKSIZE=512m OMP_NUM_THREADS=3 ./a.out
OMP: Error #34: System unable to allocate necessary resources for OMP thread:
OMP: System error #11: Resource temporarily unavailable
OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.
forrtl: error (76): Abort trap signal
... trace omitted ...
zsh: abort (core dumped) OMP_NUM_THREADS=3 KMP_STACKSIZE=512m ./a.out
In this case the OpenMP run-time library would fail to create a new thread and would notify you before it aborts program termination.
Segmentation fault is due to stack memory limit when using OpenMP. Using the solutions from the previous answer did not solve the problem for me on my Windows OS. Using memory allocation into heap rather than stack memory seems to work:
integer, parameter :: nmax = 202000
real(dp), dimension(:), allocatable :: e_in
integer i
allocate(e_in(nmax))
e_in = 0
! rest of code
deallocate(e_in)
Plus this would not involve changing any default environment parameters.
Acknowledgement to and refer to ohm314's solution here: large array using heap memory allocation

deallocation and memory allocation problems in FORTRAN [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I am having problems with the deallocate and allocate aspects of part of my FORTRAN code. in particular, i think that the issue has to do with memory allocation from a search on my error message on the web. The error message talks about invalid pointers, however, I am not using any pointers in my program
After completing iteration # 2 of my f loop (see below), the program crashes or rather most of the time it crashes and sometimes it just freezes up. I am confident that this is the point where the bug is. as the program runs up to this point.
I have subroutines not shown but since they work for other simulation combinations, I am reasonably confident that they are not the problem. I am using deallocate and allocate in other places within the program (successfully) so I am surprised that it is not working here.
I am only showing part of the program for ease of reading. in particular, I have removed my calls to the subroutines that I wrote. I hope that i have provided sufficient info for you programmers to help me figure out the problem. if not please specify what other info you want and I will be happy to comply. I have compiled the program using various compiler options and have fixed some bugs and removed any warnings. At this point, however, the compiler options do not give me any more info.
allocate(poffvect(1:6))
allocate(phi1out(1:1))
allocate(phi2out(1:1))
allocate(phi1outs1(1:1))
allocate(phi2outs1(1:1))
dummy allocation
allocate(phi1outind(1:1))
allocate(phi2outind(1:1))
allocate(phi1outinds1(1:1))
allocate(phi2outinds1(1:1))
do e = 1, 6
print *,"e", e
do f = 1, 3
print *,"f", f, iteratst1(f), trim(filenumcharimp)
deallocate(phi1outinds1, STAT = AllocateStatus)
if (AllocateStatus /= 0) stop "Error during deallocation of phi1outinds1"
print *, "Allocatestatus of phi1outinds1 is", AllocateStatus
deallocate(phi2outinds1, STAT = AllocateStatus)
print *, "DeAllocatestatus of phi1outinds2 is", AllocateStatus
if (AllocateStatus /= 0) stop "Error during deallocation of phi2outinds1"
print *, "we deallocate f loop ok", iteratst1(f)
allocate(phi1outinds1(1:iteratst1(f)), STAT = AllocateStatus)
if (AllocateStatus /= 0) stop "Error during allocation of phi1outinds1"
allocate(phi2outinds1(1:iteratst1(f)), STAT = AllocateStatus)
if (AllocateStatus /= 0) stop "Error during deallocation of phi1outinds1"
end do
end do
compiler options
ifort -free -check -traceback -o adatptmultistage1new.out adatptmultistage1new.f90
output
e 1
f 1 5000 43
DeAllocatestatus of phi1outinds1 is 0
DeAllocatestatus of phi1outinds2 is 0
we deallocate f loop ok 5000
f loop done 1
f 2 10000 43
Allocatestatus of phi1outinds1 is 0
DeAllocatestatus of phi1outinds2 is 0
we deallocate f loop ok 10000
f loop done 2
f 3 15000 43
Allocatestatus of phi1outinds1 is 0
error message
*** glibc detected *** ./adatptmultistage1new.out: munmap_chunk(): invalid pointer: 0x0000000000d3ddd0 ***
======= Backtrace: =========
/lib/libc.so.6(+0x77806)[0x7f5863b7b806]
. /adatptmultistage1new.out[0x43247c]
. /adatptmultistage1new.out[0x404368]
./adatptmultistage1new.out[0x4031ec]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f5863b22c4d]
. /adatptmultistage1new.out[0x4030e9]
======= Memory map: ========
00400000-004d4000 r-xp 00000000 08:03 9642201
/home/jgold/smwcv/error_infect/test/surfaces/multistage/adaptonly/adatptmultistage1new.out
006d4000-006dc000 rw-p 000d4000 08:03 9642201
[rest of error message not shown for brevity]
7fffb004d000-7fffb00bc000 rw-p 00000000 00:00 0 [stack]
7fffb01d7000-7fffb01d8000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted
That is a lot of code for us to try to figure out. Have you compiled it with as many compiler debugging options as possible? Especially, are you using array bounds checking? What compiler are you using? I don't see a "use" statement ... it would be better to put your subroutines into a module and "use" that module so that the compiler can check argument consistency between the actual and dummy arguments.
EDIT: "double free or corruption" suggests that memory has been corrupted. Since you don't appear to have any pointers there are three likely ways to corrupt memory:
Use an allocatable variable that hasn't been allocated. If the allocate statement fails the program would probably throw an error at that point. You might be using a variable that you forget to allocate.
Having a disagreement between the arguments in the call to a procedure and what the procedure expects, i.e., between actual and dummy arguments. Using a module will allow the compiler to do better checking for this.
Writing outside the size of an array by using an illegal subscript value -- this will overwrite "random" memory, such as the internal structures describing the next array. Turning on run-time subscript or array-bound checking will test for this. With ifort use:-check bounds or -check all. For very through checking try: -O2 -stand f03 -check all -traceback -warn all -fstack-protector -assume protect_parens -implicitnone

Resources