deallocation and memory allocation problems in FORTRAN [closed] - memory-management

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I am having problems with the deallocate and allocate aspects of part of my FORTRAN code. in particular, i think that the issue has to do with memory allocation from a search on my error message on the web. The error message talks about invalid pointers, however, I am not using any pointers in my program
After completing iteration # 2 of my f loop (see below), the program crashes or rather most of the time it crashes and sometimes it just freezes up. I am confident that this is the point where the bug is. as the program runs up to this point.
I have subroutines not shown but since they work for other simulation combinations, I am reasonably confident that they are not the problem. I am using deallocate and allocate in other places within the program (successfully) so I am surprised that it is not working here.
I am only showing part of the program for ease of reading. in particular, I have removed my calls to the subroutines that I wrote. I hope that i have provided sufficient info for you programmers to help me figure out the problem. if not please specify what other info you want and I will be happy to comply. I have compiled the program using various compiler options and have fixed some bugs and removed any warnings. At this point, however, the compiler options do not give me any more info.
allocate(poffvect(1:6))
allocate(phi1out(1:1))
allocate(phi2out(1:1))
allocate(phi1outs1(1:1))
allocate(phi2outs1(1:1))
dummy allocation
allocate(phi1outind(1:1))
allocate(phi2outind(1:1))
allocate(phi1outinds1(1:1))
allocate(phi2outinds1(1:1))
do e = 1, 6
print *,"e", e
do f = 1, 3
print *,"f", f, iteratst1(f), trim(filenumcharimp)
deallocate(phi1outinds1, STAT = AllocateStatus)
if (AllocateStatus /= 0) stop "Error during deallocation of phi1outinds1"
print *, "Allocatestatus of phi1outinds1 is", AllocateStatus
deallocate(phi2outinds1, STAT = AllocateStatus)
print *, "DeAllocatestatus of phi1outinds2 is", AllocateStatus
if (AllocateStatus /= 0) stop "Error during deallocation of phi2outinds1"
print *, "we deallocate f loop ok", iteratst1(f)
allocate(phi1outinds1(1:iteratst1(f)), STAT = AllocateStatus)
if (AllocateStatus /= 0) stop "Error during allocation of phi1outinds1"
allocate(phi2outinds1(1:iteratst1(f)), STAT = AllocateStatus)
if (AllocateStatus /= 0) stop "Error during deallocation of phi1outinds1"
end do
end do
compiler options
ifort -free -check -traceback -o adatptmultistage1new.out adatptmultistage1new.f90
output
e 1
f 1 5000 43
DeAllocatestatus of phi1outinds1 is 0
DeAllocatestatus of phi1outinds2 is 0
we deallocate f loop ok 5000
f loop done 1
f 2 10000 43
Allocatestatus of phi1outinds1 is 0
DeAllocatestatus of phi1outinds2 is 0
we deallocate f loop ok 10000
f loop done 2
f 3 15000 43
Allocatestatus of phi1outinds1 is 0
error message
*** glibc detected *** ./adatptmultistage1new.out: munmap_chunk(): invalid pointer: 0x0000000000d3ddd0 ***
======= Backtrace: =========
/lib/libc.so.6(+0x77806)[0x7f5863b7b806]
. /adatptmultistage1new.out[0x43247c]
. /adatptmultistage1new.out[0x404368]
./adatptmultistage1new.out[0x4031ec]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f5863b22c4d]
. /adatptmultistage1new.out[0x4030e9]
======= Memory map: ========
00400000-004d4000 r-xp 00000000 08:03 9642201
/home/jgold/smwcv/error_infect/test/surfaces/multistage/adaptonly/adatptmultistage1new.out
006d4000-006dc000 rw-p 000d4000 08:03 9642201
[rest of error message not shown for brevity]
7fffb004d000-7fffb00bc000 rw-p 00000000 00:00 0 [stack]
7fffb01d7000-7fffb01d8000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted

That is a lot of code for us to try to figure out. Have you compiled it with as many compiler debugging options as possible? Especially, are you using array bounds checking? What compiler are you using? I don't see a "use" statement ... it would be better to put your subroutines into a module and "use" that module so that the compiler can check argument consistency between the actual and dummy arguments.
EDIT: "double free or corruption" suggests that memory has been corrupted. Since you don't appear to have any pointers there are three likely ways to corrupt memory:
Use an allocatable variable that hasn't been allocated. If the allocate statement fails the program would probably throw an error at that point. You might be using a variable that you forget to allocate.
Having a disagreement between the arguments in the call to a procedure and what the procedure expects, i.e., between actual and dummy arguments. Using a module will allow the compiler to do better checking for this.
Writing outside the size of an array by using an illegal subscript value -- this will overwrite "random" memory, such as the internal structures describing the next array. Turning on run-time subscript or array-bound checking will test for this. With ifort use:-check bounds or -check all. For very through checking try: -O2 -stand f03 -check all -traceback -warn all -fstack-protector -assume protect_parens -implicitnone

Related

GDB: how to call functions with modified parameters during debugging

Consider the following trivial Fortran program that adds two integers via a subroutine and prints the result:
PROGRAM MAIN
INTEGER I, J, SUM
I = 1
J = 1
CALL ADD(I, J, SUM)
WRITE(*,*) SUM
END
SUBROUTINE ADD(I, J, SUM)
INTEGER I, J, SUM
SUM = I + J
END
Compiling via gfortran -g -O0 gdb-mwe.f -o gdb-mwe and running in the GNU Debugger, I want to call ADD from the debugger with modified input arguments right before the write output. Here's what happens:
Reading symbols from gdb-mwe...done.
(gdb) break 10
Breakpoint 1 at 0x4007dd: file gdb-mwe.f, line 10.
(gdb) r
Starting program: /home/username/Documents/Fortran/gdb-mwe
Breakpoint 1, MAIN__ () at gdb-mwe.f:10
10 WRITE(*,*) SUM
(gdb) p j = j+1
$2 = 2
(gdb) call add(i,j,sum)
Program received signal SIGSEGV, Segmentation fault.
0x000000000040079a in add (
i=<error reading variable: Cannot access memory at address 0x1>,
j=<error reading variable: Cannot access memory at address 0x2>,
sum=<error reading variable: Cannot access memory at address 0x2>)
at gdb-mwe.f:18
18 SUM = I + J
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(add) will be abandoned.
When the function is done executing, GDB will silently stop.
How do I get this right?
As pointed out in the comments, the open bugs in gdb prevents doing this currently.
A possible workaround would be to debug a 32-bit version of the code. This results in some differences, but for simple debugging tasks it may be sufficient.
For intel fortran compilers, this requires only adding the -m32 flag (provided 32-bit libraries have been installed).
For gfortran it seems that installing the multilib package first is necessary, as show in this questions.

Stackoverflow error in Fortran with OpenMP [duplicate]

main program:
program main
use omp_lib
use my_module
implicit none
integer, parameter :: nmax = 202000
real(8) :: e_in(nmax) = 0.D0
integer i
call omp_set_num_threads(2)
!$omp parallel default(firstprivate)
!$omp do
do i=1,2
print *, e_in(i)
print *, eTDSE(i)
end do
!$omp end do
!$omp end parallel
end program main
module:
module my_module
implicit none
integer, parameter, private :: ntmax = 202000
double complex :: eTDSE(ntmax) = (0.D0,0.D0)
!$omp threadprivate(eTDSE)
end module my_module
compiled using:
ifort -openmp main.f90 my_module.f90
It gives the Segmentation fault when execution. If remove one of the print commands in the main program, it runs fine. Also if remove the omp function and compile without -openmp option, it runs fine too.
The most probable cause for this behaviour is that your stack size limit is too small (for whatever reason). Since e_in is private to each OpenMP thread, one copy per thread is allocated on the thread stack (even if you have specified -heap-arrays!). 202000 elements of REAL(KIND=8) take 1616 kB (or 1579 KiB).
The stack size limit can be controlled by several mechanisms:
On standard Unix system shells the amount of stack size is controlled by ulimit -s <stacksize in KiB>. This is also the stack size limit for the main OpenMP thread. The value of this limit is also used by the POSIX threads (pthreads) library as the default thread stack size when creating new threads.
OpenMP supports control over the stack size limit of all additional threads via the environment variable OMP_STACKSIZE. Its value is a number with an optional suffix k/K for KiB, m/M ffor MiB, or g/G for GiB. This value does not affect the stack size of the main thread.
The GNU OpenMP run-time (libgomp) recognises the non-standard environment variable GOMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE.
The Intel OpenMP run-time recognises the non-standard environment variable KMP_STACKSIZE. If set it overrides the value of OMP_STACKSIZE and also overrides the value of GOMP_STACKSIZE if the compatibility OpenMP run-time is used (which is the default as currently the only available Intel OpenMP run-time library is the compat one).
If none of the *_STACKSIZE variables are set, the default for Intel OpenMP run-time is 2m on 32-bit architectures and 4m on 64-bit ones.
On Windows, the stack size of the main thread is part of the PE header and is embedded there by the linker. If using Microsoft's LINK to do the linking, the size is specified using the /STACK:reserve[,commit]. The reserve argument specifies the maximum stack size in bytes while the optional commit argument specifies the initial commit size. Both can be specified as hexadecimal values using the 0x prefix. If re-linking the executable is not an option, the stack size could be modified by editing the PE header with EDITBIN. It takes the same stack-related argument as the linker. Programs compiled with MSVC's whole program optimisation enabled (/GL) cannot be edited.
The GNU linker for Win32 targets supports setting the stack size via the --stack argument. To pass the option directly from GCC, the -Wl,--stack,<size in bytes> can be used.
Note that thread stacks are actually allocated with the size set by *_STACKSIZE (or to the default value), unlike the stack of the main thread, which starts small and then grows on demand up to the set limit. So don't set *_STACKSIZE to an arbitrary large value otherwise you may hit the process virtual memory size limit.
Here are some examples:
$ ifort -openmp my_module.f90 main.f90
Set the main stack size limit to 1 MiB (the additional OpenMP thread would get 4 MiB as per default):
$ ulimit -s 1024
$ ./a.out
zsh: segmentation fault (core dumped) ./a.out
Set the main stack size limit to 1700 KiB:
$ ulimit -s 1700
$ ./a.out
0.000000000000000E+000
(0.000000000000000E+000,0.000000000000000E+000)
0.000000000000000E+000
(0.000000000000000E+000,0.000000000000000E+000)
Set the main stack size limit to 2 MiB and the stack size of the additional thread to 1 MiB:
$ ulimit -s 2048
$ KMP_STACKSIZE=1m ./a.out
zsh: segmentation fault (core dumped) KMP_STACKSIZE=1m ./a.out
On most Unix systems the stack size limit of the main thread is set by PAM or other login mechanism (see /etc/security/limits.conf). The default on Scientific Linux 6.3 is 10 MiB.
Another possible scenario that can lead to an error is if the virtual address space limit is set too low. For example, if the virtual address space limit is 1 GiB and the thread stack size limit is set to 512 MiB, then the OpenMP run-time would try to allocate 512 MiB for each additional thread. With two threads one would have 1 GiB for the stacks only, and when the space for code, shared libraries, heap, etc. is added up, the virtual memory size would grow beyond 1 GiB and an error would occur:
Set the virtual address space limit to 1 GiB and run with two additional threads with 512 MiB stacks (I have commented out the call to omp_set_num_threads()):
$ ulimit -v 1048576
$ KMP_STACKSIZE=512m OMP_NUM_THREADS=3 ./a.out
OMP: Error #34: System unable to allocate necessary resources for OMP thread:
OMP: System error #11: Resource temporarily unavailable
OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.
forrtl: error (76): Abort trap signal
... trace omitted ...
zsh: abort (core dumped) OMP_NUM_THREADS=3 KMP_STACKSIZE=512m ./a.out
In this case the OpenMP run-time library would fail to create a new thread and would notify you before it aborts program termination.
Segmentation fault is due to stack memory limit when using OpenMP. Using the solutions from the previous answer did not solve the problem for me on my Windows OS. Using memory allocation into heap rather than stack memory seems to work:
integer, parameter :: nmax = 202000
real(dp), dimension(:), allocatable :: e_in
integer i
allocate(e_in(nmax))
e_in = 0
! rest of code
deallocate(e_in)
Plus this would not involve changing any default environment parameters.
Acknowledgement to and refer to ohm314's solution here: large array using heap memory allocation

Is there a way to use math expressions in gnu assembly constants?

What is the correct gnu assembly syntax for doing the following:
.section .data2
.asciz "******* Output Data ********"
total_sectors_written: .word 0x0
max_buffer_sectors: .word ((0x9fc00 - $data_buffer) / 512) # <=== need help here
.align 512
data_buffer: .asciz "<The actual data will overwrite this>"
Specifically, I'm writing a toy OS. The code above is in 16-bit real mode. I'm setting up a data buffer that will be dumped back to the boot disk. I want to calculate the number of sectors there are between where data_buffer gets placed in memory, and the upper bound of that data buffer. (Address 0x9fc00 is where the buffer would run into RAM reserved for other purposes.)
I know I could write assembly code to calculate this; but, since it is a constant known at build time, I'm curious if I can get the assembler to calculate it for me.
I'm running into three specific problems:
(1) If I use $data_buffer I get this error:
os_src/boot.S: Assembler messages:
os_src/boot.S:497: Error: missing ')'
os_src/boot.S:497: Error: can't resolve `L0' {*ABS* section} - `$data_buffer' {*UND* section}
which I find confusing, because I should use $ when I want the memory address of a label, correct?
(2) If I use data_buffer instead of $data_buffer, I get this error:
os_src/boot.S: Assembler messages:
os_src/boot.S:497: Error: missing ')'
os_src/boot.S:497: Error: value of 653855 too large for field of 2 bytes at 31
make: *** [obj/boot/dd_test.o] Error 1
which seems to suggest that the assembler is complaining about the size of the intermediate value (which does not need to fit in a 16-bit word).
(3) And, of course, what's up with the missing ')'?
When you use expressions in GNU assembler they have to resolve to absolute values. GNU assembler isn't aware of what the origin point of the code will actually be at. That is what the linker is for. Because of that data_buffer absolute address isn't known until linking is done so it is considered relocatable. If you take an absolute value like 0x9fc00 and subtract a relocatable value from it you get a relocatable value. Relocatable values can't be used in constant (absolute) expressions.
All is not lost. The linker itself will know the absolute address once it arranges everything in memory. You seem to suggest you already use a linker script which means the work you have to do is minimal. You can use the linker to compute the value of max_buffer_sectors.
Your linker script will have a SECTIONS directive like:
SECTIONS
{
[your section contents here]
}
You can create a linker symbol max_buffer_sectors with something like:
SECTIONS
{
max_buffer_sectors = (0x9fc00 - (data_buffer)) / 512;
[your section contents here]
}
This will allow the linker to compute the size since it will know data_buffer absolute address in memory.
Your GNU assembly file will need a bit of tweaking:
.globl data_buffer
.section .data2
.asciz "******* Output Data ********"
total_sectors_written: .word 0x0
.align 512
data_buffer: .asciz "<The actual data will overwrite this>"
You'll notice I used .globl data_buffer. This exports the symbol and makes it global so that the linker can use it.
You can then use the symbol max_buffer_sectors in code like:
mov $max_buffer_sectors, %ax

How does the `asm()` function works in C language?

I am learning Operating System Development and a Beginner of course. I would like to build my system in real mode environment which is a 16 bit environment using C language.
In C, I used a function asm() to convert the codes to 16 bit as follows:
asm(".code16")
which in GCC's language to generate 16 bit executables(not exactly though).
Question:
Suppose I have two header files head1.h and head2.h and a main.c file. The contents of main.c file are as follows:
asm(".code16");
#include<head1.h>
#include<head2.h>
int main(){
return 0;
}
Now, Since I started my code with the command to generate 16 bit executable file and then included head1.h and head2.h, will I need to do the same in all header files that I am to create? (or) Is it sufficient to add the line asm(".code16"); once?
OS: Ubuntu
Compiler: Gnu CC
To answer your question: It suffices for the asm block to be present at the beginning of the translation unit.
So putting it once at the beginning will do.
But you can do better: you can avoid it altogether and use the -m16 command line option (available from 5.2.0) instead.
But you can do better: you can avoid it altogether.
The effect of -m16 and .code16 is to make 32-bit code executable in real mode, it is not to produce real mode code.
Look
16.c
int main()
{
return 4;
}
Extracting the raw .text segment
>gcc -c -m16 16.c
>objcopy -j .text -O binary 16.o 16.bin
>ndisasm 16.bin
we get
00000000 6655 push ebp
00000002 6689E5 mov ebp,esp
00000005 6683E4F0 and esp,byte -0x10
00000009 66E800000000 call dword 0xf
0000000F 66B804000000 mov eax,0x4
00000015 66C9 o32 leave
00000017 66C3 o32 ret
Which is just 32-bit code filled with operand size prefixes.
On a real pre-386 machine this won't work as the 66h opcode is UD.
There are old 16-bit compilers, like Turbo C1, that address the problematic of the real-mode applications properly.
Alternatively, switch in protected mode as soon as possible or consider using UEFI.
1 It is available online. This compiler is as old as me!
It is not needed to add asm("code16") neither in head1.h nor head2.h.
The main reason is how the C pre-compiler works. It replaces the content of head1.h and head2.h within main.c.
Please check How `#include' Works for further information.
Hope it helps!
Best regards,
Miguel Ángel

Finding where <<loop>> happened

If we get a <<loop>>, it means that Haskell had managed to detect an infinite loop. Is there a way to get ghc to tell us where this loop happened? It seems that Haskell should have this information somewhere.
Compile your app with -prof and -fprof-auto(if you're using Cabal, use --enable-executable-profiling and --ghc-options=-fprof-auto) and then run it with +RTS -xc. It'll print a stack trace when errors happen. This should help you narrow your scope.
Example:
➜ haskell cat loop.hs
myFun :: Int
myFun =
let g = g + 1
in g + 10
main = print myFun
➜ haskell ghc loop.hs -prof -fprof-auto
[1 of 1] Compiling Main ( loop.hs, loop.o )
Linking loop ...
➜ haskell ./loop +RTS -xc
*** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace:
Main.myFun.g,
called from Main.myFun,
called from Main.CAF
*** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace:
Main.myFun.g,
called from Main.myFun,
called from Main.CAF
loop: <<loop>>
In addition to what has already been written: These loops are only detected at run-time. The detection is based on the code attempting to evaluate a value which is already being evaluated [by the same thread]. Clearly that should never happen.
If you're looking for a compiler switch to detect this at compile-time... you're out of luck. It's easy enough to statically spot recursion, but deciding whether the recursion is infinite or not isn't so easy.

Resources