I have a Fortran program where I specify the kind of the numeric data types in an attempt to retain a minimum level of precision, regardless of what compiler is used to build the program. For example:
integer, parameter :: rsp = selected_real_kind(4)
...
real(kind=rsp) :: real_var
The problem is that I have used MPI to parallelize the code and I need to make sure the MPI communications are specifying the same type with the same precision. I was using the following approach to stay consistent with the approach in my program:
call MPI_Type_create_f90_real(4,MPI_UNDEFINED,rsp_mpi,mpi_err)
...
call MPI_Send(real_var,1,rsp_mpi,dest,tag,MPI_COMM_WORLD,err)
However, I have found that this MPI routine is not particularly well-supported for different MPI implementations, so it's actually making my program non-portable. If I omit the MPI_Type_create routine, then I'm left to rely on the standard MPI_REAL and MPI_DOUBLE_PRECISION data types, but what if that type is not consistent with what selected_real_kind picks as the real type that will ultimately be passed around by MPI? Am I stuck just using the standard real declaration for a datatype, with no kind attribute and, if I do that, am I guaranteed that MPI_REAL and real are always going to have the same precision, regardless of compiler and machine?
UPDATE:
I created a simple program that demonstrates the issue I see when my internal reals have higher precision than what is afforded by the MPI_DOUBLE_PRECISION type:
program main
use mpi
implicit none
integer, parameter :: rsp = selected_real_kind(16)
integer :: err
integer :: rank
real(rsp) :: real_var
call MPI_Init(err)
call MPI_Comm_rank(MPI_COMM_WORLD,rank,err)
if (rank.eq.0) then
real_var = 1.123456789012345
call MPI_Send(real_var,1,MPI_DOUBLE_PRECISION,1,5,MPI_COMM_WORLD,err)
else
call MPI_Recv(real_var,1,MPI_DOUBLE_PRECISION,0,5,MPI_COMM_WORLD,&
MPI_STATUS_IGNORE,err)
end if
print *, rank, real_var
call MPI_Finalize(err)
end program main
If I build and run with 2 cores, I get:
0 1.12345683574676513672
1 4.71241976735884452383E-3998
Now change the 16 to a 15 in selected_real_kind and I get:
0 1.1234568357467651
1 1.1234568357467651
Is it always going to be safe to use selected_real_kind(15) with MPI_DOUBLE_PRECISION no matter what machine/compiler is used to do the build?
Use the Fortran 2008 intrinsic STORAGE_SIZE to determine the number bytes that each number requires and send as bytes. Note that STORAGE_SIZE returns the size in bits, so you will need to divide by 8 to get the size in bytes.
This solution works for moving data but does not help you use reductions. For that you will have to implement a user-defined reduction operation. If that's important to you, I will update my answer with the details.
For example:
program main
use mpi
implicit none
integer, parameter :: rsp = selected_real_kind(16)
integer :: err
integer :: rank
real(rsp) :: real_var
call MPI_Init(err)
call MPI_Comm_rank(MPI_COMM_WORLD,rank,err)
if (rank.eq.0) then
real_var = 1.123456789012345
call MPI_Send(real_var,storage_size(real_var)/8,MPI_BYTE,1,5,MPI_COMM_WORLD,err)
else
call MPI_Recv(real_var,storage_size(real_var)/8,MPI_BYTE,0,5,MPI_COMM_WORLD,&
MPI_STATUS_IGNORE,err)
end if
print *, rank, real_var
call MPI_Finalize(err)
end program main
I confirmed that this change corrects the problem and the output I see is:
0 1.12345683574676513672
1 1.12345683574676513672
Not really an answer, but we have the same problem and use something like this:
!> Number of digits for single precision numbers
integer, parameter, public :: single_prec = 6
!> Number of digits for double precision numbers
integer, parameter, public :: double_prec = 15
!> Number of digits for extended double precision numbers
integer, parameter, public :: xdble_prec = 18
!> Number of digits for quadruple precision numbers
integer, parameter, public :: quad_prec = 33
integer, parameter, public :: rk_prec = double_prec
!> The kind to select for default reals
integer, parameter, public :: rk = selected_real_kind(rk_prec)
And then have an initialization routine where we do:
!call mpi_type_create_f90_real(rk_prec, MPI_UNDEFINED, rk_mpi, iError)
!call mpi_type_create_f90_integer(long_prec, long_k_mpi, iError)
! Workaround shitty MPI-Implementations.
select case(rk_prec)
case(single_prec)
rk_mpi = MPI_REAL
case(double_prec)
rk_mpi = MPI_DOUBLE_PRECISION
case(quad_prec)
rk_mpi = MPI_REAL16
case default
write(*,*) 'unknown real type specified for mpi_type creation'
end select
long_k_mpi = MPI_INTEGER8
While this is not nice, it works reasonably well, and seems to be usable on Cray, IBM BlueGene and conventional Linux Clusters.
Best thing to do is push sites and vendors to properly support this in MPI. As far as I know it has been fixed in OpenMPI and planned to be fixed in MPICH by 3.1.1. See OpenMPI Tickets 3432 and 3435 as well as MPICH Tickets 1769 and 1770.
How about:
integer, parameter :: DOUBLE_PREC = kind(0.0d0)
integer, parameter :: SINGLE_PREC = kind(0.0e0)
integer, parameter :: MYREAL = DOUBLE_PREC
if (MYREAL .eq. DOUBLE_PREC) then
MPIREAL = MPI_DOUBLE_PRECISION
else if (MYREAL .eq. SINGLE_PREC) then
MPIREAL = MPI_REAL
else
print *, "Erorr: Can't figure out MPI precision."
STOP
end if
and use MPIREAL instead of MPI_DOUBLE_PRECISION from then on.
Related
I have this line in fortran and I'm getting the compiler error in the title. dFeV is a 1d array of reals.
dFeV(x)=R1*5**(15) * (a**2) * EXP(-(VmigFe)/kbt)
for the record, the variable names are inherited and not my fault. I think this is an issue with not having the memory space to compute the value on the right before I store it on the left as a real (which would have enough room), but I don't know how to allocate more space for that computation.
The problem arises as one part of your computation is done using integer arithmetic of type integer(4).
That type has an upper limit of 2^31-1 = 2147483647 whereas your intermediate result 5^15 = 30517578125 is slightly larger (thanks to #evets comment).
As pointed out in your question: you save the result in a real variable.
Therefor, you could just compute that exponentiation using real data types: 5.0**15.
Your formula will end up like the following
dFeV(x)= R1 * (5.0**15) * (a**2) * exp(-(VmigFe)/kbt)
Note that integer(4) need not be the same implementation for every processor (thanks #IanBush).
Which just means that for some specific machines the upper limit might be different from 2^31-1 = 2147483647.
As indicated in the comment, the value of 5**15 exceeds the range of 4-byte signed integers, which are the typical default integer type. So you need to instruct the compiler to use a larger type for these constants. This program example shows one method. The ISO_FORTRAN_ENV module provides the int64 type. UPDATE: corrected to what I meant, as pointed out in comments.
program test_program
use ISO_FORTRAN_ENV
implicit none
integer (int64) :: i
i = 5_int64 **15_int64
write (*, *) i
end program
Although there does seem to be an additional point here that may be specific to gfortran:
integer(kind = 8) :: result
result = 5**15
print *, result
gives: Error: Result of exponentiation at (1) exceeds the range of INTEGER(4)
while
integer(kind = 8) :: result
result = 5**7 * 5**8
print *, result
gives: 30517578125
i.e. the exponentiation function seems to have an integer(4) limit even if the variable to which the answer is being assigned has a larger capacity.
here is my script below, I use implicit none because I would like to eventually implement them in a bigger program with more variables.
program testdrandm
implicit none
real, external :: drand, drandm, rand
print *, 'drand', drand(0), drand(0)
print *, 'drandm', drandm(0), drandm(0)
print *, 'rand', rand(0), rand(0)
end program testdrandm
here is my output:
drand 4.3290930E-39 -686.1465
drandm -8.9381798E+10 1.7946890E+19
rand 0.9679557 0.1896898
The first number is within range but extremely small and will give me zero values when I use it to multiply other values. Rand works but I would like to use drandm. I would like to get random numbers between 0 to 1. Please let me know if I am using this function incorrectly.
You should use the intrinsic random_seed and random_number to generate random numbers in Fortran. The intrinsic random_number will give you real number(s) between 0 and 1.
See e.g.:
https://gcc.gnu.org/onlinedocs/gfortran/RANDOM_005fSEED.html#RANDOM_005fSEED
https://gcc.gnu.org/onlinedocs/gfortran/RANDOM_005fNUMBER.html#RANDOM_005fNUMBER
#tim18 touched on the answer, but you might not have picked up on exactly WHY you got these results.
I modified the program to print the hex representation of the values returned. When run using ifort, I get:
drand 00000000002F23C0 00000000C42B8960
drandm 00000000D1A67C90 000000005F791029
rand 000000003F77CBF2 000000003E423E09
IEEE double precision is an 8-byte format, and drand/drandm return 8 bytes, but you declared them as real (single precision), so you get only the low 4 bytes and NOT a conversion. Because the size of the exponent field is different between these types (8 bits vs. 11 bits), interpreting the low 4 bytes of a double as a real will get you wrong values.
Now see what happens if I declare drand and drandm as double precision:
drand 3EF791E0002F23C0 3FB5C4AFC42B8960
drandm 3FE33E47D1A67C90 3FEC88145F791029
rand 000000003F77CBF2 000000003E423E09
or if I go back to list-directed:
drand 2.247793601009899E-005 8.503244914348818E-002
drandm 0.601352605317418 0.891611277075303
rand 0.9679557 0.1896898
Better?
That said, I wholly agree with those who suggest using RANDOM_NUMBER instead. You would not have seen this sort of problem if you used the intrinsic procedure.
I'm running a Fortran code which performs a stochastic simulation of a marked Poisson cluster process. In practice, event properties (eg. time of occurrences) are generated by inversion method, i.e. by random sampling of the cumulative distribution function.
Because of the Poissonian randomness, I expect each generated sequence to be different, but this is not the case. I guess the reason is that the seed for the pseudorandom number generator is the same at each simulation.
I do not know Fortran, so I have no idea how to solve this issue. Here is the part of the code involved with the pseudorandom number generator, any idea?
subroutine pseud0(r)
c generation of pseudo-random numbers
c data ir/584287/
data ir/574289/
ir=ir*48828125
if(ir) 10,20,20
10 ir=(ir+2147483647)+1
20 r=float(ir)*0.4656613e-9
return
end
subroutine pseudo(random)
c wichmann+hill (1982) Appl. Statist 31
data ix,iy,iz /1992,1111,1151/
ix=171*mod(ix,177)-2*(ix/177)
iy=172*mod(iy,176)-35*(iy/176)
iz=170*mod(iz,178)-63*(iz/178)
if (ix.lt.0) ix=ix+30269
if (iy.lt.0) iy=iy+30307
if (iz.lt.0) iz=iz+30323
random=mod(float(ix)/30269.0+float(iy)/30307.0+
& float(iz)/30323.0,1.0)
return
end
First, I would review the modern literature for PRNG and pick a modern implementation. Second, I would rewrite the code in modern Fortran.
You need to follow #francescalus advice and have a method for updating the seed. Without attempting to modernizing your code, here is one method for the pseud0 prng
subroutine init0(i)
integer, intent(in) :: i
common /myseed0/iseed
iseed = i
end subroutine init0
subroutine pseud0(r)
common /myseed0/ir
ir = ir * 48828125
if (ir) 10,20,20
10 ir = (ir+2147483647)+1
20 r = ir*0.4656613e-9
end subroutine pseud0
program foo
integer i
real r1
call init0(574289) ! Original seed
do i = 1, 10
call pseud0(r1)
print *, r1
end do
print *
call init0(289574) ! New seed
do i = 1, 10
call pseud0(r1)
print *, r1
end do
print *
end program foo
I'm writing a checkpoint function in my Monte Carlo simulation in Fortran 90/95, the compiler I'm using is ifort 18.0.2, before going through detail just to clarify the version of pseudo-random generator I'm using:
A C-program for MT19937, with initialization, improved 2002/1/26.
Coded by Takuji Nishimura and Makoto Matsumoto.
Code converted to Fortran 95 by Josi Rui Faustino de Sousa
Date: 2002-02-01
See mt19937 for the source code.
The general structure of my Monte Carlo simulation code is given below:
program montecarlo
call read_iseed(...)
call mc_subroutine(...)
end
Within the read_iseed
subroutine read_iseed(...)
use mt19937
if (Restart == 'n') then
call system('od -vAn -N4 -td4 < /dev/urandom > '//trim(IN_ISEED)
open(unit=7,file=trim(IN_ISEED),status='old')
read(7,*) i
close(7)
!This is only used to initialise the PRNG sequence
iseed = abs(i)
else if (Restart == 'y') then
!Taking seed value from the latest iteration of previous simulation
iseed = RestartSeed
endif
call init_genrand(iseed)
print *, 'first pseudo-random value ',genrand_real3(), 'iseed ',iseed
return
end subroutine
Based on my understanding, if the seed value holds a constant, the PRNG should be able to reproduce the pseudo-random sequence every time?
In order to prove this is the case, I ran two individual simulations by using the same seed value, they are able to reproduce the exact sequence. So far so good!
Based on the previous test, I'd further assume that regardless the number of times init_genrand() being called within one individual simulation, the PRNG should also be able to reproduce the pseudo-random value sequence? So I did a little modification to my read_iseed() subroutine
subroutine read_iseed(...)
use mt19937
if (Restart == 'n') then
call system('od -vAn -N4 -td4 < /dev/urandom > '//trim(IN_ISEED)
open(unit=7,file=trim(IN_ISEED),status='old')
read(7,*) i
close(7)
!This is only used to initialise the PRNG sequence
iseed = abs(i)
else if (Restart == 'y') then
!Taking seed value from the latest iteration of the previous simulation
iseed = RestartSeed
endif
call init_genrand(iseed)
print *, 'first time initialisation ',genrand_real3(), 'iseed ',iseed
call init_genrand(iseed)
print *, 'second time initialisation ',genrand_real3(), 'iseed ',iseed
return
end subroutine
The output is surprisingly not the case I thought would be, by all means iseed outputs are identical in between two initializations, however, genrand_real3() outputs are not identical.
Because of this unexpected result, I struggled with resuming the simulation at an arbitrary state of the system since the simulation is not reproducing the latest configuration state of the system I'm simulating.
I'm not sure if I've provided enough information, please let me know if any part of this question needs to be more specific?
From the source code you've provided (See [mt19937]{http://web.mst.edu/~vojtat/class_5403/mt19937/mt19937ar.f90} for the source code.), the init_genrand does not clear the whole state.
There are 3 critical state variables:
integer( kind = wi ) :: mt(n) ! the array for the state vector
logical( kind = wi ) :: mtinit = .false._wi ! means mt[N] is not initialized
integer( kind = wi ) :: mti = n + 1_wi ! mti==N+1 means mt[N] is not initialized
The first one is the "array for the state vector", second one is a flag that ensures we don't start with uninitialized array, and the third one is some position marker, as I guess from the condition stated in the comment.
Looking at subroutine init_genrand( s ), it sets mtinit flag, and fills the mt() array from 1 upto n. Alright.
Looking at genrand_real3 it's based on genrand_int32.
Looking at genrand_int32, it starts up with
if ( mti > n ) then ! generate N words at one time
! if init_genrand() has not been called, a default initial seed is used
if ( .not. mtinit ) call init_genrand( seed_d )
and does its arithmetic magic and then starts getting the result:
y = mt(mti)
mti = mti + 1_wi
so.. mti is a positional index in the 'state array', and it is incremented by 1 after each integer read from the generator.
Back to init_genrand - remember? it have been resetting the array mt() but it has not resetted the MTI back to its starting mti = n + 1_wi.
I bet this is the cause of the phenomenon you've observed, since after re-initializing with the same seed, the array would be filled with the same set of values, but later the int32 generator would read from a different starting point. I doubt it was intended, so it's probably a tiny bug easy to overlook.
Suppose that k processes compute the elements of a matrix A, whose dimension is (n,m), where n is the number of rows and m is the number of columns. I am trying to use MPI_GATHER to gather these two matrices to the matrix B at the root process, where the dimension of B is (n,km). To be more specific, I wrote an example fortran code below. Here, I am passing over the columns of the matrix A (not the entire matrix) to the matrix B but this wouldn't work. When I run the executable using mpirun -n 2 a.out, I get the error:
malloc: *** error for object 0x7ffa89413fb8: incorrect checksum for freed object - object was probably modified after being freed.
1) Why do I get this error message?
2) Who can please explain conceptually, why I have to use MPI_TYPE_VECTOR?
3) How should I correct the MPI_GATHER part of the code? Can I pass over the entire matrix A?
PROGRAM test
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER, PARAMETER :: n=100, m=100
INTEGER, ALLOCATABLE, DIMENSION(:,:) :: A
INTEGER, DIMENSION(n,m) :: B
INTEGER :: ind_a, ind_c
INTEGER :: NUM_PROC, PROC_ID, IERROR, MASTER_ID=0
INTEGER :: c
INTEGER, DIMENSION(m) :: cvec
CALL MPI_INIT(IERROR)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, PROC_ID, IERROR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, NUM_PROC, IERROR)
ALLOCATE(A(n,m/NUM_PROC))
DO ind_c=1,m
cvec(ind_c)=ind_c
END DO
! Fill in matrix A
DO ind_a=1,n
DO ind_c=1,m/NUM_PROC
c=cvec(ind_c+PROC_ID*m/NUM_PROC)
A(ind_a,ind_c)=c*ind_a
END DO
END DO
! Gather the elements at the root process
DO ind_a=1,n
CALL MPI_GATHER(A(ind_a,:),m/NUM_PROC,MPI_INTEGER,B(ind_a,PROC_ID*m/NUM_PROC+1:(PROC_ID+1)*m/NUM_PROC),m/NUM_PROC,MPI_INTEGER,MASTER_ID,MPI_COMM_WORLD,IERROR)
END DO
CALL MPI_FINALIZE(IERROR)
END PROGRAM
There are two types of gather operation that can be performed in a 2 dimensional array.
1. gathering the elements from dimension-2 of all the process and collecting it in the dimension-2 of one process; and
2. gathering the elements from dimension-2 of all the process and collecting it in the dimension-1 of one process.
Said that in this example;
n=dimension-1 and m=dimension-2, and we know that Fortran is column major. Hence, the dimension-1 is contiguous in memory in Fortran.
In your gather statement you are trying to gather dimension-2 of Array-A from all the processes, and collect it into the dimension-2 of Array-B in MASTER_ID proc(TYPE-1). Since, dimension-2 is non-contiguous in memory, this causes the segmentation fault.
A single MPI_Gather call as shown below will get to the required operation, without any looping-tricks as shown above:
CALL MPI_GATHER(A, n*(m/NUM_PROC), MPI_INTEGER, &
B, n*(m/NUM_PROC), MPI_INTEGER, MASTER_ID, &
MPI_COMM_WORLD, IERROR)
But, if you attempting to gather elements from dimension-2 of Array-A from all the process to dimension-1 of Array-B in MASTER_ID proc, that is when we need to make use of MPI_TYPE_VECTOR, where we create a new type with the non-contiguous elements. Let, me know if that is the intention.
Because, the current code logic doesn't look like we need to make use of MPI_TYPE_VECTOR.