After a search without any result I want to ask you a question regarding subroutines containing optional arguments and how they are treated by the compiler (run time/compile time). Consider the following example program.
module CONTAINS_ABC
contains
subroutine ABC( a, b, c)
implicit none
real, intent(in) :: a
real, intent(in), optional :: b, c
if( present(b) )then ! PATH_B
write(*,*) 'Provide c'
else if( present(c) )then "! PATH_C
write(*,*) 'Provide b'
end if
! Do all the computations with a, b and c
end subroutine ABC
end module CONTAINS_ABC
program CALL_ABC
use CONTAINS_ABC
real :: aa, bb, cc
call ABC( a = aa, b=bb )
call ABC( a = aa, c=cc )
end program CALL_ABC
I wonder how the compiler treats subroutines with optional arguments in terms of optimization. Does the compiler produce implicitly two interfaces for the subroutine FCN and chooses then the right one during compile time in the main program? Furthermore, is the present(b)/present(c) statement evaluated at runtime or during compile time?
In case I understand things right, the compiler might know that the first call to ABC leads to the path B, while the second call to ABC must lead to path C.
I ask this question, since I have a subroutine that is called million-million of times.
I want to avoid that during runtime the decision is made wether to go along path B or path C. It would be possible of course two write simply two subroutines, however, this would produce a lot of additional lines that would actually do the same thing.
Thank you all in advance for your help!
The Fortran standards are silent on pretty much every aspect of the implementation of the language other than the syntax and semantics. For example, people often think that Fortran passes arguments by reference but the standard only requires that programs behave as if arguments are passed by reference. Implementers are free to use pass-by-pixies if they wish and if they can thereby produce a convincing simulation of pass by reference.
That's just a long-winded introduction to telling you that how a particular compiler implements a language feature is, in general, compiler-specific. Implementation details are also know to vary from version to version of the same compiler.
I think that you are wrong to think that the presence of an optional argument will be checked at compile-time even if it is provable that it could be -- I just don't think that that is something that the current crop of compilers do. I expect the compiler to generate a single implementation of the subroutine with optional arguments. It does, after all, rest with the programmer to ensure that a procedure does not attempt to process absent optional arguments.
If you, the programmer, know that a procedure will be executed many times and if you suspect that different implementations, one with argument b and without c, one vice-versa, then it's on you to determine if separate implementations will be faster than one implementation with optional arguments. If you are concerned about code duplication you could always have yet a third procedure to implement the common code, and call it from both variants.
Equally, it is on you to inspect the assembler generated by your compiler to see just what your compiler (version) does with the variations in code that you write.
I will add an example. This is what gfortran-4.8 does with your code with -Ofast (-fdump-tree-optimized). There is only one copy of the subroutine and the if (present()) calls are changed to if (b_1(D) != 0B). Maybe some smaller subroutine would be inlined. Then I would expect such an optimization to kick in, but not here.
;; Function abc (__contains_abc_MOD_abc, funcdef_no=0, decl_uid=1875, cgraph_uid=0)
abc (real(kind=4) & restrict a, real(kind=4) * b, real(kind=4) * c)
{
struct __st_parameter_dt dt_parm.1;
struct __st_parameter_dt dt_parm.0;
<bb 2>:
if (b_1(D) != 0B)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
dt_parm.0.common.filename = &"opt.f90"[1]{lb: 1 sz: 1};
dt_parm.0.common.line = 13;
dt_parm.0.common.flags = 128;
dt_parm.0.common.unit = 6;
_gfortran_st_write (&dt_parm.0);
_gfortran_transfer_character_write (&dt_parm.0, &"Provide c"[1]{lb: 1 sz: 1}, 9);
_gfortran_st_write_done (&dt_parm.0);
dt_parm.0 ={v} {CLOBBER};
goto <bb 6>;
<bb 4>:
if (c_11(D) != 0B)
goto <bb 5>;
else
goto <bb 6>;
<bb 5>:
dt_parm.1.common.filename = &"opt.f90"[1]{lb: 1 sz: 1};
dt_parm.1.common.line = 15;
dt_parm.1.common.flags = 128;
dt_parm.1.common.unit = 6;
_gfortran_st_write (&dt_parm.1);
_gfortran_transfer_character_write (&dt_parm.1, &"Provide b"[1]{lb: 1 sz: 1}, 9);
_gfortran_st_write_done (&dt_parm.1);
dt_parm.1 ={v} {CLOBBER};
<bb 6>:
return;
}
;; Function main (main, funcdef_no=2, decl_uid=1889, cgraph_uid=2) (executed once)
main (integer(kind=4) argc, character(kind=1) * * argv)
{
real(kind=4) aa;
real(kind=4) bb;
real(kind=4) cc;
static integer(kind=4) options.2[7] = {68, 1023, 0, 0, 1, 1, 0};
<bb 2>:
_gfortran_set_args (argc_2(D), argv_3(D));
_gfortran_set_options (7, &options.2[0]);
abc (&aa, &bb, 0B);
abc (&aa, 0B, &cc);
aa ={v} {CLOBBER};
bb ={v} {CLOBBER};
cc ={v} {CLOBBER};
return 0;
}
I checked this is also the case for the final assembly, but it is much more clearly visible here in GIMPLE.
Related
I am new to OpenMP. I want to solve a stiff ODE system for a range of parameter values using parallel do loops. I use the following code in Fortran given below. However, I do not know whether calling a stiff solver(as a subroutine) inside a parallel do loop is allowed or not? Also, I want to write the time series data into files with filenames such as "r_value_s__value.txt" in the subroutine before the return to the main program. Can anyone help. Below is the code and the error. I used gfortran with flags -fopenmp to compile.
PROGRAM OPENMP_PARALLEL_STIFF
USE omp_lib
IMPLICIT NONE
INTEGER :: I, J
INTEGER, PARAMETER :: RTOT=10, STOT=15
INTEGER :: TID
INTEGER, PARAMETER :: NUM_THREADS=8
DOUBLE PRECISION :: T_INITIAL, T_FINAL
CALL OMP_SET_NUM_THREADS(NUM_THREADS)
CALL CPU_TIME(T_INITIAL)
PRINT*, "TIME INITIAL ",T_INITIAL
!$OMP PARALLEL DO PRIVATE(I,J,TID)
DO I=1,RTOT
DO J=1,STOT
TID=OMP_GET_THREAD_NUM()
CALL STIFF_DRIVER(TID,I,J,RTOT,STOT)
END DO
END DO
!$OMP END PARALLEL DO
CALL CPU_TIME(T_FINAL)
PRINT*, "TIME FINAL ",T_FINAL
PRINT*, "TIME ELAPSED ",(T_FINAL-T_INITIAL)/NUM_THREADS
END PROGRAM OPENMP_PARALLEL_STIFF
SUBROUTINE STIFF_DRIVER(TID,II,JJ,RTOT,STOT)
USE USEFUL_PARAMETERS_N_FUNC
USE DVODE_F90_M
! Type declarations:
IMPLICIT NONE
! Number of odes for the problem:
INTEGER :: SERIAL_NUMBER, TID
INTEGER :: II, JJ, RTOT, STOT, IND
INTEGER :: J, NTOUT
INTEGER :: ITASK, ISTATE, ISTATS, I
! parameters : declaration
DOUBLE PRECISION, PARAMETER :: s0=0.450D0, dr=1.0D-4, ds=1.0D-2
DOUBLE PRECISION, DIMENSION(NEQ) :: Y, YOUT
DOUBLE PRECISION :: ATOL, RTOL, RSTATS, T, TOUT, EPS, TFINAL, DELTAT
DIMENSION :: RSTATS(22), ISTATS(31)
DOUBLE PRECISION :: bb, cc, ba, ba1, eta
CHARACTER(len=45) :: filename
TYPE (VODE_OPTS) :: OPTIONS
SERIAL_NUMBER=3011+II+(JJ-1)*RTOT
IND=TID+3011+II+(JJ-1)*RTOT
WRITE (*,12)SERIAL_NUMBER,TID
12 FORMAT ("SL. NO. ",I5," THREAD NO.",I3)
r=(II-1)*dr
s=s0+JJ*ds
EPS = 1.0D-9
! Open the output file:
WRITE (filename,93)r,s
93 FORMAT ("r_",f6.4,"_s_",f4.2,".txt")
OPEN (UNIT=IND,FILE=filename,STATUS='UNKNOWN',ACTION='WRITE')
! Parameters for the stiff ODE system
q0 = 0.60D0; v = 3.0D0
Va = 20.0D-4; Vs = 1.0D-1
e1 = 1.0D-1; e2 = 1.10D-5; e3 = 2.3D-3; e4=3.0D-4
del = 1.7D-4; mu = 5.9D-4
al = 1.70D-4; be = 8.9D-4; ga = 2.5D-1
! S and r dependent parameters
e1s = e1/s; e2s = e2/(s**2); e3s = e3/s; e4s = e4/s
dels = del*s; rs = r*s
e1v = e1/v; e2v = e2/(v**2); e3v = e3/v; e4v = e4/v
delv = del*v; rv = r*v
! SET INITIAL PARAMETERS for INTEGRATION ROUTINES
T = 0.0D0
TFINAL = 200.0D0
DELTAT = 0.10D0
NTOUT = INT(TFINAL/DELTAT)
RTOL = EPS
ATOL = EPS
ITASK = 1
ISTATE = 1
! Set the initial conditions: USING MODULE USEFUL_PARAMETERS_N_FUNC
CALL Y_INITIAL(NEQ,Y)
! Set the VODE_F90 options:
OPTIONS = SET_OPTS(DENSE_J=.TRUE.,USER_SUPPLIED_JACOBIAN=.FALSE., &
RELERR=RTOL,ABSERR=ATOL,MXSTEP=100000)
! Integration:
DO I=1,NTOUT
TOUT = (I-1)*DELTAT
CALL DVODE_F90(F_FUNC,NEQ,Y,T,TOUT,ITASK,ISTATE,OPTIONS)
! Stop the integration in case of an error
IF (ISTATE<0) THEN
WRITE (*,*)"ISTATE ", ISTATE
STOP
END IF
! WRITE DATA TO FILE
WRITE (IND,*) TOUT,T, Y(NEQ-2)
END DO
CLOSE(UNIT=IND)
RETURN
END SUBROUTINE STIFF_DRIVER
At line ** of file openmp_parallel_stiff.f90 (unit = 3013)
Fortran runtime error: File already opened in another unit
The issue is the format that you chose: f6.4 for r will overflow for r>=10. Then, the output will be six asterisks ****** (depending on the compiler) for all values of r>=10 on all threads. The same holds true for s.
I would suggest to either limit/check the range of these values or extend the format to honor more digits.
As #francescalus mentioned, another possibility is hit a combination of II and JJ where r and s are identical.
Just for the fun of it - let's do the math:
r=(II-1)*dr
s=s0+JJ*ds
From r=s follows
(II-1)*dr = s0+JJ*ds
or
II = 1 + s0/dr + JJ*ds/dr
Using the constants s0=0.450D0, dr=1.0D-4, ds=1.0D-2 yields
II = 4501 + JJ*10
So, whenever this combination is true for two (or more) threads at a time, you run into the observed issue.
Simple solution for this case: add the thread number to the file name.
I took an example of data transfer between Host and Device for CUDA Fortran and found this:
Host Code:
program incTest
use cudafor
use simpleOps_m
implicit none
integer, parameter :: n = 256
integer :: a(n), b, i
integer, device :: a_d(n)
a = 1
b = 3
a_d = a
call inc<<<1,n>>>(a_d, b)
a = a_d
if (all(a == 4)) then
write(*,*) 'Success'
endif
end program incTest
Device Code:
module simpleOps_m
contains
attributes(global) subroutine inc(a, b)
implicit none
integer :: a(:)
integer, value :: b
integer :: i
i = threadIdx%x
a(i) = a(i)+b
end subroutine inc
end module simpleOps_m
The expected outcome is the console presenting "Success", but this did not happen. Nothing appears in the screen, nothing errors or messages.
This happen because don't enter in if, because a_d has the same value that before call inc subroutine.
I'm using:
OS: Linux - Ubuntu 16
Cuda 8
PGI to compile
Commands to compile:
pgf90 -Mcuda -c Device.cuf
pgf90 -Mcuda -c Host.cuf
pgf90 -Mcuda -o HostDevice Device.o Host.o
./HostDevice
I tried other examples and they did not work too.
I tried using simple Fortran (.f90) code with the same commands to compile and it works!
How can I fix this problem?
What type of device are you using? (If you don't know, post the output from the "pgaccelinfo" utility).
My best guess is that you have a Pascal based device in which case you need to compile with "-Mcuda=cc60".
For example, if I add error checking to the example code, we see that we get an invalid device kernel error when running on a Pascal without the "cc60" as part of the compilation.
% cat test.cuf
module simpleOps_m
contains
attributes(global) subroutine inc(a, b)
implicit none
integer :: a(:)
integer, value :: b
integer :: i
i = threadIdx%x
a(i) = a(i)+b
end subroutine inc
end module simpleOps_m
program incTest
use cudafor
use simpleOps_m
implicit none
integer, parameter :: n = 256
integer :: a(n), b, i, istat
integer, device :: a_d(n)
a = 1
b = 3
a_d = a
call inc<<<1,n>>>(a_d, b)
istat=cudaDeviceSynchronize()
istat=cudaGetLastError()
a = a_d
if (all(a == 4)) then
write(*,*) 'Success'
else
write(*,*) 'Error code:', cudaGetErrorString(istat)
endif
end program incTest
% pgf90 test.cuf -Mcuda
% a.out
Error code:
invalid device function
% pgf90 test.cuf -Mcuda=cc60
% a.out
Success
I am trying out various ways of IPC to do the following:
Master starts.
Master starts a slave.
Master passes an array to slave.
Slave processes the array.
Slave sends the array back to master.
I have tried using OpenMPI to solve this by having the parent process spawn a child which in turn does the aforementioned processing. However, I have also tried - what I thought would be the worst possible way to do this - letting master write the data to a file and have slave read and write back to that file. The result is stunning.
Below is the two ways in which I achieve this. The first way is the "file" way, the second one is by using OpenMPI.
Master.f90
program master
implicit none
integer*4, dimension (10000) :: matrix
integer :: length, i, exitstatus, cmdstatus
logical :: waistatus
! put integers in matrix and output data into a file
open(1, file='matrixdata.dat', status='new')
length = 10000
do i=1,length
matrix(i) = i
write(1,*) matrix(i)
end do
close(1)
call execute_command_line("./slave.out", wait = .true., exitstat=exitstatus)
if(exitstatus .eq. 0) then
! open and read the file changed by subroutine slave
open(1, file= 'matrixdata.dat', status='old')
do i = 1, length
read(1,*) matrix(i)
end do
close(1)
endif
end program master
Slave.f90
program slave
implicit none
integer*4, dimension (10000) :: matrix
integer :: length, i
! Open and read the file made by master into a matrix
open (1, file= 'matrixdata.dat', status = 'old')
length = 10000
do i = 1, length
read(1,*) matrix(i)
end do
close(1)
! Square all numbers and write over the file with new data
open(1, file= 'matrixdata.dat', status = 'old')
do i=1,length
matrix(i) = matrix(i)**2
write(1,*) matrix(i)
end do
close(1)
end program slave
* OpenMPI *
Master.f90
program master
use mpi
implicit none
integer :: ierr, num_procs, my_id, intercomm, i, siz, array(10000000), s_tag, s_dest, siffra
CALL MPI_INIT(ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_id, ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
siz = 10000
!print *, "S.Rank =", my_id
!print *, "S.Size =", num_procs
if (.not. (ierr .eq. 0)) then
print*, "S.Unable to initilaize bös!"
stop
endif
do i=1,size(array)
array(i) = 2
enddo
if (my_id .eq. 0) then
call MPI_Comm_spawn("./slave.out", MPI_ARGV_NULL, 1, MPI_INFO_NULL, my_id, &
& MPI_COMM_WORLD, intercomm, MPI_ERRCODES_IGNORE, ierr)
s_dest = 0 !rank of destination (integer)
s_tag = 1 !message tag (integer)
call MPI_Send(array(1), siz, MPI_INTEGER, s_dest, s_tag, intercomm, ierr)
call MPI_Recv(array(1), siz, MPI_INTEGER, s_dest, s_tag, intercomm, MPI_STATUS_IGNORE, ierr)
!do i=1,10
! print *, "S.Array(",i,"): ", array(i)
!enddo
endif
call MPI_Finalize(ierr)
end program master
Slave.f90
program name
use mpi
implicit none
! type declaration statements
integer :: ierr, parent, my_id, n_procs, i, siz, array(10000000), ctag, csource, intercomm, siffra
logical :: flag
siz = 10000
! executable statements
call MPI_Init(ierr)
call MPI_Initialized(flag, ierr)
call MPI_Comm_get_parent(parent, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, my_id, ierr)
call MPI_Comm_size(MPI_COMM_WORLD, n_procs, ierr)
csource = 0 !rank of source
ctag = 1 !message tag
call MPI_Recv(array(1), siz, MPI_INTEGER, csource, ctag, parent, MPI_STATUS_IGNORE, ierr)
!do i=1,10
! print *, "C.Array(",i,"): ", array(i)
!enddo
do i=1,size(array)
array(i) = array(i)**2
enddo
!do i=1,10
! print *, "C.Array(",i,"): ", array(i)
!enddo
call MPI_Send(array(1), siz, MPI_INTEGER, csource, ctag, parent, ierr)
call MPI_Finalize(ierr)
end program name
Now, the interesting part is that by using the time program I have measured that it takes 19.8 ms to execute the "file version of the program". The OpenMPI version takes 60 ms. Why? Is there really so much overhead in OpenMPI that it is faster to read/write to file if you're working with <400 KiB?
I tried increasing the array to 10^5 integers. The file version executes in 114ms, OpenMPI in 53ms. When increasing to 10^6 integers file: 1103 ms, OpenMPI: 77ms.
Is the overhead really that much?
Fundamentally, it doesn't make sense to use distributed processing for problem sizes that fit in cache (except in some trivially parallel cases). The typical usage scenario is for data transfer much larger than LLC. Even you biggest case (10^6) fits in modern caches.
Firstly, for the method of writing to disk, you have to be aware of the influence of a page cache in your operating system. If your MPI processes are on the same chip, the operating system just hears 'do a write' then 'do a read'. If, in the interim, nothing pollutes the page cache then it will just fetch the data from RAM as oppose to the disk. A better experiment would be to flush the page cache between the write and read (this is possible, at least on linux, via a shell command). In effect you are performing shared memory processing if you're grabbing the data from the page cache.
Also, you are using time on the command line so you're incorporating the time it takes for MPI to initialize and establish communication interfaces with a few function calls. This is not a good benchmark because the interface provided for disk IO method has already been initialized by the operating system. Also for such a small problem size, the initialization of MPI is nontrivial compared to the runtime of the body of the program. The proper way to do this is to do the timing in the code.
For both methods, you should expect linear scaling biased by the overhead of the method. In fact, you should see a few regimes as the data size surpasses LLC and page cache. The best way to do this is to repeat your runs with ARRAY_SIZE=2^n for n=12,13,..24 and check out the curve.
After a search without any result I want to ask you a question regarding subroutines containing optional arguments and how they are treated by the compiler (run time/compile time). Consider the following example program.
module CONTAINS_ABC
contains
subroutine ABC( a, b, c)
implicit none
real, intent(in) :: a
real, intent(in), optional :: b, c
if( present(b) )then ! PATH_B
write(*,*) 'Provide c'
else if( present(c) )then "! PATH_C
write(*,*) 'Provide b'
end if
! Do all the computations with a, b and c
end subroutine ABC
end module CONTAINS_ABC
program CALL_ABC
use CONTAINS_ABC
real :: aa, bb, cc
call ABC( a = aa, b=bb )
call ABC( a = aa, c=cc )
end program CALL_ABC
I wonder how the compiler treats subroutines with optional arguments in terms of optimization. Does the compiler produce implicitly two interfaces for the subroutine FCN and chooses then the right one during compile time in the main program? Furthermore, is the present(b)/present(c) statement evaluated at runtime or during compile time?
In case I understand things right, the compiler might know that the first call to ABC leads to the path B, while the second call to ABC must lead to path C.
I ask this question, since I have a subroutine that is called million-million of times.
I want to avoid that during runtime the decision is made wether to go along path B or path C. It would be possible of course two write simply two subroutines, however, this would produce a lot of additional lines that would actually do the same thing.
Thank you all in advance for your help!
The Fortran standards are silent on pretty much every aspect of the implementation of the language other than the syntax and semantics. For example, people often think that Fortran passes arguments by reference but the standard only requires that programs behave as if arguments are passed by reference. Implementers are free to use pass-by-pixies if they wish and if they can thereby produce a convincing simulation of pass by reference.
That's just a long-winded introduction to telling you that how a particular compiler implements a language feature is, in general, compiler-specific. Implementation details are also know to vary from version to version of the same compiler.
I think that you are wrong to think that the presence of an optional argument will be checked at compile-time even if it is provable that it could be -- I just don't think that that is something that the current crop of compilers do. I expect the compiler to generate a single implementation of the subroutine with optional arguments. It does, after all, rest with the programmer to ensure that a procedure does not attempt to process absent optional arguments.
If you, the programmer, know that a procedure will be executed many times and if you suspect that different implementations, one with argument b and without c, one vice-versa, then it's on you to determine if separate implementations will be faster than one implementation with optional arguments. If you are concerned about code duplication you could always have yet a third procedure to implement the common code, and call it from both variants.
Equally, it is on you to inspect the assembler generated by your compiler to see just what your compiler (version) does with the variations in code that you write.
I will add an example. This is what gfortran-4.8 does with your code with -Ofast (-fdump-tree-optimized). There is only one copy of the subroutine and the if (present()) calls are changed to if (b_1(D) != 0B). Maybe some smaller subroutine would be inlined. Then I would expect such an optimization to kick in, but not here.
;; Function abc (__contains_abc_MOD_abc, funcdef_no=0, decl_uid=1875, cgraph_uid=0)
abc (real(kind=4) & restrict a, real(kind=4) * b, real(kind=4) * c)
{
struct __st_parameter_dt dt_parm.1;
struct __st_parameter_dt dt_parm.0;
<bb 2>:
if (b_1(D) != 0B)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
dt_parm.0.common.filename = &"opt.f90"[1]{lb: 1 sz: 1};
dt_parm.0.common.line = 13;
dt_parm.0.common.flags = 128;
dt_parm.0.common.unit = 6;
_gfortran_st_write (&dt_parm.0);
_gfortran_transfer_character_write (&dt_parm.0, &"Provide c"[1]{lb: 1 sz: 1}, 9);
_gfortran_st_write_done (&dt_parm.0);
dt_parm.0 ={v} {CLOBBER};
goto <bb 6>;
<bb 4>:
if (c_11(D) != 0B)
goto <bb 5>;
else
goto <bb 6>;
<bb 5>:
dt_parm.1.common.filename = &"opt.f90"[1]{lb: 1 sz: 1};
dt_parm.1.common.line = 15;
dt_parm.1.common.flags = 128;
dt_parm.1.common.unit = 6;
_gfortran_st_write (&dt_parm.1);
_gfortran_transfer_character_write (&dt_parm.1, &"Provide b"[1]{lb: 1 sz: 1}, 9);
_gfortran_st_write_done (&dt_parm.1);
dt_parm.1 ={v} {CLOBBER};
<bb 6>:
return;
}
;; Function main (main, funcdef_no=2, decl_uid=1889, cgraph_uid=2) (executed once)
main (integer(kind=4) argc, character(kind=1) * * argv)
{
real(kind=4) aa;
real(kind=4) bb;
real(kind=4) cc;
static integer(kind=4) options.2[7] = {68, 1023, 0, 0, 1, 1, 0};
<bb 2>:
_gfortran_set_args (argc_2(D), argv_3(D));
_gfortran_set_options (7, &options.2[0]);
abc (&aa, &bb, 0B);
abc (&aa, 0B, &cc);
aa ={v} {CLOBBER};
bb ={v} {CLOBBER};
cc ={v} {CLOBBER};
return 0;
}
I checked this is also the case for the final assembly, but it is much more clearly visible here in GIMPLE.
Now I am facing the problem that in a module, with a seed I am generating random numbers to be used in a loop of a function but each time I call that function, the same random numbers are generated (because the seed is obviously the same) but it's supposed that it must continue the series or at least it must be different between calls. One solution could be that the main program gives a new seed to be used in the module but I think it there could be another elegant solution.
I am using Mersenne Twister generator by suggestion of many people.
Added
My function in my module (it is a package of functions) escentially makes such a Metropolis test using random numbers generated by a seed, for some reason compilation complains if I put
module mymod
uses mtmod
call sgrnd(4357)!<-- this line causes compilation error
contains
myfunc(args)
implicit none
// declarations etc
!call sgrnd(4357) <-- if I put this call here compilator says ok,
!but re-start random number series each time this function is called :(
....
!the following part is inside a loop
if (prob < grnd()) then
!grnd() is random number generated
return
else continue testing to the end of the loop cycle
end myfunc
But if I put that function in the contains of the main program (using mtmod too) and call sgrnd(4357) before contains section and the calls to myfunc, now everything compile and run nicely. For clarity, I didn't want to put that long function in the main program, it has 70 lines of code, but it seems I have no escape. Notice that so, the seed is once called. The simulations have now physical meanings but with that price payed.
I always used this subroutine (I'm running a MonteCarlo simulation), call it in the beginning of your main program and tis should do the job:
(Source: gfortran 4.6.1)
c initialize a random seed from the system clock at every run (fortran 95 code)
subroutine init_random_seed()
INTEGER :: i, n, clock
INTEGER, DIMENSION(:), ALLOCATABLE :: seed
CALL RANDOM_SEED(size = n)
ALLOCATE(seed(n))
CALL SYSTEM_CLOCK(COUNT=clock)
seed = clock + 37 * (/ (i - 1, i = 1, n) /)
CALL RANDOM_SEED(PUT = seed)
DEALLOCATE(seed)
end
You can find here a subroutine that uses system time to re-seed the random number generator. You shouldn't have to do this every time you call random_number(), just each time you re-start the program.
Honestly, it didn't take me more than ten minutes to find this with Google.
In order to recover my points taken, I was obliged to find my own answer, here it is (after one hour of tries)
main program is
program callrtmod
use mymod
implicit none
real::x
x=1.0
write(*,*) x+writerandnum()
write(*,*) x+writerandnum()
write(*,*) x+writerandnum()
end program callrtmod
here's my module
module mymod
implicit none
!-------------mt variables-------------
! Default seed
integer, parameter :: defaultsd = 4357
! Period parameters
integer, parameter :: N = 624, N1 = N + 1
! the array for the state vector
integer, save, dimension(0:N-1) :: mt
integer, save :: mti = N1
!--------------------------------------
contains
function writerandnum
implicit none
real(8)::writerandnum
writerandnum = grnd()
!if you please, you could perform a Metropolis test here too
end function writerandnum
!Initialization subroutine
subroutine sgrnd(seed)
implicit none
integer, intent(in) :: seed
mt(0) = iand(seed,-1)
do mti=1,N-1
mt(mti) = iand(69069 * mt(mti-1),-1)
enddo
!
return
end subroutine sgrnd
!---------------------------------------------------------------------------
!the function grnd was here
!---------------------------------------------------------------------------
subroutine mtsavef( fname, forma )
character(*), intent(in) :: fname
character, intent(in) :: forma
select case (forma)
case('u','U')
open(unit=10,file=trim(fname),status='UNKNOWN',form='UNFORMATTED', &
position='APPEND')
write(10)mti
write(10)mt
case default
open(unit=10,file=trim(fname),status='UNKNOWN',form='FORMATTED', &
position='APPEND')
write(10,*)mti
write(10,*)mt
end select
close(10)
return
end subroutine mtsavef
subroutine mtsaveu( unum, forma )
integer, intent(in) :: unum
character, intent(in) :: forma
select case (forma)
case('u','U')
write(unum)mti
write(unum)mt
case default
write(unum,*)mti
write(unum,*)mt
end select
return
end subroutine mtsaveu
subroutine mtgetf( fname, forma )
character(*), intent(in) :: fname
character, intent(in) :: forma
select case (forma)
case('u','U')
open(unit=10,file=trim(fname),status='OLD',form='UNFORMATTED')
read(10)mti
read(10)mt
case default
open(unit=10,file=trim(fname),status='OLD',form='FORMATTED')
read(10,*)mti
read(10,*)mt
end select
close(10)
return
end subroutine mtgetf
subroutine mtgetu( unum, forma )
integer, intent(in) :: unum
character, intent(in) :: forma
select case (forma)
case('u','U')
read(unum)mti
read(unum)mt
case default
read(unum,*)mti
read(unum,*)mt
end select
return
end subroutine mtgetu
!===============================================
!Random number generator
! real(8) function grnd()
function grnd !agregue yo
implicit integer(a-z)
real(8) grnd !agregue yo
! Period parameters
integer, parameter :: M = 397, MATA = -1727483681
! constant vector a
integer, parameter :: LMASK = 2147483647
! least significant r bits
integer, parameter :: UMASK = -LMASK - 1
! most significant w-r bits
! Tempering parameters
integer, parameter :: TMASKB= -1658038656, TMASKC= -272236544
dimension mag01(0:1)
data mag01/0, MATA/
save mag01
! mag01(x) = x * MATA for x=0,1
TSHFTU(y)=ishft(y,-11)
TSHFTS(y)=ishft(y,7)
TSHFTT(y)=ishft(y,15)
TSHFTL(y)=ishft(y,-18)
if(mti.ge.N) then
! generate N words at one time
if(mti.eq.N+1) then
! if sgrnd() has not been called,
call sgrnd( defaultsd )
! a default initial seed is used
endif
do kk=0,N-M-1
y=ior(iand(mt(kk),UMASK),iand(mt(kk+1),LMASK))
mt(kk)=ieor(ieor(mt(kk+M),ishft(y,-1)),mag01(iand(y,1)))
enddo
do kk=N-M,N-2
y=ior(iand(mt(kk),UMASK),iand(mt(kk+1),LMASK))
mt(kk)=ieor(ieor(mt(kk+(M-N)),ishft(y,-1)),mag01(iand(y,1)))
enddo
y=ior(iand(mt(N-1),UMASK),iand(mt(0),LMASK))
mt(N-1)=ieor(ieor(mt(M-1),ishft(y,-1)),mag01(iand(y,1)))
mti = 0
endif
y=mt(mti)
mti = mti + 1
y=ieor(y,TSHFTU(y))
y=ieor(y,iand(TSHFTS(y),TMASKB))
y=ieor(y,iand(TSHFTT(y),TMASKC))
y=ieor(y,TSHFTL(y))
if(y .lt. 0) then
grnd=(dble(y)+2.0d0**32)/(2.0d0**32-1.0d0)
else
grnd=dble(y)/(2.0d0**32-1.0d0)
endif
return
end function grnd
end module mymod
test my solution and vote me up ;) [of course, as you see, I modified mt.f90 code to be included conveniently in my module, so I can keep separately the main program from the randon numbers generation part, so I can do a Metropolis test aside the main program. The main program just wants to know if a trial was accepted or not. My solution does give more clarity to the main progam]