Random number generator in PGI Fortran not so random - random

The following code just generates a simple triple of random numbers:
program testrand
integer, parameter :: nz = 160, nf = 160, nlt = 90
real :: tmpidx(3)
integer :: idxarr(3), idx1, idx2, idx3, seed_size, ticks
integer, allocatable :: seed(:)
call random_seed(size=seed_size)
allocate(seed(seed_size))
call system_clock(count=ticks)
seed = ticks+37*(/(i-1, i=1,seed_size)/)
call random_seed(put=seed)
deallocate(seed)
call random_number(tmpidx)
idxarr = tmpidx * (/nz, nf, nlt/)
idx1 = max(1,idxarr(1))
idx2 = max(1,idxarr(2))
idx3 = max(1,idxarr(3))
print *,idx1, idx2, idx3
end program
I compile this with gfortran and run a few times and I get:
> gfortran testrand.f90
> ./a.out
74 98 86
> ./a.out
113 3 10
> ./a.out
44 104 27
Looks pretty random. Now I compile with PGI Fortran and run a few times:
> pgf90 testrand.f90
> ./a.out
1 1 1
> ./a.out
1 1 1
> ./a.out
1 1 1
Of course, there's no way to be completely sure, but I suspect this is not random. :) Anyone know what is going on here? Anyone know the right way to get random numbers with PGI Fortran?

Somehow, PGI does not implement system_clock as in GNU compilers. I do not know why, I found it recently by doing similar stuff like you.
To see what I am talking about, just print ticks after calling system_clock. Chances are that you get 0 all the time with PGI and varying numbers with GNU compilers. To solve your problem, you can adapt the code bellow. It is a slightly modified version of a code that you can get at GNU fortran web site
program testrand
use iso_fortran_env, only: int64
integer, parameter :: nz = 160, nf = 160, nlt = 90
real :: tmpidx(3)
integer :: idxarr(3), idx1, idx2, idx3, seed_size, ticks
integer, allocatable :: seed(:)
call random_seed(size=seed_size)
allocate(seed(seed_size))
! call system_clock(count=ticks)
! seed = ticks+37*(/(i-1, i=1,seed_size)/)
! call random_seed(put=seed)
!
! deallocate(seed)
call init_random_seed()
call random_number(tmpidx)
idxarr = tmpidx * (/nz, nf, nlt/)
idx1 = max(1,idxarr(1))
idx2 = max(1,idxarr(2))
idx3 = max(1,idxarr(3))
print *,idx1, idx2, idx3
contains
!
subroutine init_random_seed()
implicit none
integer, allocatable :: seed(:)
integer :: i, n, istat, dt(8), pid
integer(int64) :: t
integer, parameter :: un=703
call random_seed(size = n)
allocate(seed(n))
! First try if the OS provides a random number generator
open(unit=un, file="/dev/urandom", access="stream", &
form="unformatted", action="read", status="old", iostat=istat)
if (istat == 0) then
read(un) seed
close(un)
else
! The PID is
! useful in case one launches multiple instances of the same
! program in parallel.
call system_clock(t)
if (t == 0) then
call date_and_time(values=dt)
t = (dt(1) - 1970) * 365_int64 * 24 * 60 * 60 * 1000 &
+ dt(2) * 31_int64 * 24 * 60 * 60 * 1000 &
+ dt(3) * 24_int64 * 60 * 60 * 1000 &
+ dt(5) * 60 * 60 * 1000 &
+ dt(6) * 60 * 1000 + dt(7) * 1000 &
+ dt(8)
end if
pid = getpid()
t = ieor( t, int(pid, kind(t)) )
do i = 1, n
seed(i) = lcg(t)
end do
end if
call random_seed(put=seed)
!print*, "optimal seed = ", seed
end subroutine init_random_seed
!
function lcg(s)
integer :: lcg
integer(int64), intent(in out) :: s
if (s == 0) then
s = 104729
else
s = mod(s, 4294967296_int64)
end if
s = mod(s * 279470273_int64, 4294967291_int64)
lcg = int(mod(s, int(huge(0), 8)), kind(0))
end function lcg
!
!this option is especially used for pgf90 to provide a getpid() function
!> #brief Returns the process ID of the current process
!! #todo write the actual code, for now returns a fixed value
!<
function getpid()result(pid)
integer pid
pid = 53 !just a prime number, no special meaning
end function getpid
end program

Related

Unique random numbers series in fortran

I have this problem in Fortran where I need to generate a sequence of random numbers (precisely, 729 random numbers between 1 and 1000) and to do I use the intrinsic function random_number().
The problem, of course, is that it returns pseudorandom numbers and it happens to often that I get same values repeated. I'm not an expert in programming with Fortran, but how can I get 729 unique random numbers without repetitions? Can someone help me?
Here you go
ijb#LAPTOP-GUG8KQ9I:~/work/stack$ cat the_chosen.f90
Program possibly_homework
Use, Intrinsic :: iso_fortran_env, Only : wp => real64
Implicit None( Type, External )
Integer, Parameter :: from = 1000
Integer, Parameter :: to_choose = 729
Real( wp ) :: rand
Integer, Dimension( : ), Allocatable :: the_many
Integer, Dimension( : ), Allocatable :: the_chosen
Integer :: the_chosen_one
Integer :: unit
Integer :: i
Allocate( the_many( 1:from ) )
the_many = [ ( i, i = 1, from ) ]
Allocate( the_chosen( 1:0 ) )
Do i = 1, to_choose
Call Random_number( rand )
the_chosen_one = 1 + Int( Size( the_many ) * rand )
the_chosen = [ the_chosen, the_many( the_chosen_one ) ]
the_many = [ the_many( :the_chosen_one - 1 ), the_many( the_chosen_one + 1: ) ]
End Do
Open( newunit = unit, file = 'the_chosen.dat' )
Do i = 1, Size( the_chosen )
Write( unit, '( i0, t20, i0 )' ) i, the_chosen( i )
End Do
Close( unit )
End Program possibly_homework
ijb#LAPTOP-GUG8KQ9I:~/work/stack$ gfortran --version
GNU Fortran (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ijb#LAPTOP-GUG8KQ9I:~/work/stack$ gfortran -std=f2018 -Wall -Wextra -fcheck=all -O -g the_chosen.f90 -o the_chosen
ijb#LAPTOP-GUG8KQ9I:~/work/stack$ ./the_chosen
ijb#LAPTOP-GUG8KQ9I:~/work/stack$ head the_chosen.dat
1 303
2 916
3 414
4 452
5 116
6 962
7 392
8 675
9 153
10 458
ijb#LAPTOP-GUG8KQ9I:~/work/stack$ sort +1 -n the_chosen.dat | awk '{ print $2 }' | uniq -D
ijb#LAPTOP-GUG8KQ9I:~/work/stack$
This shows how you can in an inelegant way solve your problem.
program p
implicit none
integer, parameter :: n = 10
logical :: a(n) = .false.
integer b(n), cnt
real x
cnt = 1
do
call random_number(x)
b(cnt) = n * x + 1
if (.not. a(b(cnt))) then
if (b(cnt) > n) cycle
a(b(cnt)) = .true.
cnt = cnt + 1
if (cnt > n) exit
end if
end do
print '(*(1X,I0))', b
end program p
Order not important (as per Marta's question at How to remove random rows from a matrix in Fortran90 ). No shuffle required.
program test
implicit none
integer Nmax, Nsample, Nchoose
logical start
logical, allocatable :: mask(:)
integer cnt, i
real r
Nmax = 10 ! or 1000
Nsample = 7 ! or 729
start = .false.
Nchoose = Nsample
if ( Nsample > Nmax / 2 ) then
start = .true. ! go for the SMALLER likelihood of already taken
Nchoose = Nmax - Nsample
end if
allocate ( mask(Nmax) )
mask = start
cnt = 0
call random_seed()
do while ( cnt < Nchoose )
call random_number( r )
i = 1 + r * Nmax
if ( mask(i) .eqv. start ) then
mask(i) = .not. mask(i)
cnt = cnt + 1
end if
end do
do i = 1, Nmax
if ( mask(i) ) write( *, * ) i
end do
end program test

Wrong results when reading a txt file into an allocatable array using Fortran 90

I'm trying to solve problems related to retaining values, when I use de/allocate in the code shown below (fortran), making a copy array, but the problem persists. I've already seen links related with the topic:
Fortran array automatically growing when adding a value
How to get priorly-unknown array as the output of a function in Fortran
It would be easy and it doesn't make sense (for the purpose of this code) if I know array dimension (input from a txt file).
Possibly I make some mistakes (one of them is obvious: minute dimension against expected total dimension). I will be grateful if someone specify them. Despite of this I can't understand how making a copy array can solve the problem, because I need to de/allocate both the temporary and the main variables.
So, is it possible to read a txt without "variable dimension" info using reallocate (de/allocate)?
That's the code (using f90):
program prueba
implicit none
integer, dimension(:), allocatable :: minuto, temp
integer :: iounit, ierr
integer :: i = 1
integer :: n = 1
open(newunit = iounit, file = 'datos.txt')
read(iounit,*)
allocate(minuto(n), temp(n))
minuto = 0; temp = 0
!-------------------------------------------
do
read(unit = iounit, fmt = '(i2)',iostat = ierr) temp(i)
if (ierr/=0) exit
if (mod(size(temp,1),5)==0) then
deallocate(minuto)
allocate(minuto(i))
minuto((i-4):i) = temp((i-4):i)
end if
i = i+1
deallocate(temp)
allocate(temp(i))
end do
close(iounit)
print*,minuto
end program prueba
(I know better ways to achieve the same goal, that's only an exercise to deepen)
I use this data example (from a txt):
min
5
10
15
20
25
30
35
40
45
50
55
0
That's the result:
-2144186072 1 -2144186072 1 25 0 35 40 45 50
In the reallocation process you deallocate minuto and don't save its old data.
This is a sample program which could work out for you
program prueba
implicit none
integer, allocatable :: minuto(:)
integer, parameter :: n = 2
integer :: iounit, ierr, temp(n), i
open (newunit = iounit, file = 'datos.txt')
read (iounit, *)
! init minuto. needed for move_alloc in first call
allocate (minuto(0))
i = 1
do
read (unit = iounit, fmt = '(i2)', iostat = ierr) temp(i)
! exit loop. still save temp(1:i-1)
if (ierr /= 0) then
if (i > 1) call save_temp(i-1)
exit
end if
! save all of temp
if (i == n) call save_temp(n)
i = mod(i, n) +1
end do
close (iounit)
print *, minuto
contains
subroutine save_temp(n_temp)
!! append temp(1:n_temp) to minuto
integer, intent(in) :: n_temp
integer, allocatable :: temp_reloc(:)
! save old data from minuto into temp_reloc
call move_alloc(minuto, temp_reloc)
allocate (minuto(size(temp_reloc) + n_temp))
! init first part of minuto by its old data
minuto(:size(temp_reloc)) = temp_reloc
! append temp's data
minuto(size(temp_reloc)+1:) = temp(1:n_temp)
end subroutine
end program
Output
$ gfortran -g3 -Wall -fcheck=all a.f90 && ./a.out
5 10 15 20 25 30 35 40 45 50 55 0

Fortran OMP parallel do loop scales differently for gfortran and Intel

I abstracted some code from a much larger coding project I'm working on. The code has an OMP parallel do loop which scales well with processor number when compiled with gfortran but badly when compiled with Intel. With gfortran, the code takes 18 seconds to run with 1 processor and 5 seconds to run with 4 processors. With Intel it takes 7 seconds to run with 1 processor and 14 seconds to run with 4 processors. I don't understand what's going on here. The code is below.
MODULE test
TYPE walker
DOUBLE PRECISION, DIMENSION(:,:), ALLOCATABLE :: R
END TYPE walker
TYPE walkerlist
INTEGER :: nwlkr
TYPE(walker), DIMENSION(:), ALLOCATABLE :: W
END TYPE walkerlist
CONTAINS
FUNCTION step( dTau, nelec, ndim ) RESULT ( dR )
DOUBLE PRECISION, INTENT(IN) :: dTau
INTEGER, INTENT(IN) :: nelec, ndim
DOUBLE PRECISION :: dR(ndim,nelec), rand1, rand2, N2DTau
INTEGER :: d, k
DOUBLE PRECISION, PARAMETER :: twopi = 8.d0 * atan(1.d0)
N2DTau = -2 * dTau
DO k = 1, nelec
DO d = 1, ndim
CALL RANDOM_NUMBER(rand1)
CALL RANDOM_NUMBER(rand2)
dR(d,k) = SQRT( N2DTau * LOG( rand1 ) ) * COS( twopi * rand2 )
END DO
END DO
END FUNCTION step
END MODULE test
PROGRAM walk
USE test
TYPE(walkerlist), TARGET :: Wl
DOUBLE PRECISION :: dTau
INTEGER :: istp, i, t1, t2, clock_rate, clock_max
Wl % nwlkr = 10000
ALLOCATE( Wl % W ( Wl % nwlkr ) )
DO i = 1, Wl % nwlkr
ALLOCATE( Wl % W(i) % R(3,2) )
END DO
dTau = 0.001
CALL SYSTEM_CLOCK ( t1, clock_rate, clock_max )
!$OMP PARALLEL DO SHARED( W ) DEFAULT( FIRSTPRIVATE )
DO i = 1, Wl % nwlkr
DO istp = 1, 4000
Wl % W(i) % R = Wl % W(i) % R + step( dTau, 2, 3 )
END DO
END DO
!$OMP END PARALLEL DO
CALL SYSTEM_CLOCK ( t2, clock_rate, clock_max )
Print*, "time:", REAL ( t2 - t1 ) / REAL ( clock_rate )
END PROGRAM walk
The issue was the random_number calls, where I'm guessing the threads were sharing seeds. I solved it by instead using the random number generating function ran.
rand1 = ran(s)
rand2 = ran(s)
Ran lets you input the seed s, which I made thread_private and of the save type. Ran changes the seed only for ifort and not gfortran, so I can't use it for the latter. Ran also sometimes outputs 0, which I personally need to always check for and discard. I also need to Ensure all threads start with a different seed.

Slow random seed generator--why?

I've been playing around with various random seed generators. Here is a simple one:
subroutine init_random_seed()
integer :: i, n, clock
integer, dimension(:), allocatable :: seed
call random_seed(size = n)
allocate(seed(n))
call system_clock(count=clock)
seed = clock + 37 * (/ (i - 1, i = 1, n) /)
call random_seed(put = seed)
deallocate(seed)
end
...and a more robust one:
SUBROUTINE init_random_seed()
USE ISO_Fortran_env, ONLY: INT64
IMPLICIT NONE
INTEGER, ALLOCATABLE :: seed(:)
INTEGER :: i, n, un, istat, dt(8), pid
INTEGER(INT64) :: t
CALL RANDOM_SEED(size = n)
ALLOCATE(seed(n))
OPEN(newunit=un, file='/dev/urandom', access='stream', status='old', action='read', form='unformatted', iostat=istat)
IF (istat == 0) THEN
READ(un) seed
CLOSE(un)
ELSE
CALL SYSTEM_CLOCK(t)
IF (t == 0) THEN
CALL DATE_AND_TIME(values = dt)
t = (dt(1) - 1970) * 365_INT64 * 24 * 60 * 60 * 1000 + dt(2) * 31_INT64 * 24 * 60 * 60 * 1000 + dt(3) * 24_INT64 * 60 * 60 &
* 1000 + dt(5) * 60 * 60 * 1000 + dt(6) * 60 * 1000 + dt(7) * 1000 + dt(8)
END IF
pid = GETPID()
t = IEOR(t, INT(pid, KIND(t)))
DO i = 1, n
seed(i) = lcg(t)
END DO
END IF
CALL RANDOM_SEED(put = seed)
DEALLOCATE(seed)
CONTAINS
FUNCTION lcg(s)
INTEGER :: lcg
INTEGER(INT64) :: s
IF (s == 0) THEN
s = 104729
ELSE
s = MOD(s, 4294967296_INT64)
END IF
s = MOD(s * 279470273_INT64, 4294967291_INT64)
lcg = INT(MOD(s, INT(HUGE(0), INT64)), KIND(0))
END FUNCTION lcg
END SUBROUTINE init_random_seed
The second one generates higher-quality random numbers, but is comparatively slow. Does anyone see why?
The second subroutine is performing many more algebriac operations than the first. In addition, the second subroutine is doing file I/O by making a call to OPEN. This is likely the performance killer. Reading from disk is usually orders of magnitude slower than reading from memory. You could try commenting out the call to OPEN, replacing it with a hard coded value, and benchmarking both subroutines again.

partition a 2D array column-wise and use allgather

I have a fortran MPI code in which a compute intensive function is invoked on every element of a 2D array. I'm trying to split the tasks among the ranks. For example if there are 30 columns and 10 ranks, then each rank gets 3 columns. The following code does this split and gathers the results using allgather. But the final array doesn't have the values from all ranks.
program allgather
include 'mpif.h'
!create a 2 x 30 myarray
integer :: x=2,y=30
integer :: numprocs,myid
integer :: i,j,k,myelements,mycolumns,jb,je
integer*4,dimension(:),allocatable :: displacement,recvcnt
real :: checksum
real,dimension(:,:),allocatable :: myarr,combinedarr
call MPI_INIT(IERR)
call MPI_COMM_SIZE(MPI_COMM_WORLD,NUMPROCS,IERR)
call MPI_COMM_RANK(MPI_COMM_WORLD,MYID,IERR)
mycolumns = y/numprocs
myelements = x * mycolumns
allocate(displacement(numprocs),recvcnt(numprocs))
jb = 1 + ( myid * mycolumns )
je = ( myid + 1 ) * mycolumns
allocate(myarr(x,mycolumns))
allocate(combinedarr(x,y))
myarr(:,:) =0
do j=jb,je
do i=1,x
myarr(i,j) = 1
enddo
enddo
!myarr(:,:)=1
if(mod(y,numprocs) > 0) then
if(myid==numprocs-1) then
jb=(myid + 1) * mycolumns + 1
do j=jb,y
do i=1,x
myarr(i,j) = 1
enddo
enddo
endif
endif
combinedarr(:,:) =0
recvcnt(:)=myelements
do k=1,numprocs
displacement(k) = (k-1) *myelements
enddo
call MPI_ALLGATHERV(myarr,myelements,MPI_REAL,combinedarr,recvcnt,displacement,MPI_REAL,MPI_COMM_WORLD,IERR)
if(mod(y,numprocs) > 0) then
recvcnt(:) = 0
recvcnt(numprocs) = (x*y) - myelements * (numprocs)
displacement(numprocs) = displacement(numprocs) + myelements
call MPI_ALLGATHERV(myarr,recvcnt(numprocs),MPI_REAL,combinedarr,recvcnt,displacement,MPI_REAL,MPI_COMM_WORLD,IERR)
endif
if (myid==0) then
checksum=0
write(6,*) "mycolumns:",mycolumns,"myelements:",myelements
do j=1,y
do i=1,x
checksum = checksum + combinedarr(i,j)
enddo
enddo
write(6,*) checksum
endif
end
First of all, you are using MPI_ALLGATHERV() just as MPI_ALLGATHER() and get no benefit from its ability to send different number of elements from/to each process. But that's not the error in your program. The error lies in the way it fills myarr. You allocate it as myarr(x,mycolumns) but when filling it from column jb to column je, you go past the end of the array in all processes but rank 0 since jb and je are greater than mycolumns there. Thus myarr contains ones only in rank 0 and zeroes in all other ranks. So, yes, the final array does not have the values that you expect but that's because you filled them wrong, not because of the way MPI subroutines are used.
Writing past the end of an allocatable array destroys the hidden structures that are used to manage heap allocation and usually crashes the program. In your case you are just lucky - I run your code with Open MPI and it crashed with core dumps each time.
And you are also missing a call to MPI_FINALIZE() at the end of your code.
Hint: use the Fortran 90 interface if available - replace include 'mpif.h' with use mpi
here is the final version of the code. I have implemented the fixes suggested by "Hristo Iliev" and also fixed the part where the # or ranks does not equally divide the # of columns. Here the last rank does the computation on the leftover columns.
program allgather
include 'mpif.h'
!create a 2 x 30 myarray
integer :: x=4,y=6
integer :: numprocs,myid
integer :: i,j,k,myelements,mycolumns,jb,je,jbb
integer*4,dimension(:),allocatable :: displacement,recvcnt
real :: checksum
real,dimension(:,:),allocatable :: myarr,combinedarr
call MPI_INIT(IERR)
call MPI_COMM_SIZE(MPI_COMM_WORLD,NUMPROCS,IERR)
call MPI_COMM_RANK(MPI_COMM_WORLD,MYID,IERR)
mycolumns = y/numprocs
myelements = x * mycolumns
allocate(displacement(numprocs),recvcnt(numprocs))
jb = 1 + ( myid * mycolumns )
je = ( myid + 1 ) * mycolumns
allocate(myarr(x,y))
allocate(combinedarr(x,y))
myarr(:,:) =0
do j=jb,je
do i=1,x
myarr(i,j) = (j-1) * x + i
enddo
enddo
if(mod(y,numprocs) > 0) then
if(myid==numprocs-1) then
jbb=(myid + 1) * mycolumns + 1
do j=jbb,y
do i=1,x
myarr(i,j) = (j-1) * x + i
enddo
enddo
endif
endif
combinedarr(:,:) =0
recvcnt(:)=myelements
do k=1,numprocs
displacement(k) = (k-1) *myelements
enddo
call MPI_ALLGATHERV(myarr(1,jb),myelements,MPI_REAL,combinedarr,recvcnt,displacement,MPI_REAL,MPI_COMM_WORLD,IERR)
if(mod(y,numprocs) > 0) then
recvcnt(:) = 0
recvcnt(numprocs) = (x*y) - myelements * (numprocs)
displacement(numprocs) = displacement(numprocs) + myelements
call MPI_ALLGATHERV(myarr(1,jbb),recvcnt(numprocs),MPI_REAL,combinedarr,recvcnt,displacement,MPI_REAL,MPI_COMM_WORLD,IERR)
endif
if (myid==0) then
checksum=0
write(6,*) "mycolumns:",mycolumns,"myelements:",myelements
do j=1,y
do i=1,x
checksum = checksum + combinedarr(i,j)
enddo
enddo
write(6,*) checksum
endif
end

Resources