How to improve the performance of Fortran intrinsic function calls when function argument is a variable

How to improve the performance of Fortran intrinsic function calls when function argument is a variable - performance

I am writing a Fortran program which involves a very large number of exponential function calls. I found that when the argument of exp() function is a variable, the calculation speed is more than 20 times slower than using constant value as the function argument. Eg, in the following two sample programs, program A is much slower than program B.
program A
real a,b
integer i
a=1.234
do i=1,100000000
b=exp(a)
end do
stop
end program A
=====================
program B
integer i
real b
do i=1,100000000
b=exp(1.234)
end do
stop
end program B
When using variable as the exp() function argument is unavoidable, how can I improve the efficiency of doing exp() calculations?

Some compilers can evaluate intrinsic functions applied to constants at compile time. Thus there is no run time cost to evaluating the intrinsic function in this case. Obviously this can't be done for true variables since the values won't be known until runtime.

See if you can get some optimisation report out of the compiler...
Example for Intel:
program A
real a,b
integer i
a=1.234
!DIR$ SIMD
do i=1,100000000
b=exp(a)
end do
stop
end program A
Example using OpenMP:
program A2
USE OMP_LIB
real a,b
integer i
a=1.234
!DIR$ SIMD
do i=1,100000000
b=exp(a)
end do
stop
end program A2
And of course in this trivial case...
program B2
real a,c
real, DIMENSION(100000000) ::b
a=1.234
c=exp(a)
b=c
! which is mathematically the same as
!b(1:100000000) = c
! or
!b(:) = c
stop
end program B2
I usually put these types of functions into a library and then compile that using !DIR# or !$OMP and get it tuned up, independent of the main program's optimisation level.
MODULE MyFuncs
PRIVATE
PUBLIC, B2
CONTAINS
PURE SUBROUTINE B2(a,n,b)
IMPLICIT NONE
real , INTENT(IN ) :: a
INTEGER , INTENT(IN ) :: n
real, DIMENSION(n), INTENT( OUT) :: b
c=exp(a)
!DIR$ SIMD
b=c
RETURN
END SUBROUTINE myB2
END MODULE MyFuncs
program A
IMPLICIT NONE
USE myFuncs
INTEGER :: n=100000000
real :: a
real, DIMENSION(n) :: b
a=1.234
CALL MyB2(a, n, b)
end program A

Related

Racing condition in assignment to variable of unknown scope

a parallel procedure computes the scalar product of two vectors in the following way
Compute the contribution of the current MPI process using OpenMP reduction. The result is stored in the SHARED variable s_loc
OpenMP master calls MPI_Allreduce to compute the sum of s_loc over all MPI processes. The result is stored in the SHARED variable s_glob
Finally, s_glob is assigned to the procedure output argument:
s = s_glob
Note that the scope of s is unknown, because it is defined outside the procedure. In the PRIVATE case, the assignment is just per thread and all is fine. However, if s is SHARED, a racing condition occurs, since s_glob is SHARED. I wonder if this racing condition may pose any problem and, if so, how it could be avoided.
In the following example the MPI part is removed, because it is important only for the scope of variables, but not for the problem itself.
program Scalar_Product
implicit none
integer, parameter :: n = 10000000
real, save :: a(n), b(n)
real :: ab
call random_number(a)
call random_number(b)
!$omp parallel private(ab)
ab = ScalarProduct(a, b)
!$omp end parallel
contains
real function ScalarProduct(a, b) result(s)
real, intent(in) :: a(:) ! left operand
real, intent(in) :: b(:) ! right operand
real, save :: s_loc, s_glob
integer :: i
!$omp master
s_loc = 0
!$omp end master
!$omp do reduction(+:s_loc) schedule(static)
do i = 1, size(a)
s_loc = s_loc + a(i) * b(i)
end do
!$omp master ! in the real application
s_glob = s_loc ! s_glob is computed using
!$omp end master ! MPI_Allreduce
! this is the assignment to which the question refers
s = s_glob
end function ScalarProduct
end program Scalar_Product
Thanks for any helpful comments!
Joerg

I revised the example and think that answers my question.
program Scalar_Product
integer, parameter :: n = 10000000
real :: a(n), b(n), ab
! skipping initialization of a and b ...
!$omp parallel shared(a,b) private(ab)
ab = ScalarProduct(a, b) ! would give a racing condition, if ab is shared
!$omp end parallel
contains
function ScalarProduct(a, b) result(s)
real, intent(in) :: a(:) ! left operand
real, intent(in) :: b(:) ! right operand
real, save :: s ! force result to be shared
integer :: i
!$omp master
s = 0
!$omp end master
!$omp do reduction(+:s) schedule(static)
do i = 1, size(a)
s = s + a(i) * b(i)
end do
end function ScalarProduct
end program Scalar_Product
Observations
By declaring the function result save it becomes shared
This allows to eliminate the problematic assignment inside the function
However, as the function is called inside a parallel region, its result must be assigned to a private variable to avoid a racing condition
Cheers, Joerg

QR decomposition Fortran error

I have a problem with QR decomposition method. I use dgeqrf subroutine for decomposition but there is no error in the compiler but it gives a problem after that. I haven't found where is the mistake.
Another question is, A=Q*R=> if A matrix has zero, Can decomposition be zero or lose the rank.
program decomposition
!CONTAINS
!subroutine Qrdecomposition(A_mat, R)
real,dimension(2,2) :: A_mat !real,dimension(2,2),intent(inout)
:: A_mat
real,dimension(2,2) :: R !real,dimension(2,2),intent(out)
:: R
real,dimension(2,2) :: A
integer :: M,N,LDA,LWORK,INFO
real,allocatable, dimension(:,:) :: TAU
real,allocatable, dimension(:,:) :: WORK
external dgeqrf
M=2
N=2
LDA=2
LWORK=2
INFO=0
A_mat(1,1)=4
A_mat(1,2)=1
A_mat(2,1)=3
A_mat(2,2)=1
A=A_mat
call dgeqrf(M,N,A,TAU,WORK,LWORK,INFO)
R=A
print *,R,WORK,LWORK
!end subroutine Qrdecomposition
end program decomposition

I see three mistakes in your code:
1) You forgot the LDA argument to dgeqrf,
2) TAU and WORK must be explicitly allocated,
3) All arrays should be declared with double precision to be consistent with dgeqrf interface:
program decomposition
!CONTAINS
!subroutine Qrdecomposition(A_mat, R)
! Note: using '8' for the kind parameter is not the best style but I'm doing it here for brevity.
real(8),dimension(2,2) :: A_mat !real,dimension(2,2),intent(inout)
real(8),dimension(2,2) :: R !real,dimension(2,2),intent(out)
real(8),dimension(2,2) :: A
integer :: M,N,LDA,LWORK,INFO
real(8),allocatable, dimension(:,:) :: TAU
real(8),allocatable, dimension(:,:) :: WORK
external dgeqrf
M=2
N=2
LDA=2
LWORK=2
INFO=0
A_mat(1,1)=4
A_mat(1,2)=1
A_mat(2,1)=3
A_mat(2,2)=1
A=A_mat
allocate(tau(M,N), work(M,N))
call dgeqrf(M,N,A,LDA,TAU,WORK,LWORK,INFO)
R=A
print *,R,WORK,LWORK
!end subroutine Qrdecomposition
end program decomposition
In certain situations, Fortran does perform automatic allocation of arrays, but it should generally not be counted on and it is not the case here.
EDIT Point 3 was pointed out by roygvib, see below.

Type-independent memory reallocation in fortran

I try to construct a subroutine to reallocate memory for a type-independent allocatable array like this:
subroutine reallocate(vector, num)
implicit none
class(*), dimension(:), allocatable, intent(inout) :: vector
integer :: num
class(*), dimension(:), allocatable :: tmp
integer :: lb, ub, ii_
if (allocated(vector)) then
ub = max(ubound(vector, 1), ub)
lb = min(lbound(vector, 1), lb)
if (ub .GT. ubound(vector, 1) .OR. lb .LT. lbound(vector, 1)) then
allocate(tmp(ub:lb), source=vector)
tmp(lbound(vector,1):ubound(vector,1)) = vector
call move_alloc(tmp, vector)
else
return
end if
else
allocate(vector(num:num), source=vector)
return
end if
return
end subroutine
For example, let's say I have a type(type1), allocatable :: v allocated within the indices -1 and 4, and I call reallocate(v, 6). After that I want v to be allocated between -1 and 6.
So, problem here comes when vector is already allocated, and I want to keep the information already stored in the vector by copying it to a newly reallocated temporal array (line that reads tmp(lbound(vector,1):ubound(vector,1)) = vector). gfortran complains: "Error: Nonallocatable variable must not be polymorphic in intrinsic assignment at (1) - check that there is a matching specific subroutine for '=' operator."
Is what I intent allowed in the Fortran 2003 standard? What would be the way to do this?

There is no way to write such a type agnostic reallocation procedure in Fortran 2003, or Fortran 2008.
Your options are:
push the allocate statements that do the (re-)allocation, back up into the scope where the type is known, effectively repeating code for each reallocation;
explicitly rewrite the reallocation procedure for each type of interest; or
write the source code for the reallocation once generically, but then use INCLUDE tricks or similar to explicitly instantiate the procedure for each type of interest.
CLASS(*) is typically not appropriate as a generic programming facility. Even if it was possible to write the body of that procedure, it is impossible to usefully call.
(Note that the example code shown references undefined and potentially unallocated variables.)

Make a subroutine faster

In the main part of my Fortran code I have this lines
Gmat=0
do i=1,indCompMax
do j=(i-1)*useSymFlag+1,nsit-(i-1)*useSymFlag
l=1
do while (G0poles(l,2)/=0)
Gmat(i,j)=Gmat(i,j)+real(G0int(i,j,l))/(omega(k)-G0poles(l,1))**G0poles(l,3)
l=l+1
enddo
enddo
enddo
call ExtendBySymmetry(Gmat)
This part is repeated several times in the code so I defined this subroutine
!=============================================================================
SUBROUTINE EvaluateFunc(matrixPol,matrixInt,z,matrix)
use NAGmodule
integer i,j,k
REAL*8, DIMENSION(Npoles,3) :: matrixPol
COMPLEX*16, DIMENSION(nsit,nsit,Npoles) :: matrixInt
COMPLEX*16, DIMENSION(nsit,nsit) :: matrix
COMPLEX*16 :: z
do i=1,indCompMax
do j=(i-1)*useSymFlag+1,nsit-(i-1)*useSymFlag
k=1
do while (matrixPol(k,2)/=0)
matrix(i,j)=matrix(i,j)+real(matrixInt(i,j,k))/(z-matrixPol(k,1))**matrixPol(k,3)
k=k+1
enddo
enddo
enddo
call ExtendBySymmetry(matrix)
end
The problem is that if I use this subroutine the evaluation of the output matrix takes much longer (around 5 times slower) than the same evaluation but made directly in the main part of the code.
How can I optimize the code and make the evaluation made with the subroutine faster?
UPDATE: Thanks for the reply. First, the operation **matrixPol(k,3) is present also in the main code, I forgot to write it in the post.
For the comparison (matrixPol(k,2)/=0) there is no problem because actually starting from a certain position of the vector all the elements are exactly zero.
Computing the prefactor outside the i,j loop helped to speed the subroutine. And switching the two indices i and j has practically no effect. Here are the running times of the subroutine
All in the main part
1.688s
my old subroutine
19.063s
with factor outside the loop i,j
5.193s
Switching the indices i and j
5.281s
with dot_product
4.958s
But the subroutine is still more than 2.5 time slower.
Here is a minimal example:
module NAGmodule
implicit none
real*8, allocatable :: hmat(:,:),eval(:),eigmat(:,:)
real*8, allocatable :: G0poles(:,:)
complex*16, allocatable :: G0int(:,:,:)
complex*16, allocatable :: Gmat(:,:)
real*8, allocatable :: omega(:)
integer :: numpoles, halffillingflag, iter, indCompMax
complex*16 :: omegaComplex
real*8, parameter :: pi=3.141592653589793
integer, parameter :: out_unit=10
integer, parameter :: timeFag=1
integer :: counti, countf, count_rate
real :: dt
integer, parameter :: Npoles=1000
real*8, parameter :: U=4
real*8, parameter :: omegamin=-20
real*8, parameter :: omegamax=20
integer, parameter :: Nomega=1500000
integer, parameter :: nsit = 4
integer, parameter :: nup = 1
integer, parameter :: ndw = 1
integer, parameter :: PBCflag=1
integer, parameter :: useSymFlag=1
end module NAGmodule
use nagmodule
integer :: i,j,k,l,m,n,p,q
REAL*8 t1,t2
allocate(hmat(nsit,nsit),eigmat(nsit,nsit),eval(nsit))
allocate(G0poles(Npoles,3),G0int(nsit,nsit,Npoles))
allocate(omega(Nomega))
allocate(Gmat(nsit,nsit))
indCompMax=1
hmat=0.
do i=1,(nsit-1)
hmat(i,i+1)=-1
hmat(i+1,i)=-1
enddo
if (PBCflag==1) then
hmat(1,nsit)=-1
hmat(nsit,1)=-1
end if
call NAGdiag(nsit)
eigmat=hmat
do k=1,Nomega
omega(k)=(omegamax-omegamin)/(Nomega-1)*(k-1)+omegamin
enddo
do k=1,nup
G0poles(k,1)=eval(k)
G0poles(k,2)=-1
G0poles(k,3)=1
enddo
do k=(nup+1),nsit
G0poles(k,1)=eval(k)
G0poles(k,2)=1
G0poles(k,3)=1
enddo
do k=1,nsit
G0int(k,k,k)=1
if ((k==nup).AND.(abs(eval(k)-eval(k+1))<1e-15)) then
G0int(k,k,k)=0.5
G0int(k+1,k+1,k)=0.5
else if ((k==nup+1).AND.(abs(eval(k)-eval(k-1))<1e-15)) then
G0int(k,k,k)=0.5
G0int(k-1,k-1,k)=0.5
end if
enddo
do k=1,nsit
G0int(:,:,k)=matmul(eigmat,matmul(G0int(:,:,k),transpose(eigmat)))
enddo
t1=0
t2=0
do k=1,Nomega
omegaComplex=CMPLX(omega(k),0)
call system_clock(counti,count_rate)
Gmat=0
call EvaluateFunc3(G0poles,G0int,omegaComplex,Gmat)
call system_clock(countf)
dt=REAL(countf-counti)/REAL(count_rate)
t1=t1+dt
call system_clock(counti,count_rate)
Gmat=0
do i=1,indCompMax
do j=(i-1)*useSymFlag+1,nsit-(i-1)*useSymFlag
l=1
do while (G0poles(l,2)/=0)
Gmat(i,j)=Gmat(i,j)+real(G0int(i,j,l))/(omega(k)-G0poles(l,1))
l=l+1
enddo
enddo
enddo
call ExtendBySymmetry(Gmat)
call system_clock(countf)
dt=REAL(countf-counti)/REAL(count_rate)
t2=t2+dt
enddo
write(*,*)'time with subroutine',t1
write(*,*)'time main',t2
stop
end
!=================================================================================
SUBROUTINE EvaluateFunc3(matrixPol,matrixInt,z,matrix)
use NAGmodule
integer i,j,k
REAL*8, DIMENSION(Npoles,3) :: matrixPol
COMPLEX*16, DIMENSION(nsit,nsit,Npoles) :: matrixInt
COMPLEX*16, DIMENSION(nsit,nsit) :: matrix
COMPLEX*16 :: z
integer :: maxPoles
COMPLEX*16, DIMENSION(Npoles) :: factor
maxPoles=0
do while (matrixPol(maxPoles+1,2)/=0)
maxPoles=maxPoles+1
enddo
factor(:maxPoles) = (1.,0.)/(z-matrixPol(:maxPoles,1))**matrixPol(:maxPoles,3)
do j=1,indCompMax
do i=(j-1)*useSymFlag+1,nsit-(j-1)*useSymFlag
matrix(i,j)=matrix(i,j)+dot_product(matrixInt(i,j,1:maxPoles),factor(1:maxPoles))
enddo
enddo
call ExtendBySymmetry2(matrix)
end
!=================================================================================
SUBROUTINE ExtendBySymmetry2(matrix)
use NAGmodule
COMPLEX*16, DIMENSION(nsit,nsit) :: matrix
integer k,i,j,l,m
if ((PBCflag==1).AND.(useSymFlag==1)) then
do i=2,nsit
matrix(2:nsit,i)=matrix(1:nsit-1,i-1)
matrix(1,i)=matrix(nsit,i-1)
enddo
else if ((PBCflag==0).AND.(useSymFlag==1)) then
do j=1,nsit/2
do i=j,nsit-j+1
matrix(j,i)=matrix(i,j)
matrix(nsit-i+1,nsit-j+1)=matrix(i,j)
matrix(nsit-j+1,nsit-i+1)=matrix(i,j)
enddo
enddo
end if
end
!=================================================================================
SUBROUTINE ExtendBySymmetry(matrix)
use NAGmodule
COMPLEX*16, DIMENSION(nsit,nsit) :: matrix
integer k,i,j,l,m
if ((PBCflag==1).AND.(useSymFlag==1)) then
do i=2,nsit
matrix(i,2:nsit)=matrix(i-1,1:nsit-1)
matrix(i,1)=matrix(i-1,nsit)
enddo
else if ((PBCflag==0).AND.(useSymFlag==1)) then
do i=1,nsit/2
do j=i,nsit-i+1
matrix(j,i)=matrix(i,j)
matrix(nsit-i+1,nsit-j+1)=matrix(i,j)
matrix(nsit-j+1,nsit-i+1)=matrix(i,j)
enddo
enddo
end if
end
!=================================================================================
SUBROUTINE NAGdiag(nsit1)
use NAGmodule
real*8, allocatable :: WORK(:)
integer, allocatable :: IWORK(:)
CHARACTER JOB, UPLO
EXTERNAL dsyevd
NMAX=nsit1
LDA=NMAX
LWORK=4*NMAX*NMAX+100
LIWORK=5*NMAX
LIWORK=10*NMAX
ALLOCATE(WORK(LWORK),IWORK(LIWORK))
JOB='V'
UPLO='L'
CALL dsyevd(JOB,UPLO,nsit1,hmat,LDA,eval,WORK,LWORK,IWORK,LIWORK,INFO)
IF (INFO.GT.0) THEN
WRITE (*,*) 'Failure to converge.'
stop
endif
deALLOCATE(WORK,IWORK)
return
end`

Due to several edits of the original question, the answer is partially superfluous by now. However, the optimization part is still valid.
The real issue with you code is that you are passing z as a complex number to the subroutine (omegaComplex), while omega(k) is real. This results in the exponentiation and division being performed as complex operations instead of real ones.
Fixing z to be real (and factor in the optimization as well) leads to the expected results. With optimizations I get:
time with subroutine 0.24000001139938831
time main 0.35700001695659012
Original answer:
First of all, the subroutine does not do the same operations that your inline code does. The operation **matrixPol(k,3) is the power to a complex number which involves a heavy computational effort. No wonder the subroutine is a lot slower.
I see a few possibilities to accelerate your code:
The divisor (z-matrixPol(k,1))**matrixPol(k,3) is independent of i and j and can be taken out of the loop.
Divisions are more expensive than multiplications. Therefore, you should pre-compute factor = 1/div outside the loop, and multiply with factor in the loop.
The comparison (matrixPol(k,2)/=0) will almost never be true, unless you set the corresponding values to exactly zero. I assume you know the order of your poles before you call the subroutine, so why not pass it along and save yourself this comparison? If that is not possible, determine the number of poles inside the subroutine before the main loop. Then, the inner loop over k is much simpler.
Inside the loop, you convert the input matrix to real over and over again. This can be done outside the loop in one go. Or, even better, just pass only the real part to the function!
At this point your code looks something like:
!=============================================================================
SUBROUTINE EvaluateFunc(matrixPol,matrixInt,z,matrix)
use NAGmodule
integer i,j,k
REAL*8, DIMENSION(Npoles,3) :: matrixPol
COMPLEX*16, DIMENSION(nsit,nsit,Npoles) :: matrixInt
COMPLEX*16, DIMENSION(nsit,nsit) :: matrix
COMPLEX*16 :: z, factor(Npoles)
REAL*8, DIMENSION(nsit,nsit,Npoles) :: matrixInt_re
integer :: maxPoles
! Determine maximum number of poles
do k=1,Npoles
! Only valid if the poles are set to exactly zero outside. If not,
! use ( abs(matrixPol(k,2)) <= someEpsilon )
if ( matrixPol(k,2) == 0 ) then
maxPoles = k-1
exit
endif
enddo
! Pre-compute factors
factor(:maxPoles) = (1.,0.)/(z-matrixPol(:maxPoles,1))**matrixPol(:maxPoles,3)
! Convert input to real
matrixInt_re = real(matrixInt)
do i=1,indCompMax
do j=i,nsit-i+1
do k=1,maxPoles
matrix(i,j)=matrix(i,j)+matrixInt_re(i,j,k)*factor(k)
enddo
enddo
enddo
call ExtendBySymmetry(Gmat)
end
Further optimization:
Rewriting the code like this it becomes apparent that the inner loop over k is nothing more than a dot product. This could potentially be sped up by the compiler. The main loop would then look like:
do i=1,indCompMax
do j=i,nsit-i+1
matrix(i,j)=matrix(i,j) + &
dot_product( matrixInt_re(i,j,:maxPoles), factor(:maxPoles) )
enddo
enddo
As chw21 noted, Fortran uses a column major memory layout and you are accessing it in a row major fashion. This potentially loses you a lot of memory. You should consider transposing your arrays in the main program or at least switch the loops over i and j. I would prefer the first option, as the inner dot product would then be performed on contiguous memory chunks.

Try to see whether you can swap the loops around. Since Fortran stores the arrays in the order
(1, 1), (2, 1), (3, 1), ..., (n, 1), (1, 2), (2, 2), ...
Memory access is far faster if you loop along that dimension.

Avoid succession of blocking MPI_BCAST

I am trying to improve one of the code I am using for numerical simulations. One of the computation step requires to compute several large arrays, whose computation is complex and costly. What is done now is that each array is computed by a specific MPI-process, and then broadcasted to the other ones. The subroutine doing that is called very often during a run of the program, so it needs to be as fast as possible.
However, I suspect that the five successive MPI_BCAST are deleterious for the program performances... I did some tests using a non-blocking BCAST (MPI_IBCAST) and I saw an improvement of the performances.
Unfortunately, I cannot use it at it does not seem to be available in some MPI implementations (at least the versions installed on the clusters I'm using...).
Do you have any ideas on how to improve this situation ? Below is a simplified version of the code I'm trying to optimize...
program test
use mpi
integer, parameter :: dp = kind(0.d0)
real(dp), dimension(:), allocatable :: a, b, c, d, e
integer, dimension(5) :: kproc
integer :: myid, numprocs, ierr
integer :: i,n
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
n = 5000 ! Can be much greater
allocate(a(n), b(n), c(n), d(n), e(n))
do i=1,5
kproc(i)=mod(i-1,numprocs)
enddo
if(myid==kproc(1)) then
a = 1.0_dp ! Complex computation for a
endif
if(myid==kproc(2)) then
b = 2.0_dp ! Complex computation for b
endif
if(myid==kproc(3)) then
c = 3.0_dp ! Complex computation for c
endif
if(myid==kproc(4)) then
d = 4.0_dp ! Complex computation for d
endif
if(myid==kproc(5)) then
e = 5.0_dp ! Complex computation for e
endif
call MPI_BCAST(a,n,MPI_DOUBLE_PRECISION,kproc(1),MPI_COMM_WORLD, ierr)
call MPI_BCAST(b,n,MPI_DOUBLE_PRECISION,kproc(2),MPI_COMM_WORLD ,ierr)
call MPI_BCAST(c,n,MPI_DOUBLE_PRECISION,kproc(3),MPI_COMM_WORLD ,ierr)
call MPI_BCAST(d,n,MPI_DOUBLE_PRECISION,kproc(4),MPI_COMM_WORLD ,ierr)
call MPI_BCAST(e,n,MPI_DOUBLE_PRECISION,kproc(5),MPI_COMM_WORLD ,ierr)
d = d+e
call MPI_FINALIZE(ierr)
end program test
In this example, you can see that the computation of the five arrays a, b, c, d and e is splitted between the MPI processes. Notice also that d and e are in fact two parts of the same array : at the end, what matters is only the value of d = d+e.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to improve the performance of Fortran intrinsic function calls when function argument is a variable - performance

Some compilers can evaluate intrinsic functions applied to constants at compile time. Thus there is no run time cost to evaluating the intrinsic function in this case. Obviously this can't be done for true variables since the values won't be known until runtime.

Related

Racing condition in assignment to variable of unknown scope

QR decomposition Fortran error

Type-independent memory reallocation in fortran

Make a subroutine faster

Avoid succession of blocking MPI_BCAST

Categories

Resources