Error: DUMMY attribute conflicts with RESULT attribute [duplicate] - debugging

This question already has answers here:
Defining a function returning an array
(2 answers)
Function Returning an array in Fortran
(2 answers)
Closed 1 year ago.
I have a Fortran 90 program formatted in the following way
program main
<bulk of code that uses update_board>
end program main
integer function update_board(board, tochange, N) result(tochange)
integer, dimension(:, :) :: board, tochange
integer N
do i=2,N+1
do j=2,(N+2)/2
nna = board(i+1, j+1) + board(i, j+1) + board(i-1, j+1) &
+ board(i-1, j) + board(i-1, j-1) + board(i, j-1) &
+ board(i+1, j-1) + board(i+1,j)
if (nna .eq. 3) then
tochange(i, j) = 1
else if (nna .eq. 2) then
tochange(i, j) = board(i, j)
else
tochange(i, j) = 0
end if
end do
end do
end function update_board
However, when I run the code, I get the compilation error of
integer function update_board(board, tochange, N) result(tochange)
1
Error: DUMMY attribute conflicts with RESULT attribute in 'tochange' at (1)
parallel_game_of_life.f90:1.12:
program main
1
parallel_game_of_life.f90:144.47:
integer, dimension(:, :) :: board, tochange
2
Error: Two main PROGRAMs at (1) and (2)
I'm sure the latter two main PROGRAMs error is a direct result of the first. However, I am pretty stuck here. I would like to update an input variable and use that as the result ,since I can't declare a function like integer, dimension(:, :) function ...

You are declaring the function output incorrectly in your code. You have declared the output type twice, but also have it both as input and output. If you want it as input, then use subroutine instead of function. Here is a fix to your code based on what I understood from it,
function update_board(board, n) result(tochange)
integer, intent(in) :: n
integer, intent(in) :: board(n,(n+2)/2+1)
integer :: tochange(n,(n+2)/2+1)
do i=2,N+1
do j=2,(N+2)/2
nna = board(i+1, j+1) + board(i, j+1) + board(i-1, j+1) &
+ board(i-1, j) + board(i-1, j-1) + board(i, j-1) &
+ board(i+1, j-1) + board(i+1,j)
if (nna .eq. 3) then
tochange(i, j) = 1
else if (nna .eq. 2) then
tochange(i, j) = board(i, j)
else
tochange(i, j) = 0
end if
end do
end do
end function update_board
An equivalent subroutine implementation would be the following,
subroutine update_board(n, board, tochange)
integer, intent(in) :: n
integer, intent(in) :: board(n,(n+2)/2+1)
integer, intent(out) :: tochange(n,(n+2)/2+1)
do i=2,N+1
do j=2,(N+2)/2
nna = board(i+1, j+1) + board(i, j+1) + board(i-1, j+1) &
+ board(i-1, j) + board(i-1, j-1) + board(i, j-1) &
+ board(i+1, j-1) + board(i+1,j)
if (nna .eq. 3) then
tochange(i, j) = 1
else if (nna .eq. 2) then
tochange(i, j) = board(i, j)
else
tochange(i, j) = 0
end if
end do
end do
end function update_board
Then, call subroutine(n, board, tochange) instead of tochange = function(board, n).

Related

OpenMP on fortran 90 lasts almost the same(if not more) as non parallelized program

I'm trying to parallelize a simulation of an Ising 2D model to get some expected values as a function of the temperature of the system. For L=48, the one-threaded version takes about 240 seconds to run 20 temperatures and 1 seed each, but the parallelized version takes about 268 seconds, which is similar.
If you take the time per seed per temperature, it results in 12 seconds for the one-threaded version and 13.4 seconds for the parallelized version. I'm looking for help with my code because I don't understand these durations. I thought that the parallelized version would split one temperature among all threads and therefore should take about 30 seconds to complete.
I need to run the simulation for 50 temperatures and 200 seeds each, for 5 values of L. It would be helpful to reduce the compute time, because otherwise it could take 20 hours for L=48 and some days for L=72.
I'm using an i7-10700KF (8 cores, 16 logical threads).
program Ising
use omp_lib
implicit none
integer L, seed, i, j, seed0, nseed,k
parameter (L=48)
integer s(1:L, 1:L)
integer*4 pbc(0:L+1), mctot, N, mcd, mcini, difE
real*8 genrand_real2, magne, energ, energia, temp, temp1, DE
real*8 mag, w(-8:8)
real*8 start, finish
real*8 sum, sume, sume2, summ, summ2, sumam, vare, varm, maxcv, maxx
real*8 cv, x, Tmaxcv, Tmaxx
integer irand, jrand
11 format(10(f20.6))
! Initialize variables
mctot = 80000
mcd = 20
mcini = 8000
N = L*L
seed0 = 20347880
nseed = 20
maxcv=0.d0
maxx=0.d0
! Initialize vector pbc
pbc(0) = L
pbc(L+1) = 1
do i = 1, L
pbc(i) = i
end do
! Initialize matrix s with random values
do i = 1, L
do j = 1, L
if (genrand_real2() < 0.5) then
s(i,j) = 1
else
s(i,j) = -1
endif
end do
end do
! Metropolis algorithm
open(1, file='Expectation values.dat')
start = omp_get_wtime()
write(1,*) '#Temp, ','E, ','E2, ','M, ','M2, ','|M|, ','VarE, ','VarM, ',&
'Cv, ','X, '
!Start loop to calculate for different temperatures
!$OMP PARALLEL PRIVATE(s,seed,w,energia,difE,irand,jrand,temp,mag,sum,sume,sume2,summ,summ2,sumam,vare,varm,cv,x)
temp1 = 1.59d0
!$OMP DO ordered schedule(dynamic)
do k = 1, 10
temp = temp1 + (0.01d0*k)
!Define the matrix w, which contains the values of the Boltzmann function for each temperature, so as not to have to calculate them each iteration
do i = -8, 8
w(i) = dexp(-i/temp)
end do
write(*,*) "Temperature: ", temp, "Thread", omp_get_thread_num()
sum = 0.d0
sume = 0.d0
sume2 = 0.d0
summ = 0.d0
summ2 = 0.d0
sumam = 0.d0
do seed = seed0, seed0 + nseed-1, 1
call init_genrand(seed)
call reinicia(s,l)
energia = energ(s,l,pbc)
do i = 1, mctot
do j = 1, N
irand = int(genrand_real2()*L) + 1
jrand = int(genrand_real2()*L) + 1
difE = int(DE(s,l,irand,jrand,pbc))
if (difE < 0) then
s(irand,jrand) = -s(irand,jrand)
energia = energia + difE
else if (genrand_real2() < w(int(difE))) then
s(irand,jrand) = -s(irand,jrand)
energia = energia + difE
endif
end do
if ((i > mcini).and.(mcd*(i/mcd)==i)) then
mag= magne(s,l)
sum = sum + 1.d0
sume = sume + energia
sume2 = sume2 + energia**2
summ = summ + mag
summ2 = summ2 + mag**2
sumam = sumam + abs(mag)
endif
end do
end do
!Energy
sume=sume/(sum*N)
sume2=sume2/(sum*N*N)
!Magnetitzation
summ = summ/(sum*N)
sumam=sumam/(sum*N)
summ2=summ2/(sum*N*N)
!Variances
vare = dsqrt(sume2-sume*sume)/dsqrt(sum)
varm = dsqrt(summ2-summ*summ)/dsqrt(sum)
!Cv
cv = (N*(sume2-sume*sume))/temp**2
if (cv.gt.maxcv) then
maxcv=cv
Tmaxcv=temp
endif
!X
x = (N*(summ2-summ*summ))/temp
if (x.gt.maxx) then
maxx=x
Tmaxx=temp
endif
write(1,11) temp,sume,sume2,summ,summ2,sumam,vare,varm,cv,x
end do
!$OMP END DO
!$OMP END PARALLEL
finish = omp_get_wtime()
close(1)
print*, "Time: ",(finish-start),"Seconds"
end program Ising
! Functions
!Function that calculates the energy of the matrix s
real*8 function energ(S,L, pbc)
implicit none
integer s(1:L, 1:L), i, j, L
integer*4 pbc(0:L+1)
real*8 ene
ene = 0.0d0
do i = 1, L
do j = 1, L
ene = ene - s(i,j) * s(pbc(i+1),j) - s(i,j) * s(i,pbc(j+1))
end do
end do
energ = ene
return
end function energ
!Function that calculates the difference in energy that occurs when the spin of position (i, j) is changed
real*8 function DE(S,L,i,j,pbc)
implicit none
integer s(1:L, 1:L), i, j, L, difE
integer*4 pbc(0:L+1)
real*8 suma
difE = 0
suma = 0.0d0
suma = suma + s(pbc(i-1),j) + s(pbc(i+1),j) + s(i,pbc(j-1)) + s(i,pbc(j+1))
difE = difE + int(2 * s(i,j) * suma)
DE = difE
return
end function DE
!Function that calculates the magnetization of the matrix s
real*8 function magne(S,L)
implicit none
integer s(1:L, 1:L),L
magne = sum(s)
return
end function magne
! SUBRUTINES
!Subroutine that resets the matrix s with random values
subroutine reinicia(S,L)
implicit none
integer s(1:L, 1:L), i,j,L
real*8 genrand_real2
do i = 1, L
do j = 1, L
if (genrand_real2() < 0.5) then
s(i,j) = 1
else
s(i,j) = -1
endif
end do
end do
return
end subroutine
I have tried parallelizing the seeds loop instead of the temperatures, but it lasts almost the same, so i think i'm not parallelizing it correctly, because it looks a nice code to parallelize.
The other option I thought of is to manually parallelize the simulation. I could do this by compiling 16 programs, each of which handles a different range of temperatures. Then I could run all of the programs concurrently, so each program would get its own thread. However, this approach would require a lot of extra RAM.

Efficient (Fast) Binary Tree in Fortran

I am using the procedure in the following code (that I took from here) to a program that I am trying to make run as fast as possible. The procedure is, however, very slow since it is probably optimized for pedagogical purposes not speed.
program tree_sort
! Sorts a file of integers by building a
! tree, sorted in infix order.
! This sort has expected behavior n log n,
! but worst case (input is sorted) n ** 2.
implicit none
type node
integer :: value
type (node), pointer :: left, right
end type node
type (node), pointer :: t ! A tree
integer :: number, ios
nullify (t) ! Start with empty tree
do
read (*, *, iostat = ios) number
if (ios < 0) exit
call insert (t, number) ! Put next number in tree
end do
! Print nodes of tree in infix order
call print_tree (t)
contains
recursive subroutine insert (t, number)
type (node), pointer :: t ! A tree
integer, intent (in) :: number
! If (sub)tree is empty, put number at root
if (.not. associated (t)) then
allocate (t)
t % value = number
nullify (t % left)
nullify (t % right)
! Otherwise, insert into correct subtree
else if (number < t % value) then
call insert (t % left, number)
else
call insert (t % right, number)
end if
end subroutine insert
recursive subroutine print_tree (t)
! Print tree in infix order
type (node), pointer :: t ! A tree
if (associated (t)) then
call print_tree (t % left)
print *, t % value
call print_tree (t % right)
end if
end subroutine print_tree
end program tree_sort
Is there any way to speed it up? I am using the procedure to sequentially add elements to a vector without adding repeated ones (so I changed the else in the insert subroutine to else if (number > t % value) then. Other than that, instead of printing I store the values in a global variable.
Edit:
Here is the actual code:
MODULE MOD_PARAMETERS
USE, INTRINSIC :: ISO_FORTRAN_ENV
IMPLICIT NONE
SAVE
INTEGER(INT32), PARAMETER :: d = 10 ! number of dimensions
INTEGER(INT32), PARAMETER :: L_0 = 5 ! after this adaptive grid kicks in, for L <= L_0 usual sparse grid
INTEGER(INT32), PARAMETER :: L_max = 5 ! maximum level
INTEGER(INT32), PARAMETER :: bound = 1 ! 0 -> for f = 0 at boundary
! 1 -> adding grid points at boundary
! 2 -> extrapolating close to boundary
INTEGER(INT32), PARAMETER :: testing_sample = 10**4
INTEGER(INT32), PARAMETER :: error_sample = 10**2
REAL(REAL64), PARAMETER :: eps = 0.001D0 ! epsilon for adaptive grid
TYPE NODE
INTEGER :: value
TYPE (NODE), POINTER :: left, right
END TYPE NODE
INTEGER(INT32), DIMENSION(:), ALLOCATABLE :: tree_vector
INTEGER(INT32) :: iii
END MODULE MOD_PARAMETERS
SUBROUTINE FF(x,output)
USE MOD_PARAMETERS
IMPLICIT NONE
REAL(REAL64), DIMENSION(d), INTENT(IN) :: x
REAL(REAL64) , INTENT(OUT) :: output
output = 1.0D0/(ABS(0.5D0-SUM(x(:)**4.0D0))+0.1D0)
END SUBROUTINE
SUBROUTINE XX(n,L,i,output)
USE MOD_PARAMETERS
IMPLICIT NONE
INTEGER(INT32) , INTENT(IN) :: n
INTEGER(INT32), DIMENSION(n), INTENT(IN) :: L, i
REAL(REAL64), DIMENSION(n), INTENT(OUT) :: output
INTEGER(INT32) :: j
DO j = 1,n
IF ((bound .EQ. 0) .OR. (bound .EQ. 2)) THEN
output(j) = REAL(i(j),REAL64)/REAL(2**L(j),REAL64)
ELSEIF (bound .EQ. 1) THEN
output(j) = REAL(i(j),REAL64)/REAL(2**MAX(L(j)-1,1),REAL64)
ENDIF
ENDDO
END SUBROUTINE
SUBROUTINE XX_INV(L,x,output)
USE MOD_PARAMETERS
IMPLICIT NONE
INTEGER(INT32), DIMENSION(d), INTENT(IN) :: L
REAL(REAL64), DIMENSION(d), INTENT(IN) :: x
INTEGER(INT32), DIMENSION(d), INTENT(OUT) :: output
INTEGER(INT32) :: j
DO j = 1,d
IF ((bound .EQ. 0) .OR. (bound .EQ. 2)) THEN
output(j) = 2*FLOOR(x(j)*REAL(2**(L(j)-1),REAL64))+1
ELSEIF (bound .EQ. 1) THEN
IF (L(j) .EQ. 2) THEN
IF (x(j) .LT. 0.5D0) THEN
output(j) = 0
ELSE
output(j) = 2
ENDIF
ELSE
output(j) = 2*FLOOR(x(j)*(REAL(2**MAX(L(j)-2,0),REAL64)))+1
ENDIF
ENDIF
ENDDO
END SUBROUTINE
SUBROUTINE BASE(x,L,i,output)
USE MOD_PARAMETERS
IMPLICIT NONE
REAL(REAL64), INTENT(IN) :: x
INTEGER(INT32), INTENT(IN) :: L,i
REAL(REAL64), INTENT(OUT) :: output
IF (bound .EQ. 0) THEN
output = MAX((1.0D0-ABS(x*REAL(2**L,REAL64)-REAL(i,REAL64))),0.0D0)
ELSEIF (bound .EQ. 1) THEN
IF ((L .EQ. 1) .AND. (i .EQ. 1)) THEN
output = 1.0D0
ELSEIF ((L .EQ. 2) .AND. (i .EQ. 0)) THEN
output = MAX(1.0D0-2.0D0*x,0.0D0)
ELSEIF ((L .EQ. 2) .AND. (i .EQ. 2)) THEN
output = MAX(2.0D0*x-1.0D0,0.0D0)
ELSE
output = MAX((1.0D0-ABS(x*REAL(2**(L-1),REAL64)-REAL(i,REAL64))),0.0D0)
ENDIF
ELSEIF (bound .EQ. 2) THEN
IF ((L .EQ. 1) .AND. (i .EQ. 1)) THEN
output = 1.0D0
ELSEIF ((L .GT. 1) .AND. (i .EQ. 1)) THEN
output = MAX(2.0D0-REAL(2**L,REAL64)*x,0.0D0)
ELSEIF ((L .GT. 1) .AND. (i .EQ. (2**L)-1)) THEN
output = MAX(REAL(2**L,REAL64)*x+REAL(1-i,REAL64),0.0D0)
ELSE
output = MAX((1.0D0-ABS(x*REAL(2**L,REAL64)-REAL(i,REAL64))),0.0D0)
ENDIF
ENDIF
END SUBROUTINE
PROGRAM MAIN
USE MOD_PARAMETERS
IMPLICIT NONE
INTEGER(INT32), DIMENSION(d,d) :: ident
REAL(REAL64), DIMENSION(1) :: x1
REAL(REAL64), DIMENSION(d) :: xd
INTEGER(INT32), DIMENSION(2*d) :: temp
INTEGER(INT32), DIMENSION(:,:), ALLOCATABLE :: grid_index, temp_grid_index, grid_index_new, J_index, &
adj_list, temp_adj_list
INTEGER(INT32), DIMENSION(:), ALLOCATABLE :: to_do, to_do_new, to_add_ind
REAL(REAL64), DIMENSION(:), ALLOCATABLE :: coeff, temp_coeff, J_coeff
REAL(REAL64) :: temp_min, temp_max, V, T, B, F
INTEGER(INT32) :: i, k, k1, k2, h, j, L, n, dd, dsize, count, count1, count2, count3, flag, &
first, repeated, add, ind, adj_list_ind
INTEGER(INT32) :: time1, time2, time3, time4, clock_rate, clock_max
INTEGER(INT32), DIMENSION(d) :: LL, ii
REAL(REAL64), DIMENSION(error_sample,d) :: sample_x
REAL(REAL64), DIMENSION(error_sample) :: sample_e, interp1
REAL(REAL64) :: max_error, L2_error
REAL(REAL64), DIMENSION(testing_sample,d) :: x_rand
REAL(REAL64), DIMENSION(testing_sample) :: interp2
TYPE(NODE), POINTER :: tree
! ============================================================================
! EXECUTABLE
! ============================================================================
ident = 0
DO i = 1,d
ident(i,i) = 1
ENDDO
! Initial grid point
dsize = 1
ALLOCATE(grid_index(dsize,2*d),grid_index_new(dsize,2*d),adj_list(dsize,2*d))
grid_index(1,:) = 1
grid_index_new = grid_index
adj_list = 0
ALLOCATE(coeff(0:dsize))
coeff(0) = 0.0D0
xd = 0.5D0
CALL FF(xd,coeff(1))
L = 1
n = SIZE(grid_index_new,1)
ALLOCATE(J_index(n*2*d,2*d))
ALLOCATE(J_coeff(n*2*d))
ALLOCATE(to_add_ind(1))
to_add_ind = 1
CALL RANDOM_NUMBER(sample_x)
sample_e = 0.0D0
CALL SYSTEM_CLOCK (time1,clock_rate,clock_max)
DO WHILE (L .LT. L_max)
CALL SYSTEM_CLOCK (time3,clock_rate,clock_max)
L = L+1
n = SIZE(grid_index_new,1)
count = 0
first = 1
DEALLOCATE(J_index,J_coeff)
ALLOCATE(J_index(n*2*d,2*d))
ALLOCATE(J_coeff(n*2*d))
J_index = 0
J_coeff = 0.0D0
DO k = 1,n
adj_list_ind = 0
DO i = 1,d
DO j = 1,2
IF ((bound .EQ. 0) .OR. (bound .EQ. 2)) THEN
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(grid_index_new(k,d+i)-(-1)**j)/)
ELSEIF (bound .EQ. 1) THEN
IF (grid_index_new(k,i) .EQ. 1) THEN
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(-(-1)**j)/)
ELSE
temp = grid_index_new(k,:)+(/ident(i,:),ident(i,:)*(grid_index_new(k,d+i)-(-1)**j)/)
ENDIF
ENDIF
CALL XX(d,temp(1:d),temp(d+1:2*d),xd)
temp_min = MINVAL(xd)
temp_max = MAXVAL(xd)
IF ((temp_min .GE. 0.0D0) .AND. (temp_max .LE. 1.0D0)) THEN
IF (first .EQ. 1) THEN
first = 0
count = count+1
J_index(count,:) = temp
V = 0.0D0
DO k1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k2 = 1,d
CALL XX(1,temp(k2),temp(d+k2),x1)
CALL BASE(x1(1),grid_index(k1,k2),grid_index(k1,k2+d),B)
T = T*B
ENDDO
V = V+coeff(k1)*T
ENDDO
CALL FF(xd,F)
J_coeff(count) = F-V
adj_list(to_add_ind(k),adj_list_ind+1) = dsize+count
adj_list_ind = adj_list_ind+1
ELSE
repeated = 0
DO h = 1,count
IF (SUM(ABS(J_index(h,:)-temp)) .EQ. 0) THEN
repeated = 1
adj_list(to_add_ind(k),adj_list_ind+1) = dsize+h
adj_list_ind = adj_list_ind+1
ENDIF
ENDDO
IF (repeated .EQ. 0) THEN
count = count+1
J_index(count,:) = temp
V = 0.0D0
DO k1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k2 = 1,d
CALL XX(1,temp(k2),temp(d+k2),x1)
CALL BASE(x1(1),grid_index(k1,k2),grid_index(k1,k2+d),B)
T = T*B
ENDDO
V = V+coeff(k1)*T
ENDDO
CALL FF(xd,F)
J_coeff(count) = F-V
adj_list(to_add_ind(k),adj_list_ind+1) = dsize+count
adj_list_ind = adj_list_ind+1
ENDIF
ENDIF
ENDIF
ENDDO
ENDDO
ENDDO
ALLOCATE(temp_grid_index(dsize,2*d))
ALLOCATE(temp_coeff(dsize))
temp_grid_index = grid_index
temp_coeff = coeff
DEALLOCATE(grid_index,coeff)
ALLOCATE(grid_index(dsize+count,2*d))
ALLOCATE(coeff(0:dsize+count))
grid_index(1:dsize,:) = temp_grid_index
coeff(0:dsize) = temp_coeff
DEALLOCATE(temp_grid_index,temp_coeff)
grid_index(dsize+1:dsize+count,:) = J_index(1:count,:)
coeff(dsize+1:dsize+count) = J_coeff(1:count)
IF (L .LT. L_max) THEN ! put this after error threshhold when implemented
ALLOCATE(temp_adj_list(dsize,2*d))
temp_adj_list = adj_list
DEALLOCATE(adj_list)
ALLOCATE(adj_list(dsize+count,2*d))
adj_list = 0
adj_list(1:dsize,:) = temp_adj_list
DEALLOCATE(temp_adj_list)
ENDIF
dsize = dsize + count
IF (L .LE. L_0) THEN
DEALLOCATE(grid_index_new)
ALLOCATE(grid_index_new(count,2*d))
grid_index_new = J_index(1:count,:)
DEALLOCATE(to_add_ind)
ALLOCATE(to_add_ind(count))
to_add_ind = dsize-count + (/ (h,h=1,count) /)
ELSE
DEALLOCATE(to_add_ind)
ALLOCATE(to_add_ind(count))
add = 0
to_add_ind = 0
DO h = 1,count
IF (ABS(J_coeff(h)) .GT. eps) THEN
add = add + 1
J_index(add,:) = J_index(h,:)
to_add_ind(add) = dsize-count+h
ENDIF
ENDDO
DEALLOCATE(grid_index_new)
ALLOCATE(grid_index_new(add,2*d))
grid_index_new = J_index(1:add,:)
ENDIF
DO i = 1,error_sample
V = 0.0D0
DO k1 = 1,SIZE(grid_index,1)
T = 1.0D0
DO k2 = 1,d
CALL BASE(sample_x(i,k2),grid_index(k1,k2),grid_index(k1,k2+d),B)
T = T*B
ENDDO
V = V+coeff(k1)*T
ENDDO
CALL FF(sample_x(i,:),F)
sample_e(i) = F-V
interp1(i) = V
ENDDO
max_error = MAXVAL(ABS(sample_e))
L2_error = (SUM(sample_e**2.0D0)/REAL(error_sample,REAL64))**0.5D0
CALL SYSTEM_CLOCK (time4,clock_rate,clock_max)
WRITE(*,'(A,I5,A,F10.5,A,I8,A,F15.10,A,F15.10)') ' level = ', L,&
' time = ',REAL(time4-time3,REAL64)/REAL(clock_rate,REAL64),&
' grid points = ',SIZE(grid_index,1),&
' max error = ',max_error,&
' L2 error = ',L2_error
ENDDO
!PRINT *, ' '
!PRINT *, ' '
!PRINT *, ' '
!DO i = 1,SIZE(adj_list,1)
! PRINT *, i, adj_list(i,:)
!ENDDO
!PRINT *, ' '
!PRINT *, ' '
!PRINT *, ' '
!DO i = 1,dsize
! PRINT *, i, grid_index(i,:), coeff(i)
!ENDDO
!PRINT *, ' '
!PRINT *, ' '
!PRINT *, ' '
ALLOCATE (to_do(dsize),to_do_new(dsize),tree_vector(dsize))
CALL SYSTEM_CLOCK (time2,clock_rate,clock_max)
PRINT *, ' '
WRITE(*,'(A,F10.5)') ' total time for setup = ', REAL(time2-time1,REAL64)/REAL(clock_rate,REAL64)
! ============================================================================
! Compute interpolated values:
! ============================================================================
IF (testing_sample .EQ. error_sample) THEN
! x_rand = sample_x
ELSE
CALL RANDOM_NUMBER(x_rand)
ENDIF
count1 = 0
count2 = 0
count3 = 0
CALL SYSTEM_CLOCK (time1,clock_rate,clock_max)
DO i = 1,testing_sample
V = 0.0D0
to_do = 0
to_do(1) = 1
to_do_new = 0
k = 1
DO L = 1,L_max
NULLIFY (tree)
tree_vector = 0
CALL SYSTEM_CLOCK (time3,clock_rate,clock_max)
DO j = 1,k
ind = to_do(j)
T = 1.0D0
DO dd = 1,d
CALL BASE(x_rand(i,dd),grid_index(ind,dd),grid_index(ind,d+dd),B)
T = T*B
ENDDO
V = V + coeff(ind)*T
ENDDO
CALL SYSTEM_CLOCK (time4,clock_rate,clock_max)
count1 = count1 + time4-time3
IF (L .LT. L_max) THEN
n = k
k = 0
DO j = 1,n
IF (adj_list(to_do(j),1) .GT. 0) THEN
DO h = 1,2*d
CALL SYSTEM_CLOCK (time3,clock_rate,clock_max)
LL = grid_index(adj_list(to_do(j),h),1:d)
ii = grid_index(adj_list(to_do(j),h),d+1:2*d)
flag = 0
k1 = 1
DO WHILE ((flag .EQ. 0) .AND. (k1 .LE. d))
IF ((bound .EQ. 0) .OR. (bound .EQ. 2)) THEN
k2 = 2*FLOOR(x_rand(i,k1)*REAL(2**(LL(k1)-1),REAL64))+1
ELSEIF (bound .EQ. 1) THEN
IF (LL(k1) .EQ. 2) THEN
IF (x_rand(i,k1) .LT. 0.5D0) THEN
k2 = 0
ELSE
k2 = 2
ENDIF
ELSE
k2 = 2*FLOOR(x_rand(i,k1)*(REAL(2**MAX(LL(k1)-2,0),REAL64)))+1
ENDIF
ENDIF
IF (k2 .NE. ii(k1)) THEN
flag = 1
ENDIF
k1 = k1 +1
ENDDO
CALL SYSTEM_CLOCK (time4,clock_rate,clock_max)
count2 = count2 + time4-time3
! CALL SYSTEM_CLOCK (time3,clock_rate,clock_max)
IF (flag .EQ. 0) THEN
!IF (MINVAL(ABS(to_do_new(1:MAX(k,1))-adj_list(to_do(j),h))) .GT. 0) THEN
to_do_new(k+1) = adj_list(to_do(j),h)
k = k+1
CALL SYSTEM_CLOCK (time3,clock_rate,clock_max)
CALL INSERT(tree,to_do_new(k))
CALL SYSTEM_CLOCK (time4,clock_rate,clock_max)
count3 = count3 + time4-time3
!ENDIF
ENDIF
! CALL SYSTEM_CLOCK (time4,clock_rate,clock_max)
! count3 = count3 + time4-time3
ENDDO
ENDIF
ENDDO
CALL SYSTEM_CLOCK (time3,clock_rate,clock_max)
iii = 0
CALL PRINT_TREE(tree)
to_do = tree_vector
CALL SYSTEM_CLOCK (time4,clock_rate,clock_max)
count3 = count3 + time4-time3
!to_do = to_do_new
to_do_new = 0
ENDIF
ENDDO
interp2(i) = V
ENDDO
CALL SYSTEM_CLOCK (time2,clock_rate,clock_max)
PRINT *, ' '
WRITE(*,'(A,F10.5,A,I10)') ' time for interpolation = ', REAL(time2-time1,REAL64)/REAL(clock_rate,REAL64),&
' points = ', testing_sample
PRINT *, ' '
WRITE(*,'(A,F10.5)') ' time for base = ', REAL(count1,REAL64)/REAL(clock_rate,REAL64)
PRINT *, ' '
WRITE(*,'(A,F10.5)') ' time for x_inv = ', REAL(count2,REAL64)/REAL(clock_rate,REAL64)
PRINT *, ' '
WRITE(*,'(A,F10.5)') ' time for repeated = ', REAL(count3,REAL64)/REAL(clock_rate,REAL64)
!PRINT *, ' '
!WRITE(*,'(A,F20.15)') ' check = ', MAXVAL(ABS(interp2-interp1))
DEALLOCATE(grid_index,grid_index_new,J_index,coeff,J_coeff,adj_list,to_do,to_do_new,to_add_ind,tree_vector)
CONTAINS
RECURSIVE SUBROUTINE INSERT(tree,number)
TYPE(NODE), POINTER :: tree
INTEGER(INT32), INTENT(IN) :: number
IF (.NOT. ASSOCIATED(tree)) THEN
ALLOCATE(tree)
tree%value = number
NULLIFY(tree%left)
NULLIFY(tree%right)
ELSEIF (number .LT. tree%value) THEN
CALL INSERT (tree%left,number)
ELSEIF (number .GT. tree%value) THEN
CALL INSERT(tree%right,number)
ENDIF
END SUBROUTINE INSERT
RECURSIVE SUBROUTINE PRINT_TREE(tree)
TYPE (NODE), POINTER :: tree
IF (ASSOCIATED(tree)) THEN
CALL PRINT_TREE(tree%left)
iii = iii+1
tree_vector(iii) = tree%value
CALL PRINT_TREE (tree%right)
END IF
END SUBROUTINE PRINT_TREE
END PROGRAM
I am using optimization O3 but otherwise no flags. In my computer the time for repeated (which is where I am using the binary tree) is 18.3 seconds, whereas if I use an alternative method that is commented in the version (with MINVAL) it only takes 3.6 seconds.

Morris Pratt table in Fortran

I have been tried to do the Morris Pratt table and the code is basically this one in C:
void preMp(char *x, int m, int mpNext[]) {
int i, j;
i = 0;
j = mpNext[0] = -1;
while (i < m) {
while (j > -1 && x[i] != x[j])
j = mpNext[j];
mpNext[++i] = ++j;
}
}
and here is where i get so far in Fortran
program MP_ALGORITHM
implicit none
integer, parameter :: m=4
character(LEN=m) :: x='abac'
integer, dimension(4) :: T
integer :: i, j
i=0
T(1)=-1
j=-1
do while(i < m)
do while((j > -1) .AND. (x(i+1:i+1) /= (x(j+i+1:j+i+1))))
j=T(j)
end do
i=i+1
j=j+1
T(i)=j
end do
print *, T(1:)
end program MP_ALGORITHM
and the problem is i think i am having the wrong output.
for x=abac it should be (?):
a b a c
-1 0 1 0
and my code is returning 0 1 1 1
so, what i've done wrong?
The problem here is that C indices start from zero, but Fortran indices start from one. You can try to adjust the index for every array acces by one, but this will get unwieldy.
The Morris-Pratt table itself is an array of indices, so it should look different in C and Fortran: The Fortran array should have one-based indices and it should use zero as invalid index.
Together with the error that chw21 pointed out, your function might look like this:
subroutine kmp_table(x, t)
implicit none
character(*), intent(in) :: x
integer, dimension(:), intent(out) :: t
integer m
integer :: i, j
m = len(x)
i = 1
t(1) = 0
j = 0
do while (i < m)
do while(j > 0 .and. x(i:i) /= x(j:j))
j = t(j)
end do
i = i + 1
j = j + 1
t(i) = j
end do
end subroutine
You can then use it in the Morris-Pratt algorithm as taken straight from the Wikipedia page with adjustment for Fortran indices:
function kmp_index(S, W) result(res)
implicit none
integer :: res
character(*), intent(in) :: S ! text to search
character(*), intent(in) :: W ! word to find
integer :: m ! zero-based offset in S
integer :: i ! one-based offset in W and T
integer, dimension(len(W)) :: T ! KMP table
call kmp_table(W, T)
i = 1
m = 0
do while (m + i <= len(S))
if (W(i:i) == S(m + i:m + i)) then
if (i == len(W)) then
res = m + 1
return
end if
i = i + 1
else
if (T(i) > 0) then
m = m + i - T(i)
i = T(i)
else
i = 1
m = m + 1
end if
end if
end do
res = 0
end function
(The index m is zero-based here, because t is only ever used in conjunction with i in S(m + i:m + i). Adding two one-based indices will yield an offset of one, whereas keeping m zero-based makes this a neutral addition. m is a local variable that isn't exposed to code from the outside.)
Alternatively, you could make your Fortran arrays zero-based by specifying a lower bound of zero for your string and array. That will clash with the useful character(*) notation, though, which always uses one-based indexing. In my opinion, it is better to think about the whole algorithm in the typical one-based indexing scheme of Fortran.
this site isn't really a debugging site. Normally I would suggest you have a look at how to debug code. It didn't take me very long to go through your code with a pen and paper and verify that that is indeed the table it produces.
Still, here are a few pointers:
The C code compares x[i] and x[j], but you compare x[i] and x[i+j] in your Fortran code, more or less.
Integer arrays usually also start at index 1 in Fortran. So just like adding one to the index in the x String, you also need to add 1 every time you access T anywhere.

3n+1 implementing with Haskell, compile error

everyone. I'm a newcomer to Haskell and just implemented the '3n + 1' problem with it. I checked a lot but the type error seemed strange, could you please help me find what the problem is?
import qualified Data.Vector as V
import qualified Data.Matrix as M
nMax = 1000000
table = V.fromList $ 0 : 1 : [cycleLength x | x <- [2 .. nMax]] where
cycleLength x = if x' <= nMax then table V.! x' + 1 else cycleLength x' + 1 where
x' = if even x then x `div` 2 else 3 * x + 1
sparseTable = M.fromLists $ [] : [[f i j | j <- [0 .. ceiling $ logBase 2 nMax]] | i <- [1 .. nMax]] where
f i 0 = table V.! i
f i j = maxValue i j
maxValue i j = max $ (leftValue i j) (rightValue i j) where
leftValue i j = sparseTable M.! (i, j - 1)
rightValue i j = sparseTable M.! (i + 2 ^ (j - 1), j - 1)
I used the Vector and Matrix (download with cabal) modules to implement the functions. I think the first function (table) has been proved that no mistakes in it, probably mistakes are in the last two function, which I used to implement the sparse table algorithm.
Since I just signed up and don't have enough reputation now, I just paste the error message here:
[1 of 1] Compiling Main ( 001.hs, interpreted )
001.hs:14:39:
Occurs check: cannot construct the infinite type: s0 ~ s0 -> s0
Relevant bindings include
leftValue :: Int -> Int -> s0 -> s0 (bound at 001.hs:15:9)
rightValue :: Int -> Int -> s0 -> s0 (bound at 001.hs:16:9)
maxValue :: Int -> Int -> s0 -> s0 (bound at 001.hs:14:1)
In the third argument of ‘leftValue’, namely ‘(rightValue i j)’
In the second argument of ‘($)’, namely
‘(leftValue i j) (rightValue i j)’
Failed, modules loaded: none.
The problem is the $ in max $ (leftValue i j) (rightValue i j).
The ($) operator binds less tightly than any other operator, including the 'normal function application you get when you just use a space.
So with the $, it parses as
max ((leftvalue i j) (rightValue i j))
if you remove it that should parse as you intended, which was presumably
max (leftValue i j) (rightValue i j)
You can get a hint of this from the error message, where it talks about the "third argument of leftValue".
There's some more information about ($) in When should I use $ (and can it always be replaced with parentheses)?

searching a prime number

I hope I am not duplication any question but the suggested topic did not provide with any similar problem. I have a function that check if a number is a prime one. Now this is the slowest possible way to search for a prime.
subroutine is_prime_slow(num, stat)
implicit none
logical :: stat
integer :: num
integer :: i
if ((num .le. 3) .and. (num .gt. 1)) then
stat = .true.
return
end if
! write(*,*) 'limit = ',limit
do i = 2,num - 1
! write(*,*) 'mod(',limit,i,') = ',mod(limit,i)
if (mod(num,i) == 0) then
stat = .false.
return
end if
end do
stat = .true.
return
end
Now let's say that I do some improvement to it.
subroutine is_prime_slow(num, stat)
implicit none
logical :: stat
integer :: num
integer :: i
if ((num .le. 3) .and. (num .gt. 1)) then
stat = .true.
return
end if
! IMPROVEMENT
if ((mod(num,2) == 0) .or. (mod(num,3) == 0) .or. (mod(num,5) == 0) .or. (mod(num,7) == 0)) then
stat = .false.
return
end if
! write(*,*) 'limit = ',limit
do i = 2,num - 1
! write(*,*) 'mod(',limit,i,') = ',mod(limit,i)
if (mod(num,i) == 0) then
stat = .false.
return
end if
end do
stat = .true.
return
end
Now the second version excludes more than half of numbers e.g. all that are divisible by 2,3,5,7. How is it possible that when I time the execution with the linux 'time' program, the 'improved' version performs just as slowly? Am I missing some obvious trick?
Searching the first 900000 numbers:
1st: 4m28sec
2nd 4m26sec
The multiples of 2, 3, 5, and 7 are quickly rejected by the original algorithm anyway, so jumping over them does not improve the performance at all. Where the algorithm spends most of its time is in proving that numbers with large prime factors are composite. To radically improve the performance you should use a better primality test, such as Miller-Rabin.
A simpler improvement is testing factors only up to sqrt(num), not num-1. If that doesn't make immediate sense, think about how big the smallest prime factor of a composite number can be. Also, if you are looking for primes from 1 to N, it may be more efficient to use a prime number sieve, or testing divisibility only by primes you have already found.
I just recently coded something similar ;-)
! Algorithm taken from https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
subroutine eratosthenes_sieve(n, primes)
implicit none
integer,intent(in) :: n
integer,allocatable,intent(out) :: primes(:)
integer :: i, j, maxPrime, stat
logical :: A(n)
maxPrime = floor(sqrt(real(n)))
A = .true.
do i=2,maxPrime
j = i*i
do
A(j) = .false.
j = j + i ; if ( j .gt. n ) exit
enddo
enddo !i
allocate( primes( count(A)-1 ), stat=stat )
if ( stat /= 0 ) stop 'Cannot allocate memory!'
j = 1
do i=2,n ! Skip 1
if ( .not. A(i) ) cycle
primes( j ) = i
j = j + 1 ; if ( j > size(primes) ) exit
enddo
end subroutine
This algorithm gives you all prime numbers up to a certain number, so you can easily check whether your prime is included or not:
if ( any(number == prime) ) write(*,*) 'Prime found:',number

Resources