Related
I am working on a large Fortran code, where parts are written in FORTRAN77.
There is a piece of code, which causes debugger to raise errors like:
Fortran runtime error:
Index '2' of dimension 1 of array 'trigs' above upper bound of 1
but when compiled without debugging options runs and does not crash the program. Debugging options used:
-g -ggdb -w -fstack-check -fbounds-check\
-fdec -fmem-report -fstack-usage
The logic of the problematic piece of code is following: in file variables.cmn I declare
implicit none
integer factors,n
real*8 triggers
parameter (n=32)
common /fft/ factors(19), triggers(6*n)
Variables factors and triggers are initialized in procedure initialize:
include 'variables.cmn'
...
CALL FFTFAX(n,factors,triggers)
...
FFTFAX is declared in another procedure as:
SUBROUTINE FFTFAX(N,IFAX,TRIGS)
implicit real*8(a-h,o-z)
DIMENSION IFAX(13),TRIGS(1)
CALL FAX (IFAX, N, 3)
CALL FFTRIG (TRIGS, N, 3)
RETURN
END
and lets look at procedure FFTRIG:
SUBROUTINE FFTRIG(TRIGS,N,MODE)
implicit real*8(a-h,o-z)
DIMENSION TRIGS(1)
PI=2.0d0*ASIN(1.0d0)
NN=N/2
DEL=(PI+PI)/dFLOAT(NN)
L=NN+NN
DO 10 I=1,L,2
ANGLE=0.5*FLOAT(I-1)*DEL
TRIGS(I)=COS(ANGLE)
TRIGS(I+1)=SIN(ANGLE)
10 CONTINUE
DEL=0.5*DEL
NH=(NN+1)/2
L=NH+NH
LA=NN+NN
DO 20 I=1,L,2
ANGLE=0.5*FLOAT(I-1)*DEL
TRIGS(LA+I)=COS(ANGLE)
TRIGS(LA+I+1)=SIN(ANGLE)
20 CONTINUE
In both FFTFAX and FFTRIG procedures there are different bounds for dimensions of arguments than the actual input array size (for TRIGS it is 1 and 19, respectively).
I printed out TRIGS after calling FFTFAX in no-debugger compilation setup:
trigs: 1.0000000000000000 0.0000000000000000\
0.99144486137381038 0.13052619222005157 0.96592582628906831\
0.25881904510252074 0.92387953251128674 0.38268343236508978\
...
My questions are:
Is notation :
DIMENSION TRIGS(1)
something more than setting bound of an array?
Why is the program even working in no-debugger mode?
Is setting:
DIMENSION TRIGS(*)
a good fix if I want variable trigs be a result of the procedure?
In f77 statements like the DIMENSION TRIGS(1) or similar or ..(*) with any number, if pertaining an argument of the procedure just tells the compiler
the rank of the array, the length in memory must be assigned to the array which is given in the call of the subroutine, normally f77 does not check this!
My recommendation either use (*) or better reformat (if necessary) the f77 sources to f90 (the bits shown would compile without change...).
and use dimension computed using n in the declaration within the subroutines/procedures.
Fortan passes arguments by address (i.e. trigs(i) in the subroutine just
will refer on the memory location, which corresponds to the address of trigs(1) + i*size(real*8).
A more consisted way to write the subroutine code could be:
SUBROUTINE FFTRIG(TRIGS,N,MODE)
! implicit real*8(a-h,o-z)
integer, intent(in) :: n
real(kind=8) :: trigs(6*n)
integer :: mode
! DIMENSION TRIGS(1)
.....
PI=2.0d0*ASIN(1.0d0)
.....
or with less ability for the compiler to check
SUBROUTINE FFTRIG(TRIGS,N,MODE)
! implicit real*8(a-h,o-z)
integer, intent(in) :: n
real(kind=8) :: trigs(:)
integer :: mode
! DIMENSION TRIGS(1)
.....
PI=2.0d0*ASIN(1.0d0)
.....
To answer your question, I would change TRIGS(1) to TRIGS(*), only to more clearly identify array TRIGS as not having it's dimension provided. TRIGS(1) is a carry over from pre F77 for how to identify this.
Using TRIGS(:) is incorrect, as defining array TRIGS in this way requires any routine calling FFTRIG to have an INTERFACE definition. This change would lead to other errors.
Your question is mixing the debugger's need for the array size vs the syntax excluding the size being provided. To overcome this you could pass the array TRIGS's declared dimension, as an extra declared argument, for the debugger to check. When using "debugger" mode, some compilers do provide hidden properties including the declared size of all arrays.
I was looking at some Ruby code somewhere, and I saw the following line:
def do_something a, b, c, &callback
xyz = a + b + c
callback.call(xyz)
end
and then when it was called, they did something like this:
do_something a, b, c do |xyz|
puts xyz
end
Is this better practice to use this sort of callback as opposed to just returning the value made by the function? I can understand why it would be done if there are multiple values that need to be transferred, but this one has just one return.
Analysis
There is insufficient information in your original post to determine if this is useful or not. The intent of your first example seems to be that the method will be passed a block, which is then called as a Proc inside the method rather than yielded back to the block. There might be a valid use case for this, but your given example isn't one of them.
If the block is already there, why not just yield to the block? And what happens if no block is given?
Passing Proc or lambda objects around can certainly be a useful technique in certain cases, but unless it simplifies your code or makes it more readable you are creating additional complexity. The examples in your original post don't make a valid case for why it might be needed. Even if you update your post with better examples, "Is a Proc object necessary?" is almost certainly a subjective question based on the needs of the larger program.
Unless you need the features of a Proc or lambda (e.g. you need a closure or access to a specific Binding) then you are generally better off yielding to a block or returning a value. Your mileage may certainly vary.
Yield or Return
In the general case, you can choose to yield to a block or return a value depending on whether or not a block was given. For example:
def do_something(a, b, c)
xyz = a + b + c
block_given? ? yield(xyz) : xyz
end
Unless you need to pass around a closure, this is likely to be a more useful technique. However, as previously stated, your mileage (and code base) may vary.
I would call this bad practice since this method requires a block (you'll get a NoMethodError without one). It can be useful to have a mechanism for immediately passing the return value to a block, but I wouldn't make it mandatory.
A simple improvement would be to make the block optional
def do_something a, b, c
xyz = a + b + c
return yield(xyz) if block_given?
xyz
end
As the title says I'm curious about the difference between "call-by-reference" and "call-by-value-return". I've read about it in some literature, and tried to find additional information on the internet, but I've only found comparison of "call-by-value" and "call-by-reference".
I do understand the difference at memory level, but not at the "conceptual" level, between the two.
The called subroutine will have it's own copy of the actual parameter value to work with, but will, when it ends executing, copy the new local value (bound to the formal parameter) back to the actual parameter of the caller.
When is call-by-value-return actually to prefer above "call-by-reference"? Any example scenario? All I can see is that it takes extra memory and execution time due to the copying of values in the memory-cells.
As a side question, is "call-by-value-return" implemented in 'modern' languages?
Call-by-value-return, from Wikipedia:
This variant has gained attention in multiprocessing contexts and Remote procedure call: if a parameter to a function call is a reference that might be accessible by another thread of execution, its contents may be copied to a new reference that is not; when the function call returns, the updated contents of this new reference are copied back to the original reference ("restored").
So, in more practical terms, it's entirely possible that a variable is in some undesired state in the middle of the execution of a function. With parallel processing this is a problem, since you can attempt to access the variable while it has this value. Copying it to a temporary value avoids this problem.
As an example:
policeCount = 0
everyTimeSomeoneApproachesOrLeaves()
calculatePoliceCount(policeCount)
calculatePoliceCount(count)
count = 0
for each police official
count++
goAboutMyDay()
if policeCount == 0
doSomethingIllegal()
else
doSomethingElse()
Assume everyTimeSomeoneApproachesOrLeaves and goAboutMyDay are executed in parallel.
So if you pass by reference, you could end up getting policeCount right after it was set to 0 in calculatePoliceCount, even if there are police officials around, then you'd end up doing something illegal and probably going to jail, or at least coughing up some money for a bribe. If you pass by value return, this won't happen.
Supported languages?
In my search, I found that Ada and Fortran support this. I don't know of others.
Suppose you have a call by reference function (in C++):
void foobar(int &x, int &y) {
while (y-->0) {
x++;
}
}
and you call it thusly:
int z = 5;
foobar(z, z);
It will never terminate, because x and y are the same reference, each time you decrement y, that is subsequently undone by the increment of x (since they are both really z under the hood).
By contrast using call-by-value-return (in rusty Fortran):
subroutine foobar(x,y):
integer, intent(inout) :: x,y
do while y > 0:
y = y - 1
x = x + 1
end do
end subroutine foobar
If you call this routine with the same variable:
integer, z = 5
call foobar(z,z)
it will still terminate, and at the end z will be changed have a value of either 10 or 0, depending on which result is applied first (I don't remember if a particular order is required and I can't find any quick answers to the question online).
Kindly go to the following link , the program in there can give u an practical idea regarding these two .
Difference between call-by-reference and call-by-value
I'm very new to Fortran, and for my research I need to get a monster of a model running, so I am learning as I am going along. So I'm sorry if I ask a "stupid" question.
I'm trying to compile (Mac OSX, from the command line) and I've already managed to solve a few things, but now I've come across something I am not sure how to fix. I think I get the idea behind the error, but again, not sure how to fix.
The model is huge, so I will only post the code sections that I think are relevant (though I could be wrong). I have a file with several subroutines, that starts with:
!==========================================================================================!
! This subroutine simply updates the budget variables. !
!------------------------------------------------------------------------------------------!
subroutine update_budget(csite,lsl,ipaa,ipaz)
use ed_state_vars, only : sitetype ! ! structure
implicit none
!----- Arguments -----------------------------------------------------------------------!
type(sitetype) , target :: csite
integer , intent(in) :: lsl
integer , intent(in) :: ipaa
integer , intent(in) :: ipaz
!----- Local variables. ----------------------------------------------------------------!
integer :: ipa
!----- External functions. -------------------------------------------------------------!
real , external :: compute_water_storage
real , external :: compute_energy_storage
real , external :: compute_co2_storage
!---------------------------------------------------------------------------------------!
do ipa=ipaa,ipaz
!------------------------------------------------------------------------------------!
! Computing the storage terms for CO2, energy, and water budgets. !
!------------------------------------------------------------------------------------!
csite%co2budget_initialstorage(ipa) = compute_co2_storage(csite,ipa)
csite%wbudget_initialstorage(ipa) = compute_water_storage(csite,lsl,ipa)
csite%ebudget_initialstorage(ipa) = compute_energy_storage(csite,lsl,ipa)
end do
return
end subroutine update_budget
!==========================================================================================!
!==========================================================================================!
I get error messages along the lines of
budget_utils.f90:20.54:
real , external :: compute_co2_storage
1
Error: Dummy argument 'csite' of procedure 'compute_co2_storage' at (1) has an attribute that requires an explicit interface for this procedure
(I get a bunch of them, but they are essentially all the same). Now, looking at ed_state_vars.f90 (which is "used" in the subroutine), I find
!============================================================================!
!============================================================================!
!---------------------------------------------------------------------------!
! Site type:
! The following are the patch level arrays that populate the current site.
!---------------------------------------------------------------------------!
type sitetype
integer :: npatches
! The global index of the first cohort in all patches
integer,pointer,dimension(:) :: paco_id
! The number of cohorts in each patch
integer,pointer,dimension(:) :: paco_n
! Global index of the first patch in this vector, across all patches
! on the grid
integer :: paglob_id
! The patches containing the cohort arrays
type(patchtype),pointer,dimension(:) :: patch
Etc etc - this goes one for another 500 lines or so.
So to get to the point, it seems like the original subroutine needs an explicit interface for its procedures in order to be able to use the (dummy) argument csite. Again, I am SO NEW to Fortran, but I am really trying to understand how it "thinks". I have been searching what it means to have an explicit interface, when (and how!) to use it etc. But I can't figure out how it applies in my case. Should I maybe use a different compiler (Intel?). Any hints?
Edit: So csite is declared a target in all procedures and from the declaration type(site type) contains a whole bunch of pointers, as specified in sitetype. But sitetype is being properly used from another module (ed_state_vars.f90) in all procedures. So I am still confused why it gives me the explicit interface error?
"explicit interface" means that the interface to the procedure (subroutine or function) is declared to the compiler. This allows the compiler to check consistency of arguments between calls to the procedure and the actual procedure. This can find a lot of programmer mistakes. You can do this writing out the interface with an interface statement but there is a far easier method: place the procedure into a module and use that module from any other entity that calls it -- from the main program or any procedure that is itself not in the module. But you don't use a procedure from another procedure in the same module -- they are automatically known to each other.
Placing a procedure into a module automatically makes its interface known to the compiler and available for cross-checking when it is useed. This is easier and less prone to mistakes than writing an interface. With an interface, you have to duplicate the procedure argument list. Then if you revise the procedure, you also have to revise the calls (of course!) but also the interface.
An explicit interface (interface statement or module) is required when you use "advanced" arguments. Otherwise the compiler doesn't know to generate the correct call
If you have a procedure that is useed, you shouldn't describe it with external. There are very few uses of external in modern Fortran -- so, remove the external attributes, put all of your procedures into a module, and use them.
I ran into the same problems you encountered whilst I was trying to install ED2 on my mac 10.9. I fixed it by including all the subroutines in that file in a module, that is:
module mymodule
contains
subroutine update_budget(csite,lsl,ipaa,ipaz)
other subroutines ecc.
end module mymodule
The same thing had to be done to some 10 to 15 other files in the package.
I have compiled all the files and produced the corresponding object files but now I am getting errors about undefined symbols. However I suspect these are independent of the modifications so if someone has the patience this might be a way to solve at least the interface problem.
After searching for a while in books, here on stackoverflow and on the general web, I have found that it is difficult to find a straightforward explanation to the real differences between the fortran argument intents. The way I have understood it, is this:
intent(in) -- The actual argument is copied to the dummy argument at entry.
intent(out) -- The dummy argument points to the actual argument (they both point to the same place in memory).
intent(inout) -- the dummy argument is created locally, and then copied to the actual argument when the procedure is finished.
If my understanding is correct, then I also want to know why one ever wants to use intent(out), since the intent(inout) requires less work (no copying of data).
Intents are just hints for the compiler, and you can throw that information away and violate it. Intents exists almost entirely to make sure that you only do what you planned to do in a subroutine. A compiler might choose to trust you and optimize something.
This means that intent(in) is not pass by value. You can still overwrite the original value.
program xxxx
integer i
i = 9
call sub(i)
print*,i ! will print 7 on all compilers I checked
end
subroutine sub(i)
integer,intent(in) :: i
call sub2(i)
end
subroutine sub2(i)
implicit none
integer i
i = 7 ! This works since the "intent" information was lost.
end
program xxxx
integer i
i = 9
call sub(i)
end
subroutine sub(i)
integer,intent(out) :: i
call sub2(i)
end
subroutine sub2(i)
implicit none
integer i
print*,i ! will print 9 on all compilers I checked, even though intent was "out" above.
end
intent(in) - looks like pass by value (and changes of this are not reflected in outside code) but is in fact pass by reference and changing it is prohibited by the compiler. But it can be changed still.
intent(out) - pass somehow by reference, in fact a return argument
intent(inout) - pass by reference, normal in/out parameter.
Use intent(out) if is is plain out, to document your design. Do not care for the very little performance gain if any. (The comments suggest there is none as intent(in) is technically also pass by reference.)
It's not clear if parts of the OP's questions were actually answered. In addition, certainly there seems to be much confusion and various errors in the ensuing answers/discussions that may benefit from some clarifications.
A) The OP's question Re
" then I also want to know why one ever wants to use intent(out), since the intent(inout) requires less work (no copying of data)."
may not have answered, or at least too directly/correctly.
First, to be clear the Intent attributes have at least TWO purposes: "safety/hygiene", and "indirect performance" issues (not "direct performance" issues).
1) Safety/Hygiene: To assist in producing "safe/sensible" code with reduced opportunity to "mess things" up. Thus, an Intent(In) cannot be overwritten (at least locally, or even "globally" under some circumstances, see below).
Similarly, Intent(Out) requires that the Arg be assigned an "explicit answer", thus helping to reduce "rubbish" results.
For example, in the solution of perhaps the most common problem in computational mathematics, i.e. the so-called "Ax=b problem", the "direct result/answer" one is looking for is the values for the vector x. Those should be Intent(Out) to ensure x is assigned an "explicit" answer. If x was declared as, say, Intent(InOut) or "no Intent", then Fortran would assign x some "default values" (probably "zero's" in Debug mode, but likely "rubbish" in Release mode, being whatever is in memory at the Args pointer location), and if the user did not then assign the correct values to x explicitly, it would return "rubbish". The Intent(Out) would "remind/force" the user to assign values explicitly to x, and thus obviating this kind of "(accidental) rubbish".
During the solution process, one would (almost surely) produce the inverse of matrix A. The user may wish to return that inverse to the calling s/r in place of A, in which case A should be Intent(InOut).
Alternatively, the user may wish to ensure that no changes are made to the matrix A or the vector b, in which case they would be declared Intent(In), and thus ensuring that critical values are not overwritten.
2 a) "Indirect Performance" (and "global safety/hygiene"): Although the Intents are not directly for influencing performance, they do so indirectly. Notably, certain types of optimisation, and particularly the Fortran Pure and Elemental constructs, can produce much improved performance. These settings typically require all Args to have their Intent's declared explicitly.
Roughly speaking, if the compiler knows in advance the Intent's of all vars, then it can optimise and "stupidity check" the code with greater ease and effectiveness.
Crucially, if one uses Pure etc constructs, then, with high probability, there will be a "kind of global safety/hygiene" as well, since Pure/Elemental s/p's can only call other Pure/Elemental s/p's and so one CANNOT arrive at a situation of the sort indicated in "The Glazer Guy's" example.
For example, if Sub1() is declared as Pure, then Sub2() must also be declared as Pure, and then it will be required to declare the Intents at all levels, and so the "garbage out" produced in "The Glazer Guy's" example could NOT happen. That is, the code would be:
Pure subroutine sub_P(i)
integer,intent(in) :: i
call sub2_P(i)
end subroutine sub_P
Pure subroutine sub2_P(i)
implicit none
! integer i ! not permitted to omit Intent in a Pure s/p
integer,intent(in) :: i
i = 7 ! This WILL NOT WORK/HAPPEN, since Pure obviates the possibility of omitting Intent, and Intent(In) prohibits assignment ... so "i" remains "safe".
end subroutine sub2_P
... on compile, this would produce something like
" ||Error: Dummy argument 'i' with INTENT(IN) in variable definition context (assignment) at (1)| "
Of course, sub2 need not be Pure to have i declared as Intent(In), which, again would provide the "safety/hygiene" one is looking for.
Notice that even if i was declared Intent(InOut) it would still fail with Pure's. That is:
Pure subroutine sub_P(i)
integer,intent(in) :: i
call sub2_P(i)
end subroutine sub_P
Pure subroutine sub2_P(i)
implicit none
integer,intent(inOut) :: i
i = 7 ! This WILL NOT WORK, since Pure obviates the possibility of "mixing" Intent's.
end subroutine sub2_P
... on compile, this would produce something like
"||Error: Dummy argument 'i' with INTENT(IN) in variable definition context (actual argument to INTENT = OUT/INOUT) at (1)|"
Thus, strict or wide reliance on Pure/Elemental constructs will ensure (mostly) "global safety/hygiene".
It will not be possible to use Pure/Elemental etc in all cases (e.g. many mixed language settings, or when relying on external libs beyond your control, etc).
Still, consistent usage of Intents, and whenever possible Pure etc, will produce much benefit, and eliminate much grief.
One can simply get into the habit of declaring Intents everywhere all the time when it is possible, whether Pure or not ... that is the recommended coding practice.
... this also brings to the fore another reason for the existence of BOTH Intent(InOut) and Intent(Out), since Pure's must have all Arg's Intent's declared, there will be some Args that are Out only, while others are InOut (i.e. it would be difficult to have Pure's without each of In, InOut, and Out Intents).
2 b) The OP's comments expecting "performance improvements "since no copying is required" indicates a misunderstanding of Fortran and its extensive use of pass by reference. Passed by reference means, essentially, only the pointers are required, and in fact, often only the pointer to the first element in an array (plus some little hidden array info) is required.
Indeed, some insight may be offered by considering "the old days" (e.g. Fortran IV, 77, etc), when passing an array might have been coded as follows:
Real*8 A(1000)
Call Sub(A)
Subroutine Sub(A)
Real*8 A(1) ! this was workable since Fortran only passes the pointer/by ref to the first element of A(1000)
! modern Fortran may well throw a bounds check warning
In modern Fortran, the "equivalent" is to declare A as Real(DP) A(:) in the s/r (though strictly speaking there are various settings that benefit from passing the array's bounds and declaring explicitly with the bounds, but that would a lengthy digression for another day).
That is, Fortran does not pass by value, nor "make copies" for Args/Dummy vars. The A() in the calling s/r is the "same A" as that used in the s/r (Of course, in the s/r, one could make a copy of A() or whatever, which would create additional work/space requirements, but that is another matter).
It is for this reason primarily that the Intent's do not directly impact performance to a great extent, even for large array Arg's etc.
B) Regarding the "pass by Value" confusion: Although the various response above do confirm that using Intent is "not pass by value", it may be helpful to clarify the matter.
It may help to change the wording to "Intent is always pass by reference". This is not the same as "not pass by value", and it is an important subtlety. Notably, not only are Intents "byRef", Intent can PREVENT pass by value.
Although there are special/much more complex settings (e.g. mixed-language Fortran DLL's etc) where much additional discussion is required, for the most part of "standard Fortran", Args are passed by Ref. A demonstrations of this "Intent subtlety" can be seen in a simple extension of "The Glazer Guys" example, as:
subroutine sub(i)
integer, intent(in) :: i, j
integer, value :: iV, jV
call sub2(i)
call sub3(i, j, jV, iV)
end
subroutine sub2(i)
implicit none
integer i
i = 7 ! This works since the "intent" information was lost.
end
subroutine sub3(i, j, jV, iV)
implicit none
integer, value, Intent(In) :: i ! This will work, since passed in byRef, but used locally as byVal
integer, value, Intent(InOut) :: j ! This will FAIL, since ByVal/ByRef collision with calling s/r,
! ||Error: VALUE attribute conflicts with INTENT(INOUT) attribute at (1)|
integer, value, Intent(InOut) :: iV ! This will FAIL, since ByVal/ByRef collision with calling s/r,
! ... in spite of "byVal" in calling s/r
! ||Error: VALUE attribute conflicts with INTENT(INOUT) attribute at (1)|
integer, value, Intent(Out) :: jV ! This will FAIL, since ByVal/ByRef collision with calling s/r
! ... in spite of "byVal" in calling s/r
! ||Error: VALUE attribute conflicts with INTENT(OUT) attribute at (1)|
jV = -7
iV = 7
end
That is, anything with an "Out" aspect to it must be "byRef" (at least in normal settings), since the calling s/r is expecting "byRef". Thus, even if all the s/r's declare Args as "Value", they are "byVal" only locally (again in standard settings). So, any attempt by the called s/r to return an Arg that is declared as Value with any sort of Out Intent, will FAIL due to the "collision" of the passing styles.
If it must be "Out" or "InOut" and "Value", then one cannot use Intent: which is somewhat more than simply saying "it is not pass by value".