Function defined in main program and public to other unit - pascal

In pascal, is there a way to make a function defined in main program and can be called by other units? I know the way to define a function in a unit can be called by the main program and other units. For some reason, I can only have two program files, one main program and one unit. One of the function cannot be defined in the unit. Thanks

No and yes. No it is not possible in Pascal, but many compilers ( Free Pascal, maybe also Delphi) this can be circumvented by using the support to call external (non Pascal) code.
This is done declaring the variable as external in the unit with a certain linker name, and adding this linker name to the procedure's declaration in the mainmodule. The code will only meet up at the linker, so you are responsible for making declarations match.
Free Pascal uses this technique e.g. to export certain OS dependent routines from the System unit without making them visible.
E.g. for Free Pascal:
declaration of function main program:
function Fpmkdir(path : pchar; mode: mode_t):cint; [public, alias : 'FPC_SYSC_MKDIR'];
begin
...
end;
declaration of the function in the unit:
Function FpMkdir (path : pChar; Mode: TMode):cInt; external name 'FPC_SYSC_MKDIR';

Related

Make-array in SBCL

How does make-array work in SBCL? Are there some equivalents of new and delete operators in C++, or is it something else, perhaps assembler level?
I peeked into the source, but didn't understand anything.
When using SBCL compiled from source and an environment like Emacs/Slime, it is possible to navigate the code quite easily using M-. (meta-point). Basically, the make-array symbol is bound to multiple things: deftransform definitions, and a defun. The deftransform are used mostly for optimization, so better just follow the function, first.
The make-array function delegates to an internal make-array% one, which is quite complex: it checks the parameters, and dispatches to different specialized implementation of arrays, based on those parameters: a bit-vector is implemented differently than a string, for example.
If you follow the case for simple-array, you find a function which calls allocate-vector-with-widetag, which in turn calls allocate-vector.
Now, allocate-vector is bound to several objects, multiple defoptimizers forms, a function and a define-vop form.
The function is only:
(defun allocate-vector (type length words)
(allocate-vector type length words))
Even if it looks like a recursive call, it isn't.
The define-vop form is a way to define how to compile a call to allocate-vector. In the function, and anywhere where there is a call to allocate-vector, the compiler knows how to write the assembly that implements the built-in operation. But the function itself is defined so that there is an entry point with the same name, and a function object that wraps over that code.
define-vop relies on a Domain Specific Language in SBCL that abstracts over assembly. If you follow the definition, you can find different vops (virtual operations) for allocate-vector, like allocate-vector-on-heap and allocate-vector-on-stack.
Allocation on heap translates into a call to calc-size-in-bytes, a call to allocation and put-header, which most likely allocates memory and tag it (I followed the definition to src/compiler/x86-64/alloc.lisp).
How memory is allocated (and garbage collected) is another problem.
allocation emits assembly code using %alloc-tramp, which in turns executes the following:
(invoke-asm-routine 'call (if to-r11 'alloc-tramp-r11 'alloc-tramp) node)
There are apparently assembly routines called alloc-tramp-r11 and alloc-tramp, which are predefined assembly instructions. A comment says:
;;; Most allocation is done by inline code with sometimes help
;;; from the C alloc() function by way of the alloc-tramp
;;; assembly routine.
There is a base of C code for the runtime, see for example /src/runtime/alloc.c.
The -tramp suffix stands for trampoline.
Have also a look at src/runtime/x86-assem.S.

Clean way to separate functions/subroutine declaration from definition in Fortran 90

I am working on a big Fortran 90 code, with a lot of modules. What bothers me is that when I modify the inner code of a function inside a module (without changing its mask), my Makefile (whose dependencies are based on "use") recompile every file that "use" that modified module, and recursively.
But when modifying the inner code of a function without touching its input/output, recompiling other files than the modified one is useless, no?
So I would like to separate the function declaration from their definition, like with the .h files in C or C++. What is the clean way to do this? Do I have to use Fortran include/preprocessor #include, or is there a "module/use" way of doing this?
I have tried something like this, but it seems to be quite nonsense...
main.f90
program prog
use foomod_header
integer :: i
bar=0
i=42
call foosub(i)
end program prog
foomod_header.f90
module foomod_header
integer :: bar
interface
subroutine foosub(i)
integer :: i
end subroutine
end interface
end module foomod_header
foomod.f90
module foomod
use foomod_header
contains
subroutine foosub(i)
integer ::i
print *,i+bar
end subroutine foosub
end module foomod
If submodules aren't an option (and they are ideal for this), then what you can do is make the procedure an external procedure and provide an interface for that procedure in a module. For example:
! Program.f90
PROGRAM p
USE Interfaces
IMPLICIT NONE
...
CALL SomeProcedure(xyz)
END PROGRAM p
! Interfaces.f90
MODULE Interfaces
IMPLICIT NONE
INTERFACE
SUBROUTINE SomeProcedure(some_arg)
USE SomeOtherModule
IMPLICIT NONE
TYPE(SomeType) :: some_arg
END SUBROUTINE SomeProcedure
END INTERFACE
END MODULE Interfaces
! SomeProcedure.f90
SUBROUTINE SomeProcedure(some_arg)
USE SomeOtherModule
IMPLICIT NONE
TYPE(SomeType) :: some_arg
...
END SUBROUTINE SomeProcedure
Some important notes:
There must only ever be one interface definition for a procedure accessible in a scope. Inside a subprogram the interface for the procedure defined by the subprogram is also considered defined - hence inside the subprogram you must not permit an interface block for procedures defined by the subprogram to be accessible. In terms of the example, this means that you must not have a USE Interfaces statement without an only clause inside the SomeProcedure external procedure.
If you do change the arguments or similar of the procedure inside SomeProcedure.f90 you had better make sure that you change the corresponding interface block inside the module!
If you can use F2003, the IMPORT statement can make life easier. Otherwise you might have to have additional modules (such as SomeOtherModule in the example) to share type definitions and the like between the Interfaces module and the external procedure.
If you have private entities or components relevant to the procedure then Fortran's rules entity and component accessibility may prevent you using this approach.
Typically some sort of whole program analysis is done at high levels of optimization. That analysis is typically much slower than the actual parsing of the code - splitting out procedures in this manner may not actually shorten build times significantly under these conditions.
Maybe the cleanest solution is to change the build system.
The real dependency introduced by a USE statement is not the source-code file, but the generated .mod file, which acts as a sort of "binary header file". I.e. where makefiles typically contain something like
MyProgram.o: MyModule.f90
what they really should contain is
MyProgram.o: MyModule.mod
MyModule.mod: MyModule.f90
with the creation of the .mod file being done in a way, that ensures an unchanged file-system timestamp, if the interface hasn't actually changed.
Sadly, compiler-support is awkward. Most compilers will overwrite the .mod file anyway, so the build process must at the same time detect, that the .mod file hasn't changed, e.g. by restoring the old modification time if the contents are unchanged, but at the same time needs to avoid recompiling the source file unnecessarily, which requires updating the modification time of the .mod file.
Additionally, some compilers (Intel, *cough*) add a binary time-stamp to the contents of the .mod files, that needs to be manually excluded from the comparison and has changed binary position across releases. This adds effort when supporting multiple compilers.

dlopen and dylib : main application and dylib address space

My main application statically links to a static library A with a function ABC and my dynamic library xyz.dylib also statically links to the same static library A which has the same function ABC. The function ABC uses a globally defined variable.
Now when the main application Loads xyz.dylib using dlopen on runtime. The initializer gets called where i have called ABC function. This function ABC and uses the global variable from main application address space.
On Osx, functions which are inline the dylib linker will use the first one that is used. So for example, if an inline function is used in your main executable first, and then used in the loaded dylib, it will use the one in the main executable.
This is normally fine, unless your inline makes reference to a global symbol, in which case you are now be using one if your globals for both the dylib, and your executable.
Again this is usually fine, since the same version is used consistently.
The problem happens when you have 2 inline functions that reference a global that is in both executable and dylib, and one function gets used first in the executable, and another one used first in the dylib. Then you have a mismatched pair. For example:
class MagicAlloc
{
void* Alloc() { return gAlloc.get(); }
void Free( void* v ) { gAlloc.free( v ); }
static RealAllocator gAlloc;
};
Suppose you call MagicAlloc::Alloc in the executable, then call it in the dylib, now for all allocations in both you will use the gAlloc in the executable. Then the first call to MagicAlloc::Free happens in the dylib. Then you will try to free something allocated in the binary on the globals from the dylib.
There are two solutions:
Don't use inlines to reference globals/statics. Move the global structure, and the function definitions into the same translation unit ( object file ). Mark the globals "static" so they aren't even visible outside the TLU. Now your functions will be resolved statically in the link step, and bound to the right global.
Hide all the symbols in the executable except the plugin api. Link as normal, but when linking the binary itself pass the following to the linker:
-Wl,-exported_symbols_list,export_file
Where export file is a list of link symbols that should be exported. E.g. you will need to at least have "_main" in that file. Now when your dylib runs it won't be able to dynamically link to the wrong inlines, because they won't be in the dynamic symbol table. The second solution is also more secure, since a malicious plugin won't be able to access globals as easily.

How to assign a function to an operator?

so i have a really simple function in my unit:
Function AzonosE(Const n1,n2:TNap):Boolean;
Begin
AzonosE:=n1=n2;
End;
i'd like to assign the('=') operator to this function, so that i can use this function in my main program this way : if n1=n2 (n1,n2:TNap;)
That's not standard Pascal functionality. OTOH, afaik neither is "CONST". You need to better specify your dialect/compiler.
In the case of Free Pascal, Niculare's reference to the relevant manual page is correct. It is afaik FPC specific though. For more practical applications it is best to have a look at the ucomplex unit in the RTL that defines a complex type.
Delphi afaik only allows it as part of structured type:
http://docwiki.embarcadero.com/RADStudio/XE3/en/Operator_Overloading_%28Delphi%29

Compiling Fortran external symbols

When compiling fortran code into object files: how does the compiler determine the symbol names?
when I use the intrinsic function "getarg" the compiler converts it into a symbol called "_getarg#12"
I looked in the external libraries and found that the symbol name inside is called "_getarg#16" what is the significance of the "#[number]" at the end of "getarg" ?
_name#length is highly Windows-specific name mangling applied to the name of routines that obey the stdcall (or __stdcall by the name of the keyword used in C) calling convention, a variant of the Pascal calling convention. This is the calling convention used by all Win32 API functions and if you look at the export tables of DLLs like KERNEL32.DLL and USER32.DLL you'd see that all symbols are named like this.
The _...#length decoration gives the number of bytes occupied by the routine arguments. This is necessary since in the stdcall calling conventions it is the callee who cleans up the arguments from the stack and not the caller as is the case with the C calling convention. When the compiler generates a call to func with two 4-byte arguments, it puts a reference to _func#8 in the object code. If the real func happens to have different number or size of arguments, its decorated name would be something different, e.g. _func#12 and hence a link error would occur. This is very useful with dynamic libraries (DLLs). Imagine that a DLL was replaced with another version where func takes one additional argument. If it wasn't for the name mangling (the technical term for prepending _ and adding #length to the symbol name), the program would still call into func with the wrong arguments and then func would increment the stack pointer with more bytes than was the size of the passed argument list, thus breaking the caller. With name mangling in place the loader would not launch the executable at all since it would not be able to resolve the reference to _func#8.
In your case it looks like the external library is not really intended to be used with this compiler or you are missing some pragma or compiler option. The getarg intrinsic takes two arguments - one integer and one assumed-sized character array (string). Some compilers pass the character array size as an additional argument. With 32-bit code this would result in 2 pointers and 1 integer being passed, totalling in 12 bytes of arguments, hence the _getarg#12. The _getarg#16 could be, for example, 64-bit routine with strings being passed by some kind of descriptor.
As IanH reminded me in his comment, another reason for this naming discrepancy could be that you are calling getarg with fewer arguments than expected. Fortran has this peculiar feature of "prototypeless" routine calls - Fortran compilers can generate calls to routines without actually knowing their signature, unlike in C/C++ where an explicit signature has to be supplied in the form of a function prototype. This is possible since in Fortran all arguments are passed by reference and pointers are always the same size, no matter the actual type they point to. In this particular case the stdcall name mangling plays the role of a very crude argument checking mechanism. If it wasn't for the mangling (e.g. on Linux with GNU Fortran where such decorations are not employed or if the default calling convention was cdecl) one could call a routine with different number of arguments than expected and the linker would happily link the object code into an executable that would then most likely crash at run time.
This is totally implementation dependent. You did not say, which compiler do you use. The (nonstandard) intrinsic can exist in more versions for different integer or character kinds. There can also be more versions of the runtime libraries for more computer architectures (e.g. 32 bit and 64 bit).

Resources