_closure and _info symbols in ghc dynamic libraries - macos

I'm wondering why some _closure symbols do not have corresponding _info symbols.
On OSX I have installed ghc-7.8.3 via https://ghcformacosx.github.io/
If I run:
nm -gU /Applications/ghc-7.8.3.app/Contents/lib/ghc-7.8.3/bin/../directory-1.2.1.0/libHSdirectory-1.2.1.0-ghc7.8.3.dylib | grep findExecut
I get the following output:
0000000000010348 D _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable1_closure
000000000000a3a8 T _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable1_info
000000000000fe90 D _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable2_closure
000000000000fe78 D _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable3_closure
000000000000fe58 D _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable4_closure
00000000000046c8 T _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable4_info
00000000000105a8 D _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable_closure
000000000000d6f0 T _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutable_info
0000000000010338 D _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutablezuzdsa_closure
000000000000a030 T _directoryzm1zi2zi1zi0_SystemziDirectory_findExecutablezuzdsa_info
Note that not all of the _closure symbols have corresponding _info symbols.
I have a situation where tar-0.4.1.0 is referencing the findExecutable3_info symbol, and linking fails because it isn't found. But first I'd like to understand the why and wherefores of the _info symbols.

See this diagram of a closure from https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects:
Every ordinary ("boxed") Haskell value is represented in memory by an object called a closure. The first word of the closure is called the "info pointer" and identifies what sort of value it is, while the rest of the closure contains data that determines the specific value (for instance, the fields of an ADT). Most closures are dynamically allocated on the heap, but a compiled Haskell program can also contain so-called static closures in its data sections. The _closure symbols are these closures that live inside the object file, and the _info symbols are the pointers to the end of info tables and the start of entry code.
For instance, if your program contains the source
x :: Integer
x = 123
then it will be compiled into the core
x :: Integer
x = S# 123# -- S# is the "small integer" constructor for Integer,
-- and 123# is an unboxed Int# literal
and in the object file there will be a symbol with a name like x_closure which is two words long, whose first word points to S#_info (via an ELF relocation) and whose second word is the value 123. In this case, there is no need for an x_info because x is an S# value.
For a function f, GHC will generate both an f_info which can be called directly when f is used in a context in which it is supplied enough arguments, and an f_closure with info pointer f_info which can be used otherwise (for example if f is used as an argument to a higher-order function). See https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObjects#FunctionClosures.
As for your linker error, you probably have some interface files that are out of sync with their corresponding object files. There is no particular meaning to the name findExecutable3, it's just some auxiliary definition that got lifted to the top level when compiling findExecutable. I would guess that somewhere in the interface file for System.Directory (or a module which depends on it) you have some unfolding that refers to a function findExecutable3, but when System.Directory was compiled, findExecutable3 actually ended up being some other sort of value.

Related

How to debug variables / recursive data types in Haskell

https://www.inf.ed.ac.uk/teaching/courses/inf1/fp/exams/exam-2016-paper1-answers.pdf
-- 3b
trace :: Command -> State -> [State]
trace Nil s = [s]
trace (com :#: mov) s = t ++ [state mov (last t)]
where t = trace com s
I am having troubles to understand section 3b. I try to debug the variables one by one but I always end up with violating the defined data types. The code confuses me and I want to see what the variables contain. How can I do it using Debug.Trace?
https://downloads.haskell.org/~ghc/latest/docs/html/libraries/base-4.12.0.0/Debug-Trace.html
Thank you.
Looks like section 3b is teaching 'How Recursion works' Those slides cover only conventional Haskell Lists. What you have with :#: is the opposite, sometimes called a snoc list. Then a sensible design for trace would be to also produce a snoc list result. But it doesn't (presumably because the lecturer thinks that by torturing beginners like this, they'll learn something). Haskell List comprehensions only work for Lists, not snoc. (Then half the content in those slides is useless for this exercise.) So 3b is teaching you structural inversion via recursion (which is the useful half of the slides).
violating the defined data types
Variable t in the code you give is local to function trace, so it seems difficult to access. But its definition
t = trace com s
is not:
We know trace :: Command -> State -> [State]
We can see in the equation for t that trace is applied to two arguments.
So the type of t must be the type of the result of trace, that is [State].
Are you not sure what is the type of the arguments to trace in the equation for t? In particular com unpacked from the Command argument to trace at top level.
Then we need to understand the type for :#:. We have for Question 3
data Command =
Nil
| Command :#: Move
That makes (:#:) an infix operator (which is why I've put it in parens).
Then we can ask GHCi for its type, to make sure.
Use the :type command, make sure to put parens.
For contrast, also ask for the :type of the usual Haskell List constructor (:) -- see the left-right inversion?
The term to the left of (:#:) is of type Command.
Then variable com in the equation for t must be type Command; and that fits what the call to trace is expecting.

Reduction of output array dimension in Fortran77 procedure

I am working on a large Fortran code, where parts are written in FORTRAN77.
There is a piece of code, which causes debugger to raise errors like:
Fortran runtime error:
Index '2' of dimension 1 of array 'trigs' above upper bound of 1
but when compiled without debugging options runs and does not crash the program. Debugging options used:
-g -ggdb -w -fstack-check -fbounds-check\
-fdec -fmem-report -fstack-usage
The logic of the problematic piece of code is following: in file variables.cmn I declare
implicit none
integer factors,n
real*8 triggers
parameter (n=32)
common /fft/ factors(19), triggers(6*n)
Variables factors and triggers are initialized in procedure initialize:
include 'variables.cmn'
...
CALL FFTFAX(n,factors,triggers)
...
FFTFAX is declared in another procedure as:
SUBROUTINE FFTFAX(N,IFAX,TRIGS)
implicit real*8(a-h,o-z)
DIMENSION IFAX(13),TRIGS(1)
CALL FAX (IFAX, N, 3)
CALL FFTRIG (TRIGS, N, 3)
RETURN
END
and lets look at procedure FFTRIG:
SUBROUTINE FFTRIG(TRIGS,N,MODE)
implicit real*8(a-h,o-z)
DIMENSION TRIGS(1)
PI=2.0d0*ASIN(1.0d0)
NN=N/2
DEL=(PI+PI)/dFLOAT(NN)
L=NN+NN
DO 10 I=1,L,2
ANGLE=0.5*FLOAT(I-1)*DEL
TRIGS(I)=COS(ANGLE)
TRIGS(I+1)=SIN(ANGLE)
10 CONTINUE
DEL=0.5*DEL
NH=(NN+1)/2
L=NH+NH
LA=NN+NN
DO 20 I=1,L,2
ANGLE=0.5*FLOAT(I-1)*DEL
TRIGS(LA+I)=COS(ANGLE)
TRIGS(LA+I+1)=SIN(ANGLE)
20 CONTINUE
In both FFTFAX and FFTRIG procedures there are different bounds for dimensions of arguments than the actual input array size (for TRIGS it is 1 and 19, respectively).
I printed out TRIGS after calling FFTFAX in no-debugger compilation setup:
trigs: 1.0000000000000000 0.0000000000000000\
0.99144486137381038 0.13052619222005157 0.96592582628906831\
0.25881904510252074 0.92387953251128674 0.38268343236508978\
...
My questions are:
Is notation :
DIMENSION TRIGS(1)
something more than setting bound of an array?
Why is the program even working in no-debugger mode?
Is setting:
DIMENSION TRIGS(*)
a good fix if I want variable trigs be a result of the procedure?
In f77 statements like the DIMENSION TRIGS(1) or similar or ..(*) with any number, if pertaining an argument of the procedure just tells the compiler
the rank of the array, the length in memory must be assigned to the array which is given in the call of the subroutine, normally f77 does not check this!
My recommendation either use (*) or better reformat (if necessary) the f77 sources to f90 (the bits shown would compile without change...).
and use dimension computed using n in the declaration within the subroutines/procedures.
Fortan passes arguments by address (i.e. trigs(i) in the subroutine just
will refer on the memory location, which corresponds to the address of trigs(1) + i*size(real*8).
A more consisted way to write the subroutine code could be:
SUBROUTINE FFTRIG(TRIGS,N,MODE)
! implicit real*8(a-h,o-z)
integer, intent(in) :: n
real(kind=8) :: trigs(6*n)
integer :: mode
! DIMENSION TRIGS(1)
.....
PI=2.0d0*ASIN(1.0d0)
.....
or with less ability for the compiler to check
SUBROUTINE FFTRIG(TRIGS,N,MODE)
! implicit real*8(a-h,o-z)
integer, intent(in) :: n
real(kind=8) :: trigs(:)
integer :: mode
! DIMENSION TRIGS(1)
.....
PI=2.0d0*ASIN(1.0d0)
.....
To answer your question, I would change TRIGS(1) to TRIGS(*), only to more clearly identify array TRIGS as not having it's dimension provided. TRIGS(1) is a carry over from pre F77 for how to identify this.
Using TRIGS(:) is incorrect, as defining array TRIGS in this way requires any routine calling FFTRIG to have an INTERFACE definition. This change would lead to other errors.
Your question is mixing the debugger's need for the array size vs the syntax excluding the size being provided. To overcome this you could pass the array TRIGS's declared dimension, as an extra declared argument, for the debugger to check. When using "debugger" mode, some compilers do provide hidden properties including the declared size of all arrays.

What is the purpose of the pipe character in Go's os.OpenFile flag argument?

When using the OpenFile function in Go's os package, what exactly is the purpose of the pipe character?
Example:
os.OpenFile("foo.txt", os.O_RDWR|os.O_APPEND, 0660)
Does it serve as a logical OR? If so, does Go choose the first one that is "truthy"?? Being that the constants those flags represent, at the heart of them are just integers written in hexadecimal, when compiled how does Go choose which flag to apply?
After all, if the function call were to go by the largest number, os.O_APPEND would take precedence over all other flags passed in as seen below:
os.O_RDWR == syscall.O_RDWR == 0x2 == 2
os.O_APPEND == syscall.O_APPEND == 0x400 == 1024
os.O_CREATE == syscall.O_CREAT == 0x40 == 64
UPDATE 1
To follow up on the comment below, if I have a bitwise operator calculation using os.O_APPEND|os.O_CREATE will that error if the file exists, or simply create/append as needed?
UPDATE 2
My question is two fold, one to understand the purpose of the bitwise operator, which I understand now is being used more as a bitmask operation; and two, how to use the os.OpenFile() function as a create or append operation. In my playing around I have found the following combination to work best:
file, _ := os.OpenFile("foo.txt", os.O_RDWR|os.O_CREATE|os.O_APPEND, 0660)
file.WriteString("Hello World\n")
file.Sync()
Is this the correct way or is there a more succinct way to do this?
It is a bitwise, not a logical OR.
If you write out the numbers in binary, and assign each a truth value 0/1, and apply the logical OR to each of the bits in place i between the arguments, and then reassemble the result into an integer by binary expansion - that's the | operator.
It is often used in a way that is commonly described as a "bitmask" - you use a bitmask when you want a single int value to represent a (small) set of switches that could be turned on or off. One bit per switch.
You should see in this context, A | B means "all the switches in A that are on, as well as all the switches in B that are on". In your case, the switches define the exact behavior of the file open/creation function, as described by the Go manual. (And probably more in detail by the Unix manpage I linked above).
In a bitmask, constants are typically defined that represent each switch - that's how those O_* constants are determined. Each is an int with exactly one bit set and represents a particular switch. (though, be careful, because sometimes they represent combinations of switches!).
Also:
^A // All of the "switches" not currently on in A
A&^B // All of the "switches" on in A but not on in B
A^B // All of the "switches" on in exactly one of A or B
, etc.
The operator | itself is described in the Go manual here.
It is a bitwise OR operator. Its purpose being used here is to allow for multiple values to be passed as a bitmask. Thus you can combine flags to create a desired result such as using the OpenFile() function to create a file if it does not exist or append to it if it does.
os.Openfile("foo.txt", os.O_RDWR|os.O_CREATE|os.O_APPEND, 0660
The constants being passed as arguments from the os package are assigned values from the syscall package. This package contains low-level operating system independent values.
Package syscall contains an interface to the low-level operating system primitives. The details vary depending on the underlying system, and by default, godoc will display the syscall documentation for the current system. If you want godoc to display syscall documentation for another system, set $GOOS and $GOARCH to the desired system. For example, if you want to view documentation for freebsd/arm on linux/amd64, set $GOOS to freebsd and $GOARCH to arm. The primary use of syscall is inside other packages that provide a more portable interface to the system, such as "os", "time" and "net".
https://golang.org/pkg/syscall/
As noted by #BadZen, a bitwise OR operator, in this case the '|' character, acts at the binary level changing any 0 values to 1's that are not already ones.
You should see in this context, A | B means "all the switches in A that are on, as well as all the switches in B that are on".
By doing this as the function above displays, you are manipulating the behavior of the function to create a file (os.O_CREATE) with the given name of foo.txtor open the file for reading/writing (os.O_RDWR) and any value written to it will be appended (os.O_APPEND). Alternatively you could pass along os.O_TRUNC in order to truncate the file before writing.
The bitwise OR operator allows you a powerful solution to combining different behaviors in order to get the result from the function that you are desiring.

lldb: how to call a function from a specific library/framework

Problem: In project we have localization functions which are specific to a framework/dynamic library. That is they have identical name but fetch resources from different bundles/folders
I'd want to call a function from a specific library, something similar to:
lldb> p my_audio_engine.framework::GetL10nString( stringId );
lldb> expr --shlib my_audio_engine.framework -- GetL10nString();
lldb> p my_audio_engine`L10N_Utils::GetString(40000)
but all these variants don't work.
Adding gdb in tags hoping the same semantic if exists will work on lldb as well.
lldb's expression parser doesn't currently have the equivalent of gdb's foo.c::function meta-symbol to encode a function from a specific source file.
Please feel free to file a bug requesting this at bugreporter.apple.com. It will get duped to the one I filed a while ago, but dups are votes for features, and we haven't gotten around to this one yet 'cause nobody but me asked for it...
For the nonce, you will have to do this by hand. Here's a silly example for calling printf, which I happen to know is in libsystem_c.dylib on OS X. First, I find the address in the shared library I am interested in:
(lldb) image lookup -vn printf libsystem_c.dylib
1 match found in /usr/lib/system/libsystem_c.dylib:
Address: libsystem_c.dylib[0x0000000000042948] (libsystem_c.dylib.__TEXT.__text + 266856)
Summary: libsystem_c.dylib`printf
Module: file = "/usr/lib/system/libsystem_c.dylib", arch = "x86_64"
Symbol: id = {0x00000653}, range = [0x00007fff91307948-0x00007fff91307a2c), name="printf"
The first address (the one under Address) is the address of the function in the dylib, not where it got loaded in the running program. That's not immediately useful. I could calculate the library's load offset if I wanted to and apply it to the file address, but fortunately the first address in the Symbol's address range is the address in the running program so I don't have to. 0x00007fff91307948 is the address I want.
Now I want to call that address. I do this in two steps because it makes the casting easier, like:
(lldb) expr typedef int (*$printf_type)(const char *, ...)
(lldb) expr $printf_type $printf_function = ($printf_type) 0x00007fff91307948
Now I have a function I can call over and over:
(lldb) expr $printf_function("Hello world %d times.\n", 400)
Hello world 400 times.
(int) $2 = 23
If you are going to do this over and over, you can write a Python function that finds the symbol out of the library of interest, and constructs the expression that calls the right function. The Python API's include calls to get symbols from a particular module (lldb-speak for loadable binary images), get their addresses, evaluate expressions, etc.

Fortran - explicit interface

I'm very new to Fortran, and for my research I need to get a monster of a model running, so I am learning as I am going along. So I'm sorry if I ask a "stupid" question.
I'm trying to compile (Mac OSX, from the command line) and I've already managed to solve a few things, but now I've come across something I am not sure how to fix. I think I get the idea behind the error, but again, not sure how to fix.
The model is huge, so I will only post the code sections that I think are relevant (though I could be wrong). I have a file with several subroutines, that starts with:
!==========================================================================================!
! This subroutine simply updates the budget variables. !
!------------------------------------------------------------------------------------------!
subroutine update_budget(csite,lsl,ipaa,ipaz)
use ed_state_vars, only : sitetype ! ! structure
implicit none
!----- Arguments -----------------------------------------------------------------------!
type(sitetype) , target :: csite
integer , intent(in) :: lsl
integer , intent(in) :: ipaa
integer , intent(in) :: ipaz
!----- Local variables. ----------------------------------------------------------------!
integer :: ipa
!----- External functions. -------------------------------------------------------------!
real , external :: compute_water_storage
real , external :: compute_energy_storage
real , external :: compute_co2_storage
!---------------------------------------------------------------------------------------!
do ipa=ipaa,ipaz
!------------------------------------------------------------------------------------!
! Computing the storage terms for CO2, energy, and water budgets. !
!------------------------------------------------------------------------------------!
csite%co2budget_initialstorage(ipa) = compute_co2_storage(csite,ipa)
csite%wbudget_initialstorage(ipa) = compute_water_storage(csite,lsl,ipa)
csite%ebudget_initialstorage(ipa) = compute_energy_storage(csite,lsl,ipa)
end do
return
end subroutine update_budget
!==========================================================================================!
!==========================================================================================!
I get error messages along the lines of
budget_utils.f90:20.54:
real , external :: compute_co2_storage
1
Error: Dummy argument 'csite' of procedure 'compute_co2_storage' at (1) has an attribute that requires an explicit interface for this procedure
(I get a bunch of them, but they are essentially all the same). Now, looking at ed_state_vars.f90 (which is "used" in the subroutine), I find
!============================================================================!
!============================================================================!
!---------------------------------------------------------------------------!
! Site type:
! The following are the patch level arrays that populate the current site.
!---------------------------------------------------------------------------!
type sitetype
integer :: npatches
! The global index of the first cohort in all patches
integer,pointer,dimension(:) :: paco_id
! The number of cohorts in each patch
integer,pointer,dimension(:) :: paco_n
! Global index of the first patch in this vector, across all patches
! on the grid
integer :: paglob_id
! The patches containing the cohort arrays
type(patchtype),pointer,dimension(:) :: patch
Etc etc - this goes one for another 500 lines or so.
So to get to the point, it seems like the original subroutine needs an explicit interface for its procedures in order to be able to use the (dummy) argument csite. Again, I am SO NEW to Fortran, but I am really trying to understand how it "thinks". I have been searching what it means to have an explicit interface, when (and how!) to use it etc. But I can't figure out how it applies in my case. Should I maybe use a different compiler (Intel?). Any hints?
Edit: So csite is declared a target in all procedures and from the declaration type(site type) contains a whole bunch of pointers, as specified in sitetype. But sitetype is being properly used from another module (ed_state_vars.f90) in all procedures. So I am still confused why it gives me the explicit interface error?
"explicit interface" means that the interface to the procedure (subroutine or function) is declared to the compiler. This allows the compiler to check consistency of arguments between calls to the procedure and the actual procedure. This can find a lot of programmer mistakes. You can do this writing out the interface with an interface statement but there is a far easier method: place the procedure into a module and use that module from any other entity that calls it -- from the main program or any procedure that is itself not in the module. But you don't use a procedure from another procedure in the same module -- they are automatically known to each other.
Placing a procedure into a module automatically makes its interface known to the compiler and available for cross-checking when it is useed. This is easier and less prone to mistakes than writing an interface. With an interface, you have to duplicate the procedure argument list. Then if you revise the procedure, you also have to revise the calls (of course!) but also the interface.
An explicit interface (interface statement or module) is required when you use "advanced" arguments. Otherwise the compiler doesn't know to generate the correct call
If you have a procedure that is useed, you shouldn't describe it with external. There are very few uses of external in modern Fortran -- so, remove the external attributes, put all of your procedures into a module, and use them.
I ran into the same problems you encountered whilst I was trying to install ED2 on my mac 10.9. I fixed it by including all the subroutines in that file in a module, that is:
module mymodule
contains
subroutine update_budget(csite,lsl,ipaa,ipaz)
other subroutines ecc.
end module mymodule
The same thing had to be done to some 10 to 15 other files in the package.
I have compiled all the files and produced the corresponding object files but now I am getting errors about undefined symbols. However I suspect these are independent of the modifications so if someone has the patience this might be a way to solve at least the interface problem.

Resources