Compiling multiple Ocaml files

Compiling multiple Ocaml files - compilation

I am new to Ocaml and trying to write some small example application. I am using ocamlc version 3.11.2 under Linux Ubuntu 10.04. I want to compile two files:
a.ml
b.ml
File b.ml uses definitions from a.ml. As far as I understand, I can use ocamlc -c to perform compilation only. I can call ocamlc one final time when I have all the .cmo files to link them to an executable. Also, when compiling a file that uses definitions from another file, I have to tell the compiler in which .cmi file to find the external definitions.
So my idea was to use:
ocamlc -i -c a.ml > a.mli
ocamlc -c a.mli b.ml
ocamlc -o b a.cmo b.cmo
The first step works and produces files a.mli and a.cmo, but when running the second step I get
File "b.ml", line 1, characters 28-31:
Error: Unbound value foo
where foo is a function that is defined in a.ml and called in b.ml.
So my question is: how can I compile each source file separately and specify the interfaces to be imported on the command line? I have been looking in the documentation and as far as I can understand I have to specify the .mli files to be included, but I do not know how.
EDIT
Here some more details. File a.ml contains the definition
let foo = 5;;
File b.ml contains the expression
print_string (string_of_int foo) ^ "\n";;
The real example is bigger but with these files I already have the error I reported above.
EDIT 2
I have edited file b.ml and replaced foo with A.foo and this works (foo is visible in b.ml even though I have another compilation error which is not important for this question). I guess it is cleaner to write my own .mli files explicitly, as suggested by

It would be clearer if you showed the code that's not working. As Kristopher points out, though, the most likely problem is that you're not specifyig which module foo is in. You can specify the module explicitly, as A.foo. Or you can open A and just use the name foo.
For a small example it doesn't matter, but for a big project you should be careful not to use open too freely. You want the freedom to use good names in your modules, and if you open too many of them, the good names can conflict with each other.

First fix the unbound value issue, as explained by Jeffrey's answer.
This is a comment about the commands you're using.
Decomposing compilation in several steps is a good way to understand what's going on.
If you want to write your own a.mli, most likely to hide some values of the module A, then your command ocaml -i -c a.ml > a.mli is a good way to get a first version of the this file and then edit it. But if you're not touching a.mli, then you don't need to generate it: you can also directly enter
ocamlc -o foo a.ml b.ml
which will produce a.cmo, b.cmo and the exectuable foo.
(It will also generate a.cmi, which is the compiled version of a.mli, that you get by issuing ocamlc -c a.mli. Likewise it will also generate b.cmi).
Note that order matters: you need to provide a.ml before b.ml on the command line. This way, when compiling b.ml, the compiler has already seen a.ml and knows where to find the module A.
Some more comments:
You're right in your "As far as I understand" paragraph.
you don't really include a separate file, it's more like import in Python: the values of module A are available, but under the name A.foo. The contents of a.ml has not been copy-pasted into b.ml, rather, values of the module A, defined in a.ml and it's compiled version a.cmo have been accessed.
if you're using this module A in b.ml, you can pass any of the following on the command line before b.ml:
a.mli, which will get compiled into a.cmi
a.cmi if you've already compiled a.mli into a.cmi
a.ml or its compiled version a.cmo if you don't need to write your own a.mli, i.e. if the default interface of module A suits you. (This interface is simply every value of a.ml).

Related

Does "-Wl,-soname" work on MinGW or is there an equivalent?

I'm experimenting a bit with building DLLs on windows using MINGW.
A very good summary (in my opinion) can be found at:
https://www.transmissionzero.co.uk/computing/building-dlls-with-mingw/
There is even a basic project which can be used for the purpose of this discussion:
https://github.com/TransmissionZero/MinGW-DLL-Example/releases/tag/rel%2Fv1.1
Note there is a cosmetic mistake in this project which will make it fail out of the box: the Makefile does not create an "obj" directory - Either adjust the Makefile or create it manually.
So here is the real question.
How to change the Windows DLL name so it differs from the actual DLL file name ??
Essentially I'm trying to achieve on Windows, the effect which is very well described here on Linux:
https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
Initially I tried changing "InternalName" and ""OriginalFilename" in the resource file used to create the DLL but that does not work.
In a second step, I tried adding "-Wl,-soname,SoName.dll" on the command that performs the final link, to change the Windows DLL name.
However, that does not seem to have the expected effect (I'm using MingW 7.3.0, x86_64-posix-seh-rev0).
Two things makes me say that:
1/ The test executable still works (I would expect it to fail, because it tries to locate SoName.dll but can't find it).
2/ "pexports.exe AddLib.dll" produces the output below, where the library name hasn't changed:
LIBRARY "AddLib.dll"
EXPORTS
Add
bar DATA
foo DATA
Am I doing anything wrong ? Are my expectations wrong perhaps ?
Thanks for your help !
David

First of all, I would like to say it's important to use either a .def file for specifying the exported symbols or use __declspec(dllexport) / __declspec(dllimport), but never mix these two methods. There is also another method using the -Wl,--export-all-symbols linker flag, but I think that's ugly and should only be used when quick and dirty is what you want.
It is possible to tell MinGW to use a DLL filename that does not match the library name. In the link step use -o to specify the DLL and use -Wl,--out-implib, to specify the library file.
Let me illustrate by showing how to build chebyshev as a both static and shared library. Its sources consist of only only 2 files: chebyshev.h and chebyshev.c.
Compile
gcc -c -o chebyshev.o chebyshev.c -I. -O3
Create static library
ar cr libchebyshev.a chebyshev.o
Create a .def file (as it wasn't supplied and __declspec(dllexport) / __declspec(dllimport) wasn't used either). Note that this file doesn't contain a line with LIBRARY allowing the linker to specify the DLL filename later.
There are several ways to do this if the .def file wasn't supplied by the project:
3.1. Get the symbols from the .h file(s). This may be hard as sometimes you need to distinguish for example between type definitions (like typedef, enum, struct) and actual functions and variables that need to be exported;
echo "EXPORTS" > chebyshev.def
sed -n -e "s/^.* \**\(chebyshev_.*\) *(.*$/\1/p" chebyshev.h >> chebyshev.def
3.2. Use nm to list symbols in the library file and filter out the type of symbols you need.
echo "EXPORTS" > chebyshev.def
nm -f posix --defined-only -p libchebyshev.a | sed -n -e "s/^_*\([^ ]*\) T .*$/\1/p" >> chebyshev.def
Link the static library into the shared library.
gcc -shared -s -mwindows -def chebyshev.def -o chebyshev-0.dll -Wl,--out-implib,libchebyshev.dll.a libchebyshev.a
If you have a project that uses __declspec(dllexport) / __declspec(dllimport) things are a lot easier. And you can even have the link step generate a .def file using the -Wl,--output-def, linker flag like this:
gcc -shared -s -mwindows -o myproject.dll -Wl,--out-implib,myproject.dll.a -Wl,--output-def,myproject.def myproject.o
This answer is based on my experiences with C. For C++ you really should use __declspec(dllexport) / __declspec(dllimport).

I believe I have found one mechanism to achieve on Windows, the effect described for Linux in https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
This involves dll_tool
In the example Makefile there was originally this line:
gcc -o AddLib.dll obj/add.o obj/resource.o -shared -s -Wl,--subsystem,windows,--out-implib,libaddlib.a
I simply replaced it with the 2 lines below instead:
dlltool -e obj/exports.o --dllname soname.dll -l libAddLib.a obj/resource.o obj/add.o
gcc -o AddLib.dll obj/resource.o obj/add.o obj/exports.o -shared -s -Wl,--subsystem,windows
Really, the key seems to be the creation with dlltool of an exports file in conjunction with dllname. This exports file is linked with the object files that make up the body of the DLL and it handles the interface between the DLL and the outside world. Note that dlltool also creates the "import library" at the same time
Now I get the expected effect, and I can see that the "Internal DLL name" (not sure what the correct terminology is) has changed:
First evidence:
>> dlltool.exe -I libAddLib.a
soname.dll
Second evidence:
>> pexports.exe AddLib.dll
LIBRARY "soname.dll"
EXPORTS
Add
bar DATA
foo DATA
Third evidence:
>> AddTest.exe
Error: the code execution cannot proceed because soname.dll was not found.
Although the desired effect is achieved, this still seems to be some sort of workaround. My understanding (but I could well be wrong) is that the gcc option "-Wl,-soname" should achieve exactly the same thing. At least it does on Linux, but is this broken on Windows perhaps ??

Make inconsistent assumptions over interface

I want to optimise the compilation time of my makefile. One problem that waists my time is, after modifying one single file, make returns for instance,
File "frontend/parser_e.ml", line 1:
Error: The files expression/rc.cmi and frontend/gen/lexer_ref.cmi
make inconsistent assumptions over interface Utility
make: *** [frontend/parser_e.cmx] Error 2
rm frontend/parser_name.ml
Note that the files in trouble may change, but it happens quite often. What I have to do is make clean and then make, as a consequence it is not an incremental build and takes time.
So does anyone know what I should check in my makefile to reduce the chance of having this kind of error?
Edit 1:
Actually, all my ml-related files are in depth 1, except frontend/gen/*, which are in depth 2. Following the answer of #camlspotter, I modified a little bit the ocamldep part of my makefile. Now it looks like follows:
DIRS= -I frontend -I frontend/gen -I lib ...
depend: $(AUTOGEN)
# ocamldep -native $(DIRS) */*.ml */*.mli > depend # this is what was written before, I don't hink it is correct
ocamldep -native $(DIRS) *.ml *.mli > depend
As a consequence, make following another make gives immediately an inconsistence error.
One remark is I don't have AUTOGEN, is it normal?
Another remark is that make depend generates a depend that has 0 character, is it normal?
Edit 2:
I modified depend: by following Makefile of OCaml source code:
beforedepend:: */*.ml
depend: beforedepend
(for d in \
frontend frontend/gen lib ... ; \
do ocamldep $(DIRS) $$d/*.mli $$d/*.ml; \
done) > depend
I have actually around 20 folders, each has 1-5 ml files. This time, make rangs over for d in ..., and does not want to stop. But if I remove 3-4 folders, it succeeds to create a depend after several seconds.

Your Makefile does not cover all the necessary dependencies between modules.
The meaning of
File "frontend/parser_e.ml", line 1:
Error: The files expression/rc.cmi and frontend/gen/lexer_ref.cmi
make inconsistent assumptions over interface Utility
is:
frontend/parser_e.ml depends on expression/rc.ml and frontend/gen/lexer_ref.ml
Both expression/rc.ml and frontend/gen/lexer_ref.ml use module named Utility
expression/rc.ml and frontend/gen/lexer_ref.ml must agree with the type (interface) of Utility, but did not.
I think of two possibilities to cause this state:
There may be two different utility.ml, for example dir_a/utility.ml and dir_b/utility.ml. OCaml does not allow linking modules with the same name. You can workaround this using packed modules (see -pack compiler option). Your case is not this.
The both modules use the same utility.ml but the dependencies may not be perfectly known to your Makefile. This is your case.
A possible scenario of the second case is:
You have modified utility.ml or utility.mli and its interface (.cmi file) has been changed.
One of expression/rc.ml and frontend/gen/lexer_ref.ml is recompiled against this new interface of Utility, but the other IS NOT, since the dependency is not known.
The compiler has found the inconsistency between the two modules when they are used together in frontend/parser_e.ml.
For fix, you have to run ocamldep to capture all the necessary module dependencies and inform it to make. Note that:
Give proper options and arguments. Since you work with nested directories, you need -I option several times.
Make sure that the auto-generated .ml and .mli files are really generated before ocamldep runs. Since you seem to have .mly and .mll files and you have the issue around them, I suspect you miss something around here.
A good example of the dependency analysis of OCaml modules is found at OCaml compiler source code itself. It is good to check around its lines with beforedepend, depend and include .depend.
General hints:
Add include .depend to your Makefile and capture all the module dependencies into this .depend file, using ocamldep
Note that all the .ml and .mli files of your project must be scanned by ocamldep. Do not forget to add -I options properly or it misses some dependencies.
Before running ocamldep, make sure auto-generated .ml and .mli files such as the output of .mly and .mll are generated. Or it misses some dependencies.
Typical Makefile looks like:
beforedepend:: x.ml
x.ml: x.mly
ocamlyacc x.mly
beforedepend:: y.ml
y.ml: y.mll
ocamllex y.mll
depend: beforedepend
ocamldep -I <dir1> -I <dir2> <all the ml and mli paths> > .depend
include .depend

Makefile -L command

If I have this line in the make file:\
libpqxx_Libs = -L/share/home/cb -lpqxx-2.6.9 -lpq
Does this indicate the compiler to use the lpqxx-2.6.9.so shared object file or does this indciate the compiler to use all the .so in the foler lpqxx-2.6.9? Or is this something else altogether?
Thanks for the help!

-L in this context is an argument to the linker, that adds the specified directory to the list of directories that the linker will search for necessary libraries, e.g. libraries that you've specified using -l.
It isn't a makefile command, even though it's usually seen in makefiles for C projects.

The -L is actually not a makefile command (as you state it in the title of your question).
What actually happens in this line is an assignment of a value to the variable libpqxx_Libs -- nothing more and nothing less. You will have to search in your makefile where that variable is used via $(libpqxx_Libs) or ${libpqxx_Libs}. That is most likely as a argument in a link command, or a compile command that includes linking.
In that context, the meaning of -L and -l can be found in, for example, the gcc man pages, which state that
-llibrary
Use the library named library when linking.
The linker searches a standard list of directories for the li-
brary, which is actually a file named `liblibrary.a'. The linker
then uses this file as if it had been specified precisely by
name.
The directories searched include several standard system direc-
tories plus any that you specify with `-L'.

How to force gcc to link like g++?

In this episode of "let's be stupid", we have the following problem: a C++ library has been wrapped with a layer of code that exports its functionality in a way that allows it to be called from C. This results in a separate library that must be linked (along with the original C++ library and some object files specific to the program) into a C program to produce the desired result.
The tricky part is that this is being done in the context of a rigid build system that was built in-house and consists of literally dozens of include makefiles. This system has a separate step for the linking of libraries and object files into the final executable but it insists on using gcc for this step instead of g++ because the program source files all have a .c extension, so the result is a profusion of undefined symbols. If the command line is manually pasted at a prompt and g++ is substituted for gcc, then everything works fine.
There is a well-known (to this build system) make variable that allows flags to be passed to the linking step, and it would be nice if there were some incantation that could be added to this variable that would force gcc to act like g++ (since both are just driver programs).
I have spent quality time with the gcc documentation searching for something that would do this but haven't found anything that looks right, does anybody have suggestions?

Considering such a terrible build system write a wrapper around gcc that exec's gcc or g++ dependent upon the arguments. Replace /usr/bin/gcc with this script, or modify your PATH to use this script in preference to the real binary.
#!/bin/sh
if [ "$1" == "wibble wobble" ]
then
exec /usr/bin/gcc-4.5 $*
else
exec /usr/bin/g++-4.5 $*
fi

The problem is that C linkage produces object files with C name mangling, and that C++ linkage produces object files with C++ name mangling.
Your best bet is to use
extern "C"
before declarations in your C++ builds, and no prefix on your C builds.
You can detect C++ using
#if __cplusplus

Many thanks to bmargulies for his comment on the original question. By comparing the output of running the link line with both gcc and g++ using the -v option and doing a bit of experimenting, I was able to determine that "-lstdc++" was the magic ingredient to add to my linking flags (in the appropriate order relative to other libraries) in order to avoid the problem of undefined symbols.
For those of you who wish to play "let's be stupid" at home, I should note that I have avoided any use of static initialization in the C++ code (as is generally wise), so I wasn't forced to compile the translation unit containing the main() function with g++ as indicated in item 32.1 of FAQ-Lite (http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html).

How do I strip local symbols from linux kernel module without breaking it?

If I do --strip-debug or --strip-unneeded, I have the .ko that lists all function names with nm, if I do just strip foo.ko I have a kernel module that refuses to load.
Does anyone know a quick shortcut how to remove all symbols that are not needed for module loading so that people cannot reverse engineer the API:s as easily?
PS: For all you open source bigots missionaries; this is something that general public will never be using in any case so no need to turn the question into a GPL flame war.

With no answer to my previous questions, here are some guesses that could also be some clues, and a step to an answer:
From what I recall, a .ko is nothing but an .o file resulting from the merge of all the .o files generated by your source module, and the addition of a .modinfo section.
At the end of any .ko building Makefile, there is an LD call: from what I recall, ld is called with the -r option, and this is what create that .o file that the Makefile calls a .ko. This resulting file is not to be confused with an archive or object library (.a file), that is just a format archiving / packaging multiple .o files as one: A merged object is the result of a link that produces yet another .o module: But in the resulting module, all sections that could be merged have been, and all public / external pairs that could be resolved have been inside those sections.
So I assume that you end up with your .ko file containing all your "local" extern definitions:
Those that are extern because they
are used to call across the .o
modules in your .ko (but are not
needed anymore since they are not
supposed to be called from outside
the .ko), and
those that the .ko module DO need to
properly communicate with the loader
and kernel.
The former have most likely already been resolved by ld during the merge, but ld has no way to know whether you intend to have them also callable from outside the .ko.
So the extraneous symbols you see are those that are extern for each of your .o files, but are not needed as extern for the resulting .ko.
And what you are looking for is a way to strip only those.
Does this last paragraph properly describe the symbols you want to get rid of?

I think this is exactly what we are
talking about here.
OK, then it looks like one solution is to "manually" remove the extraneous symbols. The "strip" utility seems to allow individually stripping (or keeping) of symbols, so you would have to use one --strip-all and a small bunch of --keep-symbol= . Note that --wildcard might help a bit, too. You can do the opposite, of course, keep all and individually strip, depending on what's the most convenient.
A good start could be to remove all the symbols that you explicitly defined in your module for cross-module linking and don't want to appear - just leaving the obvious useful ones, things like init and exit. And to not touch those that have been generated by / belong to the kernel dev software infrastructure. Then trial and error until you find the right recipe... In fact, I would think that about all your own symbols might be removable, apart from those you explicitly defined yourself as EXPORT_SYMBOL (and init / exit, of course).
Good luck! :)
PS:
In fact, it seems that the required source information exists in all .ko projects to perform the required stripping automatically: Unless I'm missing something, it seems that anything that's not EXPORT_SYMBOL or explicitly inserted by the build software could theoretically be stripped by default at the end of "ld -r" time that ends a .ko build. It's just that I don't think the toolchain (compiler / linker) have provision / directives / options to individually designate "strip or keep" syms for the relocatable link / merge. Otherwise, some modifications in the EXPORT_SYMBOL macro and in a few other places could probably achieve the result you're after, and shave some bytes from most .ko files in any Linux system.

I just built a kernel without realizing the kernel config had debug symbols enabled, so the size of the resulting modules were quite large. This worked for me:
# du -sh /lib/modules/3.1.0/
1.9G /lib/modules/3.1.0/
# find /lib/modules/3.1.0/ -iname "*.ko" -exec strip --strip-debug {} \;
# du -sh /lib/modules/3.1.0/
134M /lib/modules/3.1.0/
Find all files in /lib/modules/3.1.0 named *.ko and execute strip --strip-debug on each of them.

I'm not sure I understand what the problem really is:
When developing a .ko, if I don't explicitly add something like
ccflags-y += -ggdb -O0 -Wall
into my Makefile, I don't get any symbol but for those that I publish or external ref myself. I'm sure I don't get any other symbols for several good reasons:
the resulting .ko file is considerably smaller,
dumping the file and analyzing the ELF shows the tables are not there,
I can't see nor access the symbols in kgdb.
So I'm a little puzzled at your question, actually?... What are those symbols you do see in your .ko (and don't want to)?
How are they declared in your source file?
In which ELF sections do they end up?
And (sorry, dumb question ahead): Did you define static all things that didn't need to be seen outside of their own module?

In addition to filofel's post:
The reason stripping userspace shared libraries keeps them functioning is because their exported symbols are in the .dynsym section which is never stripped. .ko files however do not use dynsym.

people have reported success with
strip --strip-unneeded

strip -g XXX.
My Previous problem like what you happened is sloved by this command in embedded device with Linux Kernel 3.0.8.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio