How to compile multiple Chicken Scheme files? - compilation

I need to compile a Chicken Scheme project containing multiple source files, but I'm getting errors.
According to the manual and this SO answer, I need to put (declare)s in my sources. Why the compiler can't just see that I'm importing the other source is beyond me, but meh.
The problem is, even if I put the (declare)s in, the compiler complains about the (import)s and (use)s.
infinity.filesystem.scm:
(use bindings filepath posix)
(declare (uses infinity.general.scm))
(load-relative "infinity.general.scm")
(module infinity.filesystem (with-open-file make-absolute-path with-temporary-directory with-chdir)
(import scheme filepath posix infinity.general)
(begin-for-syntax
(use bindings chicken)
(import infinity.general))
...etc...
infinity.general.scm:
(declare (unit infinity.general.scm))
(require-extension srfi-1 srfi-13 format data-structures ansi-escape-sequences basic-sequences)
(module infinity.general (bind+ format-ansi repeat-string join-strings pop-chars! inc! dec!
take* drop* take-right* drop-right* ends-with? take-where)
(import scheme chicken srfi-1 srfi-13 data-structures ansi-escape-sequences basic-sequences bindings ports format)
...etc...
Command:
$ csc -uses bindings.o -uses infinity.general.o -c infinity.filesystem.scm -o infinity.filesystem.o
Compiler says:
Syntax error (import): cannot import from undefined module
and
unbound variable: use
If I just remove the import and use declarations for "infinity.general", the file compiles. However, I have two problems with this:
Will the resulting .o file actually work, in the absence of import and use clauses? Or will it complain about missing code at runtime?
csi requires that my code contains (import) and (use) declarations, whereas csc requires that it does not. I, however, require that my code works in both csi and csc!
How can I solve this, please?

Why the compiler can't just see that I'm importing the other source is beyond me, but meh.
Declares are used to determine dependencies: the compiler needs to know in what order (and if at all) to invoke a particular toplevel, to ensure the right code is initialized before any of the globals from that unit can be used. When everything is being compiled separately, the compiler wouldn't know when to insert calls to toplevels. The -uses switch you pass to csc is redundant: csc -uses foo is equivalent to putting (declare (uses foo)) in the source code. Passing -uses foo.o doesn't do anything with the file foo.o as far as I can tell.
In your code snippet, you're using load, which is not the correct way to include code at compile-time: load will read and evaluate the target file at run time. Instead, you should omit the load completely: the declare already takes care of the dependency; you just need to link them together.
Also, it's not very common to use filenames as module/unit names, though it should work.
If I just remove the import and use declarations for "infinity.general", the file compiles. However, I have two problems with this:
1) Will the resulting .o file actually work, in the absence of import and use clauses? Or will it complain about missing code at runtime?
You'll need to keep the import expressions, or the program shouldn't compile. If it does compile, there's something strange going on. You don't need use when you link everything together statically. If you're using dynamic linking, you will get a runtime error.
The error you get about unbound variable: use is because you're using use in a begin-for-syntax block. You'll probably just need to (import-for-syntax chicken), as per your other SO question.
2) csi requires that my code contains (import) and (use) declarations, whereas csc requires that it does not. I, however, require that my code works in both csi and csc!
It looks like you're approaching this too quickly: You are writing a complete program and at the same time trying to make it run compiled and interpreted, without first building an understanding of how the system works.
At this point, it's probably a good idea to experiment first with a tiny project consisting of two files. Then you can figure out how to compile an executable that works from code that also works in the interpreter. Then, use this knowledge to build the actual program. If at any point something breaks, you can always go back to the minimal case and figure out what you're doing differently.
This will also help in getting support, as you would be able to present a complete, but minimal set of files, and people will be able to tell you much quicker where you went wrong, or whether you've found a bug.

Related

Common lisp best practices for splitting code between files

I'm moderately new to common lisp, but have extended experience with other 'separate compilation' languages (think C/C++/FORTRAN and such)
I know how to do an ASDF system definition. I know how to separate stuff in packages. I'm using SBCL, by the way.
The question is this: what's the best practice for splitting code (large packages) between .lisp files? I mean, in C there are include files, while lisp lives with the current image state. So with multiple files I need to handle dependencies or serial order in the system definition. But without something like forward declarations it's painful.
Simple example on what I want to do: I have, for example, two defstructs that are part of the same bigger data structure (like struct1 is a parent of some set of struct2). Some functions works on one, some other works on the other and some other use both.
So I would have: a packages.lisp, a fun1.lisp (with the first defstruct and related functions), a fun2.lisp (with the other defstruct and functions) and a funmix.lisp (with functions that use both). In an ideal world everything is sealed and compiling these in this order would be fine. As most of you know, this in practice almost never happen.
If I need to use struct2 functions from the struct1 ones I would need to either reorder or add a dependency. But then if there's some kind of back call (that can't be done with a closure) I would have struct1.lisp depending on struct2.lisp and vice-versa which is obviously not valid. So what? I could break the loop putting the defstruct in a separate file (say, structs.lisp) but what if either of the struct's function need to access the common functions in the third file? I would like to avoid style notes.
What's the common way to solve this, i.e. keeping loosely related code in the same file but still be able to interface to other ones. Is the correct solution to seal everything in a compilation unit (a single file)? use a package for every file with exports?
Lisp dependencies are simple, because in many cases, a Lisp implementation doesn't need to process the definition of something in order to compile its use.
Some exceptions to the rule are:
Macros: macros must be loaded in order to be expanded. There is a compile-time dependency between a file which uses macro and the file which defines them.
Packages: a package foo must be defined in order to use symbols like foo:bar or foo::priv. If foo is defined by a defpackage form in some foo.lisp file, then that file has to be loaded (either in source or compiled form).
Constants: constants defined with defconstant should be seen before their use. Similar remarks apply to inline functions, compiler macros.
Any custom things in a "domain specific language" which enforces definition before use. E.g. if Whizbang Inference Engine needs rules to be defined when uses of the rules are compiled, you have to arrange for that.
For certain diagnostics to be suppressed like calls to undefined functions, the defining and using files must be taken to be as a single compilation unit. (See below.)
All the above remarks also have implications for incremental recompilation.
When there is dependency like the above between files so that one is a prerequisite of the other, when the prerequisite is touched, the dependent one must be recompiled.
How to split code into files is going to be influenced by all the usual things: cohesion, coupling and what have you. Common-Lisp-specific reasons to keep certain things together in one file is inlining. The call to a function which is in the same file as the caller may be inlined. If your program supports any in-service upgrade, the granularity of code loading is individual files. If some functions foo and bar should be independently redefinable, don't put them in the same file.
Now about compilation units. Suppose you have a file foo.lisp which defines a function called foo and bar.lisp which calls (foo). If you just compile bar.lisp, you will likely get a warning that an undefined function foo has been called. You could compile foo.lisp first and then load it, and then compile bar.lisp. But that will not work if there is a circular reference between the two: say foo.lisp also calls (bar) which bar.lisp defines.
In Common Lisp, you can defer such warnings to the end of a compilation unit, and what defines a compilation unit isn't a single file, but a dynamic scope established by a macro called with-compilation-unit. Simply put, if we do this:
(with-compilation-unit
(compile-file "foo.lisp") ;; contains (defun foo () (bar))
(compile-file "bar.lisp")) ;; contains (defun bar () (foo))
If a compile-file isn't surrounded by with-compilation-unit then there is a compilation unit spanning that file. Otherwise, the outermost nesting of the with-compilation-unit macro determines the scope of what is in the compilation unit.
Warnings about undefined functions (and such) are deferred to the end of the compilation unit. So by putting foo.lisp and bar.lisp compilation into one unit, we suppress the warnings about either foo or bar not being defined and we can compile the two in any order.
Build systems use with-compilation-unit under the hood, as appropriate.
The compilation unit isn't about dependencies but diagnostics. Above, we don't have a compile time dependency. If we touch foo.lisp, bar.lisp doesn't have to be recompiled or vice versa.
By and large, Lisp codebases don't have a lot of hard dependencies among the files. Incremental compilation often means that just the affected files that were changed have to be recompiled. The C or C++ problem that everything has to be rebuilt because a core header file was touched is essentially nonexistent.
but what if
No matter how you first organize your code, if you change it significantly you are going to have to refactor. IMO there is no ideal way of grouping dependencies in advance.
As a rule of thumb it is generally safe to define generic functions first, then types, then actual methods, for example. For non-generic functions, you can cut circular dependencies by adding forward declarations:
(declaim (ftype function ...))
Having too much circular dependency is a bit of a code smell.
Is the correct solution to seal everything in a compilation unit
Yes, if you group the definitions in the same compilation unit (the same file), the file compiler will be able to silence the style notes until it reaches the end of file: at this point it knows if there are still missing references or if all the cross-references are resolved.
But then if there's some kind of back call (that can't be done with a closure)
If you have a specific example in mind please share, but typically you can define struct1 and its functions in a way that can be self-contained; maybe it can accept a map that binds event names to callbacks:
(make-struct-1 :callbacks (list :on-empty one-is-empty
:on-full one-is-full))
Similarly, struct2 can accept callbacks too (Dependency Injection) and the main struct ties them using closures (?).
Alternatively, you can design your data-structures so that they signal conditions, and the in the caller code you intercept them to bind things together.

how to check for a macro defined in a c file in Makefile?

I have a #define ONB in a c file which (with several #ifndef...#endifs) changes many aspects of a programs behavior. Now I want to change the project makefile (or even better Makefile.am) so that if ONB is defined and some other options are set accordingly, it runs some special commands.
I searched the web but all i found was checking for environment variables... So is there a way to do this? Or I must change the c code to check for that in environment variables?(I prefer not changing the code because it is a really big project and i do not know everything about it)
Questions: My level is insufficient to ask in comments so I will have to ask here:
How and when is the define added to the target in the first place?
Do you essentially want a way to be able to post compile query the binaries to to determine if a particular define was used?
It would be helpful if you could give a concrete example, i.e. what are the special commands you want run, and what are the .c .h files involved?
Possible solution: Depending on what you need you could use LLVM tools to maybe generate and examine the AST of your code to see if a define is used. But this seems a little like over engineering.
Possible solution: You could also use #includes to pull in .c or header files and a conditional error be generated, or compile (to a .o), then if the compile fails you know it is defined or not. But this has it's own issues depending on how things are set-up in your make file.

Use vs Import vs Require vs Require-extension in Chicken Scheme

I'm a little hazy on the differences between (use) and (import) in Chicken. Similarly, how do (load), (require) and (require-extension) differ?
These things don't seem to be mentioned much on the website.
Load and require are purely run-time, procedural actions. Load accepts a string argument and loads the file with that name (which can be source or compiled code) into the running Scheme, so that whatever it defines becomes available. Require does the same thing, but checks if the file has already been loaded by seeing if provide has been called on the same name (typically by the file as it is loaded). They are relatively rare in Scheme programming, corresponding to plug-ins in other languages, where code unknown at compile time needs to be loaded. See the manual page for unit eval for more details.
Import is concerned with modules rather than files. It looks for the named module, which must have already been loaded (but see below for Chicken 5), and makes the names exported from that module visible in the current context. In order to successfully import a module, there must be an import library for it. It is syntax, so the module name must appear explicitly in the call and cannot be computed at run time. See the manual page on modules for more details.
Require-library does the Right Thing to load code. If the code is already part of the running Scheme, either because it is built into Chicken, it does nothing. Otherwise it will load a core library unit if there is one, or will call require as a last resort. At compile time, it does analogous things to make sure that the environment will be correct at run time. See the "Non-standard macros and special forms" page in the manual for more details.
Use does a require-library followed by an import on the same name. It is the most common way to add functionality to your Chicken program. However, we write (import scheme) and (import chicken) because the functionality of these modules is already loaded. Require-extension is an exact synonym for use, provided for SRFI 55 compatibility. In R7RS mode, import is also a synonym for use.
Update for Chicken 5
Use has been removed from the language, and import now does what use used to do: it loads (if necessary) and then imports. Require-extension is consequently now a synonym for import.
In addition, Chicken-specific procedures and macros have been broken into modules with names like (chicken base) and (chicken bitwise).

Boost.Tests: Specify function name instead of main

I am working on a project using the cmake build system. By default CMake has a nice framework for generating a single executable from a set of C/C++ code. The cmake function is called create_test_sourcelist. What it does is generate a C/C++ dispatcher with a single main entry point which will call other C/C++ code.
Therefore I have a bunch of C/C++ files with a function signature such as: int TestFunctionality1(int argc, char *argv[]), which I'd like to keep as-is, unless of course it means even more work.
How can I keep this system in place and start using BOOST_CHECK ? I could not figure out how to specify the actual main entry point is not called int main(int argc, char *argv[]).
My goal is have a framework for integration with Jenkins, since the project already uses Boost, I believe this should be doable without re-writing the existing CMake test suite and changing all tests into independent main function.
Unfortunately, it seems there is no straightforward and clean way to do that.
From one side the only useful function of create_test_sourcelist is to generate a test driver: a (stupid pretty simple, naive and with lack of abilities to hack/extend) C/C++ translation unit based on ${cmake-prefix}/share/cmake/Templates/TestDriver.cxx.in (and there is no way to select some other template).
From other side, Boost UTF offers its own test runner (which is equal to test driver in CMake's terminology), but in any variant (static, dynamic, single-header) it contains definition for main() some way or another (even in case of external test runner).
…so you end up with two main() functions and no ability to choose a single one.
Digging a little into sources of create_test_sourcelist I've really wonder why are they implemented it as a command and not as ordinal (extern) cmake module — it doesn't do any special (which can't be implemented using CMake language). This command is really stupid — it doesn't check that required functions are really exists (you'll get compile errors if smth wrong). There are no ways for flexible customization of output source file at all. All that is does is stripping paths and extensions from a list of source files and substitute it to mentioned template using ordinal configure_file()…
So, personally I see no reason to use it at all. It is why I've done the same (but better ;) job in the module mentioned in the comment above.
If you are still want to use that command, generated test driver completely useless if you want to use Boost UTF. You need to provide your own initialization function anyway (and it is not main()), where you can manually register your test cases into a master test suite (or organize your tests into some more complex tree). In that case there is absolutely no reason to use create_test_sourcelist! All what you can get from it is a list of sources that needs to be given to add_executable()… but it is much more easy to do w/ set()… This command even can't help you w/ list of test functions (list of filenames w/o extension actually) to call (it used internally and not exported). Do you still want to use that command??

How many times does a Common Lisp compiler recompile?

While not all Common Lisp implementations do compilation to machine code, some of them do, including SBCL and CCL.
In C/C++, if the source files don't change, the binary output of a C/C++ compiler will also not change, assuming the underlying system remains the same.
In a Common Lisp compiler, the compilation is not under the user's direct control, unlike C/C++. My question is that if the Lisp source files haven't changed, under what circumstances will a CL compiler compile the code more than once, and why? If possible, a simple illustrative example would be helpful.
I think that the question is based on some misconceptions. The compiler doesn't compile files, and it's not something that the user has no control over. The compiler is quite readily available through the compile function. The compiler operates on code, not on files. E.g., you can type at the REPL
CL-USER> (compile nil (list 'lambda (list 'x) (list '+ 'x 'x)))
#<FUNCTION (LAMBDA (X)) {100460E24B}>
NIL
NIL
There's no file involved at all. However, there is also a compile-file function, but notice that its description is:
compile-file transforms the contents of the file specified by
input-file into implementation-dependent binary data which are placed
in the file specified by output-file.
The contents of the file are compiled. Then that compiled file can be loaded. (You can also load uncompiled source files, too.) I think your question might boil down to asking under what circumstances would compile-file generate a file with different contents. I think that's really implementation dependent, and it's not really predictable. I don't know that your characterization of compilers for other languages necessarily holds either:
In C/C++, if the source files don't change, the binary output of a
C/C++ compiler will also not change, assuming the underlying system
remains the same.
What if the compiler happens to include a timestamp into the output in some data segment? Then you'd get different binary output every time. It's true that some common scripted compilation/build systems (e.g., make and similar) will check whether previous output can be reused based on whether the input files have changed in the meantime. That doesn't really say what the compiler does, though.
The rules are pretty much the same, but in Common Lisp, it's not a practice to separate declarations from implementation, so usually you must recompile every dependency to be sure. This is a shared practical consequence of dynamic environments.
Imagining there was such separation in place, the following are blantant examples (clearly not exhaustive) of changes that require recompiling specific dependent files, as the output may be different:
A changed package definition
A changed macro character or a change in its code
A changed macro
Adding or removing a inline or notinline declaration
A change in a global type or function type declaration
A changed function used in #., defvar, defparameter, defconstant, load-time-value, eql specializer, make-load-form generated code, defmacro et al (e.g. setf expanders)...
A change in the Lisp compiler, or in the base image
I mean, you can see it's not trivial to determine which files need to be recompiled. Sometimes, the answer is "all subsequent files", e.g. changing the " (double-quotes) macro-character, which might affect every literal string, or the compiler evolved in a non-backwards compatible way. In essence, we end where we started: you can only be sure with a full recompile and not reusing fasls across compilations. And sometimes it's faster than determining the minimum set of files that need to be recompiled.
In practice, you end up compiling single definitions a lot in development (e.g. with Slime) and not recompiling files when there's a fasl as old or younger than the source file. Many times, you reuse files from e.g. Quicklisp. But for testing and deployment, I advise clearing all fasls and recompiling everything.
There have been efforts to automate minimum dependency compilation with SBCL, but I think it's too slow when you change the interim projects more often that not (it involves a lot of forking, so in Windows it's either infeasible or very slow). However, it may be a time saver for base libraries that rarely change, if at all.
Another approach is to make custom base images with base libraries built-in, i.e. those you always load. It'll save both compilation and load times.

Resources