Compilation of Z.cpp fails. I modify a private member of class Z in file Z.h. I rerun make. Whereupon Make compiles all files that depend on Z.h: A.cpp, B.cpp, ..., no human would be so stupid and pedantic, ..., X.cpp, Y.cpp, before it finally arrives at Z.cpp, and fails again because of a trivial typo in my edit of Z.h ...
Is there a way to make Make adaptive? To let it start compilation with targets that did not compile lately?
[PS] Let me clarify: Of course I am aware that any modification of Z.h may break any of A.cpp, ..., Y.cpp. Of course those sources need to be recompiled too. This question is about a heuristic improvement of the compilation order.
I do not expect Make to learn anything about C++. It would be fully sufficient for me, and immensely helpful for many, if Make could be taught to reason as follows: On my last attempt to compile A, .., Z, everything passed except Z. Even if my user worked hard to fix Z, Z is still more likely to be broken than any other single source file. Therefore I will now compile the sources in the order Z, A, ..., Y. Thereby, on average, my user will have to wait less for me to terminate with error.
Make doesn't really understand C++, or even C. It understands file dependencies as a directed graph, in abstract terms. A.o depends on Z.h, and so does Z.o.
Also, the fact that you changed a private member (even if make would magically know) does not mean that A.o is unaffected. A lot of C++ code is inlined, and private members can be accessed by inlined code. private checks if access is from within the same class Z, not within the same file Z.o.
Usually the better fix is make -j 8. Why build only a single file at a time?
Related
I'm moderately new to common lisp, but have extended experience with other 'separate compilation' languages (think C/C++/FORTRAN and such)
I know how to do an ASDF system definition. I know how to separate stuff in packages. I'm using SBCL, by the way.
The question is this: what's the best practice for splitting code (large packages) between .lisp files? I mean, in C there are include files, while lisp lives with the current image state. So with multiple files I need to handle dependencies or serial order in the system definition. But without something like forward declarations it's painful.
Simple example on what I want to do: I have, for example, two defstructs that are part of the same bigger data structure (like struct1 is a parent of some set of struct2). Some functions works on one, some other works on the other and some other use both.
So I would have: a packages.lisp, a fun1.lisp (with the first defstruct and related functions), a fun2.lisp (with the other defstruct and functions) and a funmix.lisp (with functions that use both). In an ideal world everything is sealed and compiling these in this order would be fine. As most of you know, this in practice almost never happen.
If I need to use struct2 functions from the struct1 ones I would need to either reorder or add a dependency. But then if there's some kind of back call (that can't be done with a closure) I would have struct1.lisp depending on struct2.lisp and vice-versa which is obviously not valid. So what? I could break the loop putting the defstruct in a separate file (say, structs.lisp) but what if either of the struct's function need to access the common functions in the third file? I would like to avoid style notes.
What's the common way to solve this, i.e. keeping loosely related code in the same file but still be able to interface to other ones. Is the correct solution to seal everything in a compilation unit (a single file)? use a package for every file with exports?
Lisp dependencies are simple, because in many cases, a Lisp implementation doesn't need to process the definition of something in order to compile its use.
Some exceptions to the rule are:
Macros: macros must be loaded in order to be expanded. There is a compile-time dependency between a file which uses macro and the file which defines them.
Packages: a package foo must be defined in order to use symbols like foo:bar or foo::priv. If foo is defined by a defpackage form in some foo.lisp file, then that file has to be loaded (either in source or compiled form).
Constants: constants defined with defconstant should be seen before their use. Similar remarks apply to inline functions, compiler macros.
Any custom things in a "domain specific language" which enforces definition before use. E.g. if Whizbang Inference Engine needs rules to be defined when uses of the rules are compiled, you have to arrange for that.
For certain diagnostics to be suppressed like calls to undefined functions, the defining and using files must be taken to be as a single compilation unit. (See below.)
All the above remarks also have implications for incremental recompilation.
When there is dependency like the above between files so that one is a prerequisite of the other, when the prerequisite is touched, the dependent one must be recompiled.
How to split code into files is going to be influenced by all the usual things: cohesion, coupling and what have you. Common-Lisp-specific reasons to keep certain things together in one file is inlining. The call to a function which is in the same file as the caller may be inlined. If your program supports any in-service upgrade, the granularity of code loading is individual files. If some functions foo and bar should be independently redefinable, don't put them in the same file.
Now about compilation units. Suppose you have a file foo.lisp which defines a function called foo and bar.lisp which calls (foo). If you just compile bar.lisp, you will likely get a warning that an undefined function foo has been called. You could compile foo.lisp first and then load it, and then compile bar.lisp. But that will not work if there is a circular reference between the two: say foo.lisp also calls (bar) which bar.lisp defines.
In Common Lisp, you can defer such warnings to the end of a compilation unit, and what defines a compilation unit isn't a single file, but a dynamic scope established by a macro called with-compilation-unit. Simply put, if we do this:
(with-compilation-unit
(compile-file "foo.lisp") ;; contains (defun foo () (bar))
(compile-file "bar.lisp")) ;; contains (defun bar () (foo))
If a compile-file isn't surrounded by with-compilation-unit then there is a compilation unit spanning that file. Otherwise, the outermost nesting of the with-compilation-unit macro determines the scope of what is in the compilation unit.
Warnings about undefined functions (and such) are deferred to the end of the compilation unit. So by putting foo.lisp and bar.lisp compilation into one unit, we suppress the warnings about either foo or bar not being defined and we can compile the two in any order.
Build systems use with-compilation-unit under the hood, as appropriate.
The compilation unit isn't about dependencies but diagnostics. Above, we don't have a compile time dependency. If we touch foo.lisp, bar.lisp doesn't have to be recompiled or vice versa.
By and large, Lisp codebases don't have a lot of hard dependencies among the files. Incremental compilation often means that just the affected files that were changed have to be recompiled. The C or C++ problem that everything has to be rebuilt because a core header file was touched is essentially nonexistent.
but what if
No matter how you first organize your code, if you change it significantly you are going to have to refactor. IMO there is no ideal way of grouping dependencies in advance.
As a rule of thumb it is generally safe to define generic functions first, then types, then actual methods, for example. For non-generic functions, you can cut circular dependencies by adding forward declarations:
(declaim (ftype function ...))
Having too much circular dependency is a bit of a code smell.
Is the correct solution to seal everything in a compilation unit
Yes, if you group the definitions in the same compilation unit (the same file), the file compiler will be able to silence the style notes until it reaches the end of file: at this point it knows if there are still missing references or if all the cross-references are resolved.
But then if there's some kind of back call (that can't be done with a closure)
If you have a specific example in mind please share, but typically you can define struct1 and its functions in a way that can be self-contained; maybe it can accept a map that binds event names to callbacks:
(make-struct-1 :callbacks (list :on-empty one-is-empty
:on-full one-is-full))
Similarly, struct2 can accept callbacks too (Dependency Injection) and the main struct ties them using closures (?).
Alternatively, you can design your data-structures so that they signal conditions, and the in the caller code you intercept them to bind things together.
I have been lately looking into GoLang -- coming from C++ background-- I am reading a paper which allegedly explains the reasoning behind making Golang, here is its link: https://talks.golang.org/2012/splash.article
One of the claims being is, handling Dependencies (Package) in C and C++ is pain and takes on a #ifndef guard instance to state
The intent is that the C preprocessor reads in the file but disregards
the contents on the second and subsequent readings of the file...
I referred a GCC page for the same, https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html.
so that if the header file appears in a subsequent #include directive
and FOO is defined, then it is ignored and it doesn’t preprocess or
even re-open the file a second time
Go: "Reads in and disregard"
vs
GCC: it doesn’t preprocess or even re-open the file a second time.
Doesn't contradict?
your thoughts are appreciated. Thanks for Reading my question.
The first passage is talking about a generic compiler, which, conceptually speaking, should read the contents of the file and disregard the contents (because they are #ifdefd out). That is, roughly, what the C standard specifies a compiler should do.
But practically everything in the C standard is under the "as if" rule - a compiler does not actually have to be implemented in the way suggested in the standard, so long as the end result it produces is exactly the same in every case. As such, GCC's particular implementation adds an optimization where, in cases where it can tell with certainty that the contents of the file would be disregarded, it doesn't actually read it. This is perfectly fine because it still behaves as if it has read the file but disregarded it.
Note that other compilers do not necessarily do the same.
Background information (you do not need to repeat these steps to answer the question, this just gives some background):
I am trying to compile a rather large set of generated modules. These files are the output of a prototype Modelica to OCaml compiler and reflect the Modelica class structure of the Modelica Standard Library.
The main feature is the use of polymorphic, open recursion: Every method takes a this argument which contains the final superclass hierarchy. So for instance the model:
model A type T = Real type S = T end A;
is translated into
let m_A = object
method m_T this = m_Modelica_Real
method m_S this = this#m_T this
end
and has to be closed before usage:
let _ = m_A#m_T m_A
This seems to postpone a lot of typechecking until the superclass hierarchy is actually fixed, which in turn makes it impossible to compile the final linkage module (try ocamlbuild Linkage.cmo after editing the comments in the corresponding file to see what I mean).
Unfortunately, since the code base is rather large and uses a lot of objects, the type-structure might not be the root cause after all, it might as well be some optimization or a flaw in the code-generation (although I strongly suspect the typechecker). So my question is: Is there any way to profile the ocaml compiler in a way that signals when a certain phase (typechecking, intermediate code generation, optimization) is over and how long it took? Any further insights into my particular use case are also welcome.
As of right now, there isn't.
You can do it yourself though, the compiler source are open and you can get those and modify them to fit your needs.
Depending on whether you use ocamlc or ocamlopt, you'll need to modify either driver/compile.ml or driver/optcompile.ml to add timers to the compilation process.
Fortunately, this already has been done for you here. Just compile with the option -dtimings or environment variable OCAMLPARAM=timings=1,_.
Even more easily, you can download the opam Flambda switch:
opam switch install 4.03.0+pr132
ocamlopt -dtimings myfile.ml
Note: Flambda itself changes the compilation time (most what happens after typing) and its integration into the OCaml compiler is not confirmed yet.
OCaml compiler is an ordinary OCaml program in that regard. I would use poorman's profiler for a quick inspection, using e.g. pmp script.
While not all Common Lisp implementations do compilation to machine code, some of them do, including SBCL and CCL.
In C/C++, if the source files don't change, the binary output of a C/C++ compiler will also not change, assuming the underlying system remains the same.
In a Common Lisp compiler, the compilation is not under the user's direct control, unlike C/C++. My question is that if the Lisp source files haven't changed, under what circumstances will a CL compiler compile the code more than once, and why? If possible, a simple illustrative example would be helpful.
I think that the question is based on some misconceptions. The compiler doesn't compile files, and it's not something that the user has no control over. The compiler is quite readily available through the compile function. The compiler operates on code, not on files. E.g., you can type at the REPL
CL-USER> (compile nil (list 'lambda (list 'x) (list '+ 'x 'x)))
#<FUNCTION (LAMBDA (X)) {100460E24B}>
NIL
NIL
There's no file involved at all. However, there is also a compile-file function, but notice that its description is:
compile-file transforms the contents of the file specified by
input-file into implementation-dependent binary data which are placed
in the file specified by output-file.
The contents of the file are compiled. Then that compiled file can be loaded. (You can also load uncompiled source files, too.) I think your question might boil down to asking under what circumstances would compile-file generate a file with different contents. I think that's really implementation dependent, and it's not really predictable. I don't know that your characterization of compilers for other languages necessarily holds either:
In C/C++, if the source files don't change, the binary output of a
C/C++ compiler will also not change, assuming the underlying system
remains the same.
What if the compiler happens to include a timestamp into the output in some data segment? Then you'd get different binary output every time. It's true that some common scripted compilation/build systems (e.g., make and similar) will check whether previous output can be reused based on whether the input files have changed in the meantime. That doesn't really say what the compiler does, though.
The rules are pretty much the same, but in Common Lisp, it's not a practice to separate declarations from implementation, so usually you must recompile every dependency to be sure. This is a shared practical consequence of dynamic environments.
Imagining there was such separation in place, the following are blantant examples (clearly not exhaustive) of changes that require recompiling specific dependent files, as the output may be different:
A changed package definition
A changed macro character or a change in its code
A changed macro
Adding or removing a inline or notinline declaration
A change in a global type or function type declaration
A changed function used in #., defvar, defparameter, defconstant, load-time-value, eql specializer, make-load-form generated code, defmacro et al (e.g. setf expanders)...
A change in the Lisp compiler, or in the base image
I mean, you can see it's not trivial to determine which files need to be recompiled. Sometimes, the answer is "all subsequent files", e.g. changing the " (double-quotes) macro-character, which might affect every literal string, or the compiler evolved in a non-backwards compatible way. In essence, we end where we started: you can only be sure with a full recompile and not reusing fasls across compilations. And sometimes it's faster than determining the minimum set of files that need to be recompiled.
In practice, you end up compiling single definitions a lot in development (e.g. with Slime) and not recompiling files when there's a fasl as old or younger than the source file. Many times, you reuse files from e.g. Quicklisp. But for testing and deployment, I advise clearing all fasls and recompiling everything.
There have been efforts to automate minimum dependency compilation with SBCL, but I think it's too slow when you change the interim projects more often that not (it involves a lot of forking, so in Windows it's either infeasible or very slow). However, it may be a time saver for base libraries that rarely change, if at all.
Another approach is to make custom base images with base libraries built-in, i.e. those you always load. It'll save both compilation and load times.
Background
I have a (large) project A and a (large) project B, such that A depends on B.
I would like to have two separate makefiles -- one for project A and one for project B -- for performance and maintainability.
Based on the comments to an earlier question, I have decided to entirely rewrite B's makefile such that A's makefile can include it. This will avoid the evils of recursive make: allow parallelism, not remake unnecessarily, improve performance, etc.
Current solution
I can find the directory of the currently executing makefile by including at the top (before any other includes).
TOP := $(dir $(lastword $(MAKEFILE_LIST)))
I am writing each target as
$(TOP)/some-target: $(TOP)/some-src
and making changes to any necessary shell commands, e.g. find dir to find $(TOP)/dir.
While this solves the problems it has a couple disadvantages:
Targets and rules are longer and a little less readable. (This is likely unavoidable. Modularity has a price).
Using gcc -M to auto-generate dependencies requires post-processing to add $(TOP) everywhere.
Is this the usual way to write makefiles that can be included by others?
If by "usual" you mean, "most common", then the answer is "no". The most common thing people do, is to improvise some changes to the includee so the names do not clash with the includer.
What you did, however, is "good design".
In fact, I take your design even futher.
I compute a stack of directories, if the inclusion is recursive, you need to keep the current directories on a stack as you parse the makefile tree. $D is the current directory - shorter for people to type than $(TOP)/,
and I prepend everything in the includee, with $D/, so you have variables:
$D/FOOBAR :=
and phony targets:
$D/phony: