How to discover (possibly approximate) dependencies between SML source files? - compilation

I have software written in Standard ML (henceforth SML) which I'd like to make portable across three compilers.
Standard ML of New Jersey is the easiest, as their Compilation Manager does the dependency analysis for you.
MLton uses a similar format to SML/NJ, but it requires that files be listed in dependency order, similar to Unix libraries.
Moscow ML uses an ordinary Makefile, but requires that dependencies be listed on the command line (and so ideally go into the Makefile).
What I'd like to do is take a list of SML source files and discover dependencies between files. Here are the ways I've tried and failed:
The SML/NJ Compilation Manager produces a dependency graph, but it is not a graph of files. Instead it contains a big collection of undocumented compiler internals. Turning that into a dependency graph on files has been tried since at least 2009; nobody has done it.
If I once get files into dependency order, I can get def/use information, which includes source code position, from MLton. So if I find a partial solution that gets files in order for MLton, and I can use that to get a complete solution.
I'm looking for ideas along the lines of simple tools that would give approximate answers. My code has two special properties:
It never uses open.
Every top-level definition is a structure, functor, or signature definition.
It seems to me that in this situation, a simple syntactic tool might be able to find external dependencies of a file. But I don't quite know how to write one.
Ideas?

In the MLton repository, there is a cm2mlb tool that will convert a .cm file into an MLB file (util/cm2mlb of the git repo). If you put all of your files into a .cm file you should be able to use this to generate an MLB file that will be in dependency order.

Based on this thread, which is asking a similar question, I was able to write a small program that worked for me (I copied the produced list of file names, pasted it inside an .mlb file, and MLton was able to compile my little project). So, I can't guarantee anything, but it's probably a good start. Hope it helps.
sources.cm
group
structure Deps
is
$/basis.cm
$/pgraph.cm
$smlnj/cm.cm
deps.sml
deps.sml
structure Deps =
struct
structure PG = PortableGraph
fun list () =
let
fun println s = print (s ^ "\n")
fun extractFiles def =
case def of
PG.DEF { rhs = PG.COMPILE { src, ... }, ... } => SOME (#1 src)
| _ => NONE
val PG.GRAPH { defs, ... } =
(* Replace "sources.cm" with your project's .cm file path. *)
#graph (valOf (CM.Graph.graph "sources.cm"))
val files = List.mapPartial extractFiles defs
in
List.app println files
end
end

Related

Implement a SCons source code formater. Modify a source file before compiling it

I want to format my C/C++ source code before every compilation.
I found no information how to do it in SCons.
Ideas I tried:
What I would need: a Builder that has the same files for source and target. Impossible in SCons because cyclic dependency. env.FormatCode(target='bla.c', source='bla.c')
Use env.AddPreAction(source, format_action) on the objects resulted in compilation. Partially works, but not incremental with variant dirs.
def StyleFormatCCode(env, source):
sys.path.append('somePath/clangStyleChecker'))
import styleChecker
def format_action(target, source, env):
for file in source:
styleChecker.main(['-f', '-i', str(file)])
return env.AddPreAction(source, format_action)
sources1 = env.Glob('*.cpp')
sources = env.Object(sources1)
env.StyleFormatCCode(sources)
The style formatter problem is the same as Modify a source file before compilation.
Any idea how to do it or locations where I can find something like this in SCons?
Firstly, don't do this in process.
If you do it in process you will NOT be able to parallel build any where near as efficiently.
Python has the GIL and so there's very limited multithreading if all the logic is in python.
Secondly, modifying the source file before each compile will cause a recompile the next time around. Since you'll be modifying the sources.
If you're okay with either or both of the above, then something similar to your method would work for you.. ;)
You can reach into the Object and SharedObject builders and add prepend an action to their list of actions. You will be reaching in the innards of SCons to proceed carefully.
Another option would be just to do this logic in plain python (no Builder()s involved) in your SConscript/SConstruct. That would prevent the rebuilds as the sources are scanned and checked for changes after all the SConscripts have been processed.

'Vertex' redeclared in this pacakge

I have a Go project in JetBrains goland where all files are runnable yet independent of each other.
But to make every runnnable, I need to make them as package main.
And I have several "Vertex" defined elsewhere in other file and Goland complain about it.
But it is still runnable, and that's purely complaint from Goland.
Question -
Is there a better way to organized the files?
If not, is there a way to turn off the complaint from Goland?
Working with multiple files that declare the main() function in the same directory is not recommended in general, mainly due to problems similar to yours.
However, there are several ways to solve this.
You can use build constraints, also known as build tags, to separate the binaries at build time. When using them, the IDE will also need to be adjusted using the Settings/Preferences | Build Tags & Vendoring. And, depending how you build your application, you might also need to adjust the build command to add the corresponding tags to it.
The other option, which I'd recommend in this case, is to move each main() defining file into a structure such as this:
/repository_root
/cmd
/command1
command1.go (file holds the `main()` func)
/command2
command2.go (file holds the `main()` func)
/command3
command3.go (file holds the `main()` func)
/some
/package
some_file.go
some_other_file.go
....
some_other_file.go
As an example of this layout, you can have a look at Delve, which uses a similar structure, but only has a single "command" in the cmd folder.
Lastly, sometimes it's possible to remove the duplication and move it to a common file which holds the data type, but it's not always ideal and can make the build command more complex, since you need to specify all the files that should be included in the build process.
Edit:
And you can read more on how to organize your Go packages/applications here
These articles will explain how to organize your Go packages:
https://rakyll.org/style-packages/
https://medium.com/#benbjohnson/standard-package-layout-7cdbc8391fc1#.ds38va3pp
https://peter.bourgon.org/go-best-practices-2016/#repository-structure
To understand more about the design philosophy for Go packages: https://www.goinggo.net/2017/02/design-philosophy-on-packaging.html

Linking against an external object file (.o) with autoconf

For work purposes I need to link against an object file generated by another program and found in its folder, the case is that I did not find information about this kind of linkage. I think that if I hardcode the paths and put the name-of-obj.o in front of the package_LDADD variable should work, but the case is that I don't want to do it that way.
If the object is not found I want the configure to fail and tell the user that the name-of-obj.o is missing.
I tried by using AC_LIBOBJ([name-of-obj.o]) but this will try to find in the root directory a name-of-obj.c and compile it.
Any tip or solution around this issue?
Thank you!
I need to link against an object file generated by another program and
found in its folder
What you describe is a very unusual requirement, not among those that the Autotools are designed to handle cleanly or easily. In particular, Autoconf has no mechanisms specifically applicable to searching for bare object files, as opposed to libraries, and Automake has no particular automation around including such objects when it links. Nevertheless, these tools do have enough general purpose functionality to do what you want; it just won't be as tidy as you might like.
I think that if I hardcode the paths and put the
name-of-obj.o in front of the package_LDADD variable should work, but
the case is that I don't want to do it that way.
I take it that it is the "hardcode the paths" part that you want to avoid. Adding an item to an appropriate LDADD variable is not negotiable; it is the right way to get your object included in the link.
If the object is not found I want the configure to fail and tell the
user that the name-of-obj.o is missing.
Well, then, the key thing appears to be to get configure to perform a search for your object file. Autoconf does not have a built-in mechanism to perform such a search, but it's just a macro-based shell-script generator, so you can write such a search in shell script + Autoconf, maybe something like this:
AC_MSG_CHECKING([for name-of-obj.o])
OTHER_LOCATION=
for my_dir in
/some/location/other_program/src
/another/location/other_program.12345/src
$srcdir/../relative/location/other_program/src; do
AS_IF([test -r "${my_dir}/name-of-obj.o"], [
# optionally, perform any desired test to check that the object is usable
# ... perhaps one using AC_LINK_IFELSE ...
# if it passes, then
OTHER_LOCATION=${my_dir}
break
])
done
# Check whether the object was in fact discovered, and act appropriately
AS_IF([test "x${OTHER_LOCATION}" = x], [
# Not found
AC_MSG_RESULT([not found])
AC_MSG_ERROR([Cannot configure without name-of-obj.o])
], [
AC_MSG_RESULT([${OTHER_LOCATION}/name-of-obj.o])
AC_SUBST([OTHER_LOCATION])
])
That's functional, but of course you could embellish, such as by providing for the package builder to specify a location to use via a command-line argument (AC_ARG_WITH(...)). And if you want to do this for multiple objects, then you would probably want to wrap up at least some of that into a custom macro.
The Automake side is much less involved. To get the object linked, you just need to add it to the appropriate LDADD variable, using the output variable created by the above, such as:
foo_LDADD = $(OTHER_LOCATION)/name-of-obj.o
Note that if you're building just one program target then you can use the general LDADD instead of foo_LDADD, but note that by default these are alternatives not complements.
With that said, this is a bad idea overall. If you want to link something that is not part of your project, then you should get it from an installed library. That can be a local, custom-built library, of course, so long as it is a library, not a bare object file, and it is installed. It can be a static library if you don't want to rely on or distribute a separate shared library.
On the other hand, if your project is part of a larger build, then the best approach is probably to integrate it into that build, maybe as a subproject. It would still be best to link a library instead of a bare object file, but in a subproject context it might make sense to use a lib that was not installed to the build system. In conjunction with a command-line argument that tells it where to find the wanted lib, this could make the needed Autoconf code much cleaner and clearer.

Can gdb set break at every function inside a directory?

I have a large source tree with a directory that has several files in it. I'd like gdb to break every time any of those functions are called, but don't want to have to specify every file. I've tried setting break /path/to/dir/:*, break /path/to/dir/*:*, rbreak /path/to/dir/.*:* but none of them catch any of the functions in that directory. How can I get gdb to do what I want?
There seems to be no direct way to do it:
rbreak file:. does not seem to accept directories, only files. Also note that you would want a dot ., not asterisk *
there seems to be no way to loop over symbols in the Python API, see https://stackoverflow.com/a/30032690/895245
The best workaround I've found is to loop over the files with the Python API, and then call rbreak with those files:
import os
class RbreakDir(gdb.Command):
def __init__(self):
super().__init__(
'rbreak-dir',
gdb.COMMAND_BREAKPOINTS,
gdb.COMPLETE_NONE,
False
)
def invoke(self, arg, from_tty):
for root, dirs, files in os.walk(arg):
for basename in files:
path = os.path.abspath(os.path.join(root, basename))
gdb.execute('rbreak {}:.'.format(path), to_string=True)
RbreakDir()
Sample usage:
source a.py
rbreak-dir directory
This is ugly because of the gdb.execute call, but seems to work.
It is however too slow if you have a lot of files under the directory.
My test code is in my GitHub repo.
You could probably do this using the Python scripting that comes with modern gdb's. Two options: one is to list all the symbols and then if they contain the required directory create an instance of the Breakpoint class at the appropriate place to set the breakpoint. (Sorry, I can't recall off hand how to get a list of all the symbols, but I think you can do this.)
You haven't said why exactly you need to do this, but depending on your use-case an alternative may be to use reversible debugging - i.e. let it crash, and then step backwards. You can use gdb's inbuilt reversible debugging, or for radically improved performance, see UndoDB (http://undo-software.com/)

What does #...# mean in this Makefile snippet?

Can Somebody explain me on short (just as idea) what the following fragment suggests?
- I'm new in C language so I don't understand the meaning of #...# sign:
#SET_MAKE#
VPATH = #srcdir#
pkgdatadir = $(datadir)/#PACKAGE#
pkgincludedir = $(includedir)/#PACKAGE#
pkglibdir = $(libdir)/#PACKAGE#
pkglibexecdir = $(libexecdir)/#PACKAGE#
or:
build_triplet = #build#
host_triplet = #host#
If is needed to put more code, let me know.
Thanks in advance.
The system of using names enclosed in # is used by autoconf to mark strings that should be replaced by the configure script.
These appear to be build-system variables of some sort, as the # symbol is not (I believe) used in C at all. Considering the names, this seems even more likely. The package and source directory will be inserted in the corresponding places.
Perhaps more interesting are the $(var)s, which are used often in Visual Studio project files (but not source, and a VS proj is a make file of sorts itself).
My guess is you have two make/build system variable types being used here. Whether they're from two system, I do not know. As Brian Roach pointed out in a comment, at least GNU autoconf is involved here.
What file did this come from, and what other text surrounds it? That may shed more light, if a well known name is used. It is possible this isn't a code file at all, and just a make file; or it could be a code file with build system variables in (for at-build replace).
This is not C at all, looks more like a makefile of some sort. Take a look at the filename where you found this, I doubt it ends in .c.

Resources