generate diagram of "include" relationships between source code files [closed] - include

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I want to do some refactoring of code, especially the "include"-like relationships between files. There are quite a few of them, and to get started, it would be helpful to have a list, diagram, or even a columnar graph, so that I can see at a glance what is included from where.
(In many cases, a given file is included by multiple other files, so the graph would be a DAG, not a tree. There are no cycles.)
I'm working with TeX (actually ConTeXt), but the question would seem to apply to any programming languages that has a facility like that of #include in C.
The obvious, easy answer is to do a grep or "Find in Files" on all the .tex files for the relevant keywords (\usemodule, \input, and a couple of other macros we've defined). This is better than nothing, but the output is long, and it's still difficult to see patterns in what includes what. For example, is file A usually included before file B? Is file C ever included multiple times by the same file?
I guess that brings out an additional, but optional feature: that such a tool would be able to show the sequence of includes from a particular file. So in that case the DAG could be a multigraph, i.e. there could be multiple arcs from one file to another.
Ideally, it would be nice to be able to annotate each file, giving a very brief summary of what's in it. This would form part of the text on the graph node for that file.
Probably this sort of thing could be done with a script that generates graphviz dot language. But I wanted to know if it has already been done, rather than reinvent the wheel.

As it is friday in my country right now, and I'm waiting for my colleagues to go to have a beer, I thought I'd do a little programming.
Here http://www.luki.webzdarma.cz/up/IncludeGraph.zip you can download source of a really simple utility that looks for all files in one folder, parses #includes and generates a .dot file for that.
It supports and correctly handles relative paths, and works on windows and should work on linux as well. It is written in very spartan way. My version of dot is not parsing the generated files, there is some bug, but I really need to go drinking now, see if you can fix it. I'm not a regular dot user and I dont see it, though I'm sure it is pretty obvious.
Enjoy ...
PS - If you run into trouble compiling and/or running, please let me know. Thanks.
EDIT
Ok, my bad, there was a few glitches on linux. The dot problem was it was using "graph" instead of "digraph". But it is working like charm now. Here is the link. Just type make, and if that goes, make test should generate the following diagram of the program itself:
It ignores preprocessor directives in the C++ files so it is not very useful for that directly (could be fixed by simply calling g++ with preprocessor output flag and processing that instead of the actual files). I didn't get to regexp today, but if you have any programming experience, you will find that modifying DotGraph.cpp shouldn't be very hard to hard-code your inclusion token, and to change list of file extensions. Might get to regexp tomorrow or something.

A clever and general solution would be to trace the build system (using something like strace, LD_PRELOAD, patching the binaries, or some other debugging facility).
Once you'd collected the sequence of file open/close operations, you'd just have to filter out the uninteresting stuff, it should be easy to build the dependency tree for any language as long as the following assumptions are true:
The build system will open each file when it is included.
The build system will close each file when it reaches its end.
Unfortunately, a well-written or poorly-written compiler might violate these assumptions by for instance only opening a file the first time it is included, or never closing any files.
Perhaps because of these limitations, I'm not aware of any implementation of this idea.
On the other hand, clever build systems may include functionality to compute or extract dependencies themselves. gcc has the -M option to output dependencies, and javac figures out dependencies on its own (although I don't know how to get it to output them).
As far as TeX goes, I don't know enough TeX to actually implement this, but conceptually it seems like it should be possible to redefine the low-level include to command to:
write a log of what is about to be included
call the original include command to include it
write a log of what was included
You could then build your tree from the log output.

Related

What's the easiest way in Xcode to search exclusively for all writes to (or reads from) a variable

I find myself doing this from time to time and I end up just doing a search on the variable name and then looking through all the hits, skipping the ones that aren't what I'm looking for. But it occurred to me that if Xcode could reliably filter into one set or the other, it would make the process more efficient. Because quite often one only needs to know where a variable is changed in particular, so reads don't really matter then.
I can think of some ways eg. searching for "setXXX" and "XXX =" etc. but that all seems a bit clunky and imprecise. Is there a better way?
I am mostly coding in Objective-C and am looking for a way to do a static search of the source code.
What I do when I want to search for a symbol in Xcode I use the built-in search tool as you suggested in your question and that is good enough for me. However, if you want better tools for static analysis of obj-c code I'd suggest XClarify from CodeGears. It became free for open source contributors and research.

How to automate the correctness-checking of makefiles? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I know there are many linters for programming languages, like pep8 for python, but I have never come across one for a makefile. Are there any such linters for makefiles?
Any other ways to programmatically detect errors or problems in makefiles without actually running them?
As I have grown into using a makefile, it keeps on getting more complicated and long, and for me it would make sense to have a linter to keep the makefile more readable.
Things have apparently changed. I found the following:
Checkmake
Mint
Of the two, Checkmake has (as of 2018/11) more recent development, but I haven't tried either.
I also do not know where to find make file lint (web search for "make file lint" got me here), but here is a incomplete list of shallow idea fragments for implementing a make file lint utility...
White spaces are one aspect of the readability, as tabs and spaces have distinct semantics within make file. Emacs makefile-mode by default warns you of "suspicious" lines when you try to save make file with badly spaced tabs. Maybe it would be feasible to run emacs in batch mode and invoke the parsing and verification functions from that mode. If somebody were to start implementing such a make file lint utility, the emacs lisp mode could be an interesting to check.
About checking for correctness, #Mark Galeck in his answer already mentioned --warn-undefined-variables. The problem is that there is a lot of output from undefined but standardized variables. To improve on this idea, a simple wrapper could be added to filter out messages about those variables so that the real typos would be spotted. In this case make could be run with option --just-print (aka -n or --dry-run) so as to not run the actual commands to build the targets.
It is not good idea to perform any changes when make is run with --just-print option. It would be useful to grep for $(shell ...) function calls and try to do ensure nothing is changed from within them. First iteration of what we could check for: $(shell pwd) and some other common non-destructive uses are okay, anything else should invoke a warning for manual check.
We could grep for $ not followed by ( (maybe something like [$][^$(][[:space:]] expressed with POSIX regular expressions) to catch cases like $VARIABLE which parses as $(V)ARIABLE and is possibly not what the author intended and also is not good style.
The problem with make files is that they are so complex with all the nested constructs like $(shell), $(call), $(eval) and rule evaluations; the results can change from input from environment or command line or calling make invocations; also there are many implicit rules or other definitions that make any deeper semantic analysis problematic. I think all-compassing make lint utility is not feasible (except perhaps built-in within make utility itself), but certain codified guidelines and heuristic checks would already prove useful indeed.
The only thing resembling lint behaviour is the command-line option --warn-undefined-variables.

Prolog SWI : Logtalk, How do I load my own project files?

so this week consisted of me installing Logtalk, one of the extensions for Prolog. In this case I'm using Prolog SWI, and I've run into a little snag. I'm not sure how to actually consult my own projects using Logtalk. I have taken a look at the examples that Logtalk comes with in order to understand the code itself, and in doing so I've been able to load them and execute them perfectly. What I don't understand though is what is actually going on when logtalk is loading a file, and how I can load my own projects.
I'll take the "hello_world" example as the point of discussion. The file called hello_world, is located in the examples folder of the Logtalk files. and yet it is consulted like so:
| ?- logtalk_load(hello_world(loader)).
First thing I thought was "that is a functor", looking at what it was doing using trace, I found that it was being called from the library and was being told how to get to the examples folder, where it then opened the "hello_world" folder and then the "loader" file. After which normal compiling happened.
I took a look at the library and couldn't figure out what was going on. I also thought that this can't possibly be the practical route to load user created projects in Logtalk. There was another post that was asking how to do this with SWI as well, but it didn't have any replies and didn't look like any effort had been made to figure the problem out.
Now let me be clear on something, I can use the "consult('...')." command just fine, I can even use "consult" to open my projects, however if I do this the logtalk console doesn't seem to be using any of the logtalk extensions and so is just vanilla prolog. I've used an installer for windows to install logtalk and I know that it is working as I've been looking at the examples that it comes with.
I've tried to find a tutorial but it is very difficult to find much of anything for Logtalk, the most I have found is this documentation on loading from within your project:
logtalk_load/1.
logtalk_load/2.
which I understand like so:
logtalk_load(file). % Top level loading
logtalk_load(folder(file). % Bottom level loading
So to save a huge manual load each time I would have a loader file that will load the other components of my project (which is what the examples for Logtalk do). This bit makes sense to me, I think, how I get to my loader file, doesn't.
Whether or not I have been understanding it correctly or not remains to be seen, but even if I have been understanding it correctly, I'm still lost as to how I load my own projects. Thanks for any help you can give, if you could give an example that'll be best as I do learn from examples quite quickly.
LITTLE UPDATE
You asked if I was using a logtalk console for my program running, and I am, I'm using the one that is provided and referred to during the "QUICK_START" file [Start > Programs > Logtalk > "Logtalk - Prolog-SWI (console)"] I thought to double check if the logtalk add ons were working and tested the "birds" example since it uses objects and is a nice familiar example. Yet again, everything works fine when using the logtalk_load/2 functor.
I took a look at what the library path was referring to a bit more given the feedback given so far. Looking into how logtalk loads files. Set up as it is so far, without changing things logtalk consults a folder which contains a prolog file called libpaths. It is basically how the examples are found, all it is is a part way description for where to get a file from. So when I say "logtalk_load/2" from what I can tell at least I'm going to this file and finding where the folder is that I'm asking for.
Now since I have already placed my own project folder in the examples folder, I promptly added my own folder to the list to test if this would at least be a part way solution to help me understand things a bit more. I added the following to the libpaths.pl file.
logtalk_library_path(my_project, examples('my_project/')).
% The path must end in a / so I have done so
So, I've got my folder path declared, got my folder, and the loader file is what I'll be calling when I use the loader. Without thinking about setting my own lib path folder, I should have enough to get things working and do some practical learning. But alas no, seems my investigation failed and I was returned the following:
ERROR: Unhandled exception: existence_error(library,project_aim)
Not what I wanted to see, I'm back to this library error business. I'm missing a reference to my project folder somewhere but I don't know where else it could need referencing. Running trace on the matter didn't help I simply had the following occur:
Call: (17) logtalk_library_path(my_project, _G943) ? creep
Fail: (17) logtalk_library_path(my_project, _G943) ? creep
ERROR: Unhandled exception: existence_error(library,my_project)
The call is failing, I'm simply not finding a reference where ever it is logtalk is looking. And I'm a novice at best when it comes to these sorts of issues, I've been using computers now for only 3 years and programming for the past 2 in visual studios using c# and c++. At least I've shone some more light on the matter, any more helpful advice given this information?
Please use the official Logtalk support channels for help in the future. You will get timely replies there. Daniel, thanks for providing help to this user.
I assume that you're using Logtalk 2.x. Note that Logtalk 3.x supports relative and full source file paths. In Logtalk 2.x, the logtalk_compile/1-2 (compile to disk) and logtalk_load/1-2 (compile and load into memory) predicates take either the name of a source file (without the .lgt extension) OR the location of the source file to be loaded using "library notation". To use the former you need first to change the current working directory to the directory containing the file. This makes the second option more flexible. As you mention, the hello_world example you cite, can be loaded by typing:
?- logtalk_load(hello_world(loader)).
or:
?- {hello_world(loader)}.
Logtalk 2.x and 3.x also provide integration with some SWI-Prolog features such as consult/1, make/0, edit/0-1, the graphical tracer and the graphical profiler. For example:
?- [hello_world(loader)].
********** Hello World! **********
% [ /Users/pmoura/logtalk/examples/hello_world/hello_world.lgt loaded ]
% [ /Users/pmoura/logtalk/examples/hello_world/loader.lgt loaded ]
% (0 warnings)
true.
To load your own examples and projects, the easiest way is to add a library path to the directory holding your files to the $LOGTALKUSER/settings.lgt file (%LOGTALKUSER%\settings.lgt on Windows) as Daniel explained. The location of the Logtalk user directory is defined by you when using the provided installer. The default is My Documents\Logtalk in Windows. Editing the libpaths.pl file is not a good idea. Use the settings.lgt file preferentially to define your own library paths. Assuming, as it seems to be your case, that you have created a %LOGTALKUSER%\examples\project_aim directory, add the following lines to your %LOGTALKUSER%\settings.lgt file:
:- multifile(logtalk_library_path/2).
:- dynamic(logtalk_library_path/2).
logtalk_library_path(project_aim, examples('project_aim/').
If you have a %LOGTALKUSER%\examples\project_aim\loader.lgt file, you can then load it by typing:
?- {project_aim(loader)}.
Hope this helps.
What makes me uncertain of my answer is just that you claim the usual consult works but not logtalk_load. You do have to run a different program to get to Logtalk than Prolog. In Unix it would be something like swilgt for SWI-Prolog or gplgt for GNU Prolog. I don't have Windows so I can't really tell you what you need to do there, other than maybe make sure you're running a binary named Logtalk and not simply Prolog.
Otherwise I think your basic problem is that in Windows it's hard to control your working directory. In a Unix environment, you'd navigate your terminal over to the directory with your files in it and launch Logtalk or Prolog from there. Then when you name your files they would be in the current directory, so Prolog would have no trouble finding them. If you're running a command-line Prolog, you can probably configure the menu item so that it will do this for you, but you have to know where you want to send it.
You can use the functor notation either to get at subdirectories (so, e.g., foo(bar(baz(bat(afile)))) finds foo\bar\baz\bat\afile.lgt). This you seemed to have figured out, and I can at least corroborate it. This will search in its predefined list of functors, and also in the current directory. But you could launch Logtalk from anywhere and then run, say, assertz(logtalk_library_path(foo, 'C:\foo\bar\baz\bat')). after which logtalk_load(foo(afile)) is going to be expanded to C:\foo\bar\baz\bat\afile.lgt.
Building on that technique, you could put your files in the Logtalk user directory and use $LOGTALKUSER as demonstrated in the documentation. I can't find a definitive reference on where the Logtalk user directory will be on Windows, but I would expect it to be in the Documents and Settings folder for your user. So you could put stuff in there and reference it by defining a new logtalk_library_path like this.
It's nice, but it still leaves you high and dry if you have to keep on re-entering these assertions every time you launch. Fortunately, there is a Logtalk settings file named settings.lgt in your Logtalk user directory which has a chunk of commented-out code near the top:
% To define a "library" path for your projects, edit and uncomment the
% following lines (the library path must end with a slash character):
/*
:- multifile(logtalk_library_path/2).
:- dynamic(logtalk_library_path/2).
logtalk_library_path(my_project, '$HOME/my_project/').
logtalk_library_path(my_project_examples, my_project('examples/')).
*/
You can simply uncomment those lines and insert your own stuff to get a persistent shortcut.
You can also write a plrc file for SWI Prolog to define other things to happen at startup. The other option seems cleaner since it's Logtalk-specific, but a plrc is more general.
Once you have that machinery in place, having a loader file will be a lot more helpful.
NOTE: I don't have Windows to test any of this stuff on, so you may need to make either or both of the following changes to the preceeding:
You may need to use / instead of \ in your paths (or maybe either will work, who knows?). I'd probably try / first because that's how all other systems work.
You may need to use %LOGTALKUSER% instead of $LOGTALKUSER, depending on how Logtalk expands variables.
Hope this helps and I hope you stick with Logtalk, it could use some passionate users like yourself!

How can one get a list of Mathematica's built-in global rewrite rules?

I understand that over a thousand built-in rewrite rules in Mathematica populate the global rules table by default. Is there any way to get Mathematica to give a full or even partial list of those rules?
The best way is to get a job at Wolfram Research.
Failing that, I think that for things not completely compiled into the kernel you can recover most of the rules/definitions. Look at
Attributes[fn]
where fn is the command that you're interested in. If it returns
{Protected, ReadProtected}
then there's something you can get a look at (although often it's just a MakeBoxes (formatting) definition or a AutoLoad/Stub type definition). To see what's there run
Unprotect[fn];
ClearAttributes[fn, ReadProtected];
??fn
Quite often you'll have to run an example of the command to load it if it was a stub. You'll also have to dig down from the user-facing commands to the back-end implementations.
Eventually you'll most likely reach a core command that is compiled into the kernel that you can not see the details of.
I previously mentioned this in tips for creating Graph diagrams and it got a mention in What is in your Mathematica tool bag?.
An good example, with a nice bite-sized and digestible bit of code is Experimental`AngularSlider[] mentioned in Circular/Angular slider. I'll leave it up to you to look at the code produced.
Another example is something like BoxWhiskerChart, where you need to call it once in order to load all of the code. Then you see that BoxWhiskerChart proceeds to call Charting`iBoxWhiskerChart which you'll have to unprotect to look at, etc...

Which tools can help in translating (as in french -> english and not C++ -> java) source code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have some code that is written in french, that is the variables, class, function all have french names. The comments are also in french. I'd like to translate the code to english. This will be quite a challenge, since it's a 18K lines project and I'd like to know if there is any tool that could help me, especially with the variables/class/function names, since it will be error prone to rename them all.
Is there any tools that can help me? Advices?
edit : I'm not looking for machine translation. I'm looking for a tool that would help me translate the code. Let's say there is class name C and this class has a method named TraverserLaRue and I rename it CrossTheRoad I'd like all references to TraverserLaRue in all files to be translated as CrossTheRoad. However I don't want the method TraverserLaRue of class B to be translated.
I assume the langauge in question is one of the common ones, such as C, C++, C#, Java, ...
(You don't have a language with French keywords? I once encountered an entirely Swedish version of Pascal, and I gave up on working that).
So you have two problems:
Translating identifiers in the source code
Translating comments
Since comments contain arbitrary natural language text, you'll need an arbitrary translation of them. I don't think you can find an automated tool to do that.
Unlike others, however, I think you have a decent chance at translating the identifiers
and changing them en masse.
SD makes a line of source code "obfuscator" products. These tools don't process the code as raw text, rather they process the source code in terms of the targeted language; they accurately distinguish identifiers from operators, numbers, comments etc. In particular, they
operate reliably as need on just the identifiers.
One of the things these tools do is to replace one identifier name by another (usually a nonsense name) to make the code really hard to understand. Think abstractly of a map of identifier names I -> N. (They do other things, but that's not interesting here). Because you often want to re-obfuscate a file that has changed, the same way as an original, these tools allow you to reuse a previous cycle's identifier map, which is represented as list of I -> N pairs.
I think you can abuse this to do what you want.
Step 1: Run such an obfuscator on your original French code. This will produce a text file containing all the identifiers in the code as a map of the form
I1 -> N1
I2 -> N2
....
You don't care about the Ns, just the I's.
Step 2: Manually translate each French I to an English name E you think fits best.
(I have no specific suggestions about how to do this; some of the other answers here
have suggestions).
Some of the I's are likely to be library calls and are thus already correct.
You can modify the text obfuscation map file to be:
I1 -> E1
I2 -> E2
Step 3: Run the obfuscation tool, and make it use your modified obfuscation map.
It can be told to do that.
Viola, all the identifiers in your code will be changed the way you specify.
[You may get, as a freebie, the re-formatting of your original text. These tools can also format code nicely. Your name changes are likely to screw up the indentation/spacing in the original text so this is a nice bonus].
Any refactoring tool has a rename feature. Many questions on SO address language specific refactoring tools.
For the comments, you will have to handle them manually.
I did this with German code a while ago, but had mixed results because of abbreviations in names, etc. Using regular expressions, I wrote a parser that removed all of the language specific keywords and characters, then separated comments from the rest of the code, and now I had a lot of words that didn't necessarily mean anything to me by themselves. So I wrote a unique word finder that added them all to a ordered text file. Next stop was Google's language tools that attempted to translate every word in the list. I ran through the list to see if each word really translated, and if it did, I did a replace all in the code with the english equivalent. The comments I put back in with the complete translation, if it worked. What I found was that I ended up having to talk with someone who understood "Germish" to translate the abbreviations, slang terms, and mixed language pieces. So in short, regular expressions with a dictionary, unless someone has a real tool for this, which I would be interested in also.
You should definitely look into https://launchpad.net/rosetta
Ubuntu uses this to translate thousands of its packages written in hundreds of programming languages into hundreds of human languages, with updates for each new version. Truly herculean task.
edit: ...to clarify how Rosetta is used at Ubuntu: it modifies all natural language strings occuring in source code of the open-source apps, creating a language-specific source packages, which upon compiling create given kinds of binaries. Of course it does not edit binaries themselves.
First maintainers create "template files" which are something like "Patch with wildcards" - a set of rules what and where in the source tree needs to be translated, but not to what. Then Rosetta displays strings to be translated, and allows volunteering translators to provide translations to their language for each entry. Each entry can be discussed, modified, suggestions submitted and moderated. Stats are provided how much needs to be translated, which translations are unsure, which are missing etc. When the translation is complete, patch of given language is applied to the source creating its version for given language. Then a distribution is compiled from the modified sources.
This allows translation both for sources that use some external resources for multilingual allowing for language change on the fly, and for ones that have literal native language strings right in the source code, mixed with business logic.
When a new version of the package is released, template must be edited to include all new strings but it has quite good automation for preserving the existing ones. Of course only translations for new strings are required.
IMHO automatic tools won't be of any help here. Just translating variable and function names is not enough and will make the code worse because they cannot infer the original programmer intent when he choose a variable name.
Depending on what programming language this code is written to there are modern IDEs that might ease the refactoring but if you want to have good results manual code review is a must.
A good IDE will be able to list classes, methods, variables. There's also documentation generation tools that'll do that such as Javadoc for Java, Doxygen for many languages, etc.
To do the actual translation, there will be no tool that will perform well, or even to a satisfactory level. The only way to get something worthwile is to have a bilingual translator translate the terms. I've been doing freelance translations for many years, and can tell you that trying to have some machine do the translating is a waste of time. Many examples, choice of words, will be relevant to your culture and not the other. And that's just the tip of the iceberg.
Unless you find someone that can do the translation, I suggest you abandon the idea. Leave the source code as is. If a non-French speaker reads it, and needs to understand something, let them do the Google lookup. If they are native English speakers they'll probably do a better job of understanding the automatic translated stuff than you would, being French. When translating, you always want to translate into your native language.
For translating only comments you may try this simple utility I wrote (it's using Microsoft's Translator API): transource.

Resources