How to write gnatcheck rules - coding-style

Is it possible to write your own gnatcheck rules, and if so, can someone point me to a good reference? I am searching for a particular "style" that is being used, and would love if I could simply write a rule that says if you see said style, it will throw up a warning, or an error, this way we can flag when this isn't following a particular standard.

A bit of background may be helpful here. While the style checks hold out a lot of promise for enforcing user style guidelines, that isn't exactly what they are for.
The main purpose of those checks is to enforce Ada Core's (The folks who maintain the compiler) style on the sources of the Ada compiler itself. You may notice that the checks get automatically turned on if you try compiling one of the compiler's own source files.
It doesn't really serve AdaCore's purposes at all if the styles enforced by the checks themselves are user-configurable, so they added no feature like that.
Your first option if you want to use it yourself is to just stick to AdaCore's coding style. I haven't found it horrible in the past, so you may just look at doing that.
Still, making some kind of configurability would be a really cool feature for somebody to add. If you go this route, you probably would have to make it configurable (with the current behavior as the default), rather than just changing the checks. The reason is that you'd have to modify the compiler sources to accomplish this, and as I mentioned above, the compiler turns the checks on when compiling itself. You really don't want to have to reformat a ton of working Gnat compiler source files.
I'd really like to see someone do this at some point, as it would make the checks much more useful to those of us who work for someone besides AdaCore.

In addition to trashgod's reference, I think Section 7.1 of this PDF might be of some help:
http://extranet.eu.adacore.com/articles/HighIntegrityAda.pdf

For reference, the existing GNAT style checking is described in the GNAT User's Guide under §3.2.5 Style Checking. As the rules are enforced by the compiler, additional rules would require corresponding modifications.

Related

Handling comments in code when doing i18n

I'm in the process of translating a Open Source project from Chinese to English, and I've used i18n (in this case babel) to separate the code from both English and Chinese translations.
Everything's done, except for a rather large number of inline comments in the code.
Obviously, babel can't translate comments inline (and it would be rather obnoxious if it did, anyway. Since code would not be unique across languages and therefore less easily verifiable.)
The way I see it, there are a number of options:
Leave comments in -
Pro: Helps original author, etc.
Con: Makes it distracting for ongoing translation and anyone who doesn't speak the language
Strip out all the comments -
Pro: Code is native-language-agnostic, so it makes sense. Who needs comments anyway? Use the source, Luke!
Con: Goes against SE principles. Could lose something important in understanding how the code works - maybe something's been done to avoid a security risk, etc.
Place English comments near Chinese comments
(Possibly moved to lines above and prefixed with "EN" and "ZH", for example).
Pro: Best of both worlds, comments kept close to code
Con: Not conducive to dictionary-style translation. Can get bulky with more languages.
Create a comment dictionary / notes
Pro: Keeps the comments in a separate file for easy translation.
Con: Difficult to keep synced with code. Not intuitive to remember to update comments related to code when changing coe.
Use a different preprocessor for i18n before/after each development cycle.
Pro: Comments et al would be in your language. Could link this to git pull/push so you only ever see the code in your language.
Con: Bulky, non-obvious process. Could result in code-verification or even compilation errors.
None of these seem like really great solutions.
If you do alot of this, and the code is shared publicly between developers who don't share a native tongue, is there a recommended way to handle translating (or not) comments in the code itself?
I am not sure I understand... You say you separated the code from the languages part. So now you should have code (with comments) + English resources + Chinese resources (i used resources for whatever your programming language use to store localizable content)
Translators only see the resources, not the code, nor the comments. The comments stay untranslated, for the developers.
Short Answer
It seems to be a mixture of:
Strip out all the comments, and
Place English comments near Chinese comments.
Inline comments are almost always trivial - Strip them
Functional comments are not as intrusive - Translate them (possibly with a i18n prefix e.g. "[cn]:" or "[en]:").
Explanation
My meagre amount of research tends to suggest that larger projects make strong attempts to reduce comments and let the code explain itself, instead focusing on code quality which reduces the need for comments.
e.g. From the Linux Kernel Coding guidelines:
NEVER try to explain HOW your code works in a comment: it's much
better to write the code so that the working is obvious, and it's
a waste of time to explain badly written code.
...and from the MySQL coding standards:
Comment your code when you do something that someone else may think is
not trivial.
Both of these standards (and others) recommend minimal function descriptions also, so that's not as obtrusive to understanding the code, and, since function descriptions are generally multi-lined and above the code itself, multiple languages can be included as necessary.
Maybe someone, somewhere has built an interface that can isolate comments into the readers language, but I couldn't (yet) find any that do so.
I always think that API comments exported in the project and private comments in open source projects should be internationalized, which is very convenient for developers in other countries.
On Github, there are actually many developers who use their own national language to comment on some well-known open source projects and some of their own annotations. Most of the reason is that if they do not translate, the efficiency of developers reading comments very low.
Similar to .d.ts in TypeScript, I think function annotation translation can also take a similar form, which is more convenient for the community to feedback translation content, because in fact many developers are willing to do so.

Suggestions for using attributes beyond [[noreturn]]?

Coming from the discussions about the use of vendor specific attributes in another question I asked myself, "what rules should we tell people for using attributes that are not listed in the standard"?
The two attributes that are defined are [[ noreturn ]] and [[ carries_dependencies ]]. The standard leaves open how compilers should react on unknown attributes -- thus, by the standard they may stop with an error message. This is not what e.g. GCC does, it emits a warning and continues. This is probably a behavior to be expected by the most-common compilers. For this reason I would have like to read a "should" in the standard, but we don't have it.
The paper N2553 brings up flexible attributes. It lists further attributes used by GCC (
unused, weak) and MSVC (dllimport). for OpenMP, the widely supported parallelizing framework, scoped attributes are suggested, eg. omp::for(clause, clause), omp::parallel(clause,clause). So, it is very likely that we will se some vendor specific attributes very soon after they support the syntax at all, indeed.
Therefore, when we now go "out in the world" and tell people about C++11, what should the advice be about using attributes?
Only use noreturn and carries_dependencies
Use your compilers old syntax instead, eg. __attribute__((noreturn)) and define a macro when you port the code (the current situation)
Use those attributes your favorite compiler supports freely, knowing this code might not be portable to another standard-conforming compiler, because if the standard allows a compiler to stop with an error, you have to consider this will happen. This sounds a bit like advocating writing non-portable code.
Or, my guess, expect the most-used compilers to warn about unknown attributes, so you can use vendor-specific attributes, keeping in mind that in rare cases you may get problems.
Note the slight difference in the last two bullet-items. While both say "use those attributes you need", item3's message is "do not care about other compilers", while item4 implicitly rephrases the standard texts "implementation defined behavior" to "the compiler should emit a diagnostic message".
What could be the suggestion for an upcoming Best Practice here?
The best practice — the only one that is reasonably portable in practical terms, never mind ambiguity in the Standard — is to use macros. It will be many years before we can forget about compilers that don't support attributes.
The number of compilers and the number of custom __keywords__ defined by those compilers will always be increasing, and it makes sense for the language to define a way to contain the damage. It doesn't need to revolutionize the way people write unportable code, or make unportable code portable (although standard attributes do that). There is a benefit simply to giving caffeine-addled compiler backend engineers a sandbox for when they want to extend the grammar.
It is a bit alarming, though, that no attribute tokens are reserved to the implementation, or to the language besides the ones currently standard. So there will be trouble when they decide to standardize more of them.

How can one get a list of Mathematica's built-in global rewrite rules?

I understand that over a thousand built-in rewrite rules in Mathematica populate the global rules table by default. Is there any way to get Mathematica to give a full or even partial list of those rules?
The best way is to get a job at Wolfram Research.
Failing that, I think that for things not completely compiled into the kernel you can recover most of the rules/definitions. Look at
Attributes[fn]
where fn is the command that you're interested in. If it returns
{Protected, ReadProtected}
then there's something you can get a look at (although often it's just a MakeBoxes (formatting) definition or a AutoLoad/Stub type definition). To see what's there run
Unprotect[fn];
ClearAttributes[fn, ReadProtected];
??fn
Quite often you'll have to run an example of the command to load it if it was a stub. You'll also have to dig down from the user-facing commands to the back-end implementations.
Eventually you'll most likely reach a core command that is compiled into the kernel that you can not see the details of.
I previously mentioned this in tips for creating Graph diagrams and it got a mention in What is in your Mathematica tool bag?.
An good example, with a nice bite-sized and digestible bit of code is Experimental`AngularSlider[] mentioned in Circular/Angular slider. I'll leave it up to you to look at the code produced.
Another example is something like BoxWhiskerChart, where you need to call it once in order to load all of the code. Then you see that BoxWhiskerChart proceeds to call Charting`iBoxWhiskerChart which you'll have to unprotect to look at, etc...

Configuration Management - History in Code Comments

Let me pose a bit of background information before asking my question:
I recently joined a new software development group that uses Rational tools for configuration management, including a source control and change management system.
In addition to these tools, the team has a standard practice of noting any code changes as a comment in the code, such as:
///<history>
[mt] 3/15/2009 Made abc changes to fix xyz
///</history>
Their official purpose for the commenting standard is that "the comments provide traceability from requirement to code modification".
I am preparing to pose an argument that this practice is unnecessary and redundant; that the team should get rid of this standard immediately.
To wit - the change management system is the place to build traceability from requirement to code modification, and source control can provide detailed history of changes by performing a Diff between versions. When source code is checked in, the corresponding change management ticket is noted. When a CM ticket is resolved, we note which source code files were modified. I believe this provides a sufficient cross-reference for the desired traceability.
I would like to know if anyone disagrees with my argument. Am I missing some benefit of commented source code history that change management and source control systems cannot provide?
For myself, I have always found such comments to be more trouble than they're worth: they can cause merge conflicts, can appear as 'false positives' when you're trying to isolate the diffs between two versions, and may reference code changes that have since been obsoleted by later changes.
It's often (not always, but often) possible to change version-control systems without losing metadata. If you were to move your code to a system that doesn't support this, it would not be hard to write a script to convert the change history into comments before the cutover.
A comment allows you to find all the changes and their reasons in the code right where they are relevant without having to dig into diffs and version control system intricacies. Furthermore, should you decide to change of version control system, the comments will stay.
I worked on a large project with similar practice that had changed of source control system twice. There wasn't a day when I wasn't glad to have these comments.
Is it redundant? Yes.
Is it unnecessary? No.
I've always thought that code should be, of course, under version control, and that the current source code (the one that you can open and read today) should be valid only in present tense.
It doesn't matter if a report could have up to 3 axis in the past and last month you updated it to support up to 6 axis. It doesn't matter if you expanded some function or fixed some bug, as long as the current version can be easily understood. When you fix a bug, just leave the fixed code.
There's an exception, though. If (and only if) the fixed code looks less intuitive to you than the previous, incorrect one; if you feel that someone might come tomorrow and, just by reading the code, be tempted to change it back to what "seems more correct", then it's good to add a comment: "This is done this way to avoid... blah blah blah." Also, if the problem behind is an infamous war story inside the team's culture, or if for some reason the bug report database contains very interesting information about this part of the code, I wouldn't find it incorrect to add "(see Bug Id 10005)" to the explaining comment.
The one that jumps to mind to me is vendor lockin. If you ever moved away from Rational, you'd need to make sure that the full change history was maintained during the migration - not just the version of the artifacts.
When you're in the code you need to know why it's structured like that, hence in code commenting. Tools that sit outside the code, good though they may be, require far too much of a context shift in your brain to be useful. As well as that, trying to reverse engineer the code intent from documentation and a diff is pretty damn hard, I'd much rather read a line of comment any day.
There was a phase in the code I work on, back in the 1994-96 time frame, where there was a tendency to insert change history comments at the top of the file. Those comments are now meaningless and useless, and one of the many standard cleanups I perform when editing files containing such comments is to remove them.
In contrast, there are also some comments with a bug number at the location where the change is made, typically explaining why the ridiculous code is as it is. These can be very helpful. The bug number gives you somewhere else to look for information, and fingers the culprit (or victim - it varies).
On the other hand, items like this one - genuine; cleaned up last week - make me grit my teeth.
if (ctab->tarray && ctab->tarray[i])
#ifndef NT
prt_theargs(*(ctab->tarray[i]));
#else
/* Correct the parameter type mismatch in the line above */
prt_theargs(ctab->tarray[i]);
#endif /* NT */
The NT team got the call correct; why they thought it was a platform-specific fix is beyond me. Of course, if the code had used prototypes instead of just parameterless declarations before now, then the Unix team would have had to fix the code too. The comment was a help - assuring me that the bug was genuine - but exasperating.

Standards Document

I am writing a coding standards document for a team of about 15 developers with a project load of between 10 and 15 projects a year. Amongst other sections (which I may post here as I get to them) I am writing a section on code formatting. So to start with, I think it is wise that, for whatever reason, we establish some basic, consistent code formatting/naming standards.
I've looked at roughly 10 projects written over the last 3 years from this team and I'm, obviously, finding a pretty wide range of styles. Contractors come in and out and at times, and sometimes even double the team size.
I am looking for a few suggestions for code formatting and naming standards that have really paid off ... but that can also really be justified. I think consistency and shared-patterns go a long way to making the code more maintainable ... but, are there other things I ought to consider when defining said standards?
How do you lineup parenthesis? Do you follow the same parenthesis guidelines when dealing with classes, methods, try catch blocks, switch statements, if else blocks, etc.
Do you line up fields on a column? Do you notate/prefix private variables with an underscore? Do you follow any naming conventions to make it easier to find particulars in a file? How do you order the members of your class?
What about suggestions for namespaces, packaging or source code folder/organization standards? I tend to start with something like:
<com|org|...>.<company>.<app>.<layer>.<function>.ClassName
I'm curious to see if there are other, more accepted, practices than what I am accustomed to -- before I venture off dictating these standards. Links to standards already published online would be great too -- even though I've done a bit of that already.
First find a automated code-formatter that works with your language. Reason: Whatever the document says, people will inevitably break the rules. It's much easier to run code through a formatter than to nit-pick in a code review.
If you're using a language with an existing standard (e.g. Java, C#), it's easiest to use it, or at least start with it as a first draft. Sun put a lot of thought into their formatting rules; you might as well take advantage of it.
In any case, remember that much research has shown that varying things like brace position and whitespace use has no measurable effect on productivity or understandability or prevalence of bugs. Just having any standard is the key.
Coming from the automotive industry, here's a few style standards used for concrete reasons:
Always used braces in control structures, and place them on separate lines. This eliminates problems with people adding code and including it or not including it mistakenly inside a control structure.
if(...)
{
}
All switches/selects have a default case. The default case logs an error if it's not a valid path.
For the same reason as above, any if...elseif... control structures MUST end with a default else that also logs an error if it's not a valid path. A single if statement does not require this.
In the occasional case where a loop or control structure is intentionally empty, a semicolon is always placed within to indicate that this is intentional.
while(stillwaiting())
{
;
}
Naming standards have very different styles for typedefs, defined constants, module global variables, etc. Variable names include type. You can look at the name and have a good idea of what module it pertains to, its scope, and type. This makes it easy to detect errors related to types, etc.
There are others, but these are the top off my head.
-Adam
I'm going to second Jason's suggestion.
I just completed a standards document for a team of 10-12 that work mostly in perl. The document says to use "perltidy-like indentation for complex data structures." We also provided everyone with example perltidy settings that would clean up their code to meet this standard. It was very clear and very much industry-standard for the language so we had great buyoff on it by the team.
When setting out to write this document, I asked around for some examples of great code in our repository and googled a bit to find other standards documents that smarter architects than I to construct a template. It was tough being concise and pragmatic without crossing into micro-manager territory but very much worth it; having any standard is indeed key.
Hope it works out!
It obviously varies depending on languages and technologies. By the look of your example name space I am going to guess java, in which case http://java.sun.com/docs/codeconv/ is a really good place to start. You might also want to look at something like maven's standard directory structure which will make all your projects look similar.

Resources