I know that #defines, etc. are normally never indented. Why?
I'm working in some code at the moment which has a horrible mixture of #defines, #ifdefs, #elses, #endifs, etc. All these often mixed in with normal C code. The non-indenting of the #defines makes them hard to read. And the mixture of indented code with non-indented #defines is a nightmare.
Why are #defines typically not indented? Is there a reason one wouldn't indent (e.g. like this code below)?
#ifdef SDCC
#if DEBUGGING == 1
#if defined (pic18f2480)
#define FLASH_MEMORY_END 0x3DC0
#elif defined (pic18f2580)
#define FLASH_MEMORY_END 0x7DC0
#else
#error "Can't set up flash memory end!"
#endif
#else
#if defined (pic18f2480)
#define FLASH_MEMORY_END 0x4000
#elif defined (pic18f2580)
#define FLASH_MEMORY_END 0x8000
#else
#error "Can't set up flash memory end!"
#endif
#endif
#else
#if DEBUGGING == 1
#define FLASH_MEMORY_END 0x7DC0
#else
#define FLASH_MEMORY_END 0x8000
#endif
#endif
Pre-ANSI C preprocessor did not allow for space between the start of a line and the "#" character; the leading "#" had to always be placed in the first column.
Pre-ANSI C compilers are non-existent these days. Use which ever style (space before "#" or space between "#" and the identifier) you prefer.
http://www.delorie.com/gnu/docs/gcc/cpp_48.html
As some have already said, some Pre-ANSI compilers required the # to be the first char on the line but they didn't require de preprocessor directive to be attached to it, so indentation was made this way.
#ifdef SDCC
# if DEBUGGING == 1
# if defined (pic18f2480)
# define FLASH_MEMORY_END 0x3DC0
# elif defined (pic18f2580)
# define FLASH_MEMORY_END 0x7DC0
# else
# error "Can't set up flash memory end!"
# endif
# else
# if defined (pic18f2480)
# define FLASH_MEMORY_END 0x4000
# elif defined (pic18f2580)
# define FLASH_MEMORY_END 0x8000
# else
# error "Can't set up flash memory end!"
# endif
# endif
#else
# if DEBUGGING == 1
# define FLASH_MEMORY_END 0x7DC0
# else
# define FLASH_MEMORY_END 0x8000
# endif
#endif
I've often seen this style in old Unix headers but I hate it as the syntax coloring often fails on such code. I use a very visible color for pre-processor directive so that they stand out (they are at a meta-level so should not be part of the normal flow of code).
You can even see that SO does not color the sequence in a useful manner.
Regarding the parsing of preprocessor directives, the C99 standard (and the C89 standard before it) were clear about the sequence of operations performed logically by the compiler. In particular, I believe it means that this code:
/* */ # /* */ include /* */ <stdio.h> /* */
is equivalent to:
#include <stdio.h>
For better or worse, GCC 3.4.4 with '-std=c89 -pedantic' accepts the comment-laden line, at any rate. I'm not advocating that as a style - not for a second (it is ghastly). I just think that it is possible.
ISO/IEC 9899:1999 section 5.1.1.2 Translation phases says:
[Character mapping, including trigraphs]
[Line splicing - removing backslash newline]
The source file is decomposed into preprocessing tokens and sequences of
white-space characters (including comments). A source file shall not end in a
partial preprocessing token or in a partial comment. Each comment is replaced by
one space character. New-line characters are retained. Whether each nonempty
sequence of white-space characters other than new-line is retained or replaced by
one space character is implementation-defined.
Preprocessing directives are executed, macro invocations are expanded, [...]
Section 6.10 Preprocessing directives says:
A preprocessing directive consists of a sequence of preprocessing tokens that begins with
a # preprocessing token that (at the start of translation phase 4) is either the first character
in the source file (optionally after white space containing no new-line characters) or that
follows white space containing at least one new-line character, and is ended by the next
new-line character.
The only possible dispute is the parenthetical expression '(at the start of translation phase 4)', which could mean that the comments before the hash must be absent since they are not otherwise replaced by spaces until the end of phase 4.
As others have noted, the pre-standard C preprocessors did not behave uniformly in a number of ways, and spaces before and in preprocessor directives was one of the areas where different compilers did different things, including not recognizing preprocessor directives with spaces ahead of them.
It is noteworthy that backslash-newline removal occurs before comments are analyzed.
Consequently, you should not end // comments with a backslash.
I don't know why it's not more common. There are certainly times when I like to indent preprocessor directives.
One thing that keeps getting in my way (and sometimes convinces me to stop trying) is that many or most editors/IDEs will throw the directive to column 1 at the slightest provocation. Which is annoying as hell.
These days I believe this is mainly a choice of style. I think at one point in the distant past, not all compilers supported the notion of indenting preprocessor defines. I did some research and was unable to back up that assertion. But in any case, it appears that all modern compilers support the idea of indenting pre-processor macro. I do not have a copy of the C or C++ standard though so I do not know if this is standard behavior or not.
As to whether or not it's good style. Personally, I like the idea of keeping them all to the left. It gives you a consistent place to look for them. Yeah it can get annoying when there are very nested macros. But if you indent them, you'll eventually end up with even weirder looking code.
#if COND1
void foo() {
#if COND2
int i;
#if COND3
i = someFunction()
cout << i << eol;
#endif
#endif
}
#endif
For the example you've given it may be appropriate to use indentation to make it clearer, seeing as you have such a complex structure of nested directives.
Personally I think it is useful to keep them not indented most of the time, because these directives operate separately from the rest of your code. Directives such as #ifdef are handled by the pre-processor, before the compiler ever sees your code, so a block of code after an #ifdef directive may not even be compiled.
Keeping directives visually separated from the rest of your code is more important when they are interspersed with code (rather than a dedicated block of directives, as in the example you give).
In almost all the currently available C/CPP compilers it is not restricted. It's up to the user to decide how you want to align code.
So happy coding.
I'm working in some code at the moment which has a horrible mixture of #defines, #ifdefs, #elses, #endifs, #etc. All these often mixed in with normal C code. The non-indenting of the #defines makes them hard to read. And the mixture of indented code with non-indented #defines is a nightmare.
A common solution is to comment the directives, so that you easily know what they refer to:
#ifdef FOO
/* a lot of code */
#endif /* FOO */
#ifndef FOO
/* a lot of code */
#endif /* not FOO */
I know this is old topic but I wasted couple of days searching for solution. I agree with initial post that intending makes code cleaner if you have lots of them (in my case I use directives to enable/disable verbose logging). Finally, I found solution here which works Visual Studio 2017
If you like to indent #pragma expressions, you can enable it under: Tools > Options > Text Editor > C/C++ > Formatting > Indentation > Position of preprocessor directives > Leave indented
The only problem left is that auto code layout fixed that formatting =(
Related
I am trying to emit a global SYMBOL based on a #define VALUE. My attempt is as follows:
__asm__ (".globl SYMBOL");
__asm__ (".set SYMBOL, %0" :: "i" (VALUE));
What is emitted by gcc to the assembler is the following:
.globl SYMBOL
.set SYMBOL, #VALUE
How can I get rid of the hash in the .set before VALUE. FWIW, my target is ARM.
armclang defines various template modifiers that can be used with inline assembly. gcc supports them, in every instance I've checked, although it doesn't document this.
In particular there is
c
Valid for an immediate operand. Prints it as a plain value without a preceding #. Use this template modifier when using the operand in .word, or another data-generating directive, which needs an integer without the #.
So you can do
__asm__ (".set SYMBOL, %c0" : : "i" (VALUE));
Try on godbolt
(There's a few open bugs on the gcc bugzilla suggesting that template / operand modifiers should be documented. The main one seems to be 30527, where I've just posted a comment. The developers' view seems to be that operand modifiers are "compiler internals" that are not meant for end users, but for arm/aarch64 in particular, there are simple things that you just can't do any other way. They made an exception for x86, so why not here?)
You can use stringizing.
#define VALUE 89
#define xstr(s) str(s)
#define str(s) #s
__asm__ (".globl SYMBOL");
__asm__ (".set SYMBOL, " str(VALUE));
The 'VALUE' must conform to something that gas will take as working with set. They could be fixed addresses from some vendor documentation or a listing output that is parsed. If you want 'VALUE' use str(s), if you want '89' then use xstr(s). You did not describe the actual use case.
I'm trying to build a code which defines a method howmany().
In OSX there is a function-like macro with the same name in /usr/include/sys/types.h
#define howmany(x, y) __DARWIN_howmany(x, y)
Following a previous question, I've tried including #undef in the header file as
#ifdef howmany
#undef howmany /*undefining '/usr/include/sys/types.h' */
#endif
But the function-like macro remains active.
/usr/include/sys/types.h:184:9: note: macro 'howmany' defined here #define howmany(x, y) __DARWIN_howmany(x, y) /* # y's == x bits? */
How should I deactivate this function-like macro?
This is incredibly fragile. Yes, you can probably get it to work, but very small changes to how things are imported will cause it to fail again (or worse, just call the wrong code). You need to make sure that #undef is included in every file that might include types.h and that it's undefined after it's defined. This is many ways it can break. Rename your function. The name is taken.
I'm trying to implement a fork-merge parser for C using Java. I need to fork the parser whenever I find an #if directive. For example:
int x = #if 3; #else 4; #endif
The above statement should be parsed as follows:
First I create a new parser for #if and read-in everything under #if statement, in the above case, reading value 3 directly wold throw a syntax error, in that case I should read back all the tokens that were already read. How do I do this?
Well, you could take any standard parser, and replicate its state on encountering an #if. Then one version goes down the fork of the #if, the other down the fork of the #else. (Doesn't have a #else? Pretend you saw #if cond lexemes #else #endif).
Life gets messy if the preprocessor conditionals don't occur in nice places:
void p()
{ if (x<2)
{ y =
#if cond
3; } else { x=
#else
7;
#endif
}
The parse can end up in really different states at the end of each conditional subpart. A consequence: you need to expand preprocessor conditionals so they include clean language structures.
A serious problem with this approach, is that if you fork every time you see #if, you end up with 2^N forks for N #ifs. It is easy to find C code with dozens of conditionals; 2^24 --> 16 million so this gets out of hand fast.
So you need a way to merge the parses back together again when you hit the #endif. That's not so easy; we have done this with a GLR parser but the answer is very complicated and won't fit easily here. This technical paper discusses how to do this merge for LR parsers: http://www.lirmm.fr/~ducour/Doc-objets/ECOOP2012/PLDI/pldi/p323.pdf
There's a second complication: imagine
#if cond1
stuff1
#else
#if cond2
stuff4
#else
stuff5
#endif
#endif
Now you need to fork parsers inside parsers. Worse, stuff4 has a condition, which is the conjunction of cond1 and cond2, but stuff5 has a condition of cond1 & ~ cond2. Realistically, you're going to need to compute and retain the condition under which each parse (and generated subtree) occurs. You'll need some kind of symbolic condition computation to do this, and you'll need to handle the case where the composed condition is entirely false specially (just skip the content). Interestingly, our GLR solution and the above technical paper both agree that using BDDs for this is a good idea.
If you want to do refactoring, you'll need to determine the meaning of names in the presence of conditionals:
#if cond1
float x;
#else
char* x;
#endif
...
x=
#if cond1
3.7
#else
"foobar"
#endif
;
This requires having a symbol table that carries conditional information with the symbols. See my technical paper http://www.rcost.unisannio.it/mdipenta/papers/scam2002.pdf for details on how to approach this.
To do refactoring, you're going to need control and data flow analysis on top of all this.
Check out my bio for a tool where we are attempting to do all this. We have the conditional parsing part done right, we think. The rest is still up in the air.
Usually when using include guards I write them like so:
#ifndef FILENAME_H
#define FILENAME_H
...
#endif // FILENAME_H
Now in some librarys I've seen something like:
#ifndef FILENAME_H
#define FILENAME_H 1
...
#endif // FILENAME_H
After some reserach I didn't find any reason as to why the include-gurad would be needed to be initialized.
Is there any reason for doing this?
Though I've never seen such a compiler, I've been told an "empty" define could be regarded as not defined.
I'm very interested in which compiler behaves like so.
Even C89 states:
3.8.1 Conditional inclusion
Constraints
The expression that controls conditional inclusion shall be an integral constant expression [...] of the form
defined identifier
defined ( identifier )
which evaluate to 1 if the identifier is currently defined as a macro name (that is, if it is predefined or if it has been the subject of a #define preprocessing directive without an intervening #undef directive with the same subject identifier), 0 if it is not.
I have been writing C++ for a while but have very little macro experience. I have read some of the other questions on this topic but I just can't quite translate them to my problem.
I want to define a macro such that coding ENUM_PRAGMA(foo) produces _Pragma("enum(foo)") which I intend to have the effect of #pragma enum(foo)
(The compiler supports _Pragma("string").)
I have tried multiple variations of
#define ENUM_PRAGMA(siz) \
_Pragma( "enum(" #siz ")" )
but can't get any of them to work.
Based on How do I implement a macro that creates a quoted string for _Pragma? I tried
#define HELPER1(x) enum( x )
#define HELPER2(y) HELPER1(#y)
#define ENUM_PRAGMA(siz) _Pragma(HELPER2(siz))
but I'm still not quite there. (Error is string literal was expected but enum was found so I guess my HELPER2 is not quoting the string.
Can anyone please humor me on this? Thanks much.
Okay, I got it.
I defined a general purpose macro STRINGIFY:
#define STRINGIFY(str) #str
Now the real macro comes down simply to
#define ENUM_PRAGMA(siz) _Pragma(STRINGIFY(enum(siz)))
Thanks for your patience.