"warning : missing terminating " character" in false condition - gcc

I'm compiling my game engine code on VS2015 and Xcode using gcc.
I used many compilers for my code and only gcc shows the warning:
warning : missing terminating " character
The code is like following:
#if 1
...
#else
asm __volatile__("
some assembly code
...
"::: );
#endif
I know recent gcc does not accept new lines in a string but I don't know why gcc preprocessor spews warning for false block.
I know my writing style of inline assembly is old but they are in false conditional block. I don't want to touch them because there are so many.
How can I avoid this warning in false conditional block except suppressing all the warnings ?
Edit:
I compiled my code with Armcc, Vc++(2005,2008,2012,2013,2015) and clang. They don't show this kind of warning, only GCC does.
If the warnings are for the code in the TRUE conditional block, I will fix them. But this warnings are for FALSE conditional blocks which should not be evaluated.

C preprocessor is string-literal aware. Per language grammar, string literal is one of the preprocessing tokens
preprocessing-token:
header-name
identifier
pp-number
character-constant
string-literal
punctuator
each non-white-space character that cannot be one of the above
This is, of course, perfectly expected, since otherwise the preprocessor wouldn't be able to tell MACRO_NAME from "MACRO_NAME" (the latter being just a string).
For this reason, your string literals are supposed to be formatted properly for preprocessing purposes. I presume that GCC sees the first " and begins interpreting the sequence as a string literal, but then abandons the attempt (due to missing terminating ") and files it under the generic "cannot be one of the above" category, accompanying it with a warning,
You seem to be assuming that conditional inclusion works at the higher preprocessing level than recognition of such tokens as string literals. This assumption is incorrect. The preprocessor has no other choice but to parse the "disabled" stretch of the code as well.

Related

Suppressing format and extra args warnings for custom printf implementation

I've written a custom printf implementation for an embedded environment. In this effort I have also added some additional specifiers for printing unique types and timestamps, among other things:
printf("[%T] time\n");
Perhaps this ones unique as it does not take any arguments, but rather has unique handling in the parser. Although this could easily be remedied by using a macro to always pass false data as needed. Or if it cause too much trouble I could make it more unique so it does not appear the same as other specifiers that do take arguments.
All other custom types I have implemented take arguments as usual. The only trouble I'm having is with the compiler's warnings:
test.c:130:32: warning: unknown conversion type character ‘v’ in format [-Wformat=]
printf("%v\n", type);
^
test.c:130:16: error: too many arguments for format [-Werror=format-extra-args]
printf("%v\n", type);
^~~~~~~~~~~~~~~~~~~~
These warnings can be suppressed by adding -Wno-format-extra-args -Wno-format compiler arguments, which I do (for now). But this can mask genuine errors like passing integers where pointers are expected, or legitimately not providing enough arguments for a given specifier list.
Is it possible to add new semantic checks to printf-style functions?

Making a customized error message in yacc tool

Hi I just started working on lex and yacc tools.
I realized that yyerror recieves only the string "syntax error" from yacc.
I was wondering if I can customize this string.
Oh and also can I differentiate different types of errors? (tyring to have missing token and additional token as different erros.)
If so, how should I..?
Many thanks.
You're free to print any message you want to in yyerror (or even no message at all), so you can customise messages as you see fit. A common customisation is to add the line number (and possibly column number) of the token which triggered the error. You can certainly change the text if you want to, but if you just want to change it to a different language, you should probably use the gettext mechanism. You'll find .po files in the runtime-po subdirectory of the source distribution. If this facility is enabled, bison will arrange for the string to be translated before it is passed to yyerror, but of course you could do the translation yourself in yyerror if that is more convenient for you.
I suspect that what you actually want is for bison to produce a more informative error message. Bison only has one alternative error message format, which includes a list of "expected" tokens. You can ask Bison to produce such an error message by including
%define parse.error verbose
in your prologue. As the manual indicates, the bison parsing algorithm can sometimes produce an incorrect list of expected tokens (since it was not designed for this particular purpose); you can get a more precise list by enabling lookahead correction by also including
%define parse.lac full
This does have a minor performance penalty. See the linked manual section for details.
The list of tokens produced by this feature uses the name of the token as supplied in the bison file. These names are usually not very user-friendly, so you might find yourself generating error messages such as the infamous PHP error
syntax error, unexpected T_CONSTANT_ENCAPSED_STRING
(Note: more recent PHP versions produce a different but equally mysterious message.)
To avoid this, define double-quoted aliases for your tokens. This can also make your grammar a lot more readable:
%type <string> TOK_ID "identifier"
%token TOK_IF "if" TOK_ELSE "else" TOK_WHILE "while"
%token TOK_LSH "<<"
/* Etc. */
%%
stmt: expr ';'
| while
| if
| /* ... */
while: "while" '(' expr ')' stmt
expr: "identifier"
| expr "<<" expr
/* ... */
The quoted names will not be passed through gettext. That's appropriate for names which are keywords, but it might be desirable to translate descriptive token aliases. A procedure to do so is outline in this answer.

why does the VFP code ENDIF-*9 work

I have inherited VFP code that has an IF Endif statement where the endif is coded ENDIF-*9 For whatever reason this gets past the compiler, and generates no run time errors. Anyone know why this works?
It is not specific to ENDIF. It would also work for endfor, enddo, endscan ... I think VFP only cares about seeing the word "endif" and discards rest as comment.
From the Help for DO WHILE ... ENDDO:
Comments can be placed after DO WHILE and ENDDO on the same line. The comments are ignored during program compilation and execution.
I've known that and always assumed that was the case. In your example, though, you're not leaving any space between the end of the keyword and the start of the 'extra' text. My guess would be that the lexer in VFP recognizes - (and I tested + as well) as a terminator for the ENDIF (or ENDDO, etc) and treats the rest of the line as a comment. If you just have extra stuff immediately after ENDIF (like ENDIFblah), VFP doesn't recognize the keyword and treats it as junk, resulting in a syntax error.

What is the difference between "hello".length and "hello" .length?

I am surprised when I run the following examples in ruby console. They both produce the same output.
"hello".length
and
"hello" .length
How does the ruby console remove the space and provide the right output?
You can put spaces wherever you want, the interpreter looks for the end of the line. For example:
Valid
"hello".
length
Invalid
"hello"
.length
The interpreter sees the dot at the end of the line and knows something has to follow it up. While in the second case it thinks the line is finished. The same goes for the amount of spaces in one line. Does it matter how the interpreter removes the spaces? What matters is that you know the behavior.
If you want you can even
"hello" . length
and it will still work.
I know this is not an answer to you question, but does the "how" matter?
EDIT: I was corrected in the comments below. The examples with multiple lines given above are both valid when run in a script instead of IRB. I was mixed them up with the operators. Where the following also applies when running a script:
Valid
result = true || false
Valid
result = true ||
false
Invalid
result = true
|| false
This doesn't have as much to do with the console as it has to do with how the language itself is parsed by the compiler.
Most languages are parsed in such a way that items to be parsed are first grouped into TOKENS. Then the compiler is defined to expect a certain SEQUENCE of tokens in order to interpret each programming statement.
Because the compiler is only looking for a TOKEN SEQUENCE, it doesn't matter if there is space in between or not.
In this case the compiler is looking for:
STRING DOT METHOD_NAME
So it won't matter if you write "hello".length, or even "hello" . length. The same sequence of tokens are present in both, and that is all that matters to the compiler.
If you are curious how these token sequences are defined in the Ruby source code, you can look at parse.y starting around line 1042:
https://github.com/ruby/ruby/blob/trunk/parse.y#L1042
This is a file that is written using the YACC language, which is a language used to define parsers with.
Even without knowing anything about YACC, you should already be able to get some clues on how it works by just looking around the file a bit.

One more difference between gcc's and MS preprocessor

One more difference between gcc preprocessor and that of MS VS cl. Consider the following snippet:
# define A(x) L ## x
# define B A("b")
# define C(x) x
C(A("a" B))
For 'gcc -E' we get the following:
L"a" A("b")
For 'cl /E' the output is different:
L"a" L"b"
MS preprocessor somehow performs an additional macro expansion. Algorithm of its work is obviously different from that of gcc, but this algorithm also seems to be a secret. Does anyone know how the observed difference can be explained and what is the scheme of preprocessing in MS cl?
GCC is correct. The standard specifies:
C99 6.10.3.4/2 (and also C++98/11 16.3.4/2): If the name of the macro being replaced is found during this scan of the replacement list
(not including the rest of the source file’s preprocessing tokens), it is not replaced.
So, when expanding A("a" B), we first replace B to give A("a" A("B")).
A("B") is not replaced, according to the quoted rule, so the final result is L"a" A("B").
Mike's answer is correct, but he actually elides the critical part of the standard that shows why this is so:
6.10.3.4/2 If the name of the macro being replaced is found during this scan of the replacement list
(not including the rest of the source file’s preprocessing tokens), it is not replaced.
Furthermore, if any nested replacements encounter the name of the macro being replaced,
it is not replaced. These nonreplaced macro name preprocessing tokens are no longer
available for further replacement even if they are later (re)examined in contexts in which
that macro name preprocessing token would otherwise have been replaced.
Note the last clause here that I've emphasized.
So both gcc and MSVC expand the macro A("a" B) to L"a" A("b"), but the interesting case (where MSVC screws up) is when the macro is wrapped by the C macro.
When expanding the C macro, its argument is first examined for macros to expand and A is expanded. This is then substituted into the body of C, and then that body is then scanned AGAIN for macros to replace. Now you might think that since this is the expansion of C, only the name C will be skipped, but this last clause means that the tokens from the expansion of A will also skip reexpansions of A.
There are basically two ways of how one could think that the remaining occurrance of the A macro should be replaced:
The first would be the processing or macro arguments before they are inserted in place of the corresponding parameter in the macro's replacement list. Usually each argument is complete macro-replaced as if it formed the rest of the input file, as decribed in section 6.10.3.1
of the standard. However, this is not done if the parameter (here: x) occurs next to the ##; in this case the parameter is simply replaced with the argument according to 6.10.3.3, without any recursive macro replacement.
The second way would be the "rescanning and further replacement" of section 6.10.3.4, but this not done recursively for a macro that has already been replaced once.
So neither applies in this case, which means that gcc is correct in leaving that occurrence of A unreplaced.

Resources