scanf not working as expected in Frama-C - static-analysis

In the program below, function dec uses scanf to read an arbitrary input from the user.
dec is called from main and depending on the input it returns 1 or 0 and accordingly an operation will be performed. However, the value analysis indicates that y is always 0, even after the call to scanf. Why is that?

Note: the comments below apply to versions earlier than Frama-C 15 (Phosphorus, 20170501); in Frama-C 15, the Variadic plugin is enabled by default (and its short name is now -variadic).
Solution
Enable Variadic (-va) before running the value analysis (-val), it will eliminate the warning and the program will behave as expected.
Detailed explanation
Strictly speaking, Frama-C itself (the kernel) only does the parsing; it's up to the plug-ins themselves (e.g. Value/EVA) to evaluate the program.
From your description, I believe you must be using Value/EVA to analyze a program. I do not know exactly which version you are using, so I'll describe the behavior with Frama-C Silicon.
One limitation of ACSL (the specification language used by Frama-C) is that it is not currently possible to specify contracts for variadic functions such as scanf. Therefore, the specifications shipped with the Frama-C standard library are insufficient. You can notice this in the following program:
#include <stdio.h>
int d;
int main() {
scanf("%d", &d);
Frama_C_show_each(d);
return 0;
}
Running frama-c -val file.c will output, among other things:
...
[value] using specification for function scanf
FRAMAC_SHARE/libc/stdio.h:150:[value] warning: no \from part for clause 'assigns *__fc_stdin;' of function scanf
[value] Done for function scanf
[value] Called Frama_C_show_each({0})
...
That warning means that the specification is incorrect, which explains the odd behavior.
The solution in this case is to use the Variadic plug-in (-va, or -va-help for more details), which will specialize variadic calls and add specifications to them, thus avoiding the warning and behaving as expected. Here's the resulting code (-print) after running the Variadic plug-in on the example above:
$ frama-c -va file.c -print
[... lots of definitions from stdio.h ...]
/*# requires valid_read_string(format);
requires \valid(param0);
ensures \initialized(param0);
assigns \result, *__fc_stdin, *param0;
assigns \result
\from (indirect: *__fc_stdin), (indirect: *(format + (0 ..)));
assigns *__fc_stdin
\from (indirect: *__fc_stdin), (indirect: *(format + (0 ..)));
assigns *param0
\from (indirect: *__fc_stdin), (indirect: *(format + (0 ..)));
*/
int scanf_0(char const *format, int *param0);
int main(void)
{
int __retres;
scanf_0("%d",& d);
Frama_C_show_each(d);
__retres = 0;
return __retres;
}
In this example, scanf was specialized to scanf_0, with a proper ACSL annotation. Running EVA on this program will not emit any warnings and produce the expected output:
# frama-c -va file.c -val
...
[value] Done for function scanf_0
[value] Called Frama_C_show_each([-2147483648..2147483647])
...
Note: the GUI in Frama-C 14 (Silicon) does not allow the Variadic plug-in to be enabled (even after ticking it in the Analyses panel), so you must use the command-line in this case to obtain the expected result and avoid the warning. Starting from Frama-C 15 (Phosphorus, to be released in 2017), this won't be necessary: Variadic will be enabled by default and so your example would work from the start.

Related

How many bits does an enum value need?

#include <stdint.h>
enum state : uint8_t {
NONE,
USA,
CAN,
MEX
};
struct X {
state st : 2; // compiles with uint8_t st : 2
};
Clang 3.9.0 compiles successfully.
GCC 4.8.4 and 5.3.0 complain with:
warning: ‘X::st’ is too small to hold all values of ‘enum state’
Who is right?
TL;DR
Both are correct.
The value of an enumeration is limited by the underlying type, not by the enumerators!
C++14, 7.2 Enumeration declarations, paragraph 8:
It is possible to define an enumeration that has values not defined by any of its enumerators.
Which means it is possible to:
state x = static_cast< state >(5);
That is what GCC is warning you about: enum state could have values that do not fit into 2 bits.
However, as long as you don't try to actually do that to X::st, everything is shiny.
That's (probably) why Clang is not warning you about it.
Since the standard does not demand a diagnostic either way, there's nothing wrong about warning, or not warning you.

Can evaluation of functions happen during compile time?

Consider the below function,
public static int foo(int x){
return x + 5;
}
Now, let us call it,
int in = /*Input taken from the user*/;
int x = foo(10); // ... (1)
int y = foo(in); // ... (2)
Here, can the compiler change
int x = foo(10); // ... (1)
to
int x = 15; // ... (1)
by evaluating the function call during compile time since the input to the function is available during compile time ?
I understand this is not possible during the call marked (2) because the input is available only during run time.
I do not want to know a way of doing it in any specific language. I would like to know why this can or can not be a feature of a compiler itself.
C++ does have a method for this:
Have a read up on the 'constexpr' keyword in C++11, it allows compile time evaluation of functions.
They have a limitation: the function must be a return statement (not multiple lines of code), but can call other constexpr functions (C++14 does not have this limitation AFAIK).
static constexpr int foo(int x){
return x + 5;
}
EDIT:
Why a compiler might not evaluate a function (just my guess):
It might not be appropriate to remove a function by evaluating it without being told.
The function could be used in different compilation units, and with static/dynamic inputs: thus evaluating it in some circumstances and adding a call in other places.
This use would provide inconsistent execution times (especially on a deterministic platform like AVR) where timing may be important, or at least need to be predictable.
Also interrupts (and how the compiler interacts with them) may come into play here.
EDIT:
constexpr is actually stronger -- it requires that the compiler do this. The compiler is free to fold away functions without constexpr, but the programmer can't rely on it doing so.
Can you give an example in the case where the user would have benefited from this but the compiler chose not to do it ?
inline functions may, or may not resolve to constant expressions which could be optimized into the end result.
However, a constexpr guarantees it. An inline function cannot be used as a compile time constant whereas constexpr can allow you to formulate compile time functions and more so, objects.
A basic example where constexpr makes a guarantee that inline cannot.
constexpr int foo( int a, int b, int c ){
return a+b+c;
}
int array[ foo(1, 2, 3) ];
And the same as a simple object.
struct Foo{
constexpr Foo( int a, int b, int c ) : val(a+b+c){}
int val;
};
constexpr Foo foo( 1,2,4 );
int array[ foo.val ];
Unless foo.val is a compile time constant, the code above will not compile.
Even as just a function, an inline function has no guarantee. And the linker can also do inlining over multiple compilation units, after the syntax has been compiled (array bounds checked for integer constants).
This is kind of like meta-programming, but without the templates. Of course these examples do not do the topic justice, however very complex solutions would benefit from the ability to use objects and functional programming to achieve a result.
Yes, evaluation can happen during compile time. This comes under the heading of constant folding and function inlining, both of which are common optimizations for optimizing compilers.
Many languages do not have strong distinction between "compile time" and "run time", but the general rule is that the language defines an "execution model" which defines the behavior of any particular program with any particular input (or specifies that it is undefined). The compiler must produce an executable that can read any input and produce the corresponding output as defined by the execution model. What happens inside the executable doesn't matter -- as long as the externally viewed behavior is correct.
Here "input", "output" and "behavior" includes all possible interactions with the environment that are defined in the execution model, including timing effects.

strcmp() return different values for same string comparisons [duplicate]

This question already has an answer here:
Why does strcmp() in a template function return a different value?
(1 answer)
Closed 2 years ago.
char s1[] = "0";
char s2[] = "9";
printf("%d\n", strcmp(s1, s2)); // Prints -9
printf("%d\n", strcmp("0", "9")); // Prints -1
Why do strcmp returns different values when it receives the same parameters ?
Those values are still legal since strcmp's man page says that the return value of strcmp can be less, greater or equal than 0, but I don't understand why they are different in this example.
I assume you are using GCC when compiling this, I tried it on 4.8.4. The trick here is that GCC understands the semantics of certain standard library functions (strcmp being one of them). In your case, the compiler will completely eliminate the second strcmp call, because it knows that the result of strcmpgiven string constants "0" and "9" will be negative, and a standard compatible value (-1) will be used instead of doing the call. It cannot do the same with the first call, because s1 and s2 might have been changed in memory (imagine an interrupt, or multiple threads, etc.).
You can do an experiment to validate this. Add the const qualifier to the arrays to let GCC know that they cannot be changed:
const char s1[] = "0";
const char s2[] = "9";
printf("%d\n", strcmp(s1, s2)); // Now this will print -1 as well
printf("%d\n", strcmp("0", "9")); // Prints -1
You can also look at the assembler output form the compiler (use the -S flag).
The best way to check however is to use -fno-builtin, which disables this optimization. With this option, your original code will print -9 in both cases
The difference is due to the implementation of strcmp. As long as it conforms to the (<0, 0, >0), it shouldn't matter to the developer. You cannot rely on anything else. For all you know, the source code could be determining it should be negative, and randomly generating a negative number to throw you off.

Is it possible to inject values in the frama-c value analyzer?

I'm experimenting with the frama-c value analyzer to evaluate C-Code, which is actually threaded.
I want to ignore any threading problems that might occur und just inspect the possible values for a single thread. So far this works by setting the entry point to where the thread starts.
Now to my problem: Inside one thread I read values that are written by another thread, because frama-c does not (and should not?) consider threading (currently) it assumes my variable is in some broad range, but I know that the range is in fact much smaller.
Is it possible to tell the value analyzer the value range of this variable?
Example:
volatile int x = 0;
void f() {
while(x==0)
sleep(100);
...
}
Here frama-c detects that x is volatile and thus has range [--..--], but I know what the other thread will write into x, and I want to tell the analyzer that x can only be 0 or 1.
Is this possible with frama-c, especially in the gui?
Thanks in advance
Christian
This is currently not possible automatically. The value analysis considers that volatile variables always contain the full range of values included in their underlying type. There however exists a proprietary plug-in that transforms accesses to volatile variables into calls to user-supplied function. In your case, your code would be transformed into essentially this:
int x = 0;
void f() {
while(1) {
x = f_volatile_x();
if (x == 0)
sleep(100);
...
}
By specifying f_volatile_x correctly, you can ensure it returns values between 0 and 1 only.
If the variable 'x' is not modified in the thread you are studying, you could also initialize it at the beginning of the 'main' function with :
x = Frama_C_interval (0, 1);
This is a function defined by Frama-C in ...../share/frama-c/builtin.c so you have to add this file to your inputs when you use it.

Retrieving the ZF in GCC inline assembly

I need to use some x86 instructions that have no GCC intrinsics, such as BSF and BSR.
With GCC inline assembly, I can write something like the following
__INTRIN_INLINE unsigned char bsf64(unsigned long* const index, const uint64_t mask)
{
__asm__("bsf %[mask], %[index]" : [index] "=r" (*index) : [mask] "mr" (mask));
return mask ? 1 : 0;
}
Code like if (bsf64(x, y)) { /* use x */ } is translated by GCC to something like
0x000000010001bf04 <bsf64+0>: bsf %rax,%rdx
0x000000010001bf08 <bsf64+4>: test %rax,%rax
0x000000010001bf0b <bsf64+7>: jne 0x10001bf44 <...>
However if mask is zero, BSF already sets the ZF flag, so the test after bsf is redundant.
Instead of returning mask ? 1 : 0, is it possible to retrieve the ZF flag and returning it, making GCC not generate the test?
EDIT: made the if example more clear
EDIT: In response to Damon, __builtin_ffsl generates even less optimal code. If I use the following code
int b = __builtin_ffsl(mask);
if (b) {
*index = b - 1;
return true;
} else {
return false;
}
GCC generates this assembly
0x000000000044736d <+1101>: bsf %r14,%r14
0x0000000000447371 <+1105>: cmove %r12,%r14
0x0000000000447375 <+1109>: add $0x1,%r14d
0x0000000000447379 <+1113>: je 0x4471c0 <...>
0x000000000044737f <+1119>: lea -0x1(%r14),%ecx
So the test is gone, but redundant conditional move, increment and decrement are generated.
A couple of remarks:
This is an "anti-optimization". You're trying to do a micro-optimization on something that the compiler already supports.
Your code does not generate the bsf instruction at all with my version of gcc with all optimization switches turned on. Looking at the code, that is not surprising, because you return mask, which is the source operand, not the destination operand (gcc uses AT&T syntax!). The compiler is intelligent enough to figure this out and drops the assembler code (which doesn't do anything) alltogether.
There is an intrinsic function __builtin_ffsl which does exactly the same as your inline assembly (though, correctly). An intrinsic is no less portable than inline assembler, but easier for the compiler to optimize.
Using the intrinsic function results in a bsf cmov sequence on my compiler (assuming the calling code forces it to actually emit the instruction), which shows that the compiler uses the zero-flag just fine without an additional test instruction.
Returning a char when you want a bool is not the best possible hint for the compiler, though it will probably figure it out anyway most of the time. However, telling the compiler to use a bitscan instruction when you are really only interested in "zero or not zero" is certainly sub-optimal. if(x) and if(!x) work perfectly well for that matter. It would be different if you returned the result as reference, so you could reuse it in another place, but as it is, your code is only a very complicated way of writing if(x).

Resources