I want to explore Frama-C to apply Assertion-based Slicing (using ACSL notation).
I have found that there are several different versions of Frama-C with some different features.
My question is which version is best suited to develop a a slicing plugin to Frama-C and to manipulate the AST created by Frama-C.
There already is a slicing plug-in in Frama-C (in all versions).
This plug-in uses the results of the value analysis plug-in, which assumes the properties written inside ACSL assertions (after having attempted to verify them).
So, depending on what you call “assertion-based slicing” (and be aware that the article that comes up first in Google is behind a paywall), what you propose to do may already exists as a Frama-C plug-in (and one that works pretty well as of the last two or three Frama-C versions).
To answer your question anyway, the best version to use is the latest one, which is Fluorine 20130601 as of this writing.
Example of existing slicing features in Frama-C:
$ cat t.c
int f(unsigned int x)
{
int y;
/*# assert x == 0 ; */
if (x)
y = 9;
else
y = 10;
return y;
}
$ frama-c -sparecode t.c -main f
...
t.c:4:[value] Assertion got status unknown.
...
/* Generated by Frama-C */
int f(unsigned int x)
{
int y;
/*# assert x ≡ 0; */
;
y = 10;
return (y);
}
Is the above what you have in mind when you speak of “assertion-based slicing”?
Note: Frama-C's option -sparecode is a slicing option for the criterion “preserve all results of the program”. It still removes any statement that is without consequences, such as y=3; in y=3; y=4;, and being based on Frama-C's value analysis, it removes anything that is considered unreachable or without consequences because of the value analysis' results.
Another example to illustrate:
$ cat t.c
int f(unsigned int x)
{
int y;
int a, b;
int *p[2] = {&a, &b};
/*# assert x == 0 ; */
a = 100;
b = 200;
if (x)
y = 9;
else
y = 10;
return y + *(p[x]);
}
$ frama-c -sparecode t.c -main f
...
t.c:6:[value] Assertion got status unknown.
...
/* Generated by Frama-C */
int f(unsigned int x)
{
int __retres;
int y;
int a;
int *p[2];
p[0] = & a;
/*# assert x ≡ 0; */
;
a = 100;
y = 10;
__retres = y + *(p[x]);
return (__retres);
}
Related
Analysing the code below on GUI, it is possible check the input values of the function div0.
int div0(int x, int y)
{
return (x/y);
}
int main()
{
int res;
int a = 4;
int b = 2;
res = div0(a,b);
return 0;
}
Is it possible get this value through command line?
The simplest approach in your case is to insert calls to Frama_C_show_each, which is a special Frama-C builtin function that prints the internal Eva state for the given expressions, each time the interpreter passes through the program point. For instance:
int div0(int x, int y)
{
Frama_C_show_each_div0(x, y);
return (x/y);
}
Running frama-c -eva on the modified program will print:
[eva] file.c:3: Frama_C_show_each_div0: {4}, {2}
You can choose the suffix after Frama_C_show_each for each line you want. For instance, if you prefer to print each variable separately:
int div0(int x, int y)
{
Frama_C_show_each_x(x);
Frama_C_show_each_y(y);
return (x/y);
}
Will print instead:
[eva] file.c:3: Frama_C_show_each_x: {4}
[eva] file.c:4: Frama_C_show_each_y: {2}
For a more complex situation, or to avoid modifying the source code, other alternatives are possible, but they may require writing some OCaml code, either to modify Eva directly, or to add e.g. a new abstract domain which will print the expressions. But it's overkill for simple cases.
By the way, if you want your code to still compile normally, simply protect the call to Frama_C_show_each with #ifdef __FRAMAC__ guards:
int div0(int x, int y)
{
#ifdef __FRAMAC__
Frama_C_show_each_div0(x, y);
#endif
return (x/y);
}
Consider the following code:
void foo(float* __restrict__ a)
{
int i; float val;
for (i = 0; i < 100; i++) {
val = 2 * i;
a[i] = val;
}
}
void bar(float* __restrict__ a)
{
int i; float val = 0.0;
for (i = 0; i < 100; i++) {
a[i] = val;
val += 2.0;
}
}
They're based on Examples 7.26a and 7.26b in Agner Fog's Optimizing software in C++ and should do the same thing; bar is more "efficient" as written in the sense that we don't do an integer-to-float conversion at every iteration, but rather a float addition which is cheaper (on x86_64).
Here are the clang and gcc results on these two functions (with no vectorization and unrolling).
Question: It seems to me that the optimization of replacing a multiplication by the loop index with an addition of a constant value - when this is beneficial - should be carried out by compilers, even if (or perhaps especially if) there's a type conversion involved. Why is this not happening for these two functions?
Note that if we use int's rather than float's:
void foo(int* __restrict__ a)
{
int i; int val = 0;
for (i = 0; i < 100; i++) {
val = 2 * i;
a[i] = val;
}
}
void bar(int* __restrict__ a)
{
int i; int val = 0;
for (i = 0; i < 100; i++) {
a[i] = val;
val += 2;
}
}
Both clang and gcc perform the expected optimization, albeit not quite in the same way (see this question).
You are looking for enabling induction variable optimization for floating point numbers. This optimization is generally unsafe in floating point land as it changes program semantics. In your example it'll work because both initial value (0.0) and step (2.0) can be precisely represented in IEEE format but this is a rare case in practice.
It could be enabled under -ffast-math but it seems this wasn't considered as important case in GCC as it rejects non-integral induction variables early on (see tree-scalar-evolution.c).
If you believe that this is an important usecase you might consider filing request at GCC Bugzilla.
I am trying to slice code using Frama-C.
The source code is
static uint8_T ALARM_checkOverInfusionFlowRate(void)
{
uint8_T ov;
ov = 0U;
if (ALARM_Functional_B.In_Therapy) {
if (ALARM_Functional_B.Flow_Rate > ALARM_Functional_B.Flow_Rate_High) {
ov = 1U;
} else if (ALARM_Functional_B.Flow_Rate >
ALARM_Functional_B.Commanded_Flow_Rate * div_s32
(ALARM_Functional_B.Tolerance_Max, 100) +
ALARM_Functional_B.Commanded_Flow_Rate) {
ov = 1U;
} else {
if (ALARM_Functional_B.Flow_Rate > ALARM_Functional_B.Commanded_Flow_Rate * div_s32(ALARM_Functional_B.Tolerance_Min, 100) + ALARM_Functional_B.Commanded_Flow_Rate) {
ov = 2U;
}
}
}
return ov;
}
When I sliced the code usig Frama-C, I get the following. I don't know what this “undefined sequence” means.
static uint8_T ALARM_checkOverInfusionFlowRate(void)
{
uint8_T ov;
ov = 0U;
if (ALARM_Functional_B.In_Therapy)
if ((int)ALARM_Functional_B.Flow_Rate > (int)ALARM_Functional_B.Flow_Rate_High)
ov = 1U;
else {
int32_T tmp_0;
{
/*undefined sequence*/
tmp_0 = div_s32((int)ALARM_Functional_B.Tolerance_Max,100);
}
if ((int)ALARM_Functional_B.Flow_Rate > (int)ALARM_Functional_B.Commanded_Flow_Rate * tmp_0 + (int)ALARM_Functional_B.Commanded_Flow_Rate)
ov = 1U;
else {
int32_T tmp;
{
/*undefined sequence*/
tmp = div_s32((int)ALARM_Functional_B.Tolerance_Min,100);
}
if ((int)ALARM_Functional_B.Flow_Rate > (int)ALARM_Functional_B.Commanded_Flow_Rate * tmp + (int)ALARM_Functional_B.Commanded_Flow_Rate)
ov = 2U;
}
}
return ov;
}
Appreciate any help in explaining why this happens.
/* undefined sequence */ in a block simply means that the block has been generated during the code normalization at parsing time but that with respect to C semantics there is no sequence point between the statements composing it. For instance x++ + x++ will be normalized as
{
/*undefined sequence*/
tmp = x;
x ++;
tmp_0 = x;
x ++;
;
}
Internally, each statement in such a sequence is decorated with lists of locations that are accessed for writing or reading (use -kernel-debug 1 with -print to see them in the output). Option -unspecified-access used together with -val will check that such accesses are correct, i.e. that there is at most one statement inside the sequence that write to a given location and if this is the case, that there is no read access to it (except for building the value it is assigned to). In addition, this option does not take care of side-effects occurring in a function call inside the sequence. There is a special plug-in for that, but it has not been released yet.
Finally note that since Frama-C Neon, the comment reads only /*sequence*/, which seems to be less daunting for the user. Indeed, the original code may be correct or may show undefined behavior, but syntactic analysis is too weak to decide in the general case. For instance, (*p)++ + (*q)++ is correct as long as p and q do not overlap. This is why the normalization phase only points out the sequences and leaves it up to more powerful analysis plug-ins to check whether there might be an issue.
I'd like to know if it's possible to do some kind of forward conditioned slicing with Frama-C and I'm playing with some examples to understand how one could achieve this.
I've got this simple example which seems to result in an imprecise slice and I can't understand why. Here is the function I'd like to slice :
int f(int a){
int x;
if(a == 0)
x = 0;
else if(a != 0)
x = 1;
return x;
}
If I use this specification :
/*# requires a == 0;
# ensures \old(a) == a;
# ensures \result == 0;
*/
then Frama-C returns the following slice (which is precise), using "f -slice-return" criterion and f as entry point :
/*# ensures \result ≡ 0; */
int f(void){
int x;
x = 0;
return x;
}
But when using this specification :
/*# requires a != 0;
# ensures \old(a) == a;
# ensures \result == 1;
*/
then all instructions (& annotations) remain (when I was waiting for this slice to be returned :
/*# ensures \result ≡ 1; */
int f(void){
int x;
x = 1;
return x;
}
)
In the last case, is the slice imprecise? In this case, what could be the cause?
Regards,
Romain
Edit : I wrote "else if(a != 0) ..." but the problem remains with "else ..."
In Frama-C, the slicing plug-in relies on the result of a preliminary static analysis plug-in called the value analysis.
This value analysis can represent the values for variable a when a == 0 (the set of values is in this case { 0 }) but has a hard time to represent the values for a when it is known that a != 0. In the latter case, if a is not already known to be positive or negative, the value analysis plug-in needs to approximate the set of values for a. If a was known to be positive, for instance if it was an unsigned int, then the nonzero values could be represented as an interval, but the value analysis plug-in cannot represent “all values of type int except 0”.
If you are willing to change the pre-condition, you can write it in a form that is more easily understood by the value analysis plug-in (together with value analysis option -slevel):
$ cat t.c
/*# requires a < 0 || a > 0 ;
# ensures \old(a) == a;
# ensures \result == 0;
*/
int f(int a){
int x;
if(a == 0)
x = 0;
else if(a != 0)
x = 1;
return x;
}
$ frama-c -slevel 10 t.c -main f -slice-return f -then-on 'Slicing export' -print
…
/* Generated by Frama-C */
/*# ensures \result ≡ 0; */
int f(void)
{
int x;
x = 1;
return x;
}
This has no relevance whatsoever with your main question, but your ensures a == \old(a) clause is not doing what you expect. If you pretty-print your source code with option -print, you will see it has been silently transformed into ensures \old(a) == \old(a).
The ACSL language does not permit referring about the value of formal variables in the post-state, mostly because this is meaningless from the point of view of the caller. (The stack frame of the callee is popped after the call terminates.)
Is there a way, how to make modulo by 511 (and 127) faster than using "%" operator ?
int c = 758 % 511;
int d = 423 % 127;
Here is a way to do fast modulo by 511 assuming that x is at most 32767. It's about twice as fast as x%511. It does the modulo in five steps: two multiply, two addition, one shift.
inline int fast_mod_511(int x) {
int y = (513*x+64)>>18;
return x - 511*y;
}
Here is the theory at how I arrive at this. I posted the code I tested this at the end
Let's consider
y = x/511 = x/(512-1) = x/1000 * 1/(1-1/512).
Let's define z = 512, then
y = x/z*1/(1-1/z).
Using Taylor expansion
y = x/z(1 + 1/z + 1/z^2 + 1/z^3 + ...).
Now if we know that x has a limited range we can cut the expansion. Let's assume x is always less than 2^15=32768. Then we can write
512*512*y = (1+512)*x = 513*x.
After looking at the digits which are significant we arrive at
y = (513*x+64)>>18 //512^2 = 2^18.
We can divide x/511 (assuming x is less than 32768) in three steps:
multiply,
add,
shift.
Here is the code I just to profile this in MSVC2013 64-bit release mode on an Ivy Bridge core.
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
inline int fast_mod_511(int x) {
int y = (513*x+64)>>18;
return x - 511*y;
}
int main() {
unsigned int i, x;
volatile unsigned int r;
double dtime;
dtime = omp_get_wtime();
for(i=0; i<100000; i++) {
for(int j=0; j<32768; j++) {
r = j%511;
}
}
dtime =omp_get_wtime() - dtime;
printf("time %f\n", dtime);
dtime = omp_get_wtime();
for(i=0; i<100000; i++) {
for(int j=0; j<32768; j++) {
r = fast_mod_511(j);
}
}
dtime =omp_get_wtime() - dtime;
printf("time %f\n", dtime);
}
You can use a lookup table with the solutions pre-stored. If you create an array of a million integers looking up is about twice as fast as actually doing modulo in my C# app.
// fill an array
var mod511 = new int[1000000];
for (int x = 0; x < 1000000; x++) mod511[x] = x % 511;
and instead of using
c = 758 % 511;
you use
c = mod511[758];
This will cost you (possibly a lot of) memory, and will obviously not work if you want to use it for very large numbers also. But it is faster.
If you have to repeat those two modulus operations on a large number of data and your CPU supports SIMD (for example Intel's SSE/AVX/AVX2) then you can vectorize the operations, i.e., do the operations on many data in parallel. You can do this by using intrinsics or inline assembly. Yes the solution will be platform specific but maybe that is fine...