Understanding Frama-C slicer results - slice

I'd like to know if it's possible to do some kind of forward conditioned slicing with Frama-C and I'm playing with some examples to understand how one could achieve this.
I've got this simple example which seems to result in an imprecise slice and I can't understand why. Here is the function I'd like to slice :
int f(int a){
int x;
if(a == 0)
x = 0;
else if(a != 0)
x = 1;
return x;
}
If I use this specification :
/*# requires a == 0;
# ensures \old(a) == a;
# ensures \result == 0;
*/
then Frama-C returns the following slice (which is precise), using "f -slice-return" criterion and f as entry point :
/*# ensures \result ≡ 0; */
int f(void){
int x;
x = 0;
return x;
}
But when using this specification :
/*# requires a != 0;
# ensures \old(a) == a;
# ensures \result == 1;
*/
then all instructions (& annotations) remain (when I was waiting for this slice to be returned :
/*# ensures \result ≡ 1; */
int f(void){
int x;
x = 1;
return x;
}
)
In the last case, is the slice imprecise? In this case, what could be the cause?
Regards,
Romain
Edit : I wrote "else if(a != 0) ..." but the problem remains with "else ..."

In Frama-C, the slicing plug-in relies on the result of a preliminary static analysis plug-in called the value analysis.
This value analysis can represent the values for variable a when a == 0 (the set of values is in this case { 0 }) but has a hard time to represent the values for a when it is known that a != 0. In the latter case, if a is not already known to be positive or negative, the value analysis plug-in needs to approximate the set of values for a. If a was known to be positive, for instance if it was an unsigned int, then the nonzero values could be represented as an interval, but the value analysis plug-in cannot represent “all values of type int except 0”.
If you are willing to change the pre-condition, you can write it in a form that is more easily understood by the value analysis plug-in (together with value analysis option -slevel):
$ cat t.c
/*# requires a < 0 || a > 0 ;
# ensures \old(a) == a;
# ensures \result == 0;
*/
int f(int a){
int x;
if(a == 0)
x = 0;
else if(a != 0)
x = 1;
return x;
}
$ frama-c -slevel 10 t.c -main f -slice-return f -then-on 'Slicing export' -print
…
/* Generated by Frama-C */
/*# ensures \result ≡ 0; */
int f(void)
{
int x;
x = 1;
return x;
}

This has no relevance whatsoever with your main question, but your ensures a == \old(a) clause is not doing what you expect. If you pretty-print your source code with option -print, you will see it has been silently transformed into ensures \old(a) == \old(a).
The ACSL language does not permit referring about the value of formal variables in the post-state, mostly because this is meaningless from the point of view of the caller. (The stack frame of the callee is popped after the call terminates.)

Related

How much can we trust to warnings generated by static analysis tools for vulnerablity detection?

I am running flawfinder on a set of libraries written in C/C++. I have a lot of generated warnings by flawfinder. My question is that, how much I can rely on these generated warnings? For example, consider the following function from numpy library (https://github.com/numpy/numpy/blob/4ada0641ed1a50a2473f8061f4808b4b0d68eff5/numpy/f2py/src/fortranobject.c):
static PyObject *
fortran_doc(FortranDataDef def)
{
char *buf, *p;
PyObject *s = NULL;
Py_ssize_t n, origsize, size = 100;
if (def.doc != NULL) {
size += strlen(def.doc);
}
origsize = size;
buf = p = (char *)PyMem_Malloc(size);
if (buf == NULL) {
return PyErr_NoMemory();
}
if (def.rank == -1) {
if (def.doc) {
n = strlen(def.doc);
if (n > size) {
goto fail;
}
memcpy(p, def.doc, n);
p += n;
size -= n;
}
else {
n = PyOS_snprintf(p, size, "%s - no docs available", def.name);
if (n < 0 || n >= size) {
goto fail;
}
p += n;
size -= n;
}
}
else {
PyArray_Descr *d = PyArray_DescrFromType(def.type);
n = PyOS_snprintf(p, size, "'%c'-", d->type);
Py_DECREF(d);
if (n < 0 || n >= size) {
goto fail;
}
p += n;
size -= n;
if (def.data == NULL) {
n = format_def(p, size, def) == -1;
if (n < 0) {
goto fail;
}
p += n;
size -= n;
}
else if (def.rank > 0) {
n = format_def(p, size, def);
if (n < 0) {
goto fail;
}
p += n;
size -= n;
}
else {
n = strlen("scalar");
if (size < n) {
goto fail;
}
memcpy(p, "scalar", n);
p += n;
size -= n;
}
}
if (size <= 1) {
goto fail;
}
*p++ = '\n';
size--;
/* p now points one beyond the last character of the string in buf */
#if PY_VERSION_HEX >= 0x03000000
s = PyUnicode_FromStringAndSize(buf, p - buf);
#else
s = PyString_FromStringAndSize(buf, p - buf);
#endif
PyMem_Free(buf);
return s;
fail:
fprintf(stderr, "fortranobject.c: fortran_doc: len(p)=%zd>%zd=size:"
" too long docstring required, increase size\n",
p - buf, origsize);
PyMem_Free(buf);
return NULL;
}
There are two memcpy() API calls, and flawfinder tells me that:
['vul_fortranobject.c:216: [2] (buffer) memcpy:\\n Does not check for buffer overflows when copying to destination (CWE-120).\\n Make sure destination can always hold the source data.\\n memcpy(p, "scalar", n);']
I am not sure whether the report is true.
To answer your question: static analysis tools (like FlawFinder) can generate a LOT of "false positives".
I Googled to find some quantifiable information for you, and found an interesting article about "DeFP":
https://arxiv.org/pdf/2110.03296.pdf
Static analysis tools are frequently used to detect potential
vulnerabilities in software systems. However, an inevitable problem of
these tools is their large number of warnings with a high false
positive rate, which consumes time and effort for investigating. In
this paper, we present DeFP, a novel method for ranking static analysis warnings.
Based on the intuition that warnings which have
similar contexts tend to have similar labels (true positive or false
positive), DeFP is built with two BiLSTM models to capture the
patterns associated with the contexts of labeled warnings. After that,
for a set of new warnings, DeFP can calculate and rank them according
to their likelihoods to be true positives (i.e., actual
vulnerabilities).
Our experimental results on a dataset of 10
real-world projects show that using DeFP, by investigating only 60% of
the warnings, developers can find
+90% of actual vulnerabilities. Moreover, DeFP improves the state-of-the-art approach 30% in both Precision and Recall.
Apparently, the authors built a neural network to analyze FlawFinder results, and rank them.
I doubt DeFP is a practical "solution" for you. But yes: if you think that specific "memcpy()" warning is a "false positive" - then I'm inclined to agree. It very well could be :)

/*undefined sequence*/ in sliced code from Frama-C

I am trying to slice code using Frama-C.
The source code is
static uint8_T ALARM_checkOverInfusionFlowRate(void)
{
uint8_T ov;
ov = 0U;
if (ALARM_Functional_B.In_Therapy) {
if (ALARM_Functional_B.Flow_Rate > ALARM_Functional_B.Flow_Rate_High) {
ov = 1U;
} else if (ALARM_Functional_B.Flow_Rate >
ALARM_Functional_B.Commanded_Flow_Rate * div_s32
(ALARM_Functional_B.Tolerance_Max, 100) +
ALARM_Functional_B.Commanded_Flow_Rate) {
ov = 1U;
} else {
if (ALARM_Functional_B.Flow_Rate > ALARM_Functional_B.Commanded_Flow_Rate * div_s32(ALARM_Functional_B.Tolerance_Min, 100) + ALARM_Functional_B.Commanded_Flow_Rate) {
ov = 2U;
}
}
}
return ov;
}
When I sliced the code usig Frama-C, I get the following. I don't know what this “undefined sequence” means.
static uint8_T ALARM_checkOverInfusionFlowRate(void)
{
uint8_T ov;
ov = 0U;
if (ALARM_Functional_B.In_Therapy)
if ((int)ALARM_Functional_B.Flow_Rate > (int)ALARM_Functional_B.Flow_Rate_High)
ov = 1U;
else {
int32_T tmp_0;
{
/*undefined sequence*/
tmp_0 = div_s32((int)ALARM_Functional_B.Tolerance_Max,100);
}
if ((int)ALARM_Functional_B.Flow_Rate > (int)ALARM_Functional_B.Commanded_Flow_Rate * tmp_0 + (int)ALARM_Functional_B.Commanded_Flow_Rate)
ov = 1U;
else {
int32_T tmp;
{
/*undefined sequence*/
tmp = div_s32((int)ALARM_Functional_B.Tolerance_Min,100);
}
if ((int)ALARM_Functional_B.Flow_Rate > (int)ALARM_Functional_B.Commanded_Flow_Rate * tmp + (int)ALARM_Functional_B.Commanded_Flow_Rate)
ov = 2U;
}
}
return ov;
}
Appreciate any help in explaining why this happens.
/* undefined sequence */ in a block simply means that the block has been generated during the code normalization at parsing time but that with respect to C semantics there is no sequence point between the statements composing it. For instance x++ + x++ will be normalized as
{
/*undefined sequence*/
tmp = x;
x ++;
tmp_0 = x;
x ++;
;
}
Internally, each statement in such a sequence is decorated with lists of locations that are accessed for writing or reading (use -kernel-debug 1 with -print to see them in the output). Option -unspecified-access used together with -val will check that such accesses are correct, i.e. that there is at most one statement inside the sequence that write to a given location and if this is the case, that there is no read access to it (except for building the value it is assigned to). In addition, this option does not take care of side-effects occurring in a function call inside the sequence. There is a special plug-in for that, but it has not been released yet.
Finally note that since Frama-C Neon, the comment reads only /*sequence*/, which seems to be less daunting for the user. Indeed, the original code may be correct or may show undefined behavior, but syntactic analysis is too weak to decide in the general case. For instance, (*p)++ + (*q)++ is correct as long as p and q do not overlap. This is why the normalization phase only points out the sequences and leaves it up to more powerful analysis plug-ins to check whether there might be an issue.

Learning how to prove Frama-C pre-condition goals

I have the following sample code:
typedef struct {
BYTE fs_type; /* FAT sub-type (0:Not mounted) */
BYTE drv; /* Physical drive number */
} FATFS_temp;
FATFS_temp *FatFs_temp[1]; /* Pointer to the file system objects (logical drives) */
/*#
# requires (vol <= 0) && (fs != \null) ==> \valid((fs)) ; // problematic one
# behavior mount:
# //assumes \valid(fs) && vol <= 0;
# assumes fs != \null && vol <= 0;
# ensures (vol <= 0) ==> (FatFs_temp[vol] == \old(fs));
# ensures fs->fs_type == 0;
# behavior unmount:
# assumes fs == \null && vol <= 0;
# ensures (vol <= 0) ==> (FatFs_temp[vol] == \null);
# behavior error:
# assumes vol > 0;
# ensures \result == 88;
# complete behaviors mount, unmount, error;
# disjoint behaviors mount, unmount, error;
*/
int f_mount_temp (
BYTE vol, /* Logical drive number to be mounted/unmounted */
FATFS_temp *fs /* Pointer to new file system object (NULL for unmount)*/
)
{
FATFS_temp *rfs;
if (vol >= 1) /* Check if the drive number is valid */
return 88;
rfs = FatFs_temp[vol]; /* Get current fs object */
if (rfs) {
rfs->fs_type = 0; /* Clear old fs object */
}
if (fs) {
fs->fs_type = 0; /* Clear new fs object */
}
FatFs_temp[vol] = fs; /* Register new fs object */
return 22;
}
But Frama-C / Why3 couldn't prove one of the 'requires' as commented in the code.
the .Why file states the following:
goal WP "expl:Pre-condition (file src/ff_temp.c, line 12) in 'f_mount_temp'":
forall vol_0 : int.
forall malloc_0 : map int int.
forall fatFs_temp_0 : map int addr.
forall fs_0 : addr.
(fs_0 <> null) ->
(vol_0 <= 0) ->
((linked malloc_0)) ->
((is_uint8 vol_0)) ->
(forall k_0 : int. (0 <= k_0) -> (k_0 <= 0) -> (null = fatFs_temp_0[k_0])) ->
((valid_rw malloc_0 fs_0 2))
end
for the sake of learning, my questions are:
1) what is wrong with that pre-condition?
2) based on the outputs in the .Why file, what should my approach be to find out whats wrong?
3) can someone point me to resources to learn how to go about debugging my function contracts?
EDIT:
i ran Frama-c with the following flags: "-wp -wp-rte -wp-fct f_mount_temp"
i did not call this f_mount_temp from elsewhere. i ran Frama-c to check on this f_mount_temp() directly.
now its clearer to me, its likely the additional assertions that caused the pre-condition to fail. the processed function contracts are the following, with the comments indicating the status of each assertion:
/*# requires vol ≤ 0 ∧ fs ≢ \null ⇒ \valid(fs); // unknown
behavior mount: // unknown
assumes fs ≢ \null ∧ vol ≤ 0;
ensures \old(vol) ≤ 0 ⇒ FatFs_temp[\old(vol)] ≡ \old(fs);
ensures \old(fs)->fs_type ≡ 0;
behavior unmount: //unknown
assumes fs ≡ \null ∧ vol ≤ 0;
ensures \old(vol) ≤ 0 ⇒ FatFs_temp[\old(vol)] ≡ \null;
behavior error: //unknown
assumes vol > 0;
ensures \result ≡ 88;
complete behaviors mount, unmount, error; // green
disjoint behaviors mount, unmount, error; // green
*/
the inline assertions added by the -wp-rfe flags are:
int f_mount_temp(BYTE vol, FATFS_temp *fs) {
int __retres;
FATFS_temp *rfs;
if ((int)vol >= 1) {
__retres = 88;
goto return_label;
}
/*# assert rte: index_bound: vol < 1; */ // ok
rfs = FatFs_temp[vol];
if (rfs) {
/*# assert rte: mem_access: \valid(&rfs->fs_type); */ //unknown
rfs->fs_type = (unsigned char)0;
}
if (fs) {
/*# assert rte: mem_access: \valid(&fs->fs_type); */ // unknown
fs->fs_type = (unsigned char)0;
}
/*# assert rte: index_bound: vol < 1; */ // unknown
FatFs_temp[vol] = fs;
__retres = 22;
return_label: return __retres;
}
1) what is wrong with that pre-condition?
You are using && and ==> as if their relative precedences were well-known. This is wrong from a human point of view, because as ==> does not appear in many languages other than ACSL, only ACSL specialists can know what a formula that depends on its precedence means.
Apart from that, there can never be anything wrong with a pre-condition in a code snippet that does not involve a call to the function. The pre-condition it not a property that is proved with respect to the implementation of the function but with respect to the context in which the function is used. You could have made a mistake and written the logical equivalent of \false and the pre-condition would still be fine for your snippet (it would only mean that all calls to the function are invalid and must themselves be proved to be unreachable).
For your question to make sense, it would have to either:
involve the proof (or lack of proof) of a post-condition of f_mount_temp and provide this function's implementation, or
involve the proof (or lack of proof) of a pre-condition of f_mount_temp and the code of the function in which f_mount_temp is called, including that function's pre-conditions, so that it is possible to tell whether this calling function respects f_mount_temp's pre-condition. In this latter case it is not necessary to provide f_mount_temp's code or post-condition, unless it is called several times in the caller. Also the code of the other functions called from the caller need not be provided, but their contracts should be.
What you have done here, providing f's code and asking why f's pre-condition is not proved, is not coherent.
2) based on the outputs in the .Why file, what should my approach be to find out whats wrong?
This is not a bad place to ask, and I think that you could receive help if you ask again with the right bits of information.
3) can someone point me to resources to learn how to go about debugging my function contracts?
I am not aware of many of these, but this site could become a resource explaining the most common debugging tricks if you ask again…

Issue in making a String Algorithm

Given a string made up of 'a' and 'b' only,the operation that is allowed is to remove a substring of "abb" if present from the string. My question is after applying this operation any no of times can i make the string empty. I need a O(n) algorithm.
Example ,
abbabb-->yes
aabbbb->yes since aabbbb->abb->empty
aaabbb->no since aaabbb->aab
All that i can think upto now is an O(n^2) algorithm in which i sucessively find the position of the substring using substr() or find() and then remove it until string not empty or not found a "abb" in it.
Here is an example of what I suggested in the comment:
for i = 0 to word.length-1
if word[i] == 'b'
if stack.empty() //no corresponding a
return false
if stack.top() == 'a' //first b after an a
stack.push('b')
else //second b after an a
stack.pop() //pop last two letters
stack.pop()
else
stack.push('a')
return stack.empty()
There might be some boundary conditions that needs to be checked, and of course at any point pop() fails you need to return false. Seems to be working for the possible inputs that occurs to me.
The point that needs to be mathematically proved, I think, is the part where I commented "second b after an a". With the assumption that stack was empty at the beginning, if I did not miss anything that point looks correct.
It is not necessary to store anything but the count of unused pairs of b's at the end of the string, as you read it Right to Left. (And it's solved reading input only once, so O(n) time O(1) space) This is very reminiscent of finding a discrete finite automata for a regular language. If you see two b's, increase count. If you see a single b, add half a pair (update a boolean variable and possibly increment count). If you see an a and have no pair of b's, fail, else count--. If you reach the end of the string and there were no extra b's, the string was valid.
Make use of two counters to avoid using stack. Here is the c++ implementaion hope it works.
bool canBeDone(string s)
{
int aCount = 0;
int bCount = 0;
for(int i=0;i<s.length();++i)
{
if(s[i] == 'a')
{
aCount++;
continue;
}
if(s[i] == 'b' && aCount == 0)
return false;
else
{
bCount += 1;
if(bCount == 2)
{
bCount = 0;
aCount--;
}
}
}
if(!aCount && !bCount)return true;
return false;
}
Very simple and straightforward implementation in Erlang O(n) space and time (unfortunately even clwhisk's algorithm needs O(n) space in Erlang because of lists:reverse/1):
-module(abb).
-export([check/1, clwhisk/1, test/0]).
check(L) when is_list(L) ->
check(L, []).
check(L, "bba" ++ T) -> check(L, T);
check([H|T], S) -> check(T, [H|S]);
check([], S) -> S =:= [].
clwhisk(L) when is_list(L) ->
clwhisk(lists:reverse(L), 0).
clwhisk([$b|T], C) -> clwhisk(T, C+1);
clwhisk([$a|T], C) -> C >= 2 andalso clwhisk(T, C-2);
clwhisk(L, C) -> L =:= [] andalso C =:= 0.
test() ->
true = abb:check("abbabb"),
true = abb:check("aabbbb"),
false = abb:check("aaabbb"),
true = abb:check("ababbb"),
true = abb:clwhisk("abbabb"),
true = abb:clwhisk("aabbbb"),
false = abb:clwhisk("aaabbb"),
true = abb:clwhisk("ababbb"),
ok.
And there is C implementation of clwhisk's algorithm as filter:
#include <stdlib.h>
#include <stdio.h>
static inline const char *last(const char* s){
for(;*s && *s!='\n';s++);
return s-1;
}
static int check(const char* s){
int count=0;
const char *ptr = last(s);
for(; ptr >= s; ptr--)
if(*ptr == 'b') {
count++;
}
else if(*ptr == 'a') {
count -= 2;
if(count < 0)
return 0;
}
else return 0;
return count == 0;
}
int main(void) {
char *line = NULL;
size_t len = 0;
while( getline(&line, &len, stdin) != -1 )
if(*line && *line != '\n' && check(line))
fputs(line, stdout);
return EXIT_SUCCESS;
}

Which Frama-C version is best suited to develop a slicing plugin?

I want to explore Frama-C to apply Assertion-based Slicing (using ACSL notation).
I have found that there are several different versions of Frama-C with some different features.
My question is which version is best suited to develop a a slicing plugin to Frama-C and to manipulate the AST created by Frama-C.
There already is a slicing plug-in in Frama-C (in all versions).
This plug-in uses the results of the value analysis plug-in, which assumes the properties written inside ACSL assertions (after having attempted to verify them).
So, depending on what you call “assertion-based slicing” (and be aware that the article that comes up first in Google is behind a paywall), what you propose to do may already exists as a Frama-C plug-in (and one that works pretty well as of the last two or three Frama-C versions).
To answer your question anyway, the best version to use is the latest one, which is Fluorine 20130601 as of this writing.
Example of existing slicing features in Frama-C:
$ cat t.c
int f(unsigned int x)
{
int y;
/*# assert x == 0 ; */
if (x)
y = 9;
else
y = 10;
return y;
}
$ frama-c -sparecode t.c -main f
...
t.c:4:[value] Assertion got status unknown.
...
/* Generated by Frama-C */
int f(unsigned int x)
{
int y;
/*# assert x ≡ 0; */
;
y = 10;
return (y);
}
Is the above what you have in mind when you speak of “assertion-based slicing”?
Note: Frama-C's option -sparecode is a slicing option for the criterion “preserve all results of the program”. It still removes any statement that is without consequences, such as y=3; in y=3; y=4;, and being based on Frama-C's value analysis, it removes anything that is considered unreachable or without consequences because of the value analysis' results.
Another example to illustrate:
$ cat t.c
int f(unsigned int x)
{
int y;
int a, b;
int *p[2] = {&a, &b};
/*# assert x == 0 ; */
a = 100;
b = 200;
if (x)
y = 9;
else
y = 10;
return y + *(p[x]);
}
$ frama-c -sparecode t.c -main f
...
t.c:6:[value] Assertion got status unknown.
...
/* Generated by Frama-C */
int f(unsigned int x)
{
int __retres;
int y;
int a;
int *p[2];
p[0] = & a;
/*# assert x ≡ 0; */
;
a = 100;
y = 10;
__retres = y + *(p[x]);
return (__retres);
}

Resources