How much can we trust to warnings generated by static analysis tools for vulnerablity detection? - static-analysis

I am running flawfinder on a set of libraries written in C/C++. I have a lot of generated warnings by flawfinder. My question is that, how much I can rely on these generated warnings? For example, consider the following function from numpy library (https://github.com/numpy/numpy/blob/4ada0641ed1a50a2473f8061f4808b4b0d68eff5/numpy/f2py/src/fortranobject.c):
static PyObject *
fortran_doc(FortranDataDef def)
{
char *buf, *p;
PyObject *s = NULL;
Py_ssize_t n, origsize, size = 100;
if (def.doc != NULL) {
size += strlen(def.doc);
}
origsize = size;
buf = p = (char *)PyMem_Malloc(size);
if (buf == NULL) {
return PyErr_NoMemory();
}
if (def.rank == -1) {
if (def.doc) {
n = strlen(def.doc);
if (n > size) {
goto fail;
}
memcpy(p, def.doc, n);
p += n;
size -= n;
}
else {
n = PyOS_snprintf(p, size, "%s - no docs available", def.name);
if (n < 0 || n >= size) {
goto fail;
}
p += n;
size -= n;
}
}
else {
PyArray_Descr *d = PyArray_DescrFromType(def.type);
n = PyOS_snprintf(p, size, "'%c'-", d->type);
Py_DECREF(d);
if (n < 0 || n >= size) {
goto fail;
}
p += n;
size -= n;
if (def.data == NULL) {
n = format_def(p, size, def) == -1;
if (n < 0) {
goto fail;
}
p += n;
size -= n;
}
else if (def.rank > 0) {
n = format_def(p, size, def);
if (n < 0) {
goto fail;
}
p += n;
size -= n;
}
else {
n = strlen("scalar");
if (size < n) {
goto fail;
}
memcpy(p, "scalar", n);
p += n;
size -= n;
}
}
if (size <= 1) {
goto fail;
}
*p++ = '\n';
size--;
/* p now points one beyond the last character of the string in buf */
#if PY_VERSION_HEX >= 0x03000000
s = PyUnicode_FromStringAndSize(buf, p - buf);
#else
s = PyString_FromStringAndSize(buf, p - buf);
#endif
PyMem_Free(buf);
return s;
fail:
fprintf(stderr, "fortranobject.c: fortran_doc: len(p)=%zd>%zd=size:"
" too long docstring required, increase size\n",
p - buf, origsize);
PyMem_Free(buf);
return NULL;
}
There are two memcpy() API calls, and flawfinder tells me that:
['vul_fortranobject.c:216: [2] (buffer) memcpy:\\n Does not check for buffer overflows when copying to destination (CWE-120).\\n Make sure destination can always hold the source data.\\n memcpy(p, "scalar", n);']
I am not sure whether the report is true.

To answer your question: static analysis tools (like FlawFinder) can generate a LOT of "false positives".
I Googled to find some quantifiable information for you, and found an interesting article about "DeFP":
https://arxiv.org/pdf/2110.03296.pdf
Static analysis tools are frequently used to detect potential
vulnerabilities in software systems. However, an inevitable problem of
these tools is their large number of warnings with a high false
positive rate, which consumes time and effort for investigating. In
this paper, we present DeFP, a novel method for ranking static analysis warnings.
Based on the intuition that warnings which have
similar contexts tend to have similar labels (true positive or false
positive), DeFP is built with two BiLSTM models to capture the
patterns associated with the contexts of labeled warnings. After that,
for a set of new warnings, DeFP can calculate and rank them according
to their likelihoods to be true positives (i.e., actual
vulnerabilities).
Our experimental results on a dataset of 10
real-world projects show that using DeFP, by investigating only 60% of
the warnings, developers can find
+90% of actual vulnerabilities. Moreover, DeFP improves the state-of-the-art approach 30% in both Precision and Recall.
Apparently, the authors built a neural network to analyze FlawFinder results, and rank them.
I doubt DeFP is a practical "solution" for you. But yes: if you think that specific "memcpy()" warning is a "false positive" - then I'm inclined to agree. It very well could be :)

Related

Find word in string buffer/paragraph/text

This was asked in Amazon telephonic interview - "Can you write a program (in your preferred language C/C++/etc.) to find a given word in a string buffer of big size ? i.e. number of occurrences "
I am still looking for perfect answer which I should have given to the interviewer.. I tried to write a linear search (char by char comparison) and obviously I was rejected.
Given a 40-45 min time for a telephonic interview, what was the perfect algorithm he/she was looking for ???
The KMP Algorithm is a popular string matching algorithm.
KMP Algorithm
Checking char by char is inefficient. If the string has 1000 characters and the keyword has 100 characters, you don't want to perform unnecessary comparisons. The KMP Algorithm handles many cases which can occur, but I imagine the interviewer was looking for the case where: When you begin (pass 1), the first 99 characters match, but the 100th character doesn't match. Now, for pass 2, instead of performing the entire comparison from character 2, you have enough information to deduce where the next possible match can begin.
// C program for implementation of KMP pattern searching
// algorithm
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void computeLPSArray(char *pat, int M, int *lps);
void KMPSearch(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix
// values for pattern
int *lps = (int *)malloc(sizeof(int)*M);
int j = 0; // index for pat[]
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
while (i < N)
{
if (pat[j] == txt[i])
{
j++;
i++;
}
if (j == M)
{
printf("Found pattern at index %d \n", i-j);
j = lps[j-1];
}
// mismatch after j matches
else if (i < N && pat[j] != txt[i])
{
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j-1];
else
i = i+1;
}
}
free(lps); // to avoid memory leak
}
void computeLPSArray(char *pat, int M, int *lps)
{
int len = 0; // length of the previous longest prefix suffix
int i;
lps[0] = 0; // lps[0] is always 0
i = 1;
// the loop calculates lps[i] for i = 1 to M-1
while (i < M)
{
if (pat[i] == pat[len])
{
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
if (len != 0)
{
// This is tricky. Consider the example
// AAACAAAA and i = 7.
len = lps[len-1];
// Also, note that we do not increment i here
}
else // if (len == 0)
{
lps[i] = 0;
i++;
}
}
}
}
// Driver program to test above function
int main()
{
char *txt = "ABABDABACDABABCABAB";
char *pat = "ABABCABAB";
KMPSearch(pat, txt);
return 0;
}
This code is taken from a really good site that teaches algorithms:
Geeks for Geeks KMP
Amazon and companies alike expect knowledge of Boyer–Moore string search or / and Knuth–Morris–Pratt algorithms.
Those are good if you want to show perfect knowledge. Otherwise, try to be creative and write something relatively elegant and efficient.
Did you ask about delimiters before you wrote anything? It could be that they may simplify your task to provide some extra information about a string buffer.
Even code below could be ok (it's really not) if you provide enough information in advance, properly explain runtime, space requirements, choice of data containers.
int find( std::string & the_word, std::string & text )
{
std::stringstream ss( text ); // !!! could be really bad idea if 'text' is really big
std::string word;
std::unordered_map< std::string, int > umap;
while( ss >> text ) ++umap[text]; // you have to assume that each word separated by white-spaces.
return umap[the_word];
}

Sorting too slow

So, I'm doing a project for my programming languages class, and i have to create a structure, sort it, and then show the time it takes to do it, the thing is bubble sorting(case 1) takes 60 sec to do it, insertion(case 2) 5 sec and selection (case 4) takes 10 sec. All this sorting 100000 elements. shell only takes 0.03 so i started thinking i might have something wrong with my algorithms. can some one help me?
void ordenesc(compleja * vd, int tam)
{
int i=0,j=0,k=0,aux=0,op=0,inc=0,minimo=0;
char auxcad[20];
clock_t start, end;
double tiempo;
op=menus(3);
start = clock();
switch(op)
{
case 1://Burbujeo
for(i=1;i<=tam;i++)
{
for(j=0;j<tam-1;j++)
{
if(vd[j].nro>vd[j+1].nro)
{
aux=vd[j].nro;
vd[j].nro=vd[j+1].nro;
vd[j+1].nro=aux;
strcpy(auxcad,vd[j].cad);
strcpy(vd[j].cad,vd[j+1].cad);
strcpy(vd[j+1].cad,auxcad);
}
}
}
break;
case 2://Inserccion
for(i = 1; i < tam; i++)
{
aux=vd[i].nro;
strcpy(auxcad,vd[i].cad);
for (j = i - 1; j >= 0 && vd[j].nro > aux; j--)
{
vd[j+1].nro=vd[j].nro;
strcpy(vd[j+1].cad,vd[j].cad);
j--;
}
vd[j+1].nro=aux;
strcpy(vd[j+1].cad,auxcad);
}
break;
case 3://Shell
inc=(tam/2);
while (inc > 0)
{
for (i=0; i < tam; i++)
{
j = i;
aux = vd[i].nro;
strcpy(auxcad,vd[i].cad);
while ((j >= inc) && (vd[j-inc].nro > aux))
{
vd[j].nro = vd[j - inc].nro;
strcpy(vd[j].cad,vd[j-inc].cad);
j = j - inc;
}
vd[j].nro = aux;
strcpy(vd[j].cad,auxcad);
}
if (inc == 2)
inc = 1;
else
inc = inc * 5 / 11;
}
break;
case 4://Seleccion
for(i=0;i<tam-1;i++)
{
minimo=i;
for(j=i+1;j<tam;j++)
{
if(vd[minimo].nro > vd[j].nro) minimo=j;
}
aux=vd[minimo].nro;
vd[minimo].nro=vd[i].nro;
vd[i].nro=aux;
strcpy(auxcad,vd[minimo].cad);
strcpy(vd[minimo].cad,vd[i].cad);
strcpy(vd[i].cad,auxcad);
}
break;
case 9:
break;
default:
break;
}
end = clock();
tiempo = ((double) (end - start)) / CLOCKS_PER_SEC;
//system("cls");
i=0;
for(i=0;i<tam;i++){
printf("%d %s \n",vd[i].nro,vd[i].cad);}
printf("\n Tardo %f segundos \n", tiempo);
return;
}
P.d:Edited the text sorry for my english is not my first language and my brain is failing due to this.
To make sure your sort algorithm works as expected, you could add a check to the final loop that the elements are actually ordered when you print them. Its relatively unlikely that there is a fundamental error in the algorithm and it still sorts correctly.
One point of the exercise may be to show that sorting algorithms really matter, and selection sort is the only algorithm that has a better performance than O(n^2) in your list. So I wouldn't be too surprised by wide differences in performance.
One improvement you could make to bubble sort is that you only need to iterate over i elements in the inner loop (instead of tam), as the i-largest element will have bubbled up all the way in the inner loop.
Another improvement may be to just copy the pointers instead of the contents of the char arrays, e.g.
instead of
char auxcad[20];
...
strcpy(auxcad, vd[j].cad);
strcpy(vd[j].cad, vd[j+1].cad);
strcpy(vd[j+1].cad, auxcad);
you may want to write
char* auxcad;
...
auxcad = vd[j].cad;
vd[j].cad = vd[j+1].cad;
vd[j+1].cad = auxcad;

Issue in making a String Algorithm

Given a string made up of 'a' and 'b' only,the operation that is allowed is to remove a substring of "abb" if present from the string. My question is after applying this operation any no of times can i make the string empty. I need a O(n) algorithm.
Example ,
abbabb-->yes
aabbbb->yes since aabbbb->abb->empty
aaabbb->no since aaabbb->aab
All that i can think upto now is an O(n^2) algorithm in which i sucessively find the position of the substring using substr() or find() and then remove it until string not empty or not found a "abb" in it.
Here is an example of what I suggested in the comment:
for i = 0 to word.length-1
if word[i] == 'b'
if stack.empty() //no corresponding a
return false
if stack.top() == 'a' //first b after an a
stack.push('b')
else //second b after an a
stack.pop() //pop last two letters
stack.pop()
else
stack.push('a')
return stack.empty()
There might be some boundary conditions that needs to be checked, and of course at any point pop() fails you need to return false. Seems to be working for the possible inputs that occurs to me.
The point that needs to be mathematically proved, I think, is the part where I commented "second b after an a". With the assumption that stack was empty at the beginning, if I did not miss anything that point looks correct.
It is not necessary to store anything but the count of unused pairs of b's at the end of the string, as you read it Right to Left. (And it's solved reading input only once, so O(n) time O(1) space) This is very reminiscent of finding a discrete finite automata for a regular language. If you see two b's, increase count. If you see a single b, add half a pair (update a boolean variable and possibly increment count). If you see an a and have no pair of b's, fail, else count--. If you reach the end of the string and there were no extra b's, the string was valid.
Make use of two counters to avoid using stack. Here is the c++ implementaion hope it works.
bool canBeDone(string s)
{
int aCount = 0;
int bCount = 0;
for(int i=0;i<s.length();++i)
{
if(s[i] == 'a')
{
aCount++;
continue;
}
if(s[i] == 'b' && aCount == 0)
return false;
else
{
bCount += 1;
if(bCount == 2)
{
bCount = 0;
aCount--;
}
}
}
if(!aCount && !bCount)return true;
return false;
}
Very simple and straightforward implementation in Erlang O(n) space and time (unfortunately even clwhisk's algorithm needs O(n) space in Erlang because of lists:reverse/1):
-module(abb).
-export([check/1, clwhisk/1, test/0]).
check(L) when is_list(L) ->
check(L, []).
check(L, "bba" ++ T) -> check(L, T);
check([H|T], S) -> check(T, [H|S]);
check([], S) -> S =:= [].
clwhisk(L) when is_list(L) ->
clwhisk(lists:reverse(L), 0).
clwhisk([$b|T], C) -> clwhisk(T, C+1);
clwhisk([$a|T], C) -> C >= 2 andalso clwhisk(T, C-2);
clwhisk(L, C) -> L =:= [] andalso C =:= 0.
test() ->
true = abb:check("abbabb"),
true = abb:check("aabbbb"),
false = abb:check("aaabbb"),
true = abb:check("ababbb"),
true = abb:clwhisk("abbabb"),
true = abb:clwhisk("aabbbb"),
false = abb:clwhisk("aaabbb"),
true = abb:clwhisk("ababbb"),
ok.
And there is C implementation of clwhisk's algorithm as filter:
#include <stdlib.h>
#include <stdio.h>
static inline const char *last(const char* s){
for(;*s && *s!='\n';s++);
return s-1;
}
static int check(const char* s){
int count=0;
const char *ptr = last(s);
for(; ptr >= s; ptr--)
if(*ptr == 'b') {
count++;
}
else if(*ptr == 'a') {
count -= 2;
if(count < 0)
return 0;
}
else return 0;
return count == 0;
}
int main(void) {
char *line = NULL;
size_t len = 0;
while( getline(&line, &len, stdin) != -1 )
if(*line && *line != '\n' && check(line))
fputs(line, stdout);
return EXIT_SUCCESS;
}

Understanding Frama-C slicer results

I'd like to know if it's possible to do some kind of forward conditioned slicing with Frama-C and I'm playing with some examples to understand how one could achieve this.
I've got this simple example which seems to result in an imprecise slice and I can't understand why. Here is the function I'd like to slice :
int f(int a){
int x;
if(a == 0)
x = 0;
else if(a != 0)
x = 1;
return x;
}
If I use this specification :
/*# requires a == 0;
# ensures \old(a) == a;
# ensures \result == 0;
*/
then Frama-C returns the following slice (which is precise), using "f -slice-return" criterion and f as entry point :
/*# ensures \result ≡ 0; */
int f(void){
int x;
x = 0;
return x;
}
But when using this specification :
/*# requires a != 0;
# ensures \old(a) == a;
# ensures \result == 1;
*/
then all instructions (& annotations) remain (when I was waiting for this slice to be returned :
/*# ensures \result ≡ 1; */
int f(void){
int x;
x = 1;
return x;
}
)
In the last case, is the slice imprecise? In this case, what could be the cause?
Regards,
Romain
Edit : I wrote "else if(a != 0) ..." but the problem remains with "else ..."
In Frama-C, the slicing plug-in relies on the result of a preliminary static analysis plug-in called the value analysis.
This value analysis can represent the values for variable a when a == 0 (the set of values is in this case { 0 }) but has a hard time to represent the values for a when it is known that a != 0. In the latter case, if a is not already known to be positive or negative, the value analysis plug-in needs to approximate the set of values for a. If a was known to be positive, for instance if it was an unsigned int, then the nonzero values could be represented as an interval, but the value analysis plug-in cannot represent “all values of type int except 0”.
If you are willing to change the pre-condition, you can write it in a form that is more easily understood by the value analysis plug-in (together with value analysis option -slevel):
$ cat t.c
/*# requires a < 0 || a > 0 ;
# ensures \old(a) == a;
# ensures \result == 0;
*/
int f(int a){
int x;
if(a == 0)
x = 0;
else if(a != 0)
x = 1;
return x;
}
$ frama-c -slevel 10 t.c -main f -slice-return f -then-on 'Slicing export' -print
…
/* Generated by Frama-C */
/*# ensures \result ≡ 0; */
int f(void)
{
int x;
x = 1;
return x;
}
This has no relevance whatsoever with your main question, but your ensures a == \old(a) clause is not doing what you expect. If you pretty-print your source code with option -print, you will see it has been silently transformed into ensures \old(a) == \old(a).
The ACSL language does not permit referring about the value of formal variables in the post-state, mostly because this is meaningless from the point of view of the caller. (The stack frame of the callee is popped after the call terminates.)

break overhead vs control flag

I was using a naive prime generated function. This code takes about, 5.25 seconds to generate 10k prime numbers (device_primes[0] holds the number primes already found, the remaining position the prime numbers found).
_global__ void getPrimes(int *device_primes,int n)
{
int c = 0;
int thread_id = blockIdx.x * blockDim.x + threadIdx.x;
int num = thread_id+2;
if (thread_id == 0) device_primes[0] = 1;
__syncthreads();
while(device_primes[0] < n)
{
for (c = 2; c <= num - 1; c++)
{
if (num % c == 0) //not prime
{
break;
}
}
if (c == num) //prime
{
int pos = atomicAdd(&device_primes[0],1);
device_primes[pos] = num;
}
num += blockDim.x * gridDim.x; // Next number for this thread
}
}
I was just starting to optimize the code, and i made the follow modification, instead of :
for (c = 2; c <= num - 1; c++)
{
if (num % c == 0) //not prime
break;
}
if (c == num) {...}
i have now :
int prime = 1;
...
for (c = 2; c <= num - 1 && prime; c++)
{
if (num % c == 0) prime = 0; // not prime
}
if (prime) {...} // if prime
Now i can generate 10k in 0.707s. I was just wondering why such speed up with a this simple modification, is break that bad?
As Tony suggested, divergent code execution can cause major slow downs in gpu code, forcing some code to run in serial rather than parallel. In the slow version of the code above, threads that hit the break diverge from code that continues.
The cuda c programming guide is a good resource for gpu programming techniques. Here is what it says about control flow:
Any flow control instruction (if, switch, do, for, while) can significantly impact the effective instruction throughput by causing threads of the same warp to diverge (i.e., to follow different execution paths). If this happens, the different executions paths have to be serialized, increasing the total number of instructions executed for this warp. When all the different execution paths have completed, the threads converge back to the same execution path.
Newer nvidia hardware and cuda versions can handle some branching a little better than older versions, but it's still best to avoid branching whenever possible.

Resources