Selection Sort in Dafny - sorting

I am trying to implement selection sort in Dafny.
My sorted and FindMin functions do work, but selectionsort itself contains assertions which Dafny will not prove, even if they are correct.
Here is my program:
predicate sorted(a:array<int>,i:int)
requires a != null;
requires 0 <= i <= a.Length;
reads a;
{
forall k :: 0 < k < i ==> a[k-1] < a[k]
}
method FindMin(a:array<int>,i:int) returns(m:int)
requires a != null;
requires 0 <= i < a.Length;
ensures i <= m < a.Length;
ensures forall k :: i <= k < a.Length ==> a[k] >= a[m];
{
var j := i;
m := i;
while(j < a.Length)
decreases a.Length - j;
invariant i <= j <= a.Length;
invariant i <= m < a.Length;
invariant forall k :: i <= k < j ==> a[k] >= a[m];
{
if(a[j] < a[m]){m := j;}
j := j + 1;
}
}
method selectionsort(a:array<int>) returns(s:array<int>)
requires a != null;
modifies a;
ensures s != null;
ensures sorted(s,s.Length);
{
var c,m := 0,0;
var t;
s := a;
assert s != null;
assert s.Length == a.Length;
while(c<s.Length)
decreases s.Length-c;
invariant 0 <= c <= s.Length;
invariant c-1 <= m <= s.Length;
invariant sorted(s,c);
{
m := FindMin(s,c);
assert forall k :: c <= k < s.Length ==> s[k] >= s[m];
assert forall k :: 0 <= k < c ==> s[k] <= s[m];
assert s[c] >= s[m];
t := s[c];
s[m] := t;
s[c] := s[m];
assert s[m] >= s[c];
assert forall k :: c <= k < s.Length ==> s[k] >= s[c];
c := c+1;
assert c+1 < s.Length ==> s[c-1] <= s[c];
}
}
Why is this wrong? What does "postcondtion may not hold" mean? Could Dafny give an counter-example?

You seem to understand the basic idea behind loop invariants, which is needed to verify programs using Dafny.
Your program is not correct. One way to discover this is to use the verification debugger inside the Dafny IDE in Visual Studio. Click on the last error reported (the assertion on the line before the increment of c) and you will see that the upper half of the array contains an element that is smaller than both s[c] and s[m]. Then select the program points around your 3-statement swap operation and you will notice that your swap does not actually swap.
To fix the swap, exchange the second and third statement of the 3-statement swap. Better yet, make use of Dafny's multiple assignment statement, which makes the code easier to get right:
s[c], s[m] := s[m], s[c];
There are two other problems. One is that the second assertion inside the loop does not verify:
assert forall k :: 0 <= k < c ==> s[k] <= s[m];
While s[m] is the smallest element in the upper part of the array, the loop invariant needs to document that the elements in the lower part of the array are no greater than the elements in the upper part--an essential property of the selection sort algorithm. The following loop invariant does the trick:
invariant forall k, l :: 0 <= k < c <= l < a.Length ==> s[k] <= s[l];
Finally, the complaint about the property sorted(s,c) not being maintained by the loop stems from the fact that you defined sorted as strictly increasing, which swapping will never achieve unless the array's elements are initially all distinct. Dafny thus points out a design decision that you have to make about your sorting routine. You can either decide that your selectionsort method will apply only to arrays with no duplicate elements, which you do by adding
forall k, l :: 0 <= k < l < a.Length ==> a[k] != a[l];
as a precondition to (and loop invariant in) selectionsort. Or, more conventionally, you can fix your definition of sorted to replace a[k] > a[m] with a[k] >= a[m].
To clean up your code a little, you can now delete all assert statements and the declaration of t. Since m is used only inside the loop, you can move the declaration of m to the statement that calls FindMin, which also makes it evident that the loop invariant c-1 <= m <= s.Length is not needed. The two decreases clauses can be omitted; for your program, Dafny will supply these automatically. Lastly, your selectionsort method modifies the given array in place, so there is no real reason to return the reference a in the out-parameter s; instead, you can just omit the out-parameter and replace s by a everywhere.

Related

Use semantics to prove that the postcondition is true following the execution of the program assuming the precondition is true

I am trying to study for a test in my programming language concepts class.
I am trying to understand how to solve this problem. Our professor said we don't need to use formal notation to prove the problem as long as he can understand what we are saying.
I missed the lecture where he solved the problem and I'm having a very hard time finding resources to help me solve it on my own.
Would be so thankful for an explanation.
Problem
Use axiomatic semantics to prove that the postcondition is true following the execution of the program assuming the precondition is true
Precondition: n ≥ 0 and A contains n elements indexed from 0
bound = n;
while (bound > 0) {
t = 0;
for (i = 0; i < bound-1; i++) {
if (A[i] > A[i+1]) {
swap = A[i];
A[i] = A[i+1];
A[i+1] = swap;
t = i+1;
}
}
bound = t;
}
Postcondition: A[0] ≤ A[1] ≤ ...≤ A[n-1]
Lets number the lines for reference:
1. bound = n;
2. while (bound > 0) {
3. t = 0;
4. for (i = 0; i < bound-1; i++) {
5. if (A[i] > A[i+1]) {
6. swap = A[i];
7. A[i] = A[i+1];
8. A[i+1] = swap;
9. t = i+1;
10. }
11. }
12. bound = t;
13. }
Consider the following assertions:
Before entering 12
t < bound
Before entering 11
A[i] <= A[t] for all i such that 0 <= i < t
Before entering 13
A[k] <= A[j] for all indexes k and j such that bound <= k <= j <= n-1
After leaving 12
bound has decreased
Let's see now why the assertions are true
This is true because t=0 before the loop and if set inside the if it is
t = i + 1 < (bound - 1) + 1 = bound.
This is true because otherwise a swap would have happened.
This is true because of 2 and because the for doesn't change entries with indexes j from bound to n-1.
This is true because of 1.
From assertion 4 we deduce that the while loop, and so the algorithm, finishes in n steps at most, when bound = 0.
The postcondition now follows from assertion 3 for bound = 0.

How to normalize for loop?

I really want to know how to normalize for loops . In other words , I want to change for loops
from
for (i = k; i <= n; i += c) // k>=0 n>=0 c>0
to
for (j = 0; j <= itNumber-1; j++) // itNumber is the number of iterations
How to find itNumber = f ( n , k , c ) ??
Thanks in advance .
How to find:
You are looking for a transformation that will give you a "canonical" loop, starting from zero and increments the iterator by 1 in each step.
First of all, lets offset the loop for (i = k; i <= n; i += c), such that i will start from 0: for (i = 0; i <= n - k; i += c), and for each read occurrence of i inside the loop do i:=i+k. (you can't optimize in that method if i is written inside the loop)
Now let's handle the jumping gaps. currently, i is incremented by c each iteration, so we have to divide by c in order to increment in 1: for (i = 0; i <= (n - k)/c; i += 1), and for each read occurrence of i inside the loop do i:=i*c.
For some reason you are asking for a base case i<=itNumber-1, so we got i <= (n - k)/c + 1 - 1, and we can summarize:
transformation of the condition:
f(n, k, c) = (n - k)/c + 1
and the transformation for i inside the loop is: g(i) = c*(i+k)
the calculations are between integers (like in 'C'), so x/y is actually the integer part of it: floor(x/y)
Ah, I finally understand :).
Well I think that it should be the difference between k and n divided by increment, therefore
itNumber = (n-k)/c

Understanding how to use post- condition and loop-invariant correctly with Frama-c

I am trying to prove on this example that the return value will be either 0(if 8 is not in array) or 1(if 8 in array).
int fi8(int *array, int size) {
int fi8 = 0;
int i = 0;
for(i = 0; i < length; ++i)
{
if(array[i] == 8)
fi8 = 1;
}
return fi8;
}
And I created pre- and post conditions:
/*# requires 0 <= size <= 100;
# requires \valid(array+(0..size-1));
# assigns \nothing;
# ensures (\forall integer i; 0<= i < size && array[i] != 8) ==> (\result == 0);
# ensures (\exists integer i; (0<= i < size && array[i] == 8)) && (\result == 1);
#*/
and loop invariants, because Frama-C is based on Hoare Logic:
/*# loop invariant 0 <= i <= length;
# loop invariant fi8 == 0 || fi8 == 1;
# loop invariant (\forall integer i; 0<= i < size && array[i] != 8)
==> (fi8 == 0);
# loop invariant (\exists integer i; (0<= i < size && array[i] == 8))
&& (fi8 == 1);
# loop assigns i, fi8;
#*/
I'm pretty sure that I'm missing something on the lines, where I'm trying to use forall
and exists.
I spend hours trying to understand, how can I check correctly, if any value is assigned on array or not, but I feel like I'm stuck here.
I really appreciate your help :)
There are a few issues with your code. First, you seem to have mixed size and length. I've taken the liberty to use size everywhere, otherwise this code won't even be accepted by a C compiler, let alone by Frama-C. Second, \forall integer i; 0<= i < size && array[i] != 8 (and the corresponding loop invariant) is incorrect. Literally, it means that any integer i verifies both that i is between 0 and size and array[i] is not 8. Taking 101 for i gives a trivial counter-example to this proposition. What you want to express is that for any integer i, if i is between 0 and size, then array[i] is not 8, which is expressed as \forall integer i; 0<= i < size ==> array[i] != 8. On the other hand, the && connector is correct when used under the \exists: this time we indeed want to find an i such that i is within the bounds of the array and for which array[i]==8. However, the second && in your last ensures is not correct: what you want to say is that if there is such an i, then \result == 1, hence you must have an implication here: (\exists integer i; 0<= i < size && array[i] == 8) ==> (fi8 == 1)
The last issue is with your loop invariants. You are re-using as a quantified variable while it is already a C variable, which is often not a good idea. In fact, it is a real problem, since the property you want to express is that as long as we have not seen an 8 between 0 and i, fi8==0 (and dually that if if have seen an 8, fi8==1), i being the C variable. If you use j in your quantification, as in
loop invariant (\forall integer j; 0<= j < i ==> array[j] != 8) ==> (fi8 == 0);
loop invariant (\exists integer j; 0<= j < i && array[j] == 8) ==> (fi8 == 1);
all proof obligations are discharged.

Performance penalty using anonymous function in Julia

I have noticed that there is a performance penalty associated with using anonymous functions in Julia. To illustrate I have two implementations of quicksort (taken from the micro performance benchmarks in the Julia distribution). The first sorts in ascending order
function qsort!(a,lo,hi)
i, j = lo, hi
while i < hi
pivot = a[(lo+hi)>>>1]
while i <= j
while a[i] < pivot; i += 1; end
while pivot < a[j]; j -= 1; end
if i <= j
a[i], a[j] = a[j], a[i]
i, j = i+1, j-1
end
end
if lo < j; qsort!(a,lo,j); end
lo, j = i, hi
end
return a
end
The second takes an additional parameter: an anonymous function that can be used to specify ascending or descending sort, or comparison for more exotic types
function qsort_generic!(a,lo,hi,op=(x,y)->x<y)
i, j = lo, hi
while i < hi
pivot = a[(lo+hi)>>>1]
while i <= j
while op(a[i], pivot); i += 1; end
while op(pivot, a[j]); j -= 1; end
if i <= j
a[i], a[j] = a[j], a[i]
i, j = i+1, j-1
end
end
if lo < j; qsort_generic!(a,lo,j,op); end
lo, j = i, hi
end
return a
end
There is a significant performance penalty when sorting Arrays of Int64, with the default version an order of magnitude faster. Here are times for sorting arrays of length N in seconds:
N qsort_generic qsort
2048 0.00125 0.00018
4096 0.00278 0.00029
8192 0.00615 0.00061
16384 0.01184 0.00119
32768 0.04482 0.00247
65536 0.07773 0.00490
The question is: Is this due to limitations in the compiler that will be ironed out in time, or is there an idiomatic way to pass functors/anonymous functions that should be used in cases like this?
update From the answers it looks like this is something that will be fixed up in the compiler.
In the mean time, there were two suggested work arounds. Both approaches are fairly straightforward, though they do start to feel like the sort of jiggery-pokery that you have to use in C++ (though not on the same scale of awkward).
The first is the FastAnon package suggested by #Toivo Henningsson. I didn't try this approach, but it looks good.
I tried out the second method suggested by #simonstar, which gave me equivalent performance to the non-generic qsort! implementation:
abstract OrderingOp
immutable AscendingOp<:OrderingOp end
immutable DescendingOp<:OrderingOp end
evaluate(::AscendingOp, x, y) = x<y
evaluate(::DescendingOp, x, y) = x>y
function qsort_generic!(a,lo,hi,op=AscendingOp())
i, j = lo, hi
while i < hi
pivot = a[(lo+hi)>>>1]
while i <= j
while evaluate(op, a[i], pivot); i += 1; end
while evaluate(op, pivot, a[j]); j -= 1; end
if i <= j
a[i], a[j] = a[j], a[i]
i, j = i+1, j-1
end
end
if lo < j; qsort_generic!(a,lo,j,op); end
lo, j = i, hi
end
return a
end
Thanks everyone for the help.
It's a problem and will be fixed with an upcoming type system overhaul.
Update: This has now been fixed in the 0.5 version of Julia.
As others have noted, the code you've written is idiomatic Julia and will someday be fast, but the compiler isn't quite there yet. Besides using FastAnonymous, another option is to pass types instead of anonymous functions. For this pattern, you define an immutable with no fields and a method (let's call it evaluate) that accepts an instance of the type and some arguments. Your sorting function would then accept an op object instead of a function and call evaluate(op, x, y) instead of op(x, y). Because functions are specialized on their input types, there is no runtime overhead to the abstraction. This is the basis for reductions and specification of sort order in the standard library, as well as NumericExtensions.
For example:
immutable AscendingSort; end
evaluate(::AscendingSort, x, y) = x < y
function qsort_generic!(a,lo,hi,op=AscendingSort())
i, j = lo, hi
while i < hi
pivot = a[(lo+hi)>>>1]
while i <= j
while evaluate(op, a[i], pivot); i += 1; end
while evaluate(op, pivot, a[j]); j -= 1; end
if i <= j
a[i], a[j] = a[j], a[i]
i, j = i+1, j-1
end
end
if lo < j; qsort_generic!(a,lo,j,op); end
lo, j = i, hi
end
return a
end
Yes, it's due to limitations in the compiler, and there are plans to fix it, see e.g. this issue. In the meantime, the FastAnonymous package might provide a workaround.
The way that you have done it looks pretty idiomatic, there's unfortunately no magic trick that you are missing (except for possibly the FastAnonymous package).

How Does This String Permutation Works

Need help understanding the correctness of the second swap call.
/* Function to print permutations of string
This function takes three parameters:
1. String
2. Starting index of the string
3. Ending index of the string. */
void permute(char *a, int i, int n)
{
int j;
if (i == n)
printf("%s\n", a);
else
{
for (j = i; j <= n; j++)
{
swap((a+i), (a+j));
permute(a, i+1, n);
swap((a+i), (a+j)); // how does this work here?
}
}
}
It seems like the second swap is to undo the first swap. But I don't see a proof why the in-between permute call would preserve that the original *(a+i) would remain at a+j.
Notes:
[1] Code was found at http://www.geeksforgeeks.org/write-a-c-program-to-print-all-permutations-of-a-given-string/
Proposition: For all a of length > n (so that n is a valid index) and 0 <= i <= n, when
permute(a, i, n)
returns, a is the same as when permute was called.
Proof: (Induction start) If i == n, then permute(a, n, n); only prints the string and doesn't change it, hence the proposition holds in that case.
(Induction hypothesis) Let 0 <= k < n, and the enoncé of the proposition hold for all k < i <= n.
Then in the loop
for (j = i; j <= n; j++)
{
swap((a+i), (a+j));
permute(a, i+1, n);
swap((a+i), (a+j)); // how does this work here?
}
for each j, the recursive call to permute doesn't change the contents [more precisely, it undoes all changes intermediately done] per the hypothesis. Therefore, before the second swap, the contents of a[i] and a[j] are exactly what they were after the first swap, hence at the end of the loop body, the contents of a are exactly what they were when the loop body was entered.

Resources