Invariant induction over horn-clauses with Z3py - logic

I am currently using Z3py to to deduce some invariants which are encoded as a conjunction of horn-clauses whilst also providing a template for the invariant. I'm starting with a simple example first if you see the code snippet below.
x = 0;
while(x < 5){
x += 1
}
assert(x == 5)
This translates into the horn clauses
x = 0 => Inv(x)
x < 5 /\ Inv(x) => Inv(x +1)
Not( x < 5) /\ Inv(x) => x = 5
The invariant here is x <= 5.
I have provided a template for the invariant of the form a*x + b <= c
so that all the solver has to do is guess a set of values for a,b and c that can reduce to x <= 5.
However when I encode it up I keep getting unsat. If try to assert Not (x==5) I get a=2 , b = 1/8 and c = 2 which makes little sense to me as a counterexample.
I provide my code below and would be grateful for any help on correcting my encoding.
x = Real('x')
x_2 = Real('x_2')
a = Real('a')
b = Real('b')
c = Real('c')
s = Solver()
s.add(ForAll([x],And(
Implies(x == 0 , a*x + b <= c),
Implies(And(x_2 == x + 1, x < 5, a*x + b <= c), a*x_2 + b <= c),
Implies(And(a*x + b <= c, Not(x < 5)), x==5)
)))
if (s.check() == sat):
print(s.model())
Edit: it gets stranger for me. If I remove the x_2 definition and just replace x_2 with (x + 1) in the second horn clause as well as delete the x_2 = x_2 + 1, I get unsat whether I write Not( x==5) or x==5 in the final horn clause.

There were two things preventing your original encoding from working:
1) It's not possible to satisfy x_2 == x + 1 for all x for a single value of x_2. Thus, if you're going to write x_2 == x + 1, both x and x_2 need to be universally quantified.
2) Somewhat surprisingly, this problem is satisfiable in the integers but not in the reals. You can see the problem with the clause x < 5 /\ Inv(x) => Inv(x + 1). If x is an integer, then this is satisfied by x <= 5. However, if x is allowed to be any real value, then you could have x == 4.5, which satisfies both x < 5 and x <= 5, but not x + 1 <= 5, so Inv(x) = (x <= 5) does not satisfy this problem in the reals.
Also, you might find it helpful to define Inv(x), it cleans up the code quite a bit. Here is the encoding of your problem with those changes:
from z3 import *
# Changing these from 'Int' to 'Real' changes the problem from sat to unsat.
x = Int('x')
x_2 = Int('x_2')
a = Int('a')
b = Int('b')
c = Int('c')
def Inv(x):
return a*x + b <= c
s = Solver()
# I think this is the simplest encoding for your problem.
clause1 = Implies(x == 0 , Inv(x))
clause2 = Implies(And(x < 5, Inv(x)), Inv(x + 1))
clause3 = Implies(And(Inv(x), Not(x < 5)), x == 5)
s.add(ForAll([x], And(clause1, clause2, clause3)))
# Alternatively, if clause2 is specified with x_2, then x_2 needs to be
# universally quantified. Note the ForAll([x, x_2]...
#clause2 = Implies(And(x_2 == x + 1, x < 5, Inv(x)), Inv(x_2))
#s.add(ForAll([x, x_2], And(clause1, clause2, clause3)))
# Print result all the time, to avoid confusing unknown with unsat.
result = s.check()
print result
if (result == sat):
print(s.model())
One more thing: it's a bit strange to me to write a*x + b <= c as a template, because this is the same as a*x <= d for some integer d.

Related

Algorithm to precisely compare two exponentiations for very large integers (order of 1 billion)

We want to compare a^b to c^d, and tell if the first is smaller, greater, or equal (where ^ denotes exponentiation).
Obviously, for very large numbers, we cannot explicitely compute these values.
The most common approach in this situation is to apply log on both sides and compare b * log(a) to d * log(c). The issue here is that logs are floating-point operations, and as such we cannot trust our answer with 100% confidence (there might be some values which are incredibly close, and because of floating-point error we get a wrong answer).
Is there an algorithm for solving this problem? I've been scouring the intrernet for this, but I can only find solutions which work for particular cases only (e.g. in which one exponent is a multiple of another), or which use floating point in some way (logarithms, division) etc.
This is sort of two questions in one:
Are they equal?
If not, which one is greater?
As Peter O. observes, it's easiest to build in a language that provides an arbitrary-precision fraction type. I'll use Python 3.
Let's assume without loss of generality that a ≤ c (swap if necessary) and b is relatively prime to d (divide both by the greatest common divisor).
To get at the core of the question, I'm going to assume that a, c > 0 and b, d ≥ 0. Removing this assumption is tedious but not difficult.
Equality test
There are some easy cases where a = 1 or b = 0 or c = 1 or d = 0.
Separately, necessary conditions for a^b = c^d are
i. b ≥ d, since otherwise b < d, which together with a ≤ c implies a^b < c^d;
ii. a is a divisor of c, since we know from (i) that a^b = c^d is a divisor of c^b = c^(b−d) c^d.
When these conditions hold, we can divide through by a^d to reduce the problem to testing whether a^(b−d) = (c/a)^d.
In Python 3:
def equal_powers(a, b, c, d):
while True:
lhs_is_one = a == 1 or b == 0
rhs_is_one = c == 1 or d == 0
if lhs_is_one or rhs_is_one:
return lhs_is_one and rhs_is_one
if a > c:
a, b, c, d = c, d, a, b
if b < d:
return False
q, r = divmod(c, a)
if r != 0:
return False
b -= d
c = q
def test_equal_powers():
for a in range(1, 25):
for b in range(25):
for c in range(1, 25):
for d in range(25):
assert equal_powers(a, b, c, d) == (a ** b == c ** d)
test_equal_powers()
Inequality test
Once we've established that the two quantities are not equal, it's time to figure out which one is greater. (Without the equality test, the code here could run forever.)
If you're doing this for real, you should consult an actual reference on computing elementary functions. I'm just going to try to do the simplest thing that works.
Time for a calculus refresher. We have the Taylor series
−log x = (1−x) + (1−x)^2/2 + (1−x)^3/3 + (1−x)^4/4 + ...
To get a lower bound, truncate the series. To get an upper bound, we can truncate but replace the final term (1−x)^n/n with (1−x)^n/n (1/x), since
(1−x)^n/n (1/x)
= (1−x)^n/n (1 + (1−x) + (1−x)^2 + ...)
= (1−x)^n/n + (1−x)^(n+1)/n + (1−x)^(n+2)/n + ...
> (1−x)^n/n + (1−x)^(n+1)/(n+1) + (1−x)^(n+2)/(n+2) + ...
To get a good convergence rate, we're going to want 0.5 ≤ x < 1, which we can achieve by dividing x by a power of two.
In Python, we'll represent a real number as an infinite generator of shrinking intervals that contain the true value. Once the intervals for b log a and d log c are disjoint, we can determine how they compare.
import fractions
def minus(x, y):
while True:
x_lo, x_hi = next(x)
y_lo, y_hi = next(y)
yield x_lo - y_hi, x_hi - y_lo
def times(b, x):
for lo, hi in x:
yield b * lo, b * hi
def restricted_log(a):
series = 0
n = 0
numerator = 1
while True:
n += 1
numerator *= 1 - a
series += fractions.Fraction(numerator, n)
yield -(series + fractions.Fraction(numerator * (1 - a), (n + 1) * a)), -series
def log(a):
n = 0
while a >= 1:
a = fractions.Fraction(a, 2)
n += 1
return minus(restricted_log(a), times(n, restricted_log(fractions.Fraction(1, 2))))
def less_powers(a, b, c, d):
lhs = times(b, log(a))
rhs = times(d, log(c))
while True:
lhs_lo, lhs_hi = next(lhs)
rhs_lo, rhs_hi = next(rhs)
if lhs_hi < rhs_lo:
return True
if rhs_hi < lhs_lo:
return False
def test_less_powers():
for a in range(1, 10):
for b in range(10):
for c in range(1, 10):
for d in range(10):
if a ** b != c ** d:
assert less_powers(a, b, c, d) == (a ** b < c ** d)
test_less_powers()

Algorithm to determine if A is within X of a multiple of B

This is not language specific, but I'm not very good with maths!
What is the most efficient way to test if A is within X of a multiple of B? The method I'm using at the moment is:
# when (A mod B) < (B/2)
if (A modulus B) < X THEN TRUE
# when (A mod B) > (B/2)
if ABS((A modulus B) - B) < X THEN TRUE
Example 1: A is 100 more than 3 x B
A=30100
B=10000
X=150
(30100 mod 10000) = 100 < (10000/2)
(30100 modulus 10000) = 100
Example 2: A is 100 less than 3 x B
A=29900
B=10000
X=150
(29900 mod 10000) = 9900 > (10000/2)
ABS((29900 modulus 10000) - 10000) = 100
Is there a better way to do this?
(To avoid an XY Problem: I'm writing a script to monitor some industrial machinery, and I want to fire an alert when a lifetime counter metric is within a range of a periodic maintenance value. When the service interval is 10000 and the alert range is 150, I want to know when that counter is between 9850 and 10150, or 19850 and 20150, etc)
Your way is not bad, but it's a little faster to do:
if ((A + X) % B) <= X*2 then
TRUE
else
FALSE
That's for distance <= X. If you need distance < X, then:
if ((A + X + B - 1) % B) <= X*2-2 then
TRUE
else
FALSE
Note that A + X + B - 1 is like A + X - 1, but protected against anything weird your language might do with the modulus operator and negative operands in the case that X == 0.
I think the best approach is whatever you find clearest; but personally I would write the equivalent of ((A + X) MOD B ≤ 2X). For example, in C / Java / Perl / JavaScript / etc.:
if ((a + x) % b <= 2 * x) {
// A is within X of a multiple of B
}

Drawing concentric tiling circles with even diameter

I need to draw circles using pixels with these constraints:
the total of pixels across the diameter is an even number,
there is no empty pixels between two circles of radius R and R+1 (R is an integer).
The midpoint algorithm can’t be used but I found out that Eric Andres wrote the exact thing I want. The algorithm can be found in this article under the name of “half integer centered circle”. For those who don’t have access to it, I put the interesting part is at the end of the question.
I encounter difficulties to implement the algorithm. I copied the algorithm in Processing using the Python syntax (for the ease of visualisation):
def half_integer_centered_circle(xc, yc, R):
x = 1
y = R
d = R
while y >= x:
point(xc + x, yc + y)
point(xc + x, yc - y + 1)
point(xc - x + 1, yc + y)
point(xc - x + 1, yc - y + 1)
point(xc + y, yc + x)
point(xc + y, yc - x + 1)
point(xc - y + 1, yc + x)
point(xc - y + 1, yc - x + 1)
if d > x:
d = d - x
x = x + 1
elif d < R + 1 - y:
d = d + y - 1
y = y - 1
else:
d = d + y - x - 1
x = x + 1
y = y - 1
The point() function just plot a pixel at the given coordinates. Please also note that in the article, x is initialised as S, which is strange because there is no S elsewhere (it’s not explained at all), however it is said that the circle begins at (x, y) = (1, R), so I wrote x = 1.
There is the result I get for a radii between 1 pixel and 20 pixels:
As you can see, there are holes between circles and the circle with R = 3 is different from the given example (see below). Also, the circles are not really round compared to what you get with the midpoint algorithm.
How can I get the correct result?
Original Eric Andres’ algorithm:
I don't understand the way in which the algorithm has been presented in that paper. As I read it the else if clause associated with case (b) doesn't have a preceding if. I get the same results as you when transcribing it as written
Looking at the text, rather than the pseudocode, the article seems to be suggesting an algorithm of the following form:
x = 1
y = R
while x is less than or equal to y:
draw(x, y)
# ...
if the pixel to the right has radius between R - 1/2 and R + 1/2:
move one pixel to the right
if the pixel below has radius between R - 1/2 and R + 1/2:
move one pixel down
else:
move one pixel diagonally down and right
Which seems plausible. In python:
#!/usr/bin/python3
import numpy as np
import matplotlib.pyplot as pp
fg = pp.figure()
ax = fg.add_subplot(111)
def point(x, y, c):
xx = [x - 1/2, x + 1/2, x + 1/2, x - 1/2, x - 1/2 ]
yy = [y - 1/2, y - 1/2, y + 1/2, y + 1/2, y - 1/2 ]
ax.plot(xx, yy, 'k-')
ax.fill_between(xx, yy, color=c, linewidth=0)
def half_integer_centered_circle(R, c):
x = 1
y = R
while y >= x:
point(x, y, c)
point(x, - y + 1, c)
point(- x + 1, y, c)
point(- x + 1, - y + 1, c)
point(y, x, c)
point(y, - x + 1, c)
point(- y + 1, x, c)
point(- y + 1, - x + 1, c)
def test(x, y):
rSqr = x**2 + y**2
return (R - 1/2)**2 < rSqr and rSqr < (R + 1/2)**2
if test(x + 1, y):
x += 1
elif test(x, y - 1):
y -= 1
else:
x += 1
y -= 1
for i in range(1, 5):
half_integer_centered_circle(2*i - 1, 'r')
half_integer_centered_circle(2*i, 'b')
pp.axis('equal')
pp.show()
This seems to work as intended. Note that I removed the circle centre for simplicity. It should be easy enough to add in again.
Edit Realised I could match the radius 3 image if I tweaked the logic a bit.
I have been looking into this matter and observed three issues in the original paper:
The arithmetic circle copied here (Figure 10.a in the paper) is not consistent with the formal definition of the "half integer centered circle". In one case the distance to the center must be between R-1/2 and R+1/2 and in the other between integer values. The consequence is that this specific algorithm, if properly implemented, can never generate the circle of Figure 10.a.
There is a mistake in one of the inequalities of the algorithm pseudo code: the test for case (b) should be d <= (R + 1 - y) instead of d < (R + 1 - y).
All those pixels that satisfy x==y have only 4-fold symmetry (not 8-fold) and are generated twice by the algorithm. Although producing duplicated pixels may not be a problem for a drawing routine, it is not acceptable for the application that I am interested in. However this can be easily fixed by adding a simple check of the x==y condition and skipping the four duplicated pixels.
The python code of the original question includes the inequality error mentioned above and an additional mistake due to missing parenthesis in one of the expressions that should read d = d + (y - x - 1).
The following implementation fixes all this and is compatible with python2 and python3 (no integer division issues in the point() function):
import numpy as np
import matplotlib.pyplot as pp
fg = pp.figure()
ax = fg.add_subplot(111)
def point(x, y, c):
xx = [x - 0.5, x + 0.5, x + 0.5, x - 0.5, x - 0.5 ]
yy = [y - 0.5, y - 0.5, y + 0.5, y + 0.5, y - 0.5 ]
ax.plot(xx, yy, 'k-')
ax.fill_between(xx, yy, color=c, linewidth=0)
def half_integer_centered_circle(R, c):
x = 1
y = R
d = R
while y >= x:
point(x, y, c)
point(x, - y + 1, c)
point(- x + 1, y, c)
point(- x + 1, - y + 1, c)
if y != x:
point(y, x, c)
point(y, - x + 1, c)
point(- y + 1, x, c)
point(- y + 1, - x + 1, c)
if d > x:
d = d - x
x = x + 1
elif d <= R + 1 - y:
d = d + y - 1
y = y - 1
else:
d = d + (y - x - 1)
x = x + 1
y = y - 1
for i in range(1, 5):
half_integer_centered_circle(2*i - 1, 'r')
half_integer_centered_circle(2*i, 'b')
pp.axis('equal')
pp.show()

More efficient algorithm preforms worse in Haskell

A friend of mine showed me a home exercise in a C++ course which he attend. Since I already know C++, but just started learning Haskell I tried to solve the exercise in the "Haskell way".
These are the exercise instructions (I translated from our native language so please comment if the instructions aren't clear):
Write a program which reads non-zero coefficients (A,B,C,D) from the user and places them in the following equation:
A*x + B*y + C*z = D
The program should also read from the user N, which represents a range. The program should find all possible integral solutions for the equation in the range -N/2 to N/2.
For example:
Input: A = 2,B = -3,C = -1, D = 5, N = 4
Output: (-1,-2,-1), (0,-2, 1), (0,-1,-2), (1,-1, 0), (2,-1,2), (2,0, -1)
The most straight-forward algorithm is to try all possibilities by brute force. I implemented it in Haskell in the following way:
triSolve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in [(x,y,z) | x <- [minN..maxN], y <- [minN..maxN], z <- [minN..maxN], equation x y z]
So far so good, but the exercise instructions note that a more efficient algorithm can be implemented, so I thought how to make it better. Since the equation is linear, based on the assumption that Z is always the first to be incremented, once a solution has been found there's no point to increment Z. Instead, I should increment Y, set Z to the minimum value of the range and keep going. This way I can save redundant executions.
Since there are no loops in Haskell (to my understanding at least) I realized that such algorithm should be implemented by using a recursion. I implemented the algorithm in the following way:
solutions :: (Integer -> Integer -> Integer -> Bool) -> Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solutions f maxN minN x y z
| solved = (x,y,z):nextCall x (y + 1) minN
| x >= maxN && y >= maxN && z >= maxN = []
| z >= maxN && y >= maxN = nextCall (x + 1) minN minN
| z >= maxN = nextCall x (y + 1) minN
| otherwise = nextCall x y (z + 1)
where solved = f x y z
nextCall = solutions f maxN minN
triSolve' :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
triSolve' a b c d n =
let equation x y z = (a * x + b * y + c * z) == d
minN = div (-n) 2
maxN = div n 2
in solutions equation maxN minN minN minN minN
Both yield the same results. However, trying to measure the execution time yielded the following results:
*Main> length $ triSolve' 2 (-3) (-1) 5 100
3398
(2.81 secs, 971648320 bytes)
*Main> length $ triSolve 2 (-3) (-1) 5 100
3398
(1.73 secs, 621862528 bytes)
Meaning that the dumb algorithm actually preforms better than the more sophisticated one. Based on the assumption that my algorithm was correct (which I hope won't turn as wrong :) ), I assume that the second algorithm suffers from an overhead created by the recursion, which the first algorithm isn't since it's implemented using a list comprehension.
Is there a way to implement in Haskell a better algorithm than the dumb one?
(Also, I'll be glad to receive general feedbacks about my coding style)
Of course there is. We have:
a*x + b*y + c*z = d
and as soon as we assume values for x and y, we have that
a*x + b*y = n
where n is a number we know.
Hence
c*z = d - n
z = (d - n) / c
And we keep only integral zs.
It's worth noticing that list comprehensions are given special treatment by GHC, and are generally very fast. This could explain why your triSolve (which uses a list comprehension) is faster than triSolve' (which doesn't).
For example, the solution
solve :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
-- "Buffalo buffalo buffalo buffalo Buffalo buffalo buffalo..."
solve a b c d n =
[(x,y,z) | x <- vals, y <- vals
, let p = a*x +b*y
, let z = (d - p) `div` c
, z >= minN, z <= maxN, c * z == d - p ]
where
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
runs fast on my machine:
> length $ solve 2 (-3) (-1) 5 100
3398
(0.03 secs, 4111220 bytes)
whereas the equivalent code written using do notation:
solveM :: Integer -> Integer -> Integer -> Integer -> Integer -> [(Integer,Integer,Integer)]
solveM a b c d n = do
x <- vals
y <- vals
let p = a * x + b * y
z = (d - p) `div` c
guard $ z >= minN
guard $ z <= maxN
guard $ z * c == d - p
return (x,y,z)
where
minN = negate (n `div` 2)
maxN = (n `div` 2)
vals = [minN..maxN]
takes twice as long to run and uses twice as much memory:
> length $ solveM 2 (-3) (-1) 5 100
3398
(0.06 secs, 6639244 bytes)
Usual caveats about testing within GHCI apply -- if you really want to see the difference, you need to compile the code with -O2 and use a decent benchmarking library (like Criterion).

Haskell - list comprehension can't enumerate N × N

I have to write a function which returns a list of all pairs (x,y) where x,
y ∈ N , and:
x is the product of two natural numbers (x = a • b, where a, b ∈ N) and
x is really bigger than 5 but really smaller than 500, and
y is a square number (y = c² where c ∈ N) NOT greater than 1000, and
x is a divisor of y.
My attempt:
listPairs :: [(Int, Int)]
listPairs = [(a*b, y) | y <- [0..], a <- [0..], b <- [0..],
(a*b) > 5, (a*b) < 500, (y*y) < 1001,
mod y (a*b) == 0]
But it doesn't return anything and the computer works a lot on it.
However if I choose a smaller range for a, b and y e. g. [0..400], it takes up to a minute but it returns the right result.
So how could I solve the performance issue?
So, of course nested list comprehensions on infinite lists do not terminate.
Fortunately, your lists are not infinite. There's a limit. If x = a*b < 500, then we know that it must be a < 500 and b < 500. Also, c = y*y < 1001 is just y < 32. So,
listPairs :: [(Int, Int)]
listPairs =
[(x, c*c) | c <- [1..31], a <- [1..499], -- a*b < 500 ==> b<500/a ,
b <- [a..min 499 (div 500 a)], -- a*b==b*a ==> b >= a
let x = a*b, x > 5,
-- (a*b) < 500, (c*c) < 1001, -- no need to test this
rem (c*c) x == 0]
mod 0 n == 0 trivially holds, so I'm excluding 0 from "natural numbers" here.
There are still some duplicates produced here, even though we've limited the b value to b >= a in x=a*b, because x can have several representations (e.g. 1*6 == 2*3).
You can use Data.List.nub to get rid of them.

Resources