How to check progress for Z3 optimization problem

How to check progress for Z3 optimization problem - performance

I have a problem which I code up in an smt-lib file for input into Z3. There are many constraints, but really, I am interested in minimizing one variable:
(minimize totalCost)
(check-sat)
The solver runs, and runs, for hours. Much longer than if I were to simply use an assert to set totalCost below some value and run check-sat. Is there any way to get Z3 to periodically print out the lowest value it has achieved for totalCost, along with all the variable values to achieve that? That would be very helpful. Thanks in advance!

Looking at the verbose mode as Patrick described is your best bet if you want to stick to the optimizing solver. Of course, the output might be hard to understand and may not even have all the data you need. You can "instrument" z3 source code to print out more if you dig deep into it. But that's a lot more work and would need studying the source code.
But it occurs to me that z3 is doing quite well with your constraints, it's the optimization engine that slows down? That's not surprising, as the optimizing solver is not as performant as the regular solver for obvious reasons. If you suspect this is the case, you might want to optimize by doing an outer loop: Do a check-sat, get the value of totalCost, then ask again but add the extra constraint that totalCost is less than the current value provided. This can quickly converge for certain problems: If the solution space is small enough and you use many different theories, this technique can outperform the optimizing solver. You can also implement a "binary" search of sorts: For instance, if the solver gives you a solution with cost 100, you can ask if there's one with cost less than 50; if sat, you'd then ask for 25, if unsat, you'd ask for 75. Depending on your problem, this can be very effective.
Note that if you implement the above trick, you can also use the solver in the incremental mode, and it would re-use all the lemmas it learned from before. The optimizing solver itself is not incremental, so that's another advantage of the looping technique. On the other hand, if there are too many solutions to your constraints, or if you don't have a global minimum, then the looping technique can run forever: So you probably want to watch for the loop-count and stop after a certain threshold.

If you run z3 via the command-line, you may try the option -v:1, which makes the OMT solver print any update to the lower/upper bounds of the cost function.
e.g.
~$ z3 -v:1 optimathsat/tools/release_files/examples/optimization/smtlib2_binary.smt2
...
(optsmt upper bound: 57)
(optsmt upper bound: 54)
(optsmt upper bound: 157/3)
(optsmt upper bound: 52)
(optsmt upper bound: 154/3)
(optsmt upper bound: 50)
(optsmt upper bound: 149/3)
(optsmt upper bound: 97/2)
(optsmt upper bound: 145/3)
(optsmt upper bound: 48)
(optsmt upper bound: 95/2)
(optsmt upper bound: 140/3)
(optsmt upper bound: 46)
(optsmt upper bound: 181/4)
(optsmt upper bound: 45)
(optsmt upper bound: 134/3)
(optsmt upper bound: 89/2)
(optsmt upper bound: 177/4)
(optsmt upper bound: 44)
(optsmt upper bound: 43)
(optsmt upper bound: 171/4)
(optsmt upper bound: 128/3)
(optsmt upper bound: 85/2)
(optsmt upper bound: 42)
(optsmt upper bound: 81/2)
(optsmt upper bound: 202/5)
(optsmt upper bound: 40)
(optsmt upper bound: 199/5)
(optsmt upper bound: 193/5)
(optsmt upper bound: 77/2)
(optsmt upper bound: 192/5)
(optsmt upper bound: 115/3)
(optsmt upper bound: 191/5)
(optsmt upper bound: 189/5)
(optsmt upper bound: 217/6)
(optsmt upper bound: 36)
(optsmt upper bound: 69/2)
(optsmt upper bound: 137/4)
(optsmt upper bound: 34)
(optsmt upper bound: 65/2)
(optsmt upper bound: 223/7)
(optsmt upper bound: 63/2)
(optsmt upper bound: 218/7)
(optsmt upper bound: 216/7)
(optsmt upper bound: 123/4)
(optsmt upper bound: 61/2)
(optsmt upper bound: 211/7)
(optsmt upper bound: 241/8)
(optsmt upper bound: 30)
(optsmt upper bound: 208/7)
(optsmt upper bound: 59/2)
(optsmt upper bound: 115/4)
(optsmt upper bound: 57/2)
(optsmt upper bound: 113/4)
(optsmt upper bound: 253/9)
(optsmt upper bound: 251/9)
(optsmt upper bound: 250/9)
(optsmt upper bound: 221/8)
(optsmt upper bound: 55/2)
(optsmt upper bound: 192/7)
(optsmt upper bound: 191/7)
(optsmt upper bound: 109/4)
(optsmt upper bound: 217/8)
(optsmt upper bound: 27)
sat
(objectives
(objective 27)
)
This is only useful when the optimization algorithm being used advances the search starting from the satisfiable region. Some optimization engines like, e.g. MaxRes, approach the optimal solution starting from the unsatisfiable (i.e. more-than-optimal) region, and therefore do not provide any partial solution (however, they may be considerably faster on some instances).

Related

Lower Bound Time Complexity of Log Equations

Here's the equation:
Upper Bound:
Without the log i understand the Upper bound on this would be O(n^2), but with the log will the upper bound be O(log n^2)? Or is the log negated?
Lower Bound:
If we assume that this is only run once, then shouldn't this be lower bounded by O(1)?

log(n^2) = 2*log(n). That means O(log n^2) = O(log n).

First of all lower bound is marked as Ω not O.
Also, Ω(1) is a lower bound but it's not a tight one since for n >= 3:
2log(3n + n^2) > log(n) = Ω(log(n))
and for the upper bound:
2log(3n + n^2) < 2 * log(n^3) = 6log(n) = O(log(n))
And since F(n) = O(log(n)) and F(n) = Ω(log(n))
it means it's a tight bound and it's marked as: Θ(log(n))

Asymptotic Upper Bounds vs Tight Bounds

I've come across in CLRS (Introduction to Algorithms) a sentence which states
"Distinguishing asymptotic Upper Bounds from asymptotically tight bounds is standard in the algorithms literature"
While I understand the essence of what the text wants to convey, It would be better understood if I get an example illustrating the difference.

O-notation gives us asymptotic upper bound.
Consider a function f(x),
We can define a function g(x), such that f(x) = O(g(x)).
Here g(x) is the asymptotic upper bound of f(x), meaning for all values of x >= c, f(x) grows at the same rate or slower than g(x) as x increases.
Another thing to be noticed is that if h(x) is the asymptotic upper bound of g(x) then it could easily be concluded that h(x) is also the asymptotic upper bound of f(x). After all, if f(x) can only grow at an equal or smaller rate than g(x), it is bound to grow at an equal of smaller rate than h(x) as g(x) cannot grow at any faster than h(x).
Eg. if f(x) = 10x + 2,
g(x) = 12x + 1 and h(x) = 2x^2.
We can safely say that f(x) = O(g(x)), g(x) = O(h(x)) and f(x) = O(h(x)).
Here g(x) is said to be asymptotic tight upper bound and h(x) is said to be the asymptotic upper bound of f(x).

How to evaluate below expression involving asymptotic notations?

If
f(n)=ϴ(n),g(n)=ϴ(n)
and
h(n)=Ω(n)
Then how to evaluate f(n)g(n)+h(n)?
I approached like f(n)g(n)=ϴ(n^2), now what will be Ω(n)+ϴ(n^2). According to me the lower bound of this expression should be Ω(n), and upper bound should be O(n^2), but what should be the tightest bound for this expression?

For some constants k1, k2, l1, l2 and m > 0, we have:
f(n) is ϴ(n)
=> k1*n < f(n) < k2*n, for n sufficiently large
g(n) is ϴ(n)
=> l1*n < g(n) < g2*n, for n sufficiently large
h(n) is Ω(n)
=> m*n < h(n), for n sufficiently large
Then, f(n)*h(n):
for f(n) * h(n):
k1*l1*n^2 < f(n)*g(n) < k2*l2*n^2, for n sufficiently large
So we can just write p(n) = f(n)*g(n) and use constants c1=k1*l1 and c2=k2*l2, and we have:
p(n) (= f(n)*g(n)) is in ϴ(n^2), since
c1*n^2 < p(n) < c2*n^2
Then, finally, what complexity does p(n) + h(n) have? We have:
c1*n^2 + m*n < p(n) + h(n), for n sufficiently large
Since we never got an upper bound on h(n), we can't really say anything regarding the upper bound on p(n) + h(n). This is imperative: h(n) in Ω(n) only says that h(n) grows at least as fast as n (linear) asymptotically, but we don't know if this is a tight lower bound. It might be a very sloppy lower bound for a exponential time function.
Subsequently, we can only state something regarding the lower bound:
p(n) + h(n) = f(n)*g(n) + h(n) is in Ω(n^2)
I.e., f(n)*g(n) + h(n) grows at least as n^2 (i.e., in Ω(n^2)) asymptotically.
A note as to your approach: you are right (as shown above) that f(n)g(n) is in ϴ(n^2), but note that this implies that a tight lower bound of f(n)g(n) + h(n) can never be less than k*n^2: i.e., f(n)g(n) + h(n) in Ω(n^2) is a given, and a better (tigher) lower bound than what your suggested; Ω(n). Recall that the fastest growing terms dominate asymptotic behavior.
For reference, see e.g.
https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/asymptotic-notation

Is it O(n^2) or O(1)?

Is the execution time of this unique string function reduced from the naive O(n^2) approach?
This question has a lot of interesting discussion leads me to wonder if we put some threshold on the algorithm, would it change the Big-O running time complexity? For example:
void someAlgorithm(n) {
if (n < SOME_THRESHOLD) {
// do O(n^2) algorithm
}
}
Would it be O(n2) or would it be O(1).

This would be O(1), because there's a constant, such that no matter how big the input is, your algorithm will finish under a time that is smaller than that constant.
Technically, it is also O(n^2), because there's a constant c such that no matter how big your input is, your algorithm will finish under c * n ^ 2 time units. Since big-O gives you the upper bound, everything that is O(1) is also O(n^2)

If SOME_THRESHOLD is constant, then you've hard coded a constant upper bound on the growth of the function (and f(x) = O (g(x)) gives an upper bound of g(x) on the growth of f(x)).
By convention, O(k) for some constant k is just O(1) because we don't care about constant factors.
Note that the lower bound is unknown, a least theoretically, because we don't know anything about the lower bound of the O(n^2) function. We know that for f(x) = Omega(h(x)), h(x) <= 1 because f(x) = O(1). Less than constant-time functions are possible in theory, although in practice h(x) = 1, so f(x) = Omega(1).
What all this means is by forcing a constant upper bound on the function, the function now has a tight bound: f(x) = Theta(1).

Difference between big-O notation and theta notation, why (theta) Ө-notation is suitable to insertion sort to describe its worst case running time?

We use Ө-notation to write worst case running time of insertion sort. But I’m not able to relate properties of Ө-notation with insertion sort, why Ө-notation is suitable to insertion sort. How does the insertion sort function f(n), lies between the c1*n^2 and c2*n^2 for all n>=n0.
Running time of insertion sort as Ө(n^2) implies that it has upper bound O(n^2) and lower bound Ω(n^2). I’m confuse in whether insertion sort lower bound is Ω(n^2) or Ω(n).

The use of Ө-notation :
If any function have same both upper bound and lower bound, we can use Ө-notation to describe its time complexity.Both its upper bound and lower bound can be specified with single notation. It simply tells more about the characteristic of the function.
Example ,
suppose we have a function ,
f(n) = 4logn + loglogn
we can write this function as
f(n) = Ө(logn)
Because its upper bound and lower bound
are O(logn) and Ω(logn) repectively, which are same
so it is legal to write this function as ,
f(n)= Ө(logn)
Proof:
**Finding upper bound :**
f(n) = 4logn+loglogn
For all sufficience value of n>=2
4logn <= 4 logn
loglogn <= logn
Thus ,
f(n) = 4logn+loglogn <= 4logn+logn
<= 5logn
= O(logn) // where c1 can be 5 and n0 =2
**Finding lower bound :**
f(n) = 4logn+loglogn
For all sufficience value of n>=2
f(n) = 4logn+loglogn >= logn
Thus, f(n) = Ω(logn) // where c2 can be 1 and n0=2
so ,
f(n) = Ɵ(logn)
Similarly, in the case of insertion sort:
If running time of insertion sort is described by simple function f(n).
In particular , if f(n) = 2n^2+n+1 then
Finding upper bound:
for all sufficient large value of n>=1
2n^2<=2n^2 ------------------- (1)
n <=n^2 --------------------(2)
1 <=n^2 --------------------(3)
adding eq 1,2 and 3, we get.
2n^2+n+1<= 2n^2+n^2+n^2
that is
f(n)<= 4n^2
f(n) = O(n^2) where c=4 and n0=1
Finding lower bound:
for all sufficient large value of n>=1
2n^2+n^2+1 >= 2n^2
that is ,
f(n) >= 2n^2
f(n) = Ω(n^2) where c=2 and n0=1
because upper bound and lower bound are same,
f(n) = Ө(n^2)
if f(n)= 2n^2+n+1 then, c1*g(n) and c2*g(n) are presented by diagram:
In worst case, insertion sort upper bound and lower bound are O(n^2) and Ω(n^2), therefore in worst case it is legal to write the running of insertion sort as Ө(n^2))
In best case, it would be Ө(n).

The best case running time of insertion time is Ө(n) and worst case is Ө(n^2) to be precise. So the running time of insertion sort is O(n^2) not Ө(n^2). O(n^2) means that the running time of the algorithm should be less than or equal to n^2, where as Ө(n^2) means it should be exactly equal to n^2.
The worst case running time will never be less than Ө(n^2). We use Ө(n^2) because it is more accurate.

Insertion Sort Time "Computational" Complexity: O(n^2), Ω(n)
O(SUM{1..n}) = O(1/2 n(n+1)) = O(1/2 n^2 + 1/2 n)) ~ O(n^2)
Ө(SUM{1..(n/2)}) = Ө(1/8 n(n+2)) = Ө(1/8 n^2 + 1/4 n) ~ Ө(n^2)
Here is a paper that shows that Gapped Insertion Sort is O(n log n), an optimal version of insertion sort: Gapped Insertion Sort
But if you are looking for faster sorting algorithm, there's Counting Sort which has Time: O(3n) at its worst case when k=n (all symbols are unique), Space: O(n)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio