Writing a proof for an algorithm [closed] - algorithm

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I am trying to compare 2 algorithms. I thought I may try and write a proof for them. (My math sucks, so hence the question.)
Normally in our math lesson last year we would be given a question like <can't use symbols in here so left them out>.
Prove: (2r + 3) = n (n + 4)
Then I would do the needed 4 stages and get the answer at the end.
Where I am stuck is proving prims and Kruskals - how can I get these algorithms in to a form like the mathmatical one above, so I can proceed to prove?
Note: I am not asking people to answer it for me - just help me get it in to a form where I can have a go myself.

To prove the correctness of an algorithm, you typically have to show (a) that it terminates and (b) that its output satisfies the specification of what you're trying to do. These two proofs will be rather different from the algebraic proofs you mention in your question. The key concept you need is mathematical induction. (It's recursion for proofs.)
Let's take quicksort as an example.
To prove that quicksort always terminates, you would first show that it terminates for input of length 1. (This is trivially true.) Then show that if it terminates for input of length up to n, then it will terminate for input of length n+1. Thanks to induction, this is sufficient to prove that the algorithm terminates for all input.
To prove that quicksort is correct, you must convert the specification of comparison sorting to precise mathematical language. We want the output to be a permutation of the input such that if i ≤ j then ai ≤ aj. Proving that the output of quicksort is a permutation of the input is easy, since it starts with the input and just swaps elements. Proving the second property is a little trickier, but again you can use induction.

You don't give many details but there is a community of mathematicians (Mathematical Knowledge Management MKM) who have developed tools to support computer proofs of mathematics. See, for example:
http://imps.mcmaster.ca/
and the latest conference
http://www.orcca.on.ca/conferences/cicm09/mkm09/

Where i am stuck is proving prims and Kruskals - how can i get these algorithms in to a form like the mathmatical one above so i can proceed to prove
I don't think you can directly. Instead, prove that both generate a MST, then prove that any two MST are equal ( or equivalent, since you can have more than one MST for some graphs ). If both algorithms generate MSTs which are shown to be equivalent, then the algorithms are equivalent.

From my maths classes at Uni I (vaguely) remember proving Prims and Kruskals algorithms - and you don't attack it by writing it in a mathematical form. Instead, you take proven theories for Graphs and combine them e.g. http://en.wikipedia.org/wiki/Prim%27s_algorithm#Proof_of_correctness to build the proof.
If your looking to prove the complexity, then simply by the working of the algorithm it's O(n^2). There are some optimisations for the special case where the graph is sparse which can reduce this to O(nlogn).

Most of the times the proof depends on the problem you have in your hand. Simple argument can be suffice at times, at some other times you might need rigorous proof. I once used a corollary and proof of already proved theorem to justify my algorithm is right. But that is for a college project.

Maybe you want to try out a semi-automatic proof method. Just to go for something different ;) For example, if you have a Java specification of Prim's and Kruskal's algorithms, optimally building upon the same graph model, you can use the KeY Prover to prove the equivalence of the algorithm.
The crucial part is to formalize your proof obligation in Dynamic Logic (this is an extension of first-order logic with types and means of symbolic execution of Java programs). The formula to prove could match the following (sketchy) pattern:
\forall Graph g. \exists Tree t.
(<{KRUSKAL_CODE_HERE}>resultVar1=t) <-> (<{PRIM_CODE_HERE}>resultVar2=t)
This expresses that for all graphs, both algorithms terminate and the result is the same tree.
If you're lucky and your formula (and algorithm implementations) are right, then KeY can prove it automatically for you. If not, you might need to instantiate some quantified variables which makes it necessary to inspect the previous proof tree.
After having proven the thing with KeY, you can either be happy about having learned something or try to reconstruct a manual proof from the KeY proof - this can be a tedious task since KeY knows a lot of rules specific to Java which are not easy to comprehend. However, maybe you can do something like extracting an Herbrand disjunction from the terms that KeY used to instantiate existential quantifiers at the right-hand side of sequents in the proof.
Well, I think that KeY is an interesting tool and more people should get used to prove critical Java code using tools like that ;)

Related

General approach to proof of correctness for wait-free algorithms

I have used universal construction to design an algorithm for wait-free binary search trees. I have got linearization points for each of the methods. But I'm not sure on how do I formally prove correctness of this algorithm.
On searching for similar papers, I found that they have proved that the algorithm is wait-free and only generates linearizable executions. Is this condition necessary or sufficient ?
Are there any other formal methods to prove correctness for wait-free algorithms ?
To formally prove something, you need the following:
Exact definition of the thing you're going to prove. In your case, that might be some kind of predicates that are always correct. I'll give an example:
For BlockingQueue capped with size N, the number of elements in the queue should always be [0, N].
If BlockingQueue is empty, the number of times item was added there is equal to number of times it was removed from there.
Okay, we have the definition or/and exact description. Now, there are several ways to prove something about it:
Proof by contradiction method. Consider statement is incorrect and end up with some impossible conclusion. For example, we want to prove Integer numbers can be negative point. Let me state:
All integer numbers are positive.
That leads us to the fact that -1 is a positive number which is false, so the statement is incorrect, and original statement is proved.
Prove that for all possible cases the statement is correct by checking all of them (for limited sets), by using Mathematical induction (for Сountable sets), or using some more sophisticated logical statements and ways. More concrete example depends on the theorem we're going to prove.
I would also add that based on my experience, the Proof by contradiction method I wrote above is used very often and suitable in many cases, so probably you should consider it first.

What is the "cut-and-paste" proof technique?

I've seen references to cut-and-paste proofs in certain texts on algorithms analysis and design. It is often mentioned within the context of Dynamic Programming when proving optimal substructure for an optimization problem (See Chapter 15.3 CLRS). It also shows up on graphs manipulation.
What is the main idea of such proofs? How do I go about using them to prove the correctness of an algorithm or the convenience of a particular approach?
The term "cut and paste" shows up in algorithms sometimes when doing dynamic programming (and other things too, but that is where I first saw it). The idea is that in order to use dynamic programming, the problem you are trying to solve probably has some kind of underlying redundancy. You use a table or similar technique to avoid solving the same optimization problems over and over again. Of course, before you start trying to use dynamic programming, it would be nice to prove that the problem has this redundancy in it, otherwise you won't gain anything by using a table. This is often called the "optimal subproblem" property (e.g., in CLRS).
The "cut and paste" technique is a way to prove that a problem has this property. In particular, you want to show that when you come up with an optimal solution to a problem, you have necessarily used optimal solutions to the constituent subproblems. The proof is by contradiction. Suppose you came up with an optimal solution to a problem by using suboptimal solutions to subproblems. Then, if you were to replace ("cut") those suboptimal subproblem solutions with optimal subproblem solutions (by "pasting" them in), you would improve your optimal solution. But, since your solution was optimal by assumption, you have a contradiction. There are some other steps involved in such a proof, but that is the "cut and paste" part.
'cut-and-paste' technique can be used in proving both correctness of greedy algorithm (both optimal structure and greedy-choice property' and dynamic programming algorithm correctness.
Greedy Correctness
This lecture notes Correctness of MST from MIT 2005 undergrad algorithm class exhibits 'cut-and-paste' technique to prove both optimal structure and greedy-choice property.
This lecture notes Correctness of MST from MIT 6.046J / 18.410J spring 2015 use 'cut-and-paste' technique to prove greedy-choice property
Dynamic Programming Correctness
A more authentic explanation for 'cut-and-paste' was introduced in CLRS (3rd edtion) Chapter 15.3 Element of dynamic programming at page 379
"4. You show that the solutions to the subproblems used within the optimal solution to the problem must themselves be optimal by using a “cut-and-paste”
technique, You do so by supposing that each of the subproblem solutions is not optimal and then deriving a contradiction. In particular, by “cutting out” the nonoptimal subproblem solution and “pasting in” the optimal one, you show that you can get a better solution to the original problem, thus contradicting your supposition that you already had an optimal solution. If there is more than one subproblem, they are typically so similar that the cut-and-paste argument for one can be modified for the others with little effort."
Cut-and-Paste is a way used in proofing graph theory concepts, Idea is this: Assume you have solution for Problem A, you want to say some edge/node, should be available in solution. You will assume you have solution without specified edge/node, you try to reconstruct a solution by cutting an edge/node and pasting specified edge/node and say new solution benefit is at least as same as previous solution.
One of a most important samples is proving MST attributes (proving that greedy choice is good enough). see presentation on MST from CLRS book.
Proof by contradiction
P is assumed to be false, that is !P is true.
It is shown that !P implies two mutually contradictory assertions, Q and !Q.
Since Q and !Q cannot both be true, the assumption that P is false must be wrong, and P must be true.
It is not a new proof technique per se. It is just a fun way to articulate a proof by contradiction.
An example of cut-and-paste:
Suppose you are solving the shortest path problem in graphs with vertices x_1, x_2... x_n. Suppose you find a shortest path from x_1 to x_n and it goes through x_i and x_j (in that order). Then, clearly, the sub path from x_i to x_j must be a shortest path between x_i and x_j as well. Why? Because Math.
Proof using Cut and Paste:
Suppose there exists a strictly shorter path from x_i to x_j. "Cut" that path, and "Paste" into the overall path from x_1 to x_n. Then, you will have another path from x_1 to x_n that is (strictly) shorter than the previously known path, and contradicts the "shortest" nature of that path.
Plain Old Proof by Contradiction:
Suppose P: The presented path from x_1 to x_n is a shortest path.
Q: The subpath from x_i to x_j is a shortest path.
The, Not Q => not P (using the argument presented above.
Hence, P => Q.
So, "Cut" and "Paste" makes it a bit more visual and easy to explain.

Interpreting results of algorithm experiment?

I have a quick sort algorithm and a counter that I increment every time a compare or swap is performed. Here are my results for random integer arrays of different sizes -
Array size --- number of operations
10000 --- 238393
20000 --- 511260
40000 --- 1120512
80000 --- 2370145
Edit:
I have removed the incorrect question I was asking in this post. What I am actually asking is -
What Im trying to find out is 'do these results stack up with the theoretical complexity of quicksort (O(N*log(N)))?
Now, basically what I need to know is how do I interpret those results
so I can determine the Big Oh complexity of QuickSort?
By definition, it is impossible to determine the asymptotic complexity of algorithms by considering their behavior for any (finite) set of inputs and extrapolating.
If you want to try anyway, what you should do is what you do in any science: look at the data, come up with a hypothesis (e.g., "these data are approximated by the curve ...") and then try to disprove it (by checking more numbers, for instance). If you can't disprove the hypothesis through further experiments aimed at disproving it, then it can stand. You'll never really know whether you've got it right using this method, but then again, that's true of all empirical science.
As others have pointed out, the preferred (this is an understatement; universally accepted and sole acceptable may be a better phrasing) method of determining the asymptotic bounds of an algorithm is, well, to analyze it mathematically, and produce a proof that it obeys the bound.
EDIT:
This is ignoring the intricacies involved in fitting curves to data, as well as the fact that designing an effectiv experiment is hard to do. I assume you know how to fit curves (it would be no different here than in any other data analysis... you just need to know what you're looking for and how to look) and that you have designed your experiment in such a way that (a) you can answer the questions you want to answer and (b) the answers you get will have some kind of validity. These are separate issues and require literally years of formal education and training in order to begin to properly use and understand.
Though you cannot get the asymptotic bound of your method by only experimenting, sometimes you can evaluate its behavior by drawing a graph of the complexities similar to your function, and looking at the behavior.
You can do it with drawing a graph of some functions y = f(n) such that f(10000) ~= g(10000) [where g is your function], and check the behavior difference.
In your example, we get the following graphs:
We can clearly see that:
The behavior of your results is sub quadric
The behavior is above linear.
It is very close to logarithmic behavior, but just a bit "higher".
From this, we can deduce that your algorithms is probably O(n^2) [not strict! remember, big O is not a strict bound], and also could be O(nlogn), if we deduce the difference from the O(nlogn) function is a noise.
Notes:
This method proves nothing about the algorithm, and particularly
doesn't give you any worst case [or even average case] bound.
This method is usually used to evaluate two algorithms, and not some pre defined functions, to check which is better for which inputs.
EDIT:
I drew all the graphs as y1(x) = f(x), y2(x) = g(x) , ... because I found it easier to explain this way, but usually when you compare two algorithms [as you often actually use this method], the function is y(x) = f(x) / g(x), and you check if y(x) is staying close to 1, growing, shrinking?

Recursion Tree, Solving Recurrence Equations

As far as I know There are 4 ways to solve recurrence equations :
1- Recursion trees
2- Substitution
3 - Iteration
4 - Derivative
We are asked to use Substitution, which we will need to guess a formula for output. I read from CLRS book that there is no magic to do this, i was curious if there are any heuristics to do this?
I can certainly have an idea by drawing a recurrence tree or using iteration but, because the output will be in Big-OH or Theta format, formulas doesnt necessarily match.
Does any one have any recommendation for solving recurrence equations using substitution?
Please note that the list of possible ways to solve recurrence equations is definitely not complete, its merely a set of tools they teach Computer Scientists, because they will most likely solve most of your problems.
For exact solutions of recurrence equations mathematicians use a tool called generating functions. Generating functions give you exact solutions, and in general are more powerful than the master theorem.
There is a great resource online to learn about the here. http://www.math.upenn.edu/~wilf/DownldGF.html
If you go through the first couple examples you should get the hang of it in no time.
You need some math background and understand rudimentary taylor series. http://en.wikipedia.org/wiki/Taylor_series
Generating functions are also extremely useful in probability.
For simple ones, just take a "reasonable" guess.
For more complicated ones, I would go ahead and use a recurrence tree — it seems to me to be the easiest "algorithm" for generating a guess. Note that it can be difficult to use a recurrence tree to prove a bound (the details are tough to get right). Recurrence trees are highly useful for forming guesses which are then proven by substitution.
I'm not sure why you're saying the formulas won't match with the output in Big-O or Theta. They typically don't match exactly, but that's part of the point of Big-O. Part of the trick of going back to substitution is knowing how to plug in the Big-O solution to to make the substitution algebra work out. IIRC, CLRS does work out an example or two of this.

formulation of general dynamic programming problem

I wonder if the objective function of a general dynamic programming problem can always be formulated as in dynamic programming on wiki, where the objective function is a sum of items for action and state at every stage? Or that is just a specical case and what is the general formulation?
EDIT:
By "dynamic programming problem", I mean a problem that can be solved by dynamic programming technique. Such kind of problems possess the property of optimal problem and optimal structure.
But at lease for me it is sometimes not easy to identify such problems, perhaps because I have not become used to that kind of verbal description. When I came across the WIKI page for Bellman equation, I do feel mathematical formulation of the cost function will help somehow. I suspect the overall cost/gain function can always be represented as accumulation of cost/gain from all the stages? and the accumulation can be additive or multiplitive or something else?
When I posted my question, I did realize that it is more proper to discuss dynamic programming in some place more oriented to mathematical optimization. But there are quite a lot of discussion of computer algorithms in Stackoverflow.com. So I did not feel improper to ask my question here either.
That's not how I would characterize an arbitrary optimization problem (or a dynamic programming algorithm). In particular, the factor βt looks like an electrical engineering hack that programmers wouldn't usually want. More subtly, it seems like it won't always be obvious what the function F is for a given problem.
But yes, set β to 1 and any arbitrary objective function can be formulated that way. Generally the objective function may be any function of the initial state and all the actions taken; given such a function, it's easy to define a function F to plug into that formula.
Whether that's a useful thing to do or not depends on the problem, I suppose.
in computer science dynamic programming denotes the building of any algorithm in terms of recursively splitting it into subproblems when the same subproblems appear many times in this recursive expansion. A simple book example, Fibonacci numbers can be calculated using dynamic programming:
From the generic recurrence F(n) = F(n-1) + F(n-2) you could implement the following algorithm:
int fibonacci(n):
if (n < 2): return 1
else: return fibonacci(n-1) + fibonacci(n-2)
Now this is of course not efficient at all, because it creates a huge number of recursive calls, e.g.
F(8) = F(7) + F(6) = [F(6) + F(5)] + [F(5) + F(4)] = ...
So here we already see that fibonacci(5) is computed twice by the implementation. The dynamic programming paradigm is now to "memoize" or "cache" the results, like this:
integer_map store;
int memofibo(n):
if (n < 2) : return 1
else if (store.find_key(n)): return store.find_value(n)
else:
int f = memofibo(n-1) + memofibo(n-2)
store.set(n, f)
return f
This implementation ensure that the recursive step is executed at most once for every argument value of n, so it calculates the nth Fibonacci number in O(n log n) time (assuming standard O(log n)) implementation of the associative array 'store'.
So from the computer science perspective, the link you provided is the operations research / optimization problem version of the same idea (dividing problem into subproblems), but the idea has been abstracted in practice to this recursion + memoization pattern in the domain of general computer science. I hope this helps to clear some of the clouds.
Folks,
There's a new(ish) website that concentrates on operations research questions here but the low volume of traffic there may not get you a good answer very quickly.
Soapbox time:
For those who care to debate on what's appropriate for stack overflow, let us note that an algorithm is an algorithm regardless of who claims it as part of their field. The simplex method, Djikstra's method, branch and bound, lagrangian relaxation, are all algorithms or methods of solving certain types of problems. Many of these are taught and applied in both fields so the border between OR and CS is pretty blurry in this area.
For instance (and a very strong instance it is) the undergrad course in algorithms at MIT includes all of the following - Randomized Competitive Algorithm, Dynamic Programming, Greedy Algorithms, Minimum Spanning Trees, Shortest Paths, Dijkstra's Algorithm, Bellman-Ford, Linear Programming, Depth-first Search, Topological Sort, and All-pairs Shortest Paths among other topics. I'll defer to MIT in this case.
I like stack overflow because many programmers recognize an optimization problem when they encounter it, but often they just need a little help in deciding how to formulate the problem or even what the problem is called by name.

Resources