Eligibility for solving using Dynamic Programming - algorithm

For a problem to be solved using dp, do we need both optimal substructure and overlapping subproblems to be satisfied by the problem or any one condition makes it eligible to solve using dp techniques?
If a problem P1 has optimal substructure but subproblems arent overlapping and if P2 has overlapping substructure but optimal substructure isnt satisfied, can i still solve P1 and P2 using dp?

It depends on a problem, but it seems that both P1 and P2 are a poor match for dynamic programming:
P1 - you can use DP, but you won't get any performance improvements, because the problems are not overlapping and you can't reuse solutions.
P2 - if there is no optimal substructure then having a solution to the subproblem does not you help find solution for the larger problem.

Related

Bipartite matchings in multi-graphs

I am trying to find literature for a combinatorial optimization problem, in order to prove the NP-hardness (?) of another problem by reduction. The problem could be defined as a maximum weighted matching (assignment) problem in a k-regular complete balanced weighted bipartite multi-graph with integer weights. I know that it can be reduced to a known problem but I can't find a solution. I would appreciate it if someone gave me a hint.

Optimal substructure and Greedy choice

I was reading about the two properties of a greedy problem and I'm trying to understand the difference between the two below :-
Optimal substructure property: an optimal global solution contains the optimal solutions of all its subproblems.
Greedy choice property: a global optimal solution can be obtained by greedily selecting a locally optimal choice.
Aren't the two equivalent? The two seem like the same thing ; could you give me an example where optimal substructure is satisfied but greedy choice is not? And an example for when greedy choice is satisfied but optimal substructure is not?
They are not equivalent:
Let's assume that we want to find the minimum vertex cover in a tree where each node has a cost(a cost of the cover is the sum of all costs of nodes in this cover). Dynamic programming can be used here: f(v, taken) is the minimum cost of covering a subtree rooted in v in such way that v is in the cover and f(v, not taken) is the minimum cost of covering this subtree without taking v. Optimal substructure property holds true because we can solve subproblems optimally(that is, find an optimal solution for each subtree) and then combine them to find the global optimal solution. However, greedy choice property does not hold true here: picking a vertex with the smallest cost until all edges are covered does not always yield an optimal result.
It is possible that greedy choice property holds true but the optimal substructure property does not if it is not possible to define what a subproblem is. For example, Huffman code construction algorithm always merges two smallest subtrees(and yields an optimal solution), so it is a greedy algorithm, but it is not clear what a subproblem is, so it doesn't make much sense to talk about the first property at all.
For future readers who may not be familiar with vertex cover nor dynamic programming, the phrasing of those definitions does make it sound similar.
I think a useful way to rephrase greedy choice is that the optimal solution will always contain the first choice chosen by the greedy algorithm, although it doesn't necessarily have to be the first choice in the said optimal solution **--> this is why they are different because although something may be optimal and display the greedy choice property you haven't proven that at every step the current most optimal solution was made. Think Prim's MST on a weighted graph: you can start at any vertex but that means that the algorithm may choose different edges at each step for these two solutions but they always choose the edge from any given vertex with the lowest weight, thus they have the greedy choice property. But you haven't proven anything about the whole solution at each step is absolutely optimal, just that choose the greediest option.
That's why they're different, although greedy choice can lead to optimal substructure, it doesn't prove that it has optimal substructure. Common arguments to prove optimal substructure are the exchange argument and the stay-ahead argument which build off of the knowledge the algorithm displays the greedy choice property.

Greedy algorithms and optimal substructure

On Wikipedia page it is said that greedy algorithms are ideal only for problems which have optimal substructure.
Questions:
What is optimal/non-optimal substructure?
What is local and global optimum?
How to prove that Greedy algorithm yields global optimum?
I have found the answers and glad to share:
What is optimal/non-optimal substructure? A problem is said to have optimal substructure if an optimal solution can be constructed efficiently from optimal solutions of its subproblems. This property is used to determine the usefulness of dynamic programming and greedy algorithms for a problem
What is local and global optimum? Local optimum of an optimization problem is a solution that is optimal (either maximal or minimal) within a neighboring set of candidate solutions.
Global optimum - is the optimal solution among all possible solutions, not just those in a particular neighborhood of values.
How to prove that Greedy algorithm yields global optimum?
Usually, global optimum can be proven by induction. Typically, a greedy algorithm is used to solve a problem with optimal substructure if it can be proved by induction that this is optimal at each step. Otherwise, providing the problem exhibits overlapping subproblems as well, dynamic programming is used.
To prove that an optimization problem can be solved using a greedy algorithm, we need to prove that the problem has the following:
Optimal substructure property: an optimal global solution contains the optimal solutions of all its subproblems.
Greedy choice property: a global optimal solution can be obtained by greedily selecting a locally optimal choise.
Matroids can be used as well in some case used to mechanically prove that a particular problem can be solved with a greedy approach.
And finally, some good examples of greedy algorithms.

max-weight k-clique in a complete k-partite graph

My Problem
Whether there's an efficient algorithm to find a max-weight (or min-weight) k-clique in a complete k-partite graph (a graph in which vertices are adjacent if and only if they belong to different partite sets according to wikipedia)?
More Details about the Terms
Max-weight Clique: Every edge in the graph has a weight. The weight of a clique is the sum of the weights of all edges in the clique. The goal is to find a clique with the maximum weight.
Note that the size of the clique is k which is the largest possible clique size in a complete k-partite graph.
What I have tried
I met this problem during a project. Since I am not a CS person, I am not sure about the complexity etc.
I have googled several related papers but none of them deals with the same problem. I have also programmed a greedy algorithm + simulated annealing to deal with it (the result seems not good). I have also tried something like Dynamic Programming (but it does not seem efficient). So I wonder whether the exact optimal can be computed efficiently. Thanks in advance.
EDIT Since my input can be really large (e.g. the number of vertices in each clique is 2^k), I hope to find a really fast algorithm (e.g. polynomial of k in time) that works out the optimal result. If it's not possible, can we prove some lower bound of the complexity?
Generalized Maximum Clique Problem (GMCP)
I understand that you are looking for the Generalized Maximum/ minimum Clique Problem (GMCP), where finding the clique with maximum score or minimum cost is the optimization problem.
This problem is a NP-Hard problem as shown in Generalized network design problems, so there is currently no polynomial time exact solution to your problem.
Since, there is no known polynomial solution to your problem, you have 2 choices. Reducing the problem size to find the exact solution or to find an estimated solution by relaxing your problem and it leads you to a an estimation to the optimal solution.
Example and solution for the small problem size
In small k-partite graphs (in our case k is 30 and each partite has 92 nodes), we were able to get the optimal solution in a reasonable time by a heavy branch and bounding algorithm. We have converted the problem into another NP-hard problem (Mixed Integer Programming), reduced number of integer variables, and used IBM Cplex optimizer to find the optimal solution to GMCP.
You can find our project page and paper useful. I can also share the code with you.
How to estimate the solution
One straight forward estimation to this NP-Hard problem is relaxing the Mixed Integer Programming problem and solve it as a linear programming problem. Of course it will give you an estimation of the solution, but still you might get a reasonable answer in practice.
More general problem (Generalized Maximum Multi Clique Problem)
In another work, we solve the Generalized Maximum Multi Clique Problem (GMMCP), where maximizing the score or minimizing the cost of selecting multiple k-cliques in a complete k-partite graph is in interest. You can find the project page by searching for GMMCP Tracking.
The maximum clique problem in a weighted graph in general is intractable. In your case, if the graph contains N nodes, you can enumerate through all possible k-cliques in N ** k time. If k is fixed (don't know if it is), your problem is trivially polynomially solvable, as this is a polynomial in N. I don't believe the problem to be tractable if k is a free parameter because I can't see how the assumption of a k-partite graph would make the problem significantly simpler from the general one.
How hard your problem is in practice depends also on how the weights are distributed. If all the weights are very near to each others, i.e. the difference between "best" and "good" is relatively small, the problem is very hard. If you have wildly different weights on the edges, the problem can be easier, because a greedy algorithm can give you a good "initial" solution, and you can use that and subsequent good solutions to limit your combinatorial search using the well-known branch-and-bound method.

clique number of a graph

I would like to know a fast algorithm to find only the clique number(without actually finding the clique) of a graph with about 100 vertices.
I am trying to solve the following problem.
http://uva.onlinejudge.org/external/1/193.html
This is NP-complete, and you can't do it much better than actually finding the maximum clique and counting its vertices. From Wikipedia:
Clique problems include:
solving the decision problem of testing whether a graph contains a clique larger than N
These problems are all hard: the clique decision problem is NP-complete (one of Karp's 21 NP-complete problems),
If you can find the clique number in P, then the decision problem is answerable in P (you simply compute the clique number and compare it with N).
Since the decision problem is NP-Complete, finding the clique number of a general graph must be NP-Hard.
As already stated by others, this is probably really hard.
But like many theoretically hard problems, it can be pretty fast in practice with a good algorithm and suitable data. If you implement something like Bron-Kerbosch for finding cliques, while keeping track of the largest clique size you've found so far, then you can backtrack out of fruitless search trees.
For example, if you find a clique with 20 nodes in it, and your network has a large number of nodes with degree less than 20, you can immediately rule out those nodes from further consideration. On the other hand, if the degree distribution is more uniform, then this might be as slow as finding all the cliques.
Although the problem is NP-hard, the size of graph you mention is not any problem for today´s fastest maximum clique exact solvers (for any configuration).
If you are ready to implement the code then I recommend you read the papers connected with the family of algorithms MCQ, MCR and MCS, as well as the family BBMC, BBMCL and BBMCX. An interesting starting point is the comparison survey by Prosser [Prosser 12]. It includes explanation for a Java implementation of these algorithms.

Resources