Dynamic programming question

Dynamic programming question - algorithm

I am stuck with one of the algorithm homework problem. Can anyone give me some hint to solve it? Here is the question:
Consider a chain structured computation represented by a weighted graph G = (V;E) where
V = {v1; v2; ... ; vn} and E = {(vi; vi+1) such that 1<= i <= n-1. We are also given a chain-structure m identical processors P = {P1; ... ; Pm} (i.e., there exists a communication link between Pk and Pk+1 for 1 <= k <= m - 1).
The set of vertices V represents computation modules, and the set of edges E represents
communication between the two modules. Each node vi is assigned a weight wi denoting the
execution time of the module on a single processor. Each edge (vi; vi+1) is assigned a weight ci denoting the amount of communication time between the two modules if they are assigned two different processors. If multiple modules are assigned to the same processor, the modules assigned to the same processor must be consecutive. Suppose modules va; va+1; .. ; vb are assigned to Processor Pk. Then, the time taken by Pk, denoted by Tk, is the time to compute assigned modules plus the time to communicate between neighboring processors. Hence, Tk = wa+...+ wb + ca-1 + cb. Note here that ca-1 = 0 if a = 1 and cb = 0 if b = n.
The objective of the problem is to find an assignment V to P such that max1<=k<=m Tk
is minimized, where we assume that each processor must take at least one module. (This
assumption can be relaxed by adding m dummy modules with zero weight on computational
and communication time.)
Develop a dynamic programming algorithm to solve this problem in polynomial time(i.e O(mn))
I tried to find the minimum execution time for each Pk and then find the max, but I doubt my solution is dynamic programming since there is no recursive formula. Please give me some hints!
Thanks!

I think you might be able to modify the Viterbi algorithm to solve this problem.

okay. this is easy.
decompose your problem to be a function you need to minimise, say F(n,k). which results into the minimum assignment of the first n nodes to k first processors.
Then derive your formula like this, collecting the number of nodes on the kth processor.
F(n,k) = min[i=0..n]( max(F(i,k-1), w[i]+...+w[n]+c[i-1]+c[n]) )
c[0] = 0
F(*,0) = inf
F(0,*) = inf

Related

Choose the best cluster partition based on a cost function

I've a string that I'd like to cluster:
s = 'AAABBCCCCC'
I don't know in advance how many clusters I'll get. All I have, is a cost function that can take a clustering and give it a score.
There is also a constraint on the cluster sizes: they must be in a range [a, b]
In my exemple, for a=3 and b=4, all possible clustering are:
[
['AAA', 'BBC', 'CCCC'],
['AAA', 'BBCC', 'CCC'],
['AAAB', 'BCC', 'CCC'],
]
Concatenation of each clustering must give the string s
The cost function is something like this
cost(clustering) = alpha*l + beta*e + gamma*d
where:
l = variance(cluster_lengths)
e = mean(clusters_entropies)
d = 1 - nb_characters_in_b_that_are_not_in_a)/size_of_b (for b the
consecutive cluster of a)
alpha, beta, gamma are weights
This cost function gives a low cost (0) for the best case:
Where all clusters have the same size.
Content inside each cluster is the same.
Consecutive clusters don't have the same content.
Theoretically, the solution is to calculate the cost of all possible compositions for this string and choose the lowest. but It will take too much time.
Is there any clustering algorithme that can find the best clustering according to this cost function in a reasonable time ?

A dynamic programming approach should work here.
Imagine, first, that a cost(clustering) equals to the sum of cost(cluster) for all all clusters that constitute the clustering.
Then, a simple DP function is defined as follows:
F[i] = minimal cost of clustering the substring s[0:i]
and calculated in the following way:
for i = 0..length(s)-1:
for j = a..b:
last_cluster = s[i-j..i]
F[i] = min(F[i], F[i - j] + cost(last_cluster))
Of course, first you have to initialize values of F to some infinite values or nulls to correctly apply min function.
To actually restore the answer, you can store additional values P[i], which would contain the lengths of the last cluster with optimal clustering of string s[0..i].
When you update F[i], you also update P[i].
Then, restoring answer is little trouble:
current_pos = length(s) - 1
while (current_pos >= 0):
current_cluster_length = P[current_pos]
current_cluster = s[(current_pos - current_cluster_length + 1)..current_pos]
// grab current_cluster to the answer
current_pos -= current_cluster_length
Note that in this approach you will get the clsuters in the inverse order, meaning from the last cluster all the way to the first one.
Let's now apply this idea to the initial problem.
What we would like is to make cost(clustering) more or less linear, so that we can compute it cluster by cluster instead of computing it for the whole clustering.
The first parameter of our DP function F will be, as before, i, the number of chars in the substring s[0:i] we have found optimal answer to.
The meaning of the F function is, as usual, the minimal cost we can achieve with the given parameters.
The parameter e = mean(clusters_entropies) of the cost function is already linear and can be computed cluster by cluster, so this is not a problem.
The parameter l = variance(cluster_lengths) is a little bit more complex.
The variance of n values is defined as Sum[(x[i] - mean)^2] / n.
mean is expected value, namely mean = Sum[x[i]] / n.
Note also that Sum[x[i]] is the sum of lengths of all clusters and in our case it is always fixed and equals to length(s).
Therefore, mean = length(s) / n.
Okay, we have more or less made our l part of cost function linear except the n parameter. We will add this parameter, namely the number of clusters in the desired clustering, as a parameter to our F function.
We will also have a parameter cur which will mean the number of clusters currently assembled in the given state.
The parameter d of the cost function also requires adding additional parameter to our DP function F, namely j, sz, the size of the last cluster in our partition.
Overall, we have come up with a DP function F[i][n][cur][sz] that gives us the minimal cost function of partitioning string s[0:i] into n clusters of which cur are currently constructed with the size of the last cluster equal to sz. Of course, our responsibility is to make sure that a<=sz<=b.
The answer in terms of the minimal cost function will be the minimum among all possible n and a<=sz<=b values of DP function F[length(s)-1][n][n][sz].
Now notice that this time we do not even require the companion P function to store the length of the last cluster as we already included that information as the last sz parameter into our F function.
We will, however, store in P[i][n][cur][sz] the length of the next to last cluster in the optimal clustering with the specified parameters. We will use that value to restore our solution.
Thus, we will be able to restore an answer in the following way, assuming the minimum of F is achieved in the parameters n=n0 and sz=sz0:
current_pos = length(s) - 1
current_n = n0
current_cluster_size = sz0
while (current_n > 0):
current_cluster = s[(current_pos - current_cluster_size + 1)..current_pos]
next_cluster_size = P[current_pos][n0][current_n][current_cluster_size]
current_n--;
current_pos -= current_cluster_size;
current_cluster_size = next_cluster_size
Let's now get to the computation of F.
I will omit the corner cases and range checks, but it will be enough to just initialize F with some infinite values.
// initialize for the case of one cluster
// d = 0, l = 0, only have to calculate entropy
for i=0..length(s)-1:
for n=1..length(s):
F[i][n][1][i+1] = cluster_entropy(s[0..i]);
P[i][n][1][i+1] = -1; // initialize with fake value as in this case there is no previous cluster
// general case computation
for i=0..length(s)-1:
for n=1..length(s):
for cur=2..n:
for sz=a..b:
for prev_sz=a..b:
cur_cluster = s[i-sz+1..i]
prev_cluster = s[i-sz-prev_sz+1..i-sz]
F[i][n][cur][sz] = min(F[i][n][cur][sz], F[i-sz][n][cur - 1][prev_sz] + gamma*calc_d(prev_cluster, cur_cluster) + beta*cluster_entropy(cur_cluster)/n + alpha*(sz - s/n)^2)

Dijkstra's algorithm: is my implementation flawed?

In order to train myself both in Python and graph theory, I tried to implement the Dijkstra algo using Python 3, and submitted it against several online judges, to see if it was correct.
It works well in many cases, but not always.
For example, I am stuck with this one: the test case works fine and I also have tried custom test cases of my own, but when I submit the following solution, the judge keeps telling me "wrong answer", and the expected result is very different from my output, indeed.
Notice that the judge tests it against quite a complex graph (10000 nodes with 100000 edges), while all the cases I tried before never exceeded 20 nodes and around 20-40 edges.
Here is my code.
Given al an adjacency list in the following form:
al[n] = [(a1, w1), (a2, w2), ...]
where
n is the node id;
a1, a2, etc. are its adjacent nodes and w1, w2, etc. the respective weights for the given edge;
and supposing that maximum distance never exceeds 1 billion, I implemented Dijkstra's algorithm this way:
import queue
distance = [1000000000] * (N+1) # this is the array where I store the shortest path between 1 and each other node
distance[1] = 0 # starting from node 1 with distance 0
pq = queue.PriorityQueue()
pq.put((0, 1)) # same as above
visited = [False] * (N+1)
while not pq.empty():
n = pq.get()[1]
if visited[n]:
continue
visited[n] = True
for edge in al[n]:
if distance[edge[0]] > distance[n] + edge[1]:
distance[edge[0]] = distance[n] + edge[1]
pq.put((distance[edge[0]], edge[0]))
Could you please help me understand wether my implementation is flawed, or if I simply ran into some bugged online judge?
Thank you very much.
UPDATE
As requested, I'm providing the snippet I use to populate the adjacency list al for the linked problem.
N,M = input().split()
N,M = int(N), int(M)
al = [[] for n in range(N+1)]
for m in range(M):
try:
a,b,w = input().split()
a,b,w = int(a), int(b), int(w)
al[a].append((b, w))
al[b].append((a, w))
except:
pass
(Please don't mind the ugly "except: pass", I was using it just for debugging purposes... :P)

Primary problem in interpreting the question:
According to your parsing code, you are treating the input data as an undirected graph, i.e. each edge from A to B also is an edge from B to A.
Is seems like this premise is not valid and it should instead be a directed graph, i.e. you have to remove this line:
al[b].append((a, w)) # no back reference!
Previous problem, now already fixed in the code:
Currently, you are using the never-changing weight of the edges in your queue:
pq.put((edge[1], edge[0]))
This way, the nodes always end up at the same position in the queue, no matter at what stage of the algorithm and how far the path to reach that node actually is.
Instead, you should use the new distance to the target node edge[0], i.e. distance[edge[0]] as the priority in the queue:
pq.put((distance[edge[0]], edge[0]))

In a reduction with additive but nonlinear blow up can I still get APX-Hardness?

I'm not entirely sure how to phrase this so first I'll give an example and then, in analogy to the example, try and state my question.
A standard example of an L-reduction is showing bounded degree independent set (for concreteness say instantiated as G = (V,E), B $\ge$ 3 is the bound on degree) is APX complete by L - reduction to max 2 SAT. It works by creating a clause for each edge and a clause for each vertex, the idea being in our simulation by MAX 2 SAT every edge clause will be satisfied and it's optimum is:
OPT = |E| + |Maximum Ind. Set|
Since the degree is bounded, |Max Ind. Set| is \Theta(|E|) and we get an L-Reduction.
Now my question is suppose I have two problems A, which is APX-Complete, and B which is my target problem. Let the optimum of A be \Theta(m) and my solution in B
OPT_B = p(m) + OPT_A
where p is some polynomial with deg(p) > 1. I no longer have an L-reduction, my question is do get anything? Can it be a PTAS reduction? I hope the question is clear, thanks.

Numerical optimization and minimization

Lets say I have a reference node R and several test nodes T1, T2.... Tn.
Any particular node has a set of properties Rp1, Rp2, ... Rpn and T1p1, T1p2, T1p3, ... T1pn, and T2p1, T2p2, T2p3, ... T2pn, and so on. So, any node can have n properties, each of a particular type.
I have my own method of defining the distance between any two properties of the same kind between any two nodes. Furthermore, I would weigh the distances between properties and then sum them up. Thus, the distance between R and T1 would be:
dRT1 = w1*dRT1p1 + w2*dRT1p2 + w3*dRT1p3 + w4*dRT1p4 + ... wn*dRT1pn.
Now, given the reference node R, and the test nodes T1, T2 .... Tn, and given that I know the distance is the least between R and a particular node Tm (1<m<n), and if the weights are actually variables and the distances are actually constants, how do I calculate the weights such that dRTm is the minimal among all the distances between R and every other test nodes.
We have the distances dRT1, dRT2, dRT3, dRT4, ... dRTn and we know that dRTm is minimum. What algorithm should we use to determine the weights?

It seems what you want to do is to set the weights so that a particular distance (dRTm) gets a lower numerical value than any other distance, i.e. set the weights so that the inequalities
dRTm <= dRT1
...
dRTm <= dRTn
are all fulfilled. Setting all the weights to zero as mentioned in one of the comments is a trivial solution because all distances will be identically zero and all the inequalities trivially fulfilled (with equality replacing inequality) so it makes more sense to consider the stronger problem
dRTm < dRT1
...
dRTm < dRT(m-1)
dRTm < dRT(m+1)
...
dRTm < dRTn
In any case, this a simple linear programming problem. Put in the above constraints and then minimize
min : dRTm
It is linear because the calculated individual distances are constants at the time of solving for minimum dRTm. You can solve this with any linear programming package, or if you want to cook up a slow but easy to implement solution by yourself, e.g. with Fourier-Motzkin elimination. Simplex would be the actual method of choice, however.

This looks like http://en.wikipedia.org/wiki/Linear_regression, where after the article says "Thus the model takes the form" y is dRt1, the x variables are e.g. dRT1p1, and the beta variables they are trying to work out - their parameter vector - is your w1, w2...

Combinatorial Optimization - Variation on Knapsack

Here is a real-world combinatorial optimization problem.
We are given a large set of value propositions for a certain product. The value propositions are of different types but each type is independent and adds equal benefit to the overall product. In building the product, we can include any non-negative integer number of "units" of each type. However, after adding the first unit of a certain type, the marginal benefit of additional units of that type continually decreases. In fact, the marginal benefit of a new unit is the inverse of the number of units of that type, after adding the new unit. Our product must have a least one unit of some type, and there is a small correction that we must make to the overall value because of this requirement.
Let T[] be an array representing the number of each type in a certain production run of the product. Then the overall value V is given by (pseudo code):
V = 1
For Each t in T
V = V * (t + 1)
Next t
V = V - 1 // correction
On cost side, units of the same type have the same cost. But units of different types each have unique, irrational costs. The number of types is large, but we are given an array of type costs C[] that is sorted from smallest to largest. Let's further assume that the type quantity array T[] is also sorted by cost from smallest to largest. Then the overall cost U is simply the sum of each unit cost:
U = 0
For i = 0, i < NumOfValueTypes
U = U + T[i] * C[i]
Next i
So far so good. So here is the problem: Given product P with value V and cost U, find the product Q with the cost U' and value V', having the minimal U' such that U' > U, V'/U' > V/U.

The problem you've described is nonlinear integer programming problem because it contains a product of integer variables t. Its feasibility set is not closed because of strict inequalities which can be worked around by using non-strict inequalities and adding a small positive number (epsilon) to the right hand sides. Then the problem can be formulated in AMPL as follows:
set Types;
param Costs{Types}; # C
param GivenProductValue; # V
param GivenProductCost; # U
param Epsilon;
var units{Types} integer >= 0; # T
var productCost = sum {t in Types} units[t] * Costs[t];
minimize cost: productCost;
s.t. greaterCost: productCost >= GivenProductCost + Epsilon;
s.t. greaterValuePerCost:
prod {t in Types} (units[t] + 1) - 1 >=
productCost * GivenProductValue / GivenProductCost + Epsilon;
This problem can be solved using a nonlinear integer programming solver such as Couenne.

Honestly I don't think there is an easy way to solve this. The best thing would be to write the system and solve it with a solver ( Excel solver will do the tricks, but you can use Ampl to solve this non lienar program.)
The Program:
Define: U;
V;
C=[c1,...cn];
Variables: T=[t1,t2,...tn];
Objective Function: SUM(ti.ci)
Constraints:
For all i: ti integer
SUM(ti.ci) > U
(PROD(ti+1)-1).U > V.SUM(ti.ci)
It works well with excel, (you just replace >U by >=U+d where d is the significative number of the costs- (i.e if C=[1.1, 1.8, 3.0, 9.3] d =0.1) since excel doesn't allow stric inequalities in the solver.)
I guess with a real solver like Ampl it will work perfectly.
Hope it helps,

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio