Solving knapsack with fractional knapsack approach - algorithm

There are two well-known knapsack problems:
1) Given n items, each has its weight and cost. We need to select items, that will fit in our knapsack and have maximal cost in sum. It can be easily solved using dynamic programming.
2) Fractional knapsack: same as the first, but we can take not the whole item only, but its part. This problem can be easily solved with greedy algorithm.
Imagine we are using greedy algorithm from second problem to solve the first one. How can I prove, that the solution we will get is no more than two times worse, than the optimal one?

As far I can see, the greedy solution can be as much as inefficient as you want.
Imagine that you have a knapsack with capacity 1 and two (n = 2) items:
item weight cost density
------------------------
A ε ε 1 <- greedy choice
B 1 1-ε 1-ε <- optimal choice
so the greedy algorithm takes A with ε cost when the optimal solution is
to take B with 1-ε cost. The chosen (greedy) solution is
(1-ε)/ε = 1/ε - 1
times inefficient than optimal one. Make ε as little as you want (say, ε = 1e-100) and have a very inefficient greedy solution.
Edit: in case of integer values only, just scale the sample above: you have a knapsack with capacity X and two (n = 2) items
item weight cost density
------------------------
A 1 1 1 <- greedy choice
B X X-1 1-1/X <- optimal choice
in this case the greedy solution is
(X - 1) / 1 = X - 1
times inefficient than optimal one. Finally, make X to be large enough (say, X = 1e100)

Related

Dynamic Programming - "maximize" matrix chain multiplication

I'm now practice dynamic programming by myself. For the classic problem "matrix-chain multiplication" is to find the minimize number of scalar multiplication. Which is,
M[i,j] = 0 if i=j
= Min(i<=k<j){M[i,k-1]+M[k,j]+Pi-1*Pj*Pk}
and its time complexity is O(n^3)
But I'm just curious what if I want to find the "maximization"(instead of min) of scalar multiplication, does it exist a optimal structure and is it possible to solve it in polynomial time?
The exact same reasoning as the minimization applies:
If you multiply a1 ... ai, the resulting matrix dimension does not rely on the internal parenthetization.
It follows that that if the optimal - that is, most expensive - partition of a1 ... ai ... an is to multiply the matrices from 1 to i and from i + 1 to n, then it is composed of the optimal solutions to a1 ... ai and ai + 1 ... an
Since the optimal substructure remains, you can use the same algorithm as minimization (of course, changing the criteria for optimality from minimum to maximum).

how many operation is need for sorting?

This is a 2016 entrance exam question:
We have N balls with distinct and unknown weights that have labels 1 to n. We are given a two-pan balance and want to use it for weighting these balls in pairs and writing them on a paper in-order to sort all of these balls. In the worst case, how many weighing operations are need? Choose the best answer.
a) Ceil[ n log2 n ]
b) Floor[ n log2 n ]
c) n − 1
d) Ceil[ log2 n! ]
According to the answer answer sheet, the correct solution is: Ceil[ log2 n! ]
My question is: how is this solution is achieved (how does this algorithms work, is there any pesudocode?)?
If you look at Number of Comparisons in Merge-Sort you will find my answer there arguing that the total number of comparisons for mergesort (which is known to have good asymptotic behavior) is
n ⌈log2 n⌉ − 2⌈log2 n⌉ + 1
Of course n ⌈log2 n⌉ = ⌈n log2 n⌉ and 2⌈log2 n⌉ ≥ n so for n ≥ 1 this confirms answer (a) as an upper bound.
Is (b) a tighter upper bound? If you write ⌈log2 n⌉ = log2 n + d for some 0 ≤ d < 1 then you get
n (log2 n + d) − 2d n + 1 = n (log2 n + d − 2d) + 1 = (n log2 n) + n (d − 2d + 1/n)
and if you write m := ⌈log2 n⌉ and n = 2m − d that last parenthesis becomes (d − 2d + 2d − m).
Plotting this for some values of m shows that for integers m ≥ 1 this will very likely be zero. You get m = 0 for n = 1, which means d = 0 so the whole parenthesis becomes zero. So when you worked out the details of the proof, this will show that (b) is indeed an upper bound for mergesort.
How about (c)? There is an easy counterexample for n = 3. If you know that ball 1 is lighter than 2 and smaller than 3, this doesn't tell you how to sort 2 and 3. You can show that you can't have chosen a suboptimal algorithm by comparing 1 to both 2 and 3, due to the symmetry of the problem this is a generic situation. So (c) is not an upper bound. Can it be a lower bound? Sure, even to confirm that the balls are already ordered you have to weigh each consecutive pair, resulting in n − 1 comparisons. Even with the best algorithm you can't do better than guessing the correct order and then confirming your guess.
Is (d) a tighter lower bound? Plots again suggest that it is at least as great as (c), with the exception of a small region with no integer values. So if it is a lower bound, it will be tighter. Now think of a decision tree. Every algorithm to order these n balls can be written as a binary decision tree: you compare two balles named in a given node, and depending on the result of the comparison you proceed with one of two possible next steps. That decision tree has to have n! leafs, since every permutation has to be a distinct leaf so you know the exact permutation once you have reached a leaf. And a binary tree with n! leafs has to have a depth of at least ⌈log2 n!⌉. So yes, this is a lower bound as well.
Summarizing all of this you have (c) ≤ (d) ≤ x ≤ (b) ≤ (a), where x denotes the number of comparisons an optimal algorithm would need to order all the balls. As a comment by Mark Dickinson pointed out, A036604 on OEIS gives explicit lower bounds for some few n, and for n = 12 the inequality (d) < x is strict. So (d) does not describe the optimal algorithm exactly either.
By the way (and to answer your “how does this algorithms work”), finding the optimal algorithm for a given n is fairly easy, at least in theory: compute all possible decision trees for those n! sortings, and choose one with minimal depth. Of course this approach becomes impractical fairly quickly.
Now that we know that none of the answers gives the correct count of the optimal sorting algorithm, which answer is “best”? That depends a lot on context. In many applications, knowing an upper bound to the worst time behavior is more valuable than knowing a lower limit, so (b) would be superior to (d). But apparently the person creating the solution sheet had a different opinion, and went for (d), either because it is closer to the optimum (which I assume but have not proven) or because a lower bound is more useful to the application at hand. If you wanted to, you could likely challenge the whole question on the grounds that “best” wasn't adequately defined in the scope of the question.

Rewrite O(N W) in terms of N

I have this question that asks to rewrite the subset sum problem in terms of only N.
If unaware the problem is that given weights, each with cost 1 how would you find the optimal solution given a max weight to achieve.
So the O(NW) is the space and time costs, where space will be for the 2d matrix and in the use of dynamic programming. This problem is a special case of the knapsac problem.
I'm not sure how to approach this as I tried to think about it and only thing I thought of was find the sum of all weights and just have a general worst case scenario. Thanks
If the weight is not bounded, and so the complexity must depend solely on N, there is at least an O (2N) approach, which is trying all possible subsets of N elements and computing their sums.
If you are willing to use exponential space rather than polynomial space, you can solve the problem in O(n 2^(n/2)) time and O(2^(n/2)) space if you split your set of n weights into two sets A and B of roughly equal size and compute the sum of weights for all the subsets of the two sets, and then hash all sums of subsets in A and hash W - x for all sums x of subsets of B, and if you get a collision between a subset of A and a subset of B in the hash table then you have found a subset that sums to W.

Decision problems that can't even be decided efficiently?

How does these problems fall into the tapestry of the P, NP, NP-Hard, etc... sets? I don't know if any such problems even exists, but what initiated my thought process was thinking of a decidable of the travelling salesman problem:
Given a list of cities and the distances between each pair of cities, and a
Hamiltonian path P, is P the shortest Hamiltonian path?
I suspect that we cannot verify the "shortestness" of P in polynomial time, in which this decision problem is not even in NP. So where does it fall in this case?
This problem is in co-NP. You can think of NP as the class of problems where if the answer is yes, there is a small amount of information I could give you that would convince you of this. For example, the problem
Is there a Hamiltonian cycle in G with cost at most k?
is in NP, because if the answer is yes, I could just give you the cycle and you could check it to see whether it's valid. Coming up with that cycle is hard, but once you have the Hamiltonian cycle it's really easy to use it to check the answer.
The class co-NP consists of problems where if the answer is no, there's a small amount of information I could give you that would convince you of this. In your case, suppose that no, P is not the shortest Hamiltonian path. That means that there's some shorter path P'. If I gave you P', you could easily check that P wasn't ideal. Coming up with P' might be really hard (in fact, it's co-NP-hard!), but once you have it it's pretty straightforward to use it to confirm the answer is no.
Hope this helps!
Given two integers n and m, are there exactly m prime numbers p <= n?
This can be solved in about O (n^(2/3)) and possibly slightly faster, but the problem size is of course not n but log (n), so it takes sub-linear time in n, but exponential time in the problem size. That's not worse than you'd expect from a problem in NP. However, I cannot see any possible information that would allow you to check this quicker.
(Actually, there is an algorithm which determines the number of primes <= n in about O (n^(2/3)) steps, but there is no known algorithm that can check an answer faster than finding the answer. )
Given integers n and k, is 2^n - 1 the k-th Mersenne prime?
It is possible to prove that p is prime in time polynomial in the size of p if a complete factorisation of p + 1 is known, and if p = 2^n - 1 then the complete factorisation of p + 1 is trivial.
However, that is polynomial in the size of p. 2^n - 1 can be checked for primality in time that is polynomial in n. However, that is not polynomial in the size of the problem, which would be roughly the number of digits in n and k. And it would just answer the question whether 2^n - 1 is a Mersenne prime. To prove that it is the k-th Mersenne prime, we would have to check 2^m - 1 for 1 <= m < n and prove that exactly k-1 of these are primes.
Currently the answer to the question is not known for k >= 44 and many 8-digit values n.

Difficult algorithm: Optimal Solution to 1

This is one of those difficult algorithms just because there are so many options. Imagine a number 'N' and a set of Primes under 10 i.e. {2, 3, 5, 7}. The goal is to keep dividing N till we reach 1. If at any steps N is not divisible by any of the given primes then you can an operation out of:
i) N= N-1
OR ii) N=N+1
This will ensure that N is even and we can continue.
The goal should be achieved while using minimum number of operations.
Please note that this may sound trivial i.e. you can implement a step in your algo that "if N is divisible by any prime then divide it". But this does not always produce the optimal solution
E.g. if N=134: Now 134 is divisble by 2. if you divide by 2 , you get 67. 67 is not divisible by any prime, so you do an operation and N will be 66/68 both of which require another operation. So total 2 operations.
Alternatively, if N=134 and you do an operation N=N+1 i.e. N=135, In this case the total operations needed to reach 1 is 1. So this is the optimal solution
Unless there is some mathematical solution for this problem (If you are looking for a mathematical solution, math.SE is better for this question) - you could reduce the problem to a shortest path problem.
Represent the problem as a graph G=(V,E) where V = N (all natural numbers) and E = {(u,v) | you can get from u to v in a single step }1.
Now, you need to run a classic search algorithm from your source (the input number) to your target (the number 1). Some of the choices to get an optimal solution are:
BFS - since the reduced graph is not weighted, BFS is guaranteed to be both complete (find a solution if one exists) and optimal (finds the shortest solution).
heuristic A* - which is also complete and optimal2, and if you have a good heuristic function - should be faster then an uninformed BFS.
Optimization note:
The graph can be constructed "on the fly", no need to create it as pre-processing. To do so, you will need a next:V->2^V (from a node to a set of node) function, such that next(v) = {u | (v,u) is in E}
P.S. complexity comment: The BFS solution is pseudo-polynomial (linear in the input number worst case), since the "highest" vertex you will ever develop is n+1, so the solution is basically O(n) worst case - though I believe deeper analysis can restrict it to a better limit.
(1) If you are interested only in +1/-1 to be counted as ops, you can create the edges based on the target after finishing divisions.
(2) If an admissible heuristic function is used.

Resources