I can solve Copying Books Problem using binary search method as it is easy to implement. But I have just started solving Dynamic Programing problems and I wanted to know Dynamic Programing solution for the problem
Before the invention of book-printing, it was very hard to make a
copy of a book. All the contents had to be re-written by hand by so
called scribers. The scriber had been given a book and after several
months he finished its copy. One of the most famous scribers lived in
the 15th century and his name was Xaverius Endricus Remius Ontius
Xendrianus (Xerox). Anyway, the work was very annoying and boring. And
the only way to speed it up was to hire more scribers.
Once upon a time, there was a theater ensemble that wanted to play
famous Antique Tragedies. The scripts of these plays were divided into
many books and actors needed more copies of them, of course. So they
hired many scribers to make copies of these books. Imagine you have m
books (numbered 1, 2, ...., m) that may have different number of
pages ( p_1, p_2, ..., p_m) and you want to make one copy of each
of them. Your task is to divide these books among k scribes, k <=
m. Each book can be assigned to a single scriber only, and every
scriber must get a continuous sequence of books. That means, there
exists an increasing succession of numbers 0 = b_0 < b_1 < b_2, ...
< b_{k-1} <= b_k = m$ such that i-th scriber gets a sequence of books
with numbers between bi-1+1 and bi. The time needed to make a copy of
all the books is determined by the scriber who was assigned the most
work. Therefore, our goal is to minimize the maximum number of pages
assigned to a single scriber. Your task is to find the optimal
assignment.
For Binary Search I am doing the following.
Low =1 and High = Sum of pages of all books
Run Binary search
For Mid(Max pages assigned to a scribe), assign books greedily such that no scribe gets page more than MAX
If scribes remain without work it means actual value is less than MID, if Books remain actual pages is more MID and I am updating accordingly.
Here is a possible dynamic programming solution written in python. I use indexing starting from 0.
k = 2 # number of scribes
# number of pages per book. 11 pages for first book, 1 for second, etc.
pages = [11, 1, 1, 10, 1, 1, 3, 3]
m = len(pages) # number of books
def find_score(assignment):
max_pages = -1
for scribe in assignment:
max_pages = max(max_pages, sum([pages[book] for book in scribe]))
return max_pages
def find_assignment(assignment, scribe, book):
if book == m:
return find_score(assignment), assignment
assign_current = [x[:] for x in assignment] # deep copy
assign_current[scribe].append(book)
current = find_assignment(assign_current, scribe, book + 1)
if scribe == k - 1:
return current
assign_next = [x[:] for x in assignment] # deep copy
assign_next[scribe + 1].append(book)
next = find_assignment(assign_next, scribe + 1, book + 1)
return min(current, next)
initial_assignment = [[] for x in range(k)]
print find_assignment(initial_assignment, 0, 0)
The function find_assignment returns the assignment as a list where the ith element is a list of book indexes assigned to the ith scribe. The score of the assignment is returned as well (the max number of pages a scribe has to copy in the assignment).
The key to dynamic programming is to first identify the subproblem. In this case, the books are ordered and can only be assigned sequentially. Thus the subproblem is to find an optimal assignment for the last n books using s scribes (where n < m and s < k). A subproblem can be solved with smaller subproblems using the following relation: min(assigning book to the "current" scribe, assigning the book to the next scribe).
Related
The problem is the famous leetcode problem (or in similar other contexts), best to buy and sell stocks, with at most k transactions.
Here is the screenshot of the problem:
I am trying to make sense of this DP solution. You can ignore the first part of large k. I don't understand the dp part why it works.
class Solution(object):
def maxProfit(self, k, prices):
"""
:type k: int
:type prices: List[int]
:rtype: int
"""
# for large k greedy approach (ignore this part for large k)
if k >= len(prices) / 2:
profit = 0
for i in range(1, len(prices)):
profit += max(0, prices[i] - prices[i-1])
return profit
# Don't understand this part
dp = [[0]*2 for i in range(k+1)]
for i in reversed(range(len(prices))):
for j in range (1, k+1):
dp[j][True] = max(dp[j][True], -prices[i]+dp[j][False])
dp[j][False] = max(dp[j][False], prices[i]+dp[j-1][True])
return dp[k][True]
I was able to drive a similar solution, but that uses two rows (dp and dp2) instead of just one row (dp in the solution above). To me it looks like the solution is overwriting values on itself, which for this solution doesn't look right. However the answer works and passes leetcode.
Lets put words to it:
for i in reversed(range(len(prices))):
For each future price we already know in advance, after considering later prices.
for j in range (1, k+1):
For each possibility of considering this price as one of k two-price transactions.
dp[j][True] = max(dp[j][True], -prices[i]+dp[j][False])
If we consider this might be a purchase -- since we are going backwards in time, a purchase means a completed transaction -- we choose the best of (1) having considered the jth purchase already (dp[j][True]) or (2) subtract this price as a purchase and add the best result we have already that includes the jth sale (-prices[i] + dp[j][False]).
dp[j][False] = max(dp[j][False], prices[i]+dp[j-1][True])
Otherwise, we might consider this as a sale (the first half of a transaction since we're going backwards), so we choose the best of (1) the jth sale already considered (dp[j][False]), or (2) we add this price as a sale and add to that the best result we have so far for the first (j - 1) completed transactions (prices[i] + dp[j-1][True]).
Note that the first dp[j][False] is referring to the jth "half-transaction," the sale if you will, since we are going backwards in time, that would have been set in an earlier iteration on a later price. We then can possibly overwrite it with our consideration of this price as a jth sale.
I am trying to find a solution in which a given resource (eg. budget) will be best distributed to different options which yields different results on the resource provided.
Let's say I have N = 1200 and some functions. (a, b, c, d are some unknown variables)
f1(x) = a * x
f2(x) = b * x^c
f3(x) = a*x + b*x^2 + c*x^3
f4(x) = d^x
f5(x) = log x^d
...
And also, let's say there n number of these functions that yield different results based on its input x, where x = 0 or x >= m, where m is a constant.
Although I am not able to find exact formula for the given functions, I am able to find the output. This means that I can do:
X = f1(N1) + f2(N2) + f3(N3) + ... + fn(Nn) where (N1 + ... Nn) = N as many times as there are ways of distributing N into n numbers, and find a specific case where X is the greatest.
How would I actually go about finding the best distribution of N with the least computation power, using whatever libraries currently available?
If you are happy with allocations constrained to be whole numbers then there is a dynamic programming solution of cost O(Nn) - so you can increase accuracy by scaling if you want, but this will increase cpu time.
For each i=1 to n maintain an array where element j gives the maximum yield using only the first i functions giving them a total allowance of j.
For i=1 this is simply the result of f1().
For i=k+1 consider when working out the result for j consider each possible way of splitting j units between f_{k+1}() and the table that tells you the best return from a distribution among the first k functions - so you can calculate the table for i=k+1 using the table created for k.
At the end you get the best possible return for n functions and N resources. It makes it easier to find out what that best answer is if you maintain of a set of arrays telling the best way to distribute k units among the first i functions, for all possible values of i and k. Then you can look up the best allocation for f100(), subtract off the value this allocated to f100() from N, look up the best allocation for f99() given the resulting resources, and carry on like this until you have worked out the best allocations for all f().
As an example suppose f1(x) = 2x, f2(x) = x^2 and f3(x) = 3 if x>0 and 0 otherwise. Suppose we have 3 units of resource.
The first table is just f1(x) which is 0, 2, 4, 6 for 0,1,2,3 units.
The second table is the best you can do using f1(x) and f2(x) for 0,1,2,3 units and is 0, 2, 4, 9, switching from f1 to f2 at x=2.
The third table is 0, 3, 5, 9. I can get 3 and 5 by using 1 unit for f3() and the rest for the best solution in the second table. 9 is simply the best solution in the second table - there is no better solution using 3 resources that gives any of them to f(3)
So 9 is the best answer here. One way to work out how to get there is to keep the tables around and recalculate that answer. 9 comes from f3(0) + 9 from the second table so all 3 units are available to f2() + f1(). The second table 9 comes from f2(3) so there are no units left for f(1) and we get f1(0) + f2(3) + f3(0).
When you are working the resources to use at stage i=k+1 you have a table form i=k that tells you exactly the result to expect from the resources you have left over after you have decided to use some at stage i=k+1. The best distribution does not become incorrect because that stage i=k you have worked out the result for the best distribution given every possible number of remaining resources.
Consider a set of 13 Danish, 11 Japanese and 8 Polish people. It is well known that the number of different ways of dividing this set of people to groups is the 13+11+8=32:th Bell number (the number of set partitions). However we are asked to find the number of possible set partitions under a given constraint. The question is as follows:
A set partition is said to be good if it has no group consisting of at least two people that only includes a single nationality. How many good partitions there are for this set? (A group may include only one person.)
The brute force approach requires going though about 10^26 partitions and checking which ones are good. This seems pretty unfeasible, especially if the groups are larger or one introduces other nationalities. Is there a smart way instead?
EDIT: As a side note. There probably is no hope for a really nice solution. A highly esteemed expert in combinatorics answered a related question, which, I think, basically says that the related problem, and thus this problem also, is very difficult to solve exactly.
Here's a solution using dynamic programming.
It starts from an empty set, then adds one element at a time and calculates all the valid partitions.
The state space is huge, but notice that to be able to calculate the next step we only need to know about a partition the following things:
For each nationality, how many sets it contains that consists of only a single member of that nationality. (e.g.: {a})
How many sets it contains with mixed elements. (e.g.: {a, b, c})
For each of these configurations I only store the total count. Example:
[0, 1, 2, 2] -> 3
{a}{b}{c}{mixed}
e.g.: 3 partitions that look like: {b}, {c}, {c}, {a,c}, {b,c}
Here's the code in python:
import collections
from operator import mul
from fractions import Fraction
def nCk(n,k):
return int( reduce(mul, (Fraction(n-i, i+1) for i in range(k)), 1) )
def good_partitions(l):
n = len(l)
i = 0
prev = collections.defaultdict(int)
while l:
#any more from this kind?
if l[0] == 0:
l.pop(0)
i += 1
continue
l[0] -= 1
curr = collections.defaultdict(int)
for solution,total in prev.iteritems():
for idx,item in enumerate(solution):
my_solution = list(solution)
if idx == i:
# add element as a new set
my_solution[i] += 1
curr[tuple(my_solution)] += total
elif my_solution[idx]:
if idx != n:
# add to a set consisting of one element
# or merge into multiple sets that consist of one element
cnt = my_solution[idx]
c = cnt
while c > 0:
my_solution = list(solution)
my_solution[n] += 1
my_solution[idx] -= c
curr[tuple(my_solution)] += total * nCk(cnt, c)
c -= 1
else:
# add to a mixed set
cnt = my_solution[idx]
curr[tuple(my_solution)] += total * cnt
if not prev:
# one set with one element
lone = [0] * (n+1)
lone[i] = 1
curr[tuple(lone)] = 1
prev = curr
return sum(prev.values())
print good_partitions([1, 1, 1, 1]) # 15
print good_partitions([1, 1, 1, 1, 1]) # 52
print good_partitions([2, 1]) # 4
print good_partitions([13, 11, 8]) # 29811734589499214658370837
It produces correct values for the test cases. I also tested it against a brute-force solution (for small values), and it produces the same results.
An exact analytic solution is hard, but a polynomial time+space dynamic programming solution is straightforward.
First of all, we need an absolute order on the size of groups. We do that by comparing how many Danes, Japanese, and Poles we have.
Next, the function to write is this one.
m is the maximum group size we can emit
p is the number of people of each nationality that we have left to split
max_good_partitions_of_maximum_size(m, p) is the number of "good partitions"
we can form from p people, with no group being larger than m
Clearly you can write this as a somewhat complicated recursive function that always select the next partition to use, then call itself with that as the new maximum size, and subtract the partition from p. If you had this function, then your answer is simply max_good_partitions_of_maximum_size(p, p) with p = [13, 11, 8]. But that is going to be a brute force search that won't run in reasonable time.
Finally apply https://en.wikipedia.org/wiki/Memoization by caching every call to this function, and it will run in polynomial time. However you will also have to cache a polynomial number of calls to it.
Let's say we have a news website with 100 pages each displaying several articles, and we want to parse regularly the website to keep statistics on the number of commentaries per article.
The number of commentaries on a article will change rapidly on new articles (so on the first pages), and really slowly on the very old article (on the last pages).
So I will want to parse the first pages way more often than the last pages.
A solution to this problem I imagined would be each time to generate an interval of the pages we want to parse, with the additional requirement that n in this interval would have a probability 1/n of appearing.
For example, we would parse the page 1 every time.
The page 2 would appear in the interval half of the time.
The page 3, 1/3 of the time...
Our algorithm would then generate the 'interval' [1,1] most of the time. The interval [1,2] would be less likely, [1,3] even less ... and [1,100] would be really rare.
Do you see a way to implement this algorithm with the usual random function of most of the languages ?
Is there another way to solve the problem (parse more often the recent content on a website) making more sense ?
Thanks for your help.
edit:
Here is an implementation in Python based on the answer provided by #david-eisenstat.
I tried to implement the version with random() generating integers, but I obtain strange results.
# return a number between 1 and n
def randPage(n):
while True:
r = floor(1 / (1 - random()))
if r <= n:
return r
If you have a function random() that returns doubles in the interval [0, 1), then you look at pages 1 to floor(1 / (1 - random())). Page n is examined if and only if the output of random() is in the interval [1 - 1/n, 1), which has length 1/n.
If you're using an integer random() function in the interval [0, RAND_MAX], then let k = random() and look at RAND_MAX / k pages if k != 0 or all of them if k == 0.
I'm playing a game that has a weapon-forging component, where you combine two weapons to get a new one. The sheer number of weapon combinations (see "6.1. Blade Combination Tables" at http://www.gamefaqs.com/ps/914326-vagrant-story/faqs/8485) makes it difficult to figure out what you can ultimately create out of your current weapons through repeated forging, so I tried writing a program that would do this for me. I give it a list of weapons that I currently have, such as:
francisca
tabarzin
kris
and it gives me the list of all weapons that I can forge:
ball mace
chamkaq
dirk
francisca
large crescent
throwing knife
The problem is that I'm using a brute-force algorithm that scales extremely poorly; it takes about 15 seconds to calculate all possible weapons for seven starting weapons, and a few minutes to calculate for eight starting weapons. I'd like it to be able to calculate up to 64 weapons (the maximum that you can hold at once), but I don't think I'd live long enough to see the results.
function find_possible_weapons(source_weapons)
{
for (i in source_weapons)
{
for (j in source_weapons)
{
if (i != j)
{
result_weapon = combine_weapons(source_weapons[i], source_weapons[j]);
new_weapons = array();
new_weapons.add(result_weapon);
for (k in source_weapons)
{
if (k != i && k != j)
new_weapons.add(source_weapons[k]);
}
find_possible_weapons(new_weapons);
}
}
}
}
In English: I attempt every combination of two weapons from my list of source weapons. For each of those combinations, I create a new list of all weapons that I'd have following that combination (that is, the newly-combined weapon plus all of the source weapons except the two that I combined), and then I repeat these steps for the new list.
Is there a better way to do this?
Note that combining weapons in the reverse order can change the result (Rapier + Firangi = Short Sword, but Firangi + Rapier = Spatha), so I can't skip those reversals in the j loop.
Edit: Here's a breakdown of the test example that I gave above, to show what the algorithm is doing. A line in brackets shows the result of a combination, and the following line is the new list of weapons that's created as a result:
francisca,tabarzin,kris
[francisca + tabarzin = chamkaq]
chamkaq,kris
[chamkaq + kris = large crescent]
large crescent
[kris + chamkaq = large crescent]
large crescent
[francisca + kris = dirk]
dirk,tabarzin
[dirk + tabarzin = francisca]
francisca
[tabarzin + dirk = francisca]
francisca
[tabarzin + francisca = chamkaq]
chamkaq,kris
[chamkaq + kris = large crescent]
large crescent
[kris + chamkaq = large crescent]
large crescent
[tabarzin + kris = throwing knife]
throwing knife,francisca
[throwing knife + francisca = ball mace]
ball mace
[francisca + throwing knife = ball mace]
ball mace
[kris + francisca = dirk]
dirk,tabarzin
[dirk + tabarzin = francisca]
francisca
[tabarzin + dirk = francisca]
francisca
[kris + tabarzin = throwing knife]
throwing knife,francisca
[throwing knife + francisca = ball mace]
ball mace
[francisca + throwing knife = ball mace]
ball mace
Also, note that duplicate items in a list of weapons are significant and can't be removed. For example, if I add a second kris to my list of starting weapons so that I have the following list:
francisca
tabarzin
kris
kris
then I'm able to forge the following items:
ball mace
battle axe
battle knife
chamkaq
dirk
francisca
kris
kudi
large crescent
scramasax
throwing knife
The addition of a duplicate kris allowed me to forge four new items that I couldn't before. It also increased the total number of forge tests to 252 for a four-item list, up from 27 for the three-item list.
Edit: I'm getting the feeling that solving this would require more math and computer science knowledge than I have, so I'm going to give up on it. It seemed like a simple enough problem at first, but then, so does the Travelling Salesman. I'm accepting David Eisenstat's answer since the suggestion of remembering and skipping duplicate item lists made such a huge difference in execution time and seems like it would be applicable to a lot of similar problems.
Start by memoizing the brute force solution, i.e., sort source_weapons, make it hashable (e.g., convert to a string by joining with commas), and look it up in a map of input/output pairs. If it isn't there, do the computation as normal and add the result to the map. This often results in big wins for little effort.
Alternatively, you could do a backward search. Given a multiset of weapons, form predecessors by replacing one of the weapon with two weapons that forge it, in all possible ways. Starting with the singleton list consisting of the singleton multiset consisting of the goal weapon, repeatedly expand the list by predecessors of list elements and then cull multisets that are supersets of others. Stop when you reach a fixed point.
If linear programming is an option, then there are systematic ways to prune search trees. In particular, let's make the problem easier by (i) allowing an infinite supply of "catalysts" (maybe not needed here?) (ii) allowing "fractional" forging, e.g., if X + Y => Z, then 0.5 X + 0.5 Y => 0.5 Z. Then there's an LP formulation as follows. For all i + j => k (i and j forge k), the variable x_{ijk} is the number of times this forge is performed.
minimize sum_{i, j => k} x_{ijk} (to prevent wasteful cycles)
for all i: sum_{j, k: j + k => i} x_{jki}
- sum_{j, k: j + i => k} x_{jik}
- sum_{j, k: i + j => k} x_{ijk} >= q_i,
for all i + j => k: x_{ijk} >= 0,
where q_i is 1 if i is the goal item, else minus the number of i initially available. There are efficient solvers for this easy version. Since the reactions are always 2 => 1, you can always recover a feasible forging schedule for an integer solution. Accordingly, I would recommend integer programming for this problem. The paragraph below may still be of interest.
I know shipping an LP solver may be inconvenient, so here's an insight that will let you do without. This LP is feasible if and only if its dual is bounded. Intuitively, the dual problem is to assign a "value" to each item such that, however you forge, the total value of your inventory does not increase. If the goal item is valued at more than the available inventory, then you can't forge it. You can use any method that you can think of to assign these values.
I think you are unlikely to get a good general answer to this question because
if there was an efficient algorithm to solve your problem, then it would also be able to solve NP-complete problems.
For example, consider the problem of finding the maximum number of independent rows in a binary matrix.
This is a known NP-complete problem (e.g. by showing equivalence to the maximum independent set problem).
We can reduce this problem to your question in the following manner:
We can start holding one weapon for each column in the binary matrix, and then we imagine each row describes an alternative way of making a new weapon (say a battle axe).
We construct the weapon translation table such that to make the battle axe using method i, we need all weapons j such that M[i,j] is equal to 1 (this may involve inventing some additional weapons).
Then we construct a series of super weapons which can be made by combining different numbers of our battle axes.
For example, the mega ultimate battle axe may require 4 battle axes to be combined.
If we are able to work out the best weapon that can be constructed from your starting weapons, then we have solved the problem of finding the maximum number of independent rows in the original binary matrix.
It's not a huge saving, however looking at the source document, there are times when combining weapons produces the same weapon as one that was combined. I assume that you won't want to do this as you'll end up with less weapons.
So if you added a check for if the result_weapon was the same type as one of the inputs, and didn't go ahead and recursively call find_possible_weapons(new_weapons), you'd trim the search down a little.
The other thing I could think of, is you are not keeping a track of work done, so if the return from find_possible_weapons(new_weapons) returns the same weapon that you already have got by combining other weapons, you might well be performing the same search branch multiple times.
e.g. if you have a, b, c, d, e, f, g, and if a + b = x, and c + d = x, then you algorithm will be performing two lots of comparing x against e, f, and g. So if you keep a track of what you've already computed, you'll be onto a winner...
Basically, you have to trim the search tree. There are loads of different techniques to do this: it's called search. If you want more advice, I'd recommend going to the computer science stack exchange.
If you are still struggling, then you could always start weighting items/resulting items, and only focus on doing the calculation on 'high gain' objects...
You might want to start by creating a Weapon[][] matrix, to show the results of forging each pair. You could map the name of the weapon to the index of the matrix axis, and lookup of the results of a weapon combination would occur in constant time.