What is a good non-recursive algorithm for deciding whether a passed in amount can be built additively from a set of numbers? - algorithm

What is a non recursive algorithm for deciding whether a passed in amount can be built additively from a set of numbers.
In my case I'm determining whether a certain currency amount (such as $40) can be met by adding up some combination of a set of bills (such as $5, $10 and $20 bills). That is a simple example, but the algorithm needs to work for any currency set (some currencies use funky bill amounts and some bills may not be available at a given time).
So $50 can be met with a set of ($20 and $30), but cannot be met with a set of ($20 and $40). The non-recursive requirement is due to the target code base being for SQL Server 2000 where the support of recursion is limited.
In addition this is for supporting a multi currency environment where the set of bills available may change (think a foreign currency exchange teller for example).

You have twice stated that the algorithm cannot be recursive, yet that is the natural solution to this problem. One way or another, you will need to perform a search to solve this problem. If recursion is out, you will need to backtrack manually.
Pick the largest currency value below the target value. If it's match, you're done. If not, push the current target value on a stack and subtract from the target value the picked currency value. Keep doing this until you find a match or there are no more currency values left. Then use the stack to backtrack and pick a different value.
Basically, it's the recursive solution inside a loop with a manually managed stack.

If you treat each denomination as a point on a base-n number, where n is the maximum number of notes you would need, then you can increment through that number until you've exhausted the problem space or found a solution.
The maximum number of notes you would need is the Total you require divided by the lowest denomination note.
It's a brute force response to the problem, but it'll definitely work.
Here's some p-code. I'm probably all over the place with my fence posts, and it's so unoptimized to be ridiculous, but it should work. I think the idea's right anyway.
Denominations = [10,20,50,100]
Required = 570
Denominations = sort(Denominations)
iBase = integer (Required / Denominations[1])
BumpList = array [Denominations.count]
BumpList.Clear
repeat
iTotal = 0
for iAdd = 1 to Bumplist.size
iTotal = iTotal + bumplist [iAdd] * Denominations[iAdd]
loop
if iTotal = Required then exit true
//this bit should be like a mileometer.
//We add 1 to each wheel, and trip over to the next wheel when it gets to iBase
finished = true
for iPos from bumplist.last to bumplist.first
if bumplist[iPos] = (iBase-1) then bumplist[iPos] = 0
else begin
finished = false
bumplist[iPos] = bumplist[iPos]+1
exit for
end
loop
until (finished)
exit false

That's a problem that can be solved by an approach known as dynamic programming. The lecture notes I have are too focused on bioinformatics, unfortunately, so you'll have to google for it yourself.

This sounds like the subset sum problem, which is known to be NP-complete.
Good luck with that.
Edit: If you're allowed arbitrary number of bills/coins of some denomination (as opposed to just one), then it's a different problem, and is easier. See the coin problem. I realized this when reading another answer to a (suspiciously) similar question.

I agree with Tyler - what you are describing is a variant of the Subset Sum problem which is known to be NP-Complete. In this case you are a bit lucky as you are working with a limited set of values so you can use dynamic programming techniques here to optimize the problem a bit. In terms of some general ideas for the code:
Since you are dealing with money, there are only so many ways to make change with a given bill and in most cases some bills are used more often than others. So if you store the results you can keep a set of the most common solutions and then just check them before you try and find the actual solution.
Unless the language you are working with doesn't support recursion there is no reason to completely ignore the use of recursion in the solution. While any recursive problem can be solved using iteration, this is a case where recursion is likely going to be easier to write.
Some of the other users such as Kyle and seanyboy point you in the right direction for writing your own function so you should take a look at what they have provided for what you are working on.

You can deal with this problem with Dynamic Programming method as MattW. mentioned.
Given limited number of bills and maximum amount of money, you can try the following solution. The code snippet is in C# but I believe you can port it to other language easily.
// Set of bills
int[] unit = { 40,20,70};
// Max amount of money
int max = 100000;
bool[] bucket = new bool[max];
foreach (int t in unit)
bucket[t] = true;
for (int i = 0; i < bucket.Length; i++)
if (bucket[i])
foreach (int t in unit)
if(i + t < bucket.Length)
bucket[i + t] = true;
// Check if the following amount of money
// can be built additively
Console.WriteLine("15 : " + bucket[15]);
Console.WriteLine("50 : " + bucket[50]);
Console.WriteLine("60 : " + bucket[60]);
Console.WriteLine("110 : " + bucket[110]);
Console.WriteLine("120 : " + bucket[120]);
Console.WriteLine("150 : " + bucket[150]);
Console.WriteLine("151 : " + bucket[151]);
Output:
15 : False
50 : False
60 : True
110 : True
120 : True
150 : True
151 : False

There's a difference between no recursion and limited recursion. Don't confuse the two as you will have missed the point of your lesson.
For example, you can safely write a factorial function using recursion in C++ or other low level languages because your results will overflow even your biggest number containers within but a few recursions. So the problem you will face will be that of storing the result before it ever gets to blowing your stack due to recursion.
This said, whatever solution you find - and I haven't even bothered understanding your problem deeply as I see that others have already done that - you will have to study the behaviour of your algorithm and you can determine what is the worst case scenario depth of your stack.
You don't need to avoid recursion altogether if the worst case scenario is supported by your platform.

Edit: The following will work some of the time. Think about why it won't work all the time and how you might change it to cover other cases.
Build it starting with the largest bill towards the smallest. This will yeild the lowest number of bills.
Take the initial amount and apply the largest bill as many times as you can without going over the price.
Step to the next largest bill and apply it the same way.
Keep doing this until you are on your smallest bill.
Then check if the sum equals the target amount.

Algorithm:
1. Sort currency denominations available in descending order.
2. Calculate Remainder = Input % denomination[i] i -> n-1, 0
3. If remainder is 0, the input can be broken down, otherwise it cannot be.
Example:
Input: 50, Available: 10,20
[50 % 20] = 10, [10 % 10] = 0, Ans: Yes
Input: 50, Available: 15,20
[50 % 20] = 10, [10 % 15] = 15, Ans: No

Related

Dynamic programming from cormen's book

When reading about dynamic programming in "Introduction to algorithms" By cormen, Chapter 15: Dynamic Programming , I came across this statement
When developing a dynamic-programming algorithm, we follow a sequence of
four steps:
Characterize the structure of an optimal solution.
Recursively define the value of an optimal solution.
Compute the value of an optimal solution, typically in a bottom-up fashion.
Construct an optimal solution from computed information.
Steps 1–3 form the basis of a dynamic-programming solution to a problem. If we
need only the value of an optimal solution, and not the solution itself, then we
can omit step 4. When we do perform step 4, we sometimes maintain additional
information during step 3 so that we can easily construct an optimal solution.
I did not understand the difference in step 3 and 4.
computing the value of optimal solution
and
constructing the optimal solution.
I was expecting to understand this by reading even further, but failed to understand.
Can some one help me understanding this by giving an example ?
Suppose we are using dynamic programming to work out whether there is a subset of [1,3,4,6,10] that sums to 9.
The answer to step 3 is the value, in this case "TRUE".
The answer to step 4 is working out the actual subset that sums to 9, in this case "3+6".
In dynamical programming we most of the time end up with a huge results hash. However initially it only contains the result obtained from the first, smallest, simplest (bottom) case and by using these initial results and calculating on top of them we eventually merge to the target. At this point the last item in the hash most of the time is the target (step 3 completed). Then we will have to process it to get the desired result.
A perfect example could be finding the minimum number of cubes summing up to a target. Target is 500 and we should get [5,5,5,5] or if the target is 432 we must get [6,6].
So we can implement this task in JS as follows;
function getMinimumCubes(tgt){
var maxi = Math.floor(Math.fround(Math.pow(tgt,1/3))),
hash = {0:[[]]},
cube = 0;
for (var i = 1; i <= maxi; i++){
cube = i*i*i;
for (var j = 0; j <= tgt - cube; j++){
hash[j+cube] = hash[j+cube] ? hash[j+cube].concat(hash[j].map(e => e.concat(i)))
: hash[j].map(e => e.concat(i));
}
}
return hash[tgt].reduce((p,c) => p.length < c.length ? p:c);
}
var target = 432,
result = [];
console.time("perf:");
result = getMinimumCubes(target);
console.timeEnd("perf:");
console.log(result);
So in this code, hash = {0:[[]]}, is step 1; the nested for loops which eventually prepare the hash[tgt] are in fact step 3 and the .reduce() functor at the return stage is step 4 since it shapes up the last item of the hash (hash[tgt]) to give us the desired result by filtering out the shortest result among all results that sum up to the target value.
To me the step 2 is somewhat meaningless. Not because of the mention of recursion but also by meaning. Besides I have never used nor seen a recursive approach in dynamical programming. It's best implemented with while or for loops.

Reducing the time complexity of this algorithm

I'm playing a game that has a weapon-forging component, where you combine two weapons to get a new one. The sheer number of weapon combinations (see "6.1. Blade Combination Tables" at http://www.gamefaqs.com/ps/914326-vagrant-story/faqs/8485) makes it difficult to figure out what you can ultimately create out of your current weapons through repeated forging, so I tried writing a program that would do this for me. I give it a list of weapons that I currently have, such as:
francisca
tabarzin
kris
and it gives me the list of all weapons that I can forge:
ball mace
chamkaq
dirk
francisca
large crescent
throwing knife
The problem is that I'm using a brute-force algorithm that scales extremely poorly; it takes about 15 seconds to calculate all possible weapons for seven starting weapons, and a few minutes to calculate for eight starting weapons. I'd like it to be able to calculate up to 64 weapons (the maximum that you can hold at once), but I don't think I'd live long enough to see the results.
function find_possible_weapons(source_weapons)
{
for (i in source_weapons)
{
for (j in source_weapons)
{
if (i != j)
{
result_weapon = combine_weapons(source_weapons[i], source_weapons[j]);
new_weapons = array();
new_weapons.add(result_weapon);
for (k in source_weapons)
{
if (k != i && k != j)
new_weapons.add(source_weapons[k]);
}
find_possible_weapons(new_weapons);
}
}
}
}
In English: I attempt every combination of two weapons from my list of source weapons. For each of those combinations, I create a new list of all weapons that I'd have following that combination (that is, the newly-combined weapon plus all of the source weapons except the two that I combined), and then I repeat these steps for the new list.
Is there a better way to do this?
Note that combining weapons in the reverse order can change the result (Rapier + Firangi = Short Sword, but Firangi + Rapier = Spatha), so I can't skip those reversals in the j loop.
Edit: Here's a breakdown of the test example that I gave above, to show what the algorithm is doing. A line in brackets shows the result of a combination, and the following line is the new list of weapons that's created as a result:
francisca,tabarzin,kris
[francisca + tabarzin = chamkaq]
chamkaq,kris
[chamkaq + kris = large crescent]
large crescent
[kris + chamkaq = large crescent]
large crescent
[francisca + kris = dirk]
dirk,tabarzin
[dirk + tabarzin = francisca]
francisca
[tabarzin + dirk = francisca]
francisca
[tabarzin + francisca = chamkaq]
chamkaq,kris
[chamkaq + kris = large crescent]
large crescent
[kris + chamkaq = large crescent]
large crescent
[tabarzin + kris = throwing knife]
throwing knife,francisca
[throwing knife + francisca = ball mace]
ball mace
[francisca + throwing knife = ball mace]
ball mace
[kris + francisca = dirk]
dirk,tabarzin
[dirk + tabarzin = francisca]
francisca
[tabarzin + dirk = francisca]
francisca
[kris + tabarzin = throwing knife]
throwing knife,francisca
[throwing knife + francisca = ball mace]
ball mace
[francisca + throwing knife = ball mace]
ball mace
Also, note that duplicate items in a list of weapons are significant and can't be removed. For example, if I add a second kris to my list of starting weapons so that I have the following list:
francisca
tabarzin
kris
kris
then I'm able to forge the following items:
ball mace
battle axe
battle knife
chamkaq
dirk
francisca
kris
kudi
large crescent
scramasax
throwing knife
The addition of a duplicate kris allowed me to forge four new items that I couldn't before. It also increased the total number of forge tests to 252 for a four-item list, up from 27 for the three-item list.
Edit: I'm getting the feeling that solving this would require more math and computer science knowledge than I have, so I'm going to give up on it. It seemed like a simple enough problem at first, but then, so does the Travelling Salesman. I'm accepting David Eisenstat's answer since the suggestion of remembering and skipping duplicate item lists made such a huge difference in execution time and seems like it would be applicable to a lot of similar problems.
Start by memoizing the brute force solution, i.e., sort source_weapons, make it hashable (e.g., convert to a string by joining with commas), and look it up in a map of input/output pairs. If it isn't there, do the computation as normal and add the result to the map. This often results in big wins for little effort.
Alternatively, you could do a backward search. Given a multiset of weapons, form predecessors by replacing one of the weapon with two weapons that forge it, in all possible ways. Starting with the singleton list consisting of the singleton multiset consisting of the goal weapon, repeatedly expand the list by predecessors of list elements and then cull multisets that are supersets of others. Stop when you reach a fixed point.
If linear programming is an option, then there are systematic ways to prune search trees. In particular, let's make the problem easier by (i) allowing an infinite supply of "catalysts" (maybe not needed here?) (ii) allowing "fractional" forging, e.g., if X + Y => Z, then 0.5 X + 0.5 Y => 0.5 Z. Then there's an LP formulation as follows. For all i + j => k (i and j forge k), the variable x_{ijk} is the number of times this forge is performed.
minimize sum_{i, j => k} x_{ijk} (to prevent wasteful cycles)
for all i: sum_{j, k: j + k => i} x_{jki}
- sum_{j, k: j + i => k} x_{jik}
- sum_{j, k: i + j => k} x_{ijk} >= q_i,
for all i + j => k: x_{ijk} >= 0,
where q_i is 1 if i is the goal item, else minus the number of i initially available. There are efficient solvers for this easy version. Since the reactions are always 2 => 1, you can always recover a feasible forging schedule for an integer solution. Accordingly, I would recommend integer programming for this problem. The paragraph below may still be of interest.
I know shipping an LP solver may be inconvenient, so here's an insight that will let you do without. This LP is feasible if and only if its dual is bounded. Intuitively, the dual problem is to assign a "value" to each item such that, however you forge, the total value of your inventory does not increase. If the goal item is valued at more than the available inventory, then you can't forge it. You can use any method that you can think of to assign these values.
I think you are unlikely to get a good general answer to this question because
if there was an efficient algorithm to solve your problem, then it would also be able to solve NP-complete problems.
For example, consider the problem of finding the maximum number of independent rows in a binary matrix.
This is a known NP-complete problem (e.g. by showing equivalence to the maximum independent set problem).
We can reduce this problem to your question in the following manner:
We can start holding one weapon for each column in the binary matrix, and then we imagine each row describes an alternative way of making a new weapon (say a battle axe).
We construct the weapon translation table such that to make the battle axe using method i, we need all weapons j such that M[i,j] is equal to 1 (this may involve inventing some additional weapons).
Then we construct a series of super weapons which can be made by combining different numbers of our battle axes.
For example, the mega ultimate battle axe may require 4 battle axes to be combined.
If we are able to work out the best weapon that can be constructed from your starting weapons, then we have solved the problem of finding the maximum number of independent rows in the original binary matrix.
It's not a huge saving, however looking at the source document, there are times when combining weapons produces the same weapon as one that was combined. I assume that you won't want to do this as you'll end up with less weapons.
So if you added a check for if the result_weapon was the same type as one of the inputs, and didn't go ahead and recursively call find_possible_weapons(new_weapons), you'd trim the search down a little.
The other thing I could think of, is you are not keeping a track of work done, so if the return from find_possible_weapons(new_weapons) returns the same weapon that you already have got by combining other weapons, you might well be performing the same search branch multiple times.
e.g. if you have a, b, c, d, e, f, g, and if a + b = x, and c + d = x, then you algorithm will be performing two lots of comparing x against e, f, and g. So if you keep a track of what you've already computed, you'll be onto a winner...
Basically, you have to trim the search tree. There are loads of different techniques to do this: it's called search. If you want more advice, I'd recommend going to the computer science stack exchange.
If you are still struggling, then you could always start weighting items/resulting items, and only focus on doing the calculation on 'high gain' objects...
You might want to start by creating a Weapon[][] matrix, to show the results of forging each pair. You could map the name of the weapon to the index of the matrix axis, and lookup of the results of a weapon combination would occur in constant time.

Divide a group of people into two disjoint subgroups (of arbitrary size) and find some values

As we know from programming, sometimes a slight change in a problem can
significantly alter the form of its solution.
Firstly, I want to create a simple algorithm for solving
the following problem and classify it using bigtheta
notation:
Divide a group of people into two disjoint subgroups
(of arbitrary size) such that the
difference in the total ages of the members of
the two subgroups is as large as possible.
Now I need to change the problem so that the desired
difference is as small as possible and classify
my approach to the problem.
Well,first of all I need to create the initial algorithm.
For that, should I make some kind of sorting in order to separate the teams, and how am I suppose to continue?
EDIT: for the first problem,we have ruled out the possibility of a set being an empty set. So all we have to do is just a linear search to find the min age and then put it in a set B. SetA now has all the other ages except the age of setB, which is the min age. So here is the max difference of the total ages of the two sets, as high as possible
The way you described the first problem, it is trivial in the way that it requires you to find only the minimum element (in case the subgroups should contain at least 1 member), otherwise it is already solved.
The second problem can be solved recursively the pseudo code would be:
// compute sum of all elem of array and store them in sum
min = sum;
globalVec = baseVec;
fun generate(baseVec, generatedVec, position, total)
if (abs(sum - 2*total) < min){ // check if the distribution is better
min = abs(sum - 2*total);
globalVec = generatedVec;
}
if (position >= baseVec.length()) return;
else{
// either consider elem at position in first group:
generate(baseVec,generatedVec.pushback(baseVec[position]), position + 1, total+baseVec[position]);
// or consider elem at position is second group:
generate(baseVec,generatedVec, position + 1, total);
}
And now just start the function with generate(baseVec,"",0,0) where "" stand for an empty vector.
The algo can be drastically improved by applying it to a sorted array, hence adding a test condition to stop branching, but the idea stays the same.

Efficient (fastest) way to sum elements of matrix in matlab

Lets have matrix A say A = magic(100);. I have seen 2 ways of computing sum of all elements of matrix A.
sumOfA = sum(sum(A));
Or
sumOfA = sum(A(:));
Is one of them faster (or better practise) then other? If so which one is it? Or are they both equally fast?
It seems that you can't make up your mind about whether performance or floating point accuracy is more important.
If floating point accuracy were of paramount accuracy, then you would segregate the positive and negative elements, sorting each segment. Then sum in order of increasing absolute value. Yeah, I know, its more work than anyone would do, and it probably will be a waste of time.
Instead, use adequate precision such that any errors made will be irrelevant. Use good numerical practices about tests, etc, such that there are no problems generated.
As far as the time goes, for an NxM array,
sum(A(:)) will require N*M-1 additions.
sum(sum(A)) will require (N-1)*M + M-1 = N*M-1 additions.
Either method requires the same number of adds, so for a large array, even if the interpreter is not smart enough to recognize that they are both the same op, who cares?
It is simply not an issue. Don't make a mountain out of a mole hill to worry about this.
Edit: in response to Amro's comment about the errors for one method over the other, there is little you can control. The additions will be done in a different order, but there is no assurance about which sequence will be better.
A = randn(1000);
format long g
The two solutions are quite close. In fact, compared to eps, the difference is barely significant.
sum(A(:))
ans =
945.760668102446
sum(sum(A))
ans =
945.760668102449
sum(sum(A)) - sum(A(:))
ans =
2.72848410531878e-12
eps(sum(A(:)))
ans =
1.13686837721616e-13
Suppose you choose the segregate and sort trick I mentioned. See that the negative and positive parts will be large enough that there will be a loss of precision.
sum(sort(A(A<0),'descend'))
ans =
-398276.24754782
sum(sort(A(A<0),'descend')) + sum(sort(A(A>=0),'ascend'))
ans =
945.7606681037
So you really would need to accumulate the pieces in a higher precision array anyway. We might try this:
[~,tags] = sort(abs(A(:)));
sum(A(tags))
ans =
945.760668102446
An interesting problem arises even in these tests. Will there be an issue because the tests are done on a random (normal) array? Essentially, we can view sum(A(:)) as a random walk, a drunkard's walk. But consider sum(sum(A)). Each element of sum(A) (i.e., the internal sum) is itself a sum of 1000 normal deviates. Look at a few of them:
sum(A)
ans =
Columns 1 through 6
-32.6319600960983 36.8984589766173 38.2749084367497 27.3297721091922 30.5600109446534 -59.039228262402
Columns 7 through 12
3.82231962760523 4.11017616179294 -68.1497901792032 35.4196443983385 7.05786623564426 -27.1215387236418
Columns 13 through 18
When we add them up, there will be a loss of precision. So potentially, the operation as sum(A(:)) might be slightly more accurate. Is it so? What if we use a higher precision for the accumulation? So first, I'll form the sum down the columns using doubles, then convert to 25 digits of decimal precision, and sum the rows. (I've displayed only 20 digits here, leaving 5 digits hidden as guard digits.)
sum(hpf(sum(A)))
ans =
945.76066810244807408
Or, instead, convert immediately to 25 digits of precision, then summing the result.
sum(hpf(A(:))
945.76066810244749807
So both forms in double precision were equally wrong here, in opposite directions. In the end, this is all moot, since any of the alternatives I've shown are far more time consuming compared to the simple variations sum(A(:)) or sum(sum(A)). Just pick one of them and don't worry.
Performance-wise, I'd say both are very similar (assuming a recent MATLAB version). Here is quick test using the TIMEIT function:
function sumTest()
M = randn(5000);
timeit( #() func1(M) )
timeit( #() func2(M) )
end
function v = func1(A)
v = sum(A(:));
end
function v = func2(A)
v = sum(sum(A));
end
the results were:
>> sumTest
ans =
0.0020917
ans =
0.0017159
What I would worry about is floating-point issues. Example:
>> M = randn(1000);
>> abs( sum(M(:)) - sum(sum(M)) )
ans =
3.9108e-11
Error magnitude increases for larger matrices
i think a simple way to understand is apply " tic_ toc "function in first and last of your code.
tic
A = randn(5000);
format long g
sum(A(:));
toc
but when you used randn function ,elements of it are random and time of calculation can
different in each cycle CPU calculation .
This better you used a unique matrix whit so large elements to compare time of calculation.

Forming Dynamic Programming algorithm for a variation of Knapsack Problem

I was thinking,
I wanted to do a variation on the Knapsack Problem.
Imagine the original problem, with items with various weights/value.
My version will, along with having the normal weights/values, contain a "group" value.
eg.
Item1[5kg, $600, electronic]
Item2[1kg, $50, food]
Now, having a set of items like this, how would I code up the knapsack problem to make sure that a maximum of 1 item from each "group" is selected.
Notes:
You don't need to choose an item from that group
There are multiple items in each group
You're still minimizing weight, maximizing value
The amount of groups are predefined, along with their values.
I'm just writing a draft of the code out at this stage, and I've chosen to use a dynamic approach. I understand the idea behind the dynamic solution for the regular knapsack problem, how do I alter this solution to incorporate these "groups"?
KnapSackVariation(v,w,g,n,W)
{
for (w = 0 to W)
V[0,w] = 0;
for(i = 1 to n)
for(w = 0 to W)
if(w[i] <= w)
V[i,w] = max{V[i-1, w], v[i] + V[i-1, w-w[i]]};
else
V[i,w] = V[i-1, w];
return V[n,W];
}
That's what I have so far, need to add it so that it will remove all corresponding items from the group it is in each time it solves this.
just noticed your question trying to find an answer to a question of my own. The problem you've stated is a well-known and well-studied problem called the Multiple Choice Knapsack Problem. If you google that you'll find all sorts of information, and I can also recommend this book: http://www.amazon.co.uk/Knapsack-Problems-Hans-Kellerer/dp/3642073115/ref=sr_1_1?ie=UTF8&qid=1318767496&sr=8-1, which dedicates a whole chapter to the problem. In the classic formulation of MCKP, you have to choose one item from each group. However, you can easily convert that version of the problem to your version by adding a dummy item to each group with profit and weight = 0, and the same algorithms will work. I would caution you against trying to adapt code for the binary knapsack problem to the MCKP with a few tweaks--this approach is likely to lead you to a solution whose performance degrades unacceptably as the number of items in each group increases.
Assume
c[i] : The category of the ith element
V[i,w,S] : Maximum value of the knapsack such that it contains at max one item from each category in S
Recursive Formulation
V[i,w,S] = max(V[i-1,w,S],V[i,w-w[i],S-{c[i]}] + v[i])
Base Case
V[0,w,S] = -`infinity if w!=0 or S != {}`

Resources