Distrubution of a sum S over N different operands - algorithm

I'm trying to figure out a way to distribute a sum S over N different operands (b1, b2, .., bn), where b1, b2, ... bn are in a fixed ratio, which is determined by another set of operands (a1,a2, .. an)
Consider a situation where:
Candidate A gets a total of Ta votes from N constituencies, with distribution: {a1, a2, a3 .. aN}
Candidate B gets a total of Tb votes (Ta and Tb are unrelated, which means Ta < Tb, Ta = Tb & Ta > Tb are all possible) from M constituencies (IMP: M <= N), distribution unknown.
What is the best approach to allot the Tb votes to the constituencies b1, b2, b3.. bM such that, they are distributed in the same ratio as a1, a2, a3.. aN.
Some Cases:
1.Ideal
Ta = 20 (8,6,4,2) Tb = 10
Then we get: Tb (4,3,2,1)
2.Somewhat less ideal
Ta = 20(8 ,6, 4, 1 , 1) Tb = 10
Then we get (4, 3, 2, 1, 0) which actually means (4,3,2,1) (M < N), and is still tolerable.

Is your a_i sorted always? Assuming that's the case, one way to start is to start assigning b_i from the first value of a_i.

One simple solution:
br = ar * (Tb / Ta)
Which doesn't really work for complex ranges or for a mis-matched Ta and Tb
Like, Ta = 22 (5, 5, 4, 2, 1, 1, 1, 1, 1, 1)
and Tb = 7
UPDATE:
I followed following rules to get to the best solution:
Keep the ratio as (Tb / Ta) and keep on distributing until you run out.
Whenever you round, round up i.e. 3.24 -> 4 and 3.68 also -> 4
e.g. Here: b1 = 5 * 7 / 22 => 2, b2 = 5*7/22 = 2, b3 = 4*7/22 = 2, b4 = 1 (Since just one remains)
So we have Tb = 7(2,2,2,1) Which is closest to (5, 5, 4, 2)

Related

Find the best set among the many sets based on it's item's cost

I have items in sets as a below example. Each item contains particular cost.
I have a max budget. I need to do combination in such a way that in each combination I need at least one item from each set and sum of the costs should be equal to my budget.
Example
A = [a1, a2, a3, a4, ... , a10]
B = [b1, b2, b3, b4, ... , b10]
C = [c1, c2, c3, c4, ... , c10] may be upto G
Max budget = 10
cost of a1 = 2
a2 = 8
b1 = 1
b2 = 7
c1 = 3
c2 = 1
etc
Output can be
[a1, b2, c2] i,e 2+7+1 = 10
[a2, b1, c2] i,e 8+1+1 = 10
[a1, b1, c1] i,e 2+1+3 = 6 Eliminated (since 6 != 10)
goes on
I can have max of 7 sets and 10 items in each. So maximum combinations will be 10^7. Is there any algorithm to achieve this easily. I followed brute force method and it is too expensive.
Thank you.

Generate number with equal probability

You are given a function let’s say bin() which will generate 0 or 1 with equal probability. Now you are given a range of contiguous integers say [a,b] (a and b inclusive).
Write a function say rand() using bin() to generate numbers within range [a,b] with equal probability
The insight you need is that your bin() function returns a single binary digit, or "bit". Invoking it once gives you 0 or 1. If you invoke it twice you get two bits b0 and b1 which can be combined as b1 * 2 + b0, giving you one of 0, 1, 2 or 3 with equal probability. If you invoke it thrice you get three bits b0, b1 and b2. Put them together and you get b2 * 2^2 + b1 * 2 + b0, giving you a member of {0, 1, 2, 3, 4, 5, 6, 7} with equal probability. And so on, as many as you want.
Your range [a, b] has m = b-a+1 values. You just need enough bits to generate a number between 0 and 2^n-1, where n is the smallest value that makes 2^n-1 greater than or equal to m. Then just scale that set to start at a and you're good.
So let's say you are given the range [20, 30]. There are 11 numbers there from 20 to 30 inclusive. 11 is greater than 8 (2^3), but less than 16 (2^4), so you'll need 4 bits. Use bin() to generate four bits b0, b1, b2, and b3. Put them together as x = b3 * 2^3 + b2 * 2^2 + b1 * 2 + b0. You'll get a result, x, between 0 and 15. If x > 11 then generate another four bits. When x <= 11, your answer is x + 20.
Help, but no code:
You can shift the range [0,2 ** n] easily to [a,a+2 ** n]
You can easily produce an equal probability from [0,2**n-1]
If you need a number that isn't a power of 2, just generate a number up to 2 ** n and re-roll if it exceeds the number you need
Subtract the numbers to work out your range:
Decimal: 20 - 10 = 10
Binary : 10100 - 01010 = 1010
Work out how many bits you need to represent this: 4.
For each of these, generate a random 1 or 0:
num_bits = 4
rand[num_bits]
for (x = 0; x < num_bits; ++x)
rand[x] = bin()
Let's say rand[] = [0,1,0,0] after this. Add this number back to the start of your range.
Binary: 1010 + 0100 = 1110
Decimal: 10 + 4 = 14
You can always change the range [a,b] to [0,b-a], denote X = b - a. Then you can define a function rand(X) as follows:
function int rand(X){
int i = 1;
// determine how many bits you need (see above answer for why)
while (X < 2^i) {
i++;
}
// generate the random numbers
Boolean cont = true;
int num = 0;
while (cont == true) {
for (j = 1 to i) {
// this generates num in range [0,2^i -1] with equal prob
// but we need to discard if num is larger than X
num = num + bin() * 2^j;
}
if (num <= X) { cont = false}
}
return num;
}

how many n digit numbers are there with product p

What algorithm should we use to get the count of n digit numbers such that the product of its digits is p; the special condition here is that none of the digits should be 1;
What i have thought so far is to do a prime factorization of p. Say n=3 and p=24.
we first do a prime factorization of 24 to get : 2*2*2*3.
now i have problem in determining the combinations of these which are
4*2*3 , 2*4*3, .... etc
Even if can do so... how will I scale for n is way smaller than the count of primes.
I am not too sure if thats the right direction... any inputs are welcome.
First, you don't really need full prime decomposition, only decomposition to primes smaller than your base (I guess you mean 10 here but the problem can be generalized to any base). So we only need factorization into the first 4 primes: 2, 3, 5 and 7. If the rest (prime or not) factor is anything bigger than 1, then the problem has 0 solutions.
Now, lets assume that the number p is factored into:
p = 2^d1 * 3^d2 * 5^d3 * 7^d4
and is also composed from the n digits:
p = d(n-1)d(n-2)...d2d1d0
Then, rearranging the digits, is will also be:
p = 2^q2 * 3^q3 * 4^q4 * 5^q3 * ... * 9^q9
where qi >= 0 and q2 + q3 + ... q9 = n
and also (due to the factorization):
for prime=2: d1 = q2 + 2*q4 + q6 + 3*q8
for prime=3: d2 = q3 + q6 + 2*q9
for prime=5: d3 = q5
for prime=7: d4 = q7
So the q5 and q7 are fixed and we have to find all non-negative integer solutions to the equations:
(where the unknowns are the rest qi: q2, q3, q4, q6, q8 and q9)
d1 = q2 + 2*q4 + q6 + 3*q8
d2 = q3 + q6 + 2*q9
n - d3 - d4 = q2 + q3 + q4 + q6 + q8 + q9
For every one of the above solutions, there are several rearrangements of the digits, which can be found by the formula:
X = n! / ( q2! * q3! * ... q9! )
which have to be summed up.
There may be a closed formula for this, using generating functions, you could post it at Math.SE
Example for p=24, n=3:
p = 2^3 * 3^1 * 5^0 * 7^0
and we have:
d1=3, d2=1, d3=0, d4=0
The integer solutions to:
3 = q2 + 2*q4 + q6 + 3*q8
1 = q3 + q6 + 2*q9
3 = q2 + q3 + q4 + q6 + q8 + q9
are (q2, q3, q4, q6, q8, q9) =:
(2, 0, 0, 1, 0, 0)
(1, 1, 1, 0, 0, 0)
which give:
3! / ( 2! * 1! ) = 3
3! / ( 1! * 1! * 1! ) = 6
and 3+6 = 9 total solutions.
Example for p=3628800, n=10:
p = 2^8 * 3^4 * 5^1 * 7^1
and we have:
d1=8, d2=4, d3=1, d4=1
The integer solutions to:
8 = q2 + 2*q4 + q6 + 3*q8
4 = q3 + q6 + 2*q9
8 = q2 + q3 + q4 + q6 + q8 + q9
are (q2, q3, q4, q6, q8, q9) (along with the corresponding digits and the rearrangements per solution):
(5, 0, 0, 0, 1, 2) 22222899 57 10! / (5! 2!) = 15120
(4, 0, 2, 0, 0, 2) 22224499 57 10! / (4! 2! 2!) = 37800
(4, 1, 0, 1, 1, 1) 22223689 57 10! / (4!) = 151200
(3, 2, 1, 0, 1, 1) 22233489 57 10! / (3! 2!) = 302400
(4, 0, 1, 2, 0, 1) 22224669 57 10! / (4! 2!) = 75600
(3, 1, 2, 1, 0, 1) 22234469 57 10! / (3! 2!) = 302400
(2, 2, 3, 0, 0, 1) 22334449 57 10! / (3! 2! 2!) = 151200
(2, 4, 0, 0, 2, 0) 22333388 57 10! / (4! 2! 2!) = 37800
(3, 2, 0, 2, 1, 0) 22233668 57 10! / (3! 2! 2!) = 151200
(2, 3, 1, 1, 1, 0) 22333468 57 10! / (3! 2!) = 302400
(1, 4, 2, 0, 1, 0) 23333448 57 10! / (4! 2!) = 75600
(4, 0, 0, 4, 0, 0) 22226666 57 10! / (4! 4!) = 6300
(3, 1, 1, 3, 0, 0) 22234666 57 10! / (3! 3!) = 100800
(2, 2, 2, 2, 0, 0) 22334466 57 10! / (2! 2! 2! 2!) = 226800
(1, 3, 3, 1, 0, 0) 23334446 57 10! / (3! 3!) = 100800
(0, 4, 4, 0, 0, 0) 33334444 57 10! / (4! 4!) = 6300
which is 2043720 total solutions, if I haven't done any mistakes..
I don't think I'd start by tackling what is known to be a 'hard' problem, computing the prime decomposition. By I don't think I mean my gut feeling, rather than any rigorous computation of complexity, tells me.
Since you are ultimately only interested in the single-digit divisors of p I'd start by dividing p by 2, then by 3, then 4, all the way up to 9. Of course, some of these divisions won't produce an integer result in which case you can discard that digit from further consideration.
For your example of p = 24 you'll get {{2},12}, {{3},8}, {{4},6}, {{6},4}, {{8},3} (ie tuples of divisor and remainder). Now apply the approach again, though this time you are looking for the 2 digit numbers whose digits multiply to the remainder. That is, for {{2},12} you would get {{2,2},6},{{2,3},4},{{2,4},3},{{2,6},2}. As it happens all of these results deliver 3-digit numbers whose digits multiply to 24, but in general it is possible that some of the remainders will still have 2 or more digits and you'll need to trim the search tree at those points. Now go back to {{3},8} and carry on.
Note that this approach avoids having to separately calculate how many permutations of a set of digits you need to consider because it enumerates them all. It also avoids having to consider 2*2 and 4 as separate candidates for inclusion.
I expect you could speed this up with a little memoisation too.
Now I look forward to someone more knowledgeable in combinatorics telling us the closed-form solution to this problem.
You can use dynamic programming approach based on the following formula:
f[ n ][ p ] = 9 * ( 10^(n-1) - 9^(n-1) ), if p = 0
0, if n = 1 and p >= 10
1, if n = 1 and p < 10
sum{ f[ n - 1 ][ p / k ] for 0 < k < 10, p mod k = 0 }, if n > 1
The first case is a separate case for p = 0. This case calculates in O(1), besides helps to exclude k = 0 values from 4th case.
The 2nd and 3rd cases are the dynamic base.
The 4th case k sequentially takes all possible values of the last digit, and we sum up quantities of numbers with product p with last digit k by reducing to the same problem of smaller size.
This will have O( n * p ) running time if you implement dp with memorization.
PS: My answer is for more general problem than OP described. If condition that no digit must be equal to 1 must be satisfied, formulas can be adjusted as follows:
f[ n ][ p ] = 8 * ( 9^(n-1) - 8^(n-1) ), if p = 0
0, if n = 1 and p >= 10 or p = 1
1, if n = 1 and 1 < p < 10
sum{ f[ n - 1 ][ p / k ] for 1 < k < 10, p mod k = 0 }, if n > 1
For the N digit numbers and product of its digits is p;
For example if n = 3 and p =24
Arrangement would be as follow (Permutation)
= (p!)/(p-n)!
= (24!) /(24 -3)!
= (24 * 23 * 22 * 21 )! / 21 !
= (24 * 23 * 22 )
= 12144
So it would be 12144 arrangement can be made
And for Combination is as follow
= (p!)/(n!) * (p-n)!
= (24!) /(3!) * (24 -3)!
= (24 * 23 * 22 * 21 )! / (3!) * 21 !
= (24 * 23 * 22 ) / 6
= 2024
May this will help you
The problems seems contrived but in any case there are upper bounds to what you seen. For example p can have no prime divisor > 7 since it needs to be a single digit ("such that the product of its digits").
Hence suppose p = 1 * 2^a * 3^b * 5^c * 7^d.
2^a can come from ceil(a/3) to 'a' digits. 3^b can come from ceil(b/2) to 'b' digits. 5^c and 7^d can come from 'c' and 'd' digits respectively. The remaining digits can be filled with 1s.
Hence n can range from ceil(a/3)+ceil(b/2)+c+d to infinity while p has a set of fixed values.
Prime factorization feels like the right direction, though you don't need any prime greater than 7, so you can just divide by 2,3,5,7 repeatedly. (No solution if we don't get a prime, or get one > 7).
Once we have the prime factors, p % x and p / x can be implemented as constant time operations (you don't actually need p, you can just keep the prime factors).
My idea is, calculate the combinations with the algorithm below, and the permutations from there is easy.
getCombinations(map<int, int> primeCounts, int numSoFar, string str)
if (numSoFar == n)
if (primeCounts == allZeroes)
addCombination(str);
else
;// do nothing, too many digits
else if (primeCounts[7] >= 1) // p % 7
getCombinations(primeCounts - [7]->1, numSoFar-1, str + "7")
else if (primeCounts[5] >= 1) // p % 5
getCombinations(primeCounts - [5]->1, numSoFar-1, str + "5")
else if (primeCounts[3] >= 2) // p % 9
getCombinations(primeCounts - [3]->2, numSoFar-1, str + "9")
getCombinations(primeCounts - [3]->2, numSoFar-2, str + "33")
else if (primeCounts[2] >= 3) // p % 8
getCombinations(primeCounts - [2]->3, numSoFar-1, str + "8")
getCombinations(primeCounts - [2]->3, numSoFar-2, str + "24")
getCombinations(primeCounts - [2]->3, numSoFar-3, str + "222")
else if (primeCounts[3] >= 1 && primeCounts[2] >= 1) // p % 6
getCombinations(primeCounts - {[2]->1,[3]->1}, numSoFar-1, str + "6")
getCombinations(primeCounts - {[2]->1,[3]->1}, numSoFar-2, str + "23")
else if (primeCounts[2] >= 2) // p % 4
getCombinations(primeCounts - [2]->2, numSoFar-1, str + "4")
getCombinations(primeCounts - [2]->2, numSoFar-2, str + "22")
else if (primeCounts[3] >= 1) // p % 3
getCombinations(primeCounts - [3]->1, numSoFar-1, str + "3")
else if (primeCounts[2] >= 1) // p % 2
getCombinations(primeCounts - [2]->1, numSoFar-1, str + "2")
else ;// do nothing, too few digits
Given the order in which things are done, I don't think there would be duplicates.
Improvement:
You needn't look at p%7 again (deeper down the stack) once you've looked at p%5, since we know it can't be divisible by 7 any more, so a lot of those checks can be optimised away.
primeCounts needn't be a map, it can just be an array of length 4, and it needn't be copied, one can just increase and decrease the values appropriately. Something similar can be done with str as well (character array).
If there were too many digits for getCombinations(..., str + "8"), there's no point in checking "24" or "222". This and similar checks shouldn't be too difficult to implement (just have the function return a bool).

How to decompose an integer in two for grid creation

Given an integer N I want to find two integers A and B that satisfy A × B ≥ N with the following conditions:
The difference between A × B and N is as low as possible.
The difference between A and B is as low as possible (to approach a square).
Example: 23. Possible solutions 3 × 8, 6 × 4, 5 × 5. 6 × 4 is the best since it leaves just one empty space in the grid and is "less" rectangular than 3 × 8.
Another example: 21. Solutions 3 × 7 and 4 × 6. 3 × 7 is the desired one.
A brute force solution is easy. I would like to see if a clever solution is possible.
Easy.
In pseudocode
a = b = floor(sqrt(N))
if (a * b >= N) return (a, b)
a += 1
if (a * b >= N) return (a, b)
return (a, b+1)
and it will always terminate, the distance between a and b at most only 1.
It will be much harder if you relax second constraint, but that's another question.
Edit: as it seems that the first condition is more important, you have to attack the problem
a bit differently. You have to specify some method to measure the badness of not being square enough = 2nd condition, because even prime numbers can be factorized as 1*number, and we fulfill the first condition. Assume we have a badness function (say a >= b && a <= 2 * b), then factorize N and try different combinations to find best one. If there aren't any good enough, try with N+1 and so on.
Edit2: after thinking a bit more I come with this solution, in Python:
from math import sqrt
def isok(a, b):
"""accept difference of five - 2nd rule"""
return a <= b + 5
def improve(a, b, N):
"""improve result:
if a == b:
(a+1)*(b-1) = a^2 - 1 < a*a
otherwise (a - 1 >= b as a is always larger)
(a+1)*(b-1) = a*b - a + b - 1 =< a*b
On each iteration new a*b will be less,
continue until we can, or 2nd condition is still met
"""
while (a+1) * (b-1) >= N and isok(a+1, b-1):
a, b = a + 1, b - 1
return (a, b)
def decomposite(N):
a = int(sqrt(N))
b = a
# N is square, result is ok
if a * b >= N:
return (a, b)
a += 1
if a * b >= N:
return improve(a, b, N)
return improve(a, b+1, N)
def test(N):
(a, b) = decomposite(N)
print "%d decomposed as %d * %d = %d" % (N, a, b, a*b)
[test(x) for x in [99, 100, 101, 20, 21, 22, 23]]
which outputs
99 decomposed as 11 * 9 = 99
100 decomposed as 10 * 10 = 100
101 decomposed as 13 * 8 = 104
20 decomposed as 5 * 4 = 20
21 decomposed as 7 * 3 = 21
22 decomposed as 6 * 4 = 24
23 decomposed as 6 * 4 = 24
I think this may work (your conditions are somewhat ambiguous). this solution is somewhat similar to other one, in basically produces rectangular matrix which is almost square.
you may need to prove that A+2 is not optimal condition
A0 = B0 = ceil (sqrt N)
A1 = A0+1
B1 = B0-1
if A0*B0-N > A1*B1-N: return (A1,B1)
return (A0,B0)
this is solution if first condition is dominant (and second condition is not used)
A0 = B0 = ceil (sqrt N)
if A0*B0==N: return (A0,B0)
return (N,1)
Other conditions variations will be in between
A = B = ceil (sqrt N)

Compress two or more numbers into one byte

I think this is not really possible but worth asking anyway. Say I have two small numbers (Each ranges from 0 to 11). Is there a way that I can compress them into one byte and get them back later. How about with four numbers of similar sizes.
What I need is something like: a1 + a2 = x. I only know x and from that get a1, a2
For the second part: a1 + a2 + a3 + a4 = x. I only know x and from that get a1, a2, a3, a4
Note: I know you cannot unadd, just illustrating my question.
x must be one byte. a1, a2, a3, a4 range [0, 11].
Thats trivial with bit masks. Idea is to divide byte into smaller units and dedicate them to different elements.
For 2 numbers, it can be like this: first 4 bits are number1, rest are number2. You would use number1 = (x & 0b11110000) >> 4, number2 = (x & 0b00001111) to retrieve values, and x = (number1 << 4) | number2 to compress them.
For two numbers, sure. Each one has 12 possible values, so the pair has a total of 12^2 = 144 possible values, and that's less than the 256 possible values of a byte. So you could do e.g.
x = 12*a1 + a2
a1 = x / 12
a2 = x % 12
(If you only have signed bytes, e.g. in Java, it's a little trickier)
For four numbers from 0 to 11, there are 12^4 = 20736 values, so you couldn't fit them in one byte, but you could do it with two.
x = 12^3*a1 + 12^2*a2 + 12*a3 + a4
a1 = x / 12^3
a2 = (x / 12^2) % 12
a3 = (x / 12) % 12
a4 = x % 12
EDIT: the other answers talk about storing one number per four bits and using bit-shifting. That's faster.
The 0-11 example is pretty easy -- you can store each number in four bits, so putting them into a single byte is just a matter of shifting one 4 bits to the left, and oring the two together.
Four numbers of similar sizes won't fit -- four bits apiece times four gives a minimum of 16 bits to hold them.
Let's say it in general: suppose you want to mix N numbers a1, a2, ... aN, a1 ranging from 0..k1-1, a2 from 0..k2-1, ... and aN from 0 .. kN-1.
Then, the encoded number is:
encoded = a1 + k1*a2 + k1*k2*a3 + ... k1*k2*..*k(N-1)*aN
The decoding is then more tricky, stepwise:
rest = encoded
a1 = rest mod k1
rest = rest div k1
a2 = rest mod k2
rest = rest div k2
...
a(N-1) = rest mod k(N-1)
rest = rest div k(N-1)
aN = rest # rest is already < kN
If the numbers 0-11 aren't evenly distributed you can do even better by using shorter bit sequences for common values and longer ones for rarer values. It costs at least one bit to code which length you are using so there is a whole branch of CS devoted to proving when it's worth doing.
So a byte can hold upto 256 values or FF in Hex. So you can encode two numbers from 0-16 in a byte.
byte a1 = 0xf;
byte a2 = 0x9;
byte compress = a1 << 4 | (0x0F & a2); // should yield 0xf9 in one byte.
4 Numbers you can do if you reduce it to only 0-8 range.
Since a single byte is 8 bits, you can easily subdivide it, with smaller ranges of values. The extreme limit of this is when you have 8 single bit integers, which is called a bit field.
If you want to store two 4-bit integers (which gives you 0-15 for each), you simply have to do this:
value = a * 16 + b;
As long as you do proper bounds checking, you will never lose any information here.
To get the two values back, you just have to do this:
a = floor(value / 16)
b = value MOD 15
MOD is modulus, it's the "remainder" of a division.
If you want to store four 2-bit integers (0-3), you can do this:
value = a * 64 + b * 16 + c * 4 + d
And, to get them back:
a = floor(value / 64)
b = floor(value / 16) MOD 4
c = floor(value / 4) MOD 4
d = value MOD 4
I leave the last division as an exercise for the reader ;)
#Mike Caron
your last example (4 integers between 0-3) is much faster with bit-shifting. No need for floor().
value = (a << 6) | (b << 4) | (c << 2) | d;
a = (value >> 6);
b = (value >> 4) % 4;
c = (value >> 2) % 4;
d = (value) % 4;
Use Bit masking or Bit Shifting. The later is faster
Test out BinaryTrees for some fun. (it will be handing later on in dev life regarding data and all sorts of dev voodom lol)
Packing four values into one number will require at least 15 bits. This doesn't fit in a single byte, but in two.
What you need to do is a conversion from base 12 to base 65536 and conversely.
B = A1 + 12.(A2 + 12.(A3 + 12.A4))
A1 = B % 12
A2 = (B / 12) % 12
A3 = (B / 144) % 12
A4 = B / 1728
As this takes 2 bytes anyway, conversion from base 12 to (packed) base 16 is by far prefable.
B1 = A1 + 256.A2
B2 = A3 + 256.A4
A1 = B1 % 256
A2 = B1 / 256
A3 = B2 % 256
A4 = B2 / 256
The modulos and divisions are implemented bymaskings and shifts.
0-9 works much easier. You can easily store 11random order decimals in 4 1/2 bytes. Which is tighter compression than log(256)÷log(10). Just by creative mapping. Remember not all compression has to do with, dictionaries, redundancies, or sequences.
If you are talking of random numbers 0 - 9 you can have 4 digits per 14 bits not 15.

Resources