How divide pie with constraints - algorithm

How divide a pie with constraints?
Hi I have rounded pie and would like to divide it, but am not able to figure out how to do it.
I have four friends: A,B,C,D
I want to divide pie based on how I like them, so based on opinion.
Slice sizes:
A= 1/22 of pie
B= 10/22 of pie
C= 1/22 of pie
D= 10/22 of pie.
How to divide the pie when there are some constrains?
Like B will tell me he wants < 10% of whole pie and
D must have at least 85% of pie.
A,C don't care.
In this case I can say ok, so D wants at least 85% 22*0.85= 18.7 so he will get this.
Now I have only rest of the pie 22-18.7= 3.3 = 15% to divide and I don't want to give bigger slice than 10% to B. And I want still apply the ratios I proposed but only on rest of the pie as D must have at least 85%.
I think the ratios should be now applied after constrains are resolved.
A has no constraint so he can get from 0-100%
B wants 0-10%
C has no constraints 0-100%
D wants at least 85-100%
I can apply the ratios on slices under constrains like
when B wants 0-10% then I can say the ratio will have influence to the size between 0-10%
and for D will the ratio influence size (85%-100%).
| friends: | A | B | C | D |
|constraints:| | <=0.1 | >=0.85 | |
|ranges: | 0 to 1 | 0 to 0.1 | 0.85 to 1 | 0 to 1 |
|ratios: | 1/22 | 10/22 | 1/22 | 10/22 |
Hopefully is the problem understandable. At the end I want to have whole pie divided among ABCD with not violated constrains, and with somehow applied ratios.

Let me propose a formalization of this problem as a quadratic program.
Let x be the desired result. We want to minimize the L2 norm of
x/ratio (element-wise) subject to lower ≤ x ≤ upper
(element-wise) and x·1 = 1.
The idea behind this objective is that, by examining the optimality
conditions, we can show that there exists some scalar z such that x
= median(lower, ratio z, upper).
Below is some very poorly tested Python 3 code to approximately solve
this quadratic program.
from fractions import Fraction
ratio = [1, 10, 2, 10]
lower = [0, 0, 0, 85]
upper = [100, 10, 100, 100]
# Want to minimize the L2 norm of x / ratio subject to lower <= x <= upper and
# sum(x) == 100
# Validation
assert 0 < len(ratio) == len(lower) == len(upper)
assert all(0 < r for r in ratio)
assert all(0 <= l <= u <= 100 for (l, u) in zip(lower, upper))
assert sum(lower) <= 100 <= sum(upper)
# Binary search
n = len(ratio)
critical = sorted(
{Fraction(bound[i], ratio[i]) for bound in [lower, upper] for i in range(n)}
)
a = 0
b = len(critical)
while b - a > 1:
m = (a + b) // 2
z = critical[m]
if sum(sorted([lower[i], ratio[i] * z, upper[i]])[1] for i in range(n)) <= 100:
a = m
else:
b = m
x = [0] * n
z = critical[a]
divisor = 0
for i in range(n):
value = ratio[i] * z
if value < lower[i]:
x[i] = lower[i]
elif upper[i] <= value:
x[i] = upper[i]
else:
divisor += ratio[i]
dividend = 100 - sum(x)
for i in range(n):
if lower[i] <= ratio[i] * z < upper[i]:
x[i] = Fraction(ratio[i], divisor) * dividend
print(x)
Output:
[Fraction(5, 3), 10, Fraction(10, 3), 85]

Related

Finding natural numbers having n Trailing Zeroes in Factorial

I need help with the following problem.
Given an integer m, I need to find the number of positive integers n and the integers, such that the factorial of n ends with exactly m zeroes.
I wrote this code it works fine and i get the right output, but it take way too much time as the numbers increase.
a = input()
while a:
x = []
m, n, fact, c, j = input(), 0, 1, 0, 0
z = 10*m
t = 10**m
while z - 1:
fact = 1
n = n + 1
for i in range(1, n + 1):
fact = fact * i
if fact % t == 0 and ((fact / t) % 10) != 0:
x.append(int(n))
c = c + 1
z = z - 1
for p in range(c):
print x[p],
a -= 1
print c
Could someone suggest me a more efficient way to do this. Presently, it takes 30 seconds for a test case asking for numbers with 250 trailing zeros in its factorial.
Thanks
To get number of trailing zeroes of n! efficiently you can put
def zeroes(value):
result = 0;
d = 5;
while (d <= value):
result += value // d; # integer division
d *= 5;
return result;
...
# 305: 1234! has exactly 305 trailing zeroes
print zeroes(1234)
In order to solve the problem (what numbers have n trailing zeroes in n!) you can use these facts:
number of zeroes is a monotonous function: f(x + a) >= f(x) if a >= 0.
if f(x) = y then x <= y * 5 (we count only 5 factors).
if f(x) = y then x >= y * 4 (let me leave this for you to prove)
Then implement binary search (on monotonous function).
E.g. in case of 250 zeroes we have the initial range to test [4*250..5*250] == [1000..1250]. Binary search narrows the range down into [1005..1009].
1005, 1006, 1007, 1008, 1009 are all numbers such that they have exactly 250 trainling zeroes in factorial
Edit I hope I don't spoil the fun if I (after 2 years) prove the last conjecture (see comments below):
Each 5**n within facrtorial when multiplied by 2**n produces 10**n and thus n zeroes; that's why f(x) is
f(x) = [x / 5] + [x / 25] + [x / 125] + ... + [x / 5**n] + ...
where [...] stands for floor or integer part (e.g. [3.1415926] == 3). Let's perform easy manipulations:
f(x) = [x / 5] + [x / 25] + [x / 125] + ... + [x / 5**n] + ... <= # removing [...]
x / 5 + x / 25 + x / 125 + ... + x / 5**n + ... =
x * (1/5 + 1/25 + 1/125 + ... + 1/5**n + ...) =
x * (1/5 * 1/(1 - 1/5)) =
x * 1/5 * 5/4 =
x / 4
So far so good
f(x) <= x / 4
Or if y = f(x) then x >= 4 * y Q.E.D.
Focus on the number of 2s and 5s that makes up a number. e.g. 150 is made up of 2*3*5*5, there 1 pair of 2&5 so there's one trailing zero. Each time you increase the tested number, try figuring out how much 2 and 5s are in the number. From that, adding up previous results you can easily know how much zeros its factorial contains.
For example, 15!=15*...*5*4*3*2*1, starting from 2:
Number 2s 5s trailing zeros of factorial
2 1 0 0
3 1 0 0
4 2 0 0
5 2 1 1
6 3 1 1
...
10 5 2 2
...
15 7 3 3
..
24 12 6 6
25 12 8 8 <- 25 counts for two 5-s: 25 == 5 * 5 == 5**2
26 13 8 8
..
Refer to Peter de Rivaz's and Dmitry Bychenko's comments, they have got some good advices.

Is there any way of optimizing a multiplication loop?

Let's say I have to repeat the process of multiplying a variable by a constant and modulus the result by another constant, n times to get my desired result.
the obvious solution is iterating n times, but it's getting time consuming the greater n is.
Code example:
const N = 1000000;
const A = 123;
const B = 456;
var c = 789;
for (var i = 0; i < n; i++)
{
c = (c * a) % b;
}
log("Total: " + c);
Is there any algebraic solution to optimize this loop?
% has two useful properties:
1) (x % b) % b = x % b
2) (c*a) % b = ((c%b) * (a%b))%b
This implies that e.g.
(((c*a)%b)*a) % b = ((((c*a)%b)%b) * (a%b)) % b
= (((c*a) % b) * (a%b)) % b
= (c*a*a) % b
= (c*a^2) % b
Hence, in your case the final c that you compute is equivalent to
(c*a^n)%b
This can be computed efficiently using exponentiation by squaring.
To illustrate this equivalence:
def f(a,b,c,n):
for i in range(n):
c = (c*a)%b
return c
def g(a,b,c,n):
return (c*pow(a,n,b)) % b
a = 123
b = 456
c = 789
n = 10**6
print(f(a,b,c,n),g(a,b,c,n)) #prints 261, 261
First, note that c * A^n is never an exact multiple of B = 456 since the former is always odd and the latter is always even. You can generalize this by considering the prime factorizations of the numbers involved and see that no repetition of the factors of c and A will ever give you something that contains all the factors of B. This means c will never turn into 0 as a result of the iterated multiplication.
There are only 456 possible values for c * a mod B = 456; therefore, if you iterate the loop 456 times, you will see at least value of c repeated. Suppose the first value of c that repeats is c', when i= i'. Say it first saw c' when i=i''. By continuing to iterate the multiplication, we would expect to see c' again:
we saw it at i''
we saw it at i'
we should see it at i' + (i' - i'')
we should see it at i' + k(i' - i'') as well
Once you detect a repeat you know that pattern is going to repeat forever. Therefore, you can compute how many patterns are needed to get to N, and the offset in the repeating pattern that you'd be at for i = N - 1, and then you'd know the answer without actually performing the multiplications.
A simpler example:
A = 2
B = 3
C = 5
c[0] = 5
c[1] = 5 * 2 % 3 = 1
c[2] = 1 * 2 % 3 = 2
c[3] = 2 * 2 % 3 = 1 <= duplicate
i' = 3
i'' = 1
repeating pattern: 1, 2, 1
c[1+3k] = 1
c[2+3k] = 2
c[3+3k] = 1
10,000 = 1 + 3k for k = 3,333
c[10,000] = 1
c[10,001] = 2
c[10,002] = 1

Bradley Adaptive Thresholding -- Confused (questions)

I have some questions, probably stupid, about the implementation of the adaptive thresholding by Bradley. I have read paper about it http://people.scs.carleton.ca:8008/~roth/iit-publications-iti/docs/gerh-50002.pdf and I am a bit confused. Mainly about this statement:
if ((in[i,j]*count) ≤ (sum*(100−t)/100)) then
Let's assume that we have this input:
width, i
[0] [1] [2]
+---+---+---+
height [0] | 1 | 2 | 2 |
j +---+---+---+
[1] | 3 | 4 | 3 |
+---+---+---+
[2] | 5 | 3 | 2 |
+---+---+---+
and let's say that:
s = 2
s/2 = 1
t = 15
i = 1
j = 1 (we are at the center pixel)
So that means we have a window 3x3, right? Then:
x1 = 0, x2 = 2, y1 = 0, y2 = 2
What is count then? If it is number of pixels in the window, why it is 2*2=4, instead of 3*3=9 according to the algorithm? Further, why is the original value of the pixel multiplied by the count?
The paper says that the value is compared to the average value of surrounding pixels, why it isn't
in[i,j] <= (sum/count) * ((100 - t) / 100)
then?
Can somebody please explain this to me? It is probably very stupid question but I can't figure it out.
Before we start, let's present the pseudocode of the algorithm written in their paper:
procedure AdaptiveThreshold(in,out,w,h)
1: for i = 0 to w do
2: sum ← 0
3: for j = 0 to h do
4: sum ← sum+in[i, j]
5: if i = 0 then
6: intImg[i, j] ← sum
7: else
8: intImg[i, j] ← intImg[i−1, j] +sum
9: end if
10: end for
11: end for
12: for i = 0 to w do
13: for j = 0 to h do
14: x1 ← i−s/2 {border checking is not shown}
15: x2 ← i+s/2
16: y1 ← j −s/2
17: y2 ← j +s/2
18: count ← (x2−x1)×(y2−y1)
19: sum ← intImg[x2,y2]−intImg[x2,y1−1]−intImg[x1−1,y2] +intImg[x1−1,y1−1]
20: if (in[i, j]×count) ≤ (sum×(100−t)/100) then
21: out[i, j] ← 0
22: else
23: out[i, j] ← 255
24: end if
25: end for
26: end for
intImg is the integral image of the input image to threshold, assuming grayscale.
I've implemented this algorithm with success, so let's talk about your doubts.
What is count then? If it is number of pixels in the window, why it is 2*2=4, instead of 3*3=9 according to the algorithm?
There is an underlying assumption in the paper that they don't talk about. The value of s needs to be odd, and the windowing should be:
x1 = i - floor(s/2)
x2 = i + floor(s/2)
y1 = j - floor(s/2)
y2 = j + floor(s/2)
count is certainly the total number of pixels in the window, but you also need to make sure that you don't go out of bounds. What you have there should certainly be a 3 x 3 window and so s = 3, not 2. Now, if s = 3, but if we were to choose i = 0, j = 0, we will have x and y values that are negative. We can't have this and so the total number of valid pixels within this 3 x 3 window centred at i = 0, j = 0 is 4, and so count = 4. For windows that are within the bounds of the image, then count would be 9.
Further, why is the original value of the pixel multiplied by the count? The paper says that the value is compared to the average value of surrounding pixels, why it isn't:
in[i,j] <= (sum/count) * ((100 - t) / 100)
then?
The condition you're looking at is at line 20 of the algorithm:
20: (in[i, j]×count) ≤ (sum×(100−t)/100)
The reason why we take a look at in[i,j]*count is because we assume that in[i,j] is the average intensity within the s x s window. Therefore, if we examined a s x s window and added up all of the intensities, this is equal to in[i,j] x count. The algorithm is quite ingenious. Basically, we compare the assumed average intensity (in[i,j] x count) within the s x s window and if this is less than t% of the actual average within this s x s window (sum x ((100-t)/100)), then the output is set to black. If it is larger, than the output is set to white. However, you have eloquently stated that it should be this instead:
in[i,j] <= (sum/count) * ((100 - t) / 100)
This is essentially the same as line 20, but you divided both sides of the equation by count, so it's still the same expression. I would say that this explicitly states what I talked about above. The multiplication by count is certainly confusing, and so what you have written makes more sense.
Therefore, you're just seeing it a different way, and that's totally fine! So to answer your question, what you have stated is certainly correct and is equivalent to the expression seen in the actual algorithm.
Hope this helps!

Segmented Least Squares

Give an algorithm that takes a sequence of points in the plane (x_1, y_1), (x_2, y_2), ...., (x_n, y_n) and an integer k as input and returns the best piecewise linear function f consisting of at most k pieces that minimizes the sum squared error. You may assume that you have access to an algorithm that computes the sum squared error for one segment through a set of n points in Θ(n) time.The solution should use O(n^2k) time and O(nk) space.
Can anyone help me with this problem? Thank you so much!
(This is too late for your homework, but hope it helps anyway.)
First is dynamic programming in python / numpy for k = 4 only,
to help you understand how dynamic programming works;
once you understand that, writing a loop for any k should be easy.
Also, Cost[] is a 2d matrix, space O(n^2);
see the notes at the end for getting down to space O(n k)
#!/usr/bin/env python
""" split4.py: min-cost split into 4 pieces, dynamic programming k=4 """
from __future__ import division
import numpy as np
__version__ = "2014-03-09 mar denis"
#...............................................................................
def split4( Cost, verbose=1 ):
""" split4.py: min-cost split into 4 pieces, dynamic programming k=4
min Cost[0:a] + Cost[a:b] + Cost[b:c] + Cost[c:n]
Cost[a,b] = error in least-squares line fit to xy[a] .. xy[b] *including b*
or error in lsq horizontal lines, sum (y_j - av y) ^2 for each piece --
o--
o-
o---
o----
| | | |
0 2 5 9
(Why 4 ? to walk through step by step, then put in a loop)
"""
# speedup: maxlen 2 n/k or so
Cost = np.asanyarray(Cost)
n = Cost.shape[1]
# C2 C3 ... costs, J2 J3 ... indices of best splits
J2 = - np.ones(n, dtype=int) # -1, NaN mark undefined / bug
C2 = np.ones(n) * np.NaN
J3 = - np.ones(n, dtype=int)
C3 = np.ones(n) * np.NaN
# best 2-splits of the left 2 3 4 ...
for nleft in range( 1, n ):
J2[nleft] = j = np.argmin([ Cost[0,j-1] + Cost[j,nleft] for j in range( 1, nleft+1 )]) + 1
C2[nleft] = Cost[0,j-1] + Cost[j,nleft]
# an idiom for argmin j, min value c together
# best 3-splits of the left 3 4 5 ...
for nleft in range( 2, n ):
J3[nleft] = j = np.argmin([ C2[j-1] + Cost[j,nleft] for j in range( 2, nleft+1 )]) + 2
C3[nleft] = C2[j-1] + Cost[j,nleft]
# best 4-split of all n --
j4 = np.argmin([ C3[j-1] + Cost[j,n-1] for j in range( 3, n )]) + 3
c4 = C3[j4-1] + Cost[j4,n-1]
j3 = J3[j4]
j2 = J2[j3]
jsplit = np.array([ 0, j2, j3, j4, n ])
if verbose:
print "split4: len %s pos %s cost %.3g" % (np.diff(jsplit), jsplit, c4)
print "split4: J2 %s C2 %s" %(J2, C2)
print "split4: J3 %s C3 %s" %(J3, C3)
return jsplit
#...............................................................................
if __name__ == "__main__":
import random
import sys
import spread
n = 10
ncycle = 2
plot = 0
seed = 0
# run this.py a=1 b=None c=[3] 'd = expr' ... in sh or ipython
for arg in sys.argv[1:]:
exec( arg )
np.set_printoptions( 1, threshold=100, edgeitems=10, linewidth=100, suppress=True )
np.random.seed(seed)
random.seed(seed)
print "\n", 80 * "-"
title = "Dynamic programming least-square horizontal lines %s n %d seed %d" % (
__file__, n, seed)
print title
x = np.arange( n + 0. )
y = np.sin( 2*np.pi * x * ncycle / n )
# synthetic time series ?
print "y: %s av %.3g variance %.3g" % (y, y.mean(), np.var(y))
print "Cost[j,k] = sum (y - av y)^2 --" # len * var y[j:k+1]
Cost = spread.spreads_allij( y )
print Cost # .round().astype(int)
jsplit = split4( Cost )
# split4: len [3 2 3 2] pos [ 0 3 5 8 10]
if plot:
import matplotlib.pyplot as pl
title += "\n lengths: %s" % np.diff(jsplit)
pl.title( title )
pl.plot( y )
for js, js1 in zip( jsplit[:-1], jsplit[1:] ):
if js1 <= js: continue
yav = y[js:js1].mean() * np.ones( js1 - js + 1 )
pl.plot( np.arange( js, js1 + 1 ), yav )
# pl.legend()
pl.show()
Then, the following code does Cost[] for horizontal lines only, slope 0;
extending it to line segments of any slope, in time O(n), is left as an exercise.
""" spreads( all y[:j] ) in time O(n)
define spread( y[] ) = sum (y - average y)^2
e.g. spread of 24 hourly temperatures y[0:24] i.e. y[0] .. y[23]
around a horizontal line at the average temperature
(spread = 0 for constant temperature,
24 c^2 for constant + [c -c c -c ...],
24 * variance(y) )
How fast can one compute all 24 spreads
1 hour (midnight to 1 am), 2 hours ... all 24 ?
A simpler problem: compute all 24 averages in time O(n):
N = np.arange( 1, len(y)+1 )
allav = np.cumsum(y) / N
= [ y0, (y0 + y1) / 2, (y0 + y1 + y2) / 3 ...]
An identity:
spread(y) = sum(y^2) - n * (av y)^2
Voila: the code below, all spreads() in time O(n).
Exercise: extend this to spreads around least-squares lines
fit to [ y0, [y0 y1], [y0 y1 y2] ... ], not just horizontal lines.
"""
from __future__ import division
import sys
import numpy as np
#...............................................................................
def spreads( y ):
""" [ spread y[:1], spread y[:2] ... spread y ] in time O(n)
where spread( y[] ) = sum (y - average y )^2
= n * variance(y)
"""
N = np.arange( 1, len(y)+1 )
return np.cumsum( y**2 ) - np.cumsum( y )**2 / N
def spreads_allij( y ):
""" -> A[i,j] = sum (y - av y)^2, spread of y around its average
for all y[i:j+1]
time, space O(n^2)
"""
y = np.asanyarray( y, dtype=float )
n = len(y)
A = np.zeros((n,n))
for i in range(n):
A[i,i:] = spreads( y[i:] )
return A
So far we have an n x n cost matrix, space O(n^2).
To get down to space O( n k ),
look closely at the pattern of Cost[i,j] accesses in the dyn-prog code:
for nleft .. to n:
Cost_nleft = Cost[j,nleft ] -- time nleft or nleft^2
for k in 3 4 5 ...:
min [ C[k-1, j-1] + Cost_nleft[j] for j .. to nleft ]
Here Cost_nleft is one row of the full n x n cost matrix, ~ n segments, generated as needed.
This can be done in time O(n) for line segments.
But if "error for one segment through a set of n points takes O(n) time",
it seems we're up to time O(n^3). Comments anyone ?
If you can do least squares for some segment in n^2, it's easy to do what you want in n^2 k^2 with dynamic programming. You might be able to optimize that to a single k only.

How to decompose an integer in two for grid creation

Given an integer N I want to find two integers A and B that satisfy A × B ≥ N with the following conditions:
The difference between A × B and N is as low as possible.
The difference between A and B is as low as possible (to approach a square).
Example: 23. Possible solutions 3 × 8, 6 × 4, 5 × 5. 6 × 4 is the best since it leaves just one empty space in the grid and is "less" rectangular than 3 × 8.
Another example: 21. Solutions 3 × 7 and 4 × 6. 3 × 7 is the desired one.
A brute force solution is easy. I would like to see if a clever solution is possible.
Easy.
In pseudocode
a = b = floor(sqrt(N))
if (a * b >= N) return (a, b)
a += 1
if (a * b >= N) return (a, b)
return (a, b+1)
and it will always terminate, the distance between a and b at most only 1.
It will be much harder if you relax second constraint, but that's another question.
Edit: as it seems that the first condition is more important, you have to attack the problem
a bit differently. You have to specify some method to measure the badness of not being square enough = 2nd condition, because even prime numbers can be factorized as 1*number, and we fulfill the first condition. Assume we have a badness function (say a >= b && a <= 2 * b), then factorize N and try different combinations to find best one. If there aren't any good enough, try with N+1 and so on.
Edit2: after thinking a bit more I come with this solution, in Python:
from math import sqrt
def isok(a, b):
"""accept difference of five - 2nd rule"""
return a <= b + 5
def improve(a, b, N):
"""improve result:
if a == b:
(a+1)*(b-1) = a^2 - 1 < a*a
otherwise (a - 1 >= b as a is always larger)
(a+1)*(b-1) = a*b - a + b - 1 =< a*b
On each iteration new a*b will be less,
continue until we can, or 2nd condition is still met
"""
while (a+1) * (b-1) >= N and isok(a+1, b-1):
a, b = a + 1, b - 1
return (a, b)
def decomposite(N):
a = int(sqrt(N))
b = a
# N is square, result is ok
if a * b >= N:
return (a, b)
a += 1
if a * b >= N:
return improve(a, b, N)
return improve(a, b+1, N)
def test(N):
(a, b) = decomposite(N)
print "%d decomposed as %d * %d = %d" % (N, a, b, a*b)
[test(x) for x in [99, 100, 101, 20, 21, 22, 23]]
which outputs
99 decomposed as 11 * 9 = 99
100 decomposed as 10 * 10 = 100
101 decomposed as 13 * 8 = 104
20 decomposed as 5 * 4 = 20
21 decomposed as 7 * 3 = 21
22 decomposed as 6 * 4 = 24
23 decomposed as 6 * 4 = 24
I think this may work (your conditions are somewhat ambiguous). this solution is somewhat similar to other one, in basically produces rectangular matrix which is almost square.
you may need to prove that A+2 is not optimal condition
A0 = B0 = ceil (sqrt N)
A1 = A0+1
B1 = B0-1
if A0*B0-N > A1*B1-N: return (A1,B1)
return (A0,B0)
this is solution if first condition is dominant (and second condition is not used)
A0 = B0 = ceil (sqrt N)
if A0*B0==N: return (A0,B0)
return (N,1)
Other conditions variations will be in between
A = B = ceil (sqrt N)

Resources