Magic bouncing ball problem - algorithm

Mary got a magic ball for her birthday. The ball, when thrown from
some height, bounces for the double of this height. Mary's thrown the
ball from her balcony which is x above the ground. Help her
calculate how many bounces are there needed for the ball to reach whe
height w.
Input: One integer z (1 ≤ z ≤ 106) as the number of test cases. For
every test, integers x and w (1 ≤ x ≤ 109, 0 ≤ w ≤ 109).
Output: For every case one integer equal to the number of bounces
needed fot the ball to reach w should be printed.
OK, so, though it looks unspeakably easy, I can't find a more efficient way to solve it than a simple, dumb, brutal approach of a loop multiplying x by 2 till it's at least w. For a maximum test, it will get a horrific time, of course. Then, I thought of using previous cases which saves quite a bit time providing that we can get the closest yet smaller result from the previous cases in a short time (O(1)?) which, however, I can't (and don't know if it's possible..) implement. How should this be done?

You are essentially trying to solve the problem
2i x = w
and then finding the smallest integer greater than i. Solving, we get
2i = w / x
i = log2 (w / x)
So one approach would be to compute this value explicitly and then take the ceiling. Of course, you'd have to watch out for numerical instability when doing this. For example, if you are using floats to encode the values and then let w = 8,000,001 and x = 1,000,000, you will end up getting the wrong answer (3 instead of 4). If you use doubles to hold the value, you will also get the wrong answer when x = 1 and w = 536870912 (reporting 30 instead of 29, since 1 x 229 = 536870912, but due to inaccuracies in the double the answer is erroneously rounded up to 30). It looks like we'll have to switch to a different approach.
Let's revisit your initial solution of just doubling the value of x until it exceeds w should be perfectly fine here. The maximum number of times you can double x until it reaches w is given by log2 (w/x), and since w/x is at most one billion, this iterates at most log2 109 times, which is about thirty times each. Doing thirty iterations of a multiply by two is probably going to be extremely fast. More generally, if the upper bound of w / x is U, then this will take at most O(log U) time to complete. If you have k (x, w) pairs to check, this takes time O(k log U).
If you're not satisfied with doing this, though, there's another very fast algorithm you could try. Essentially, you want to compute log2 w/x. You could start off by creating a table that lists all powers of two along with their logarithms. For example, your table might look like
T[1] = 0
T[2] = 1
T[4] = 2
T[8] = 3
...
You could then compute w/x, then do a binary search to figure out where in which range the value lies. The upper bound of this range is then the number of times the ball must bounce. This means that if you have k different pairs to inspect, and if you know that the maximum ratio of w/x is U, creating this table takes O(log U) time and each query then takes time proportional to the log of the size of the table, which is O(log log U). The overall runtime is then O(log U + k log log U), which is extremely good. Given that you're dealing with at most one million problem instances and that U is one billion, k log log U is just under five million, and log U is about thirty.
Finally, if you're willing to do some perversely awful stuff with bitwise hackery, since you know for a fact that w/x fits into a 32-bit word, you can use this bitwise trickery with IEEE doubles to compute the logarithm in a very small number of machine operations. This would probably be faster than the above two approaches, though I can't necessarily guarantee it.
Hope this helps!

Use this formula to calculate the number of bounces for each test case.
ceil( log(w/x) / log(2) )
This is pseudo-code, but it should be pretty simple to convert it to any language. Just replace log with a function that finds the logarithm of a number in some specific base and replace ceil with a function that rounds up a given decimal value to the next int above it (for example, ceil(2.3) = 3).
See http://www.purplemath.com/modules/solvexpo2.htm for why this works (in your case, you're trying to solve the equation x * 2 ^ n = w for an integer n, and you should start by dividing both sides by x).
EDIT:
Before using this method, you should check that w > x and return 1 if it isn't. (The ball always has to bounce at least once).
Also, it has been pointed out that inaccuracies in floating point values may cause this method to sometimes fail. You can work around this by checking if 2 ^ (n-1) >= w, where n is the result of the equation above, and if so returning (n - 1) instead of n.

Related

Is this shortcut for modulo by a constant (C) valid? IF (A mod 2^n) > C: { -C}

Looking to do modulo operator, A mod K where...
K is a uint32_t constant, is not a power of two, and I will be using it over and over again.
A is a uint32_t variable, possibly as much as ~2^13 times larger than K.
The ISA does not have single cycle modulo or division instructions. (8-bit micro)
The naive approach seems to coincide with the naive approach to division; repeat subtraction until underflow, then keep the remainder. This would obviously have fairly bad worst case performance, but would work for any A and K.
A known fast approach which works well for a K that is some power of two, is to logical AND with that power of two -1.
From Wikipedia...
A % 2^n == A & (2^n - 1)
My knee jerk reaction is to use these two things together, and I'm wondering if that is valid?
Specifically, I figure I can use the power of two mod trick to narrow the worst case for the above subtraction method. In other words, quickly mod to the nearest power of two above my constant, then subtract my constant if necessary. Here's the code that is in the actual question, fully expanded.
A = A AND (2^n - 1) # MOD A to the next higher power of two
if A > K: # See if we are still larger than our constant
A -= K # If so, subtract. We now must be lower.
##################
# A = A MOD K ???
##################
On inspection, this should always work, and should always be fast, since the next power of two greater than K should always be such that 2K will be larger. That is, K < 2^n < 2K meaning I should only ever need one extra test, and then possibly one subtraction.
...but this seems too simple. If it worked, I'd expect to have seen it before. But I can't find an example. I can't find a counter example either though. I have checked the usual places. What am I missing?
You can't combine both the approaches. First understand why does the below equation holds true.
A % p == A & (p - 1), where p = 2^n
p will have exactly 1 set bit in it's binary representation, say it's position is x.
So all the numbers which have atleast one set bit in a position greater than x, are all divisible by p, that is why performing AND with p-1 would give all set bits less than 2^x, which is same as performing mod
But that isn't the case when p is not a power of 2.
If that didn't made sense, then take for example:
A = 18 = 10010,
K = 6 = 110,
A % K = 0
According to your approach, you will perform AND operation with A and 7 (= 2^3-1), resulting in 2, which is not the value of MOD.

Average Case Complexity of a trivial algorithm

P(x,y,z){
print x
if(y!=x) print y
if(z!=x && z!=y) print z
}
Trivial Algorithm here, values x,y,z are chosen randomly from {1,...r} with r >= 1.
I'm trying to determine the average case complexity of this algorithm and I measure complexity based on the number of print statements.
The best case here is T(n) = 1 or O(1), when x=y=z and the probability of that is 1/3.
The worst case here is still T(n) = 3 or still O(1) when x!=y!=z and the probability is 2/3.
But when it comes to mathematically deriving the average case:
Sample Space is n possible inputs, Probability over Sample Space is 1/n chance
So, how do I calculate average case complexity? (This is where I draw a blank..)
Your algorithm has three cases:
All three numbers are equal. The probability of this is 1/r, since
once you choose x, there's only one choice for y and for z. The cost
for this case is 1.
x != y, but x == z or y == z. The probability of this is 1/r * (1/(r - 1))* 1/2,
since once you choose x, you only have r -1 choices left for y, and z can only be
one of these two choices. Cost = 2.
All three numbers are distinct. Probability that all three are distinct is
1/r * (1/(r - 1))*(1/(r - 2)). Cost = 3.
Thus, the average case can be computed as:
1/r + 1/r * (1/(r - 1)) + 1/r * (1/(r - 1))*(1/(r - 2)) * 3 == O(1)
Edit: The above expression is O(1), since the whole expression is made up of constants.
The average case will be somewhere between the best and worst cases; for this particular problem, that's all you need (at least as far as big-O).
1) Can you program the general case at least? Write the (pseudo)-code and analyze it, it might be readily apparent. You may actually program it suboptimally and there may exist a better solution. This is very typical and it's part of the puzzle-solving of the mathematics end of computer science, e.g. it's hard to discover quicksort on your own if you're just trying to code up a sort.
2) If you can, then run a monte carlo simulation and graph the results. i.e., for N = 1, 5, 10, 20, ..., 100, 1000, or whatever sample is realistic, run 10000 trials and plot the average time. If you're lucky X = sample size, Y = avg. time for 10000 runs at that sample size will graph out a nice line, or parabola, or some easy-to-model curve.
So I'm not sure if you need help on (1) finding or coding the algorithm or (2) analyzing it, you will probably want to revise your question to specify this.
P(x,y,z){
1.print x
2.if(y!=x)
3. print y
4.if(z!=x && z!=y)
5. print z
}
Line 1: takes a constant time c1 (c1:print x)
Line 2: takes a constant time c2 (c2:condition test)
Line 3 :takes a constant time c3 (c3 :print y)
Line 3: takes a constant time c4 (c4:condition test)
Line 4: takes a constant time c5 (c5:print z)
Analysis :
Unless your function P(x,y,z) does not depend on input size " r" the program will take a constant amount of time to run since Time Taken :T(c1)+T(c2+c3)+T(c4+c5) ..summing up the Big O of the function P(x,y,z) is O(1) where 1 is a constant and indicates constant amount of time since T(c1),T(c2),..T(c5) all take constant amount of time.. and say if the function P(x,y,z) iterates from 1 to r..then the complexity of your snippet would have changed and will be in terms of the input size i.e "r"
Best Case : O(1)
Average Case : O(1)
worst Case : O(1)

Finding even numbers in an array without using feedback

I saw this post: Finding even numbers in an array and I was thinking about how you could do it without feedback. Here's what I mean.
Given an array of length n containing at most e even numbers and a
function isEven that returns true if the input is even and false
otherwise, write a function that prints all the even numbers in the
array using the fewest number of calls to isEven.
The answer on the post was to use a binary search, which is neat since it doesn't mean the array has to be in order. The number of times you have to check if a number is even is e log n instead if n because you do a binary search (log n) to find one even number each time (e times).
But that idea means that you divide the array in half, test for evenness, then decide which half to keep based on the result.
My question is whether or not you can beat n calls on a fixed testing scheme where you check all the numbers you want for evenness without knowing the outcome, and then figure out where the even numbers are after you've done all the tests based on the results. So I guess it's no-feedback or blind or some term like that.
I was thinking about this for a while and couldn't come up with anything. The binary search idea doesn't work at all with this constraint, but maybe something else does? Even getting down to n/2 calls instead of n (yes, I know they are the same big-O) would be good.
The technical term for "no-feedback or blind" is "non-adaptive". O(e log n) calls still suffice, but the algorithm is rather more involved.
Instead of testing the evenness of products, we're going to test the evenness of sums. Let E ≠ F be distinct subsets of {1, …, n}. If we have one array x1, …, xn with even numbers at positions E and another array y1, …, yn with even numbers at positions F, how many subsets J of {1, …, n} satisfy
(∑i in J xi) mod 2 ≠ (∑i in J yi) mod 2?
The answer is 2n-1. Let i be an index such that xi mod 2 ≠ yi mod 2. Let S be a subset of {1, …, i - 1, i + 1, … n}. Either J = S is a solution or J = S union {i} is a solution, but not both.
For every possible outcome E, we need to make calls that eliminate every other possible outcome F. Suppose we make 2e log n calls at random. For each pair E ≠ F, the probability that we still cannot distinguish E from F is (2n-1/2n)2e log n = n-2e, because there are 2n possible calls and only 2n-1 fail to distinguish. There are at most ne + 1 choices of E and thus at most (ne + 1)ne/2 pairs. By a union bound, the probability that there exists some indistinguishable pair is at most n-2e(ne + 1)ne/2 < 1 (assuming we're looking at an interesting case where e ≥ 1 and n ≥ 2), so there exists a sequence of 2e log n calls that does the job.
Note that, while I've used randomness to show that a good sequence of calls exists, the resulting algorithm is deterministic (and, of course, non-adaptive, because we chose that sequence without knowledge of the outcomes).
You can use the Chinese Remainder Theorem to do this. I'm going to change your notation a bit.
Suppose you have N numbers of which at most E are even. Choose a sequence of distinct prime powers q1,q2,...,qk such that their product is at least N^E, i.e.
qi = pi^ei
where pi is prime and ei > 0 is an integer and
q1 * q2 * ... * qk >= N^E
Now make a bunch of 0-1 matrices. Let Mi be the qi x N matrix where the entry in row r and column c has a 1 if c = r mod qi and a 0 otherwise. For example, if qi = 3^2, then row 2 has ones in columns 2, 11, 20, ... 2 + 9j and 0 elsewhere.
Now stack these matrices vertically to get a Q x N matrix M, where Q = q1 + q2 + ... + qk. The rows of M tell you which numbers to multiply together (the nonzero positions). This gives a total of Q products that you need to test for evenness. Call each row a "trial", and say that a "trial involves j" if the jth column of that row is nonempty. The theorem you need is the following:
THEOREM: The number in position j is even if and only if all trials involving j are even.
So you do a total of Q trials and then look at the results. If you choose the prime powers intelligently, then Q should be significantly smaller than N. There are asymptotic results that show you can always get Q on the order of
(2E log N)^2 / 2log(2E log N)
This theorem is actually a corollary of the Chinese Remainder Theorem. The only place that I've seen this used is in Combinatorial Group Testing. Apparently the problem originally arose when testing soldiers coming back from WWII for syphilis.
The problem you are facing is a form of group testing, type of a problem with the objective of reducing the cost of identifying certain elements of a set (up to d elements of a set of N elements).
As you've already stated, there are two basic principles via which the testing may be carried out:
Non-adaptive Group Testing, where all the tests to be performed are decided a priori.
Adaptive Group Testing, where we perform several tests, basing each test on the outcome of previous tests. Obviously, adaptive testing has a potential to reduce the cost, compared to non-adaptive testing.
Theoretical bounds for both principles have been studied, and are available in this Wiki article, or this paper.
For adaptive testing, the upper bound is O(d*log(N)) (as already described in this answer).
For non-adaptive testing, it can be shown that the upper bound is O(d*d/log(d)*log(N)), which is obviously larger than the upper bound for adaptive testing by a factor of d/log(d).
This upper bound for non-adaptive testing comes from an algorithm which uses disjunct matrices: matrices of dimension T x N ("number of tests" x "number of elements"), where each item can be either true (if an element was included in a test), or false (if it wasn't), with a property that any subset of d columns must differ from all other columns by at least a single row (test inclusion). This allows linear time of decoding (there are also "d-separable" matrices where fewer test are needed, but the time complexity for their decoding is exponential and not computationaly feasible).
Conclusion:
My question is whether or not you can beat n calls on a fixed testing scheme [...]
For such a scheme and a sufficiently large value of N, a disjunct matrix can be constructed which would have less than K * [d*d/log(d)*log(N)] rows. So, for large values of N, yes, you can beat it.
The underlying question (challenge) is kind of silly. If the binary search answer is acceptable (where it sums sub arrays and sends them to IsEven) then I can think of a way to do it with E or less calls to IsEven (assuming the numbers are integers of course).
JavaScript to demonstrate
// sort the array by only the first bit of the number
A.sort(function(x,y) { return (x & 1) - (y & 1); });
// all of the evens will be at the beginning
for(var i=0; i < E && i < A.length; i++) {
if(IsEven(A[i]))
Print(A[i]);
else
break;
}
Not exactly a solution, but just few thoughts.
It is easy to see that if a solution exists for array length n that takes less than n tests, then for any array length m > n it is easy to see that there is always a solution with less than m tests. So, if you have a solution for n = 2 or 3 or 4, then the problem is solved.
You can split the array into pairs of numbers and for each pair: if the sum is odd, then exactly one of them is even, otherwise if one of the numbers is even, then both of them are even. This way for each pair it takes either one or two tests. Best case:n/2 tests, worse case:n tests, if even and odd numbers are chosen with equal probability, then: 3n/4 tests.
My hunch is there is no solution with less than n tests. Not sure how to prove it.
UPDATE: The second solution can be extended in the following way.
Check if the sum of two numbers is even. If odd, then exactly one of them is even. Otherwise label the set as "homogeneous set of size 2". Take two "homogenous set"s of same size n. Pick one number from each set and check if their sum is even. If it is even, combine these two sets to a "homogeneous set of size 2n". Otherwise, it implies that one of those sets purely consists of even numbers and the other one purely odd numbers.
Best case:n/2 tests. Average case: 3*n/2. Worst case is still n. Worst case exists only when all the numbers are even or all the numbers are odd.
If we can add and multiply array elements, then we can compute every Boolean function (up to complementation) on the low-order bits. Simulate a circuit that encodes the positions of the even numbers as a number from 0 to nC0 + nC1 + ... + nCe - 1 represented in binary and use calls to isEven to read off the bits.
Number of calls used: within 1 of the information-theoretic optimum.
See also fully homomorphic encryption.

Bijection on the integers below x

i'm working on image processing, and i'm writing a parallel algorithm that iterates over all the pixels in an image, and changes the surrounding pixels based on it's value. In this algorithm, minor non-deterministic is acceptable, but i'd rather minimize it by only querying distant pixels simultaneously. Could someone give me an algorithm that bijectively maps the integers below n to the integers below n, in a fast and simple manner, such that two integers that are close to each other before mapping are likely to be far apart after application.
For simplicity let's say n is a power of two. Could you simply reverse the order of the least significant log2(n) bits of the number?
Considering the pixels to be a one dimentional array you could use a hash function j = i*p % n where n is the zero based index of the last pixel and p is a prime number chosen to place the pixel far enough away at each step. % is the remainder operator in C, mathematically I'd write j(i) = i p (mod n).
So if you want to jump at least 10 rows at each iteration, choose p > 10 * w where w is the screen width. You'll want to have a lookup table for p as a function of n and w of course.
Note that j hits every pixel as i goes from 0 to n.
CORRECTION: Use (mod (n + 1)), not (mod n). The last index is n, which cannot be reached using mod n since n (mod n) == 0.
Apart from reverting the bit order, you can use modulo. Say N is a prime number (like 521), so for all x = 0..520 you define a function:
f(x) = x * fac mod N
which is bijection on 0..520. fac is arbitrary number different from 0 and 1. For example for N = 521 and fac = 122 you get the following mapping:
which as you can see is quite uniform and not many numbers are near the diagonal - there are some, but it is a small proportion.

Programing Pearls - Random Select algorithm

Page 120 of Programming Pearls 1st edition presents this algorithm for selecting M equally probable random elements out of a population of N integers.
InitToEmpty
Size := 0
While Size < M do
T := RandInt(1,N)
if not Member(T)
Insert(T)
Size := Size + 1
It is stated that the expected number of Member tests is less than 2M, as long as M < N/2.
I'd like to know how to prove it, but my algorithm analysis background is failing me.
I understand that the closer M is to N, the longer the program will take, because the result set will have more elements and the likelihood of RandInt selecting an existing one will increase proportionally.
Can you help me figuring out this proof?
I am not a math wizard, but I will give it a rough shot. This is NOT guaranteed to be right though.
For each additional member of M, you pick a number, see if it's there, and if is add it. Otherwise, you try again. Trying something until you're successful is called a geometric probability distribution.
http://en.wikipedia.org/wiki/Geometric_distribution
So you are running M geometric trials. Each trial has expected value 1/p, so will take expected 1/p tries to get a number not already in M. p is N minus the number of numbers we've already added from M divided by N (i.e. how many unpicked items / total items). So for the fourth number, p = (N -3) / N, which is the probability of picking an unused number, so the expected number of picks for the third number is N / N-3 .
The expected value of the run time is all of these added together. So something like
E(run time) = N/N + N/(N -1) + N/(N -2 ) ... + N/ (N-M)
Now if M < N/2, then the last element in that summation is bounded above by 2. ((N/N/2) == 2)). It's also obviously the largest element in the whole summation. So if the biggest element is two picks, and there are M elements being summed, the EV of the whole run time is bounded above by 2M.
Ask me if any of this is unclear. Correct me if any of this is wrong :)
Say we have chosen K elements out of N. Then our next try has probability (N-K)/N of succeeding, so the number of tries that it takes to find the K + 1 st element is geometrically distributed with mean N/(N-K).
So if 2M < N we expect it to take less than two tries to get each element.

Resources