Algorithm complexity - algorithm

Suppose an oracle knows a natural number nthat you wish to know.
The oracle answers only Yes/No, to the following three types of queries:
Is the number greater than x ?
Is the number lesser than x ?
Is the number equal to x ?
(where x can be an arbitrary natural number, that can changed across queries).
Describe a method for posing queries to the oracle, which is asymptotically efficient in the number of queries posed.
Perform the analysis and write a proof of correctness. Note that the number of queries posed will be a function of n

This question is not very fair as it requires asymptotic efficiency, without giving any hint on the goal. We can use an informal information theoretic bound and say that the answer conveys i bits of information, which is Omega(i)=Omega(Lg n).
The algorithm
Phase 1: find the number of significant bits.
Ask x<1b, x<10b, x<100b, x<1000b, x<10000b, x<100000b... (all powers of 2)
until you get a yes.
Phase 2: find all bits.
Take the value of the last bound where phase 1 stopped and divide it by 2.
Then, going from the second most significant to the least significant bit,
set the next bit and ask if <x. Keep the bit set if you get a no.
Example
Let us assume x=10110b, your questions will go as follows:
x<1b ? no
x<10b ? no
x<100b ? no
x<1000b ? no
x<10000b ? no
x<100000b ? yes
Q=10000b
x<11000b ? yes
Q=10000b
x<10100b ? no
Q=10100b
x<10110b ? no
Q=10110b
x<10111b ? yes
Q=10110b
For 5 bits, 10 questions.
Correctness
In phase 1, the search intervals form a partition of the integers and the search will stop sooner or later. When it stops, P<=x<2P holds, where P is a power of 2,or 2^k<=x<2^(k+1).
In phase 2, we keep the invariant condition Q<=x<Q+2^(k+1) by iterative halving (initially Q=0): given Q<=x<Q+2^(k+1), we ask for x<Q+2^k and conclude either Q<=x<Q+2^k or Q+2^k<=x<Q+2^(k+1), which we turn to Q'<=x<Q'+2^k by setting Q'=Q+2^k. In the end, Q<=x<Q+1.
Efficiency
Phase 1 takes as many queries as there are significant bits.
Phase 2 takes as many queries as there are significant bits.

Check out the Wikipedia post on binary search algorithm. That can be a starting point for you.
Binary Search Algorithm

Related

Big O runtime analysis for 3-way recursion with memoization

I'm doing some practice interview questions and came across this one:
Given a list of integers which represent hedge heights, determine the minimum number of moves to make the hedges pretty - that is, compute the minimum number of changes needed to make the array alternate between increasing and decreasing. For example, [1,6,6,4,4] should return 2 as you need to change the second 6 to something >6 and the last 4 to something <4. Assume the min height is 1 and the max height is 9. You can change to any number that is between 1 and 9, and that counts as 1 move regardless of the diff to the current number.
My solution is here: https://repl.it/#plusfuture/GrowlingOtherEquipment
I'm trying to figure out the big O runtime for this solution, which is memoized recursion. I think it's O(n^3) because for each index, I need to check against 3 possible states for the rest of the array, changeUp, noChange, and changeDown. My friend maintains that it's O(n) since I'm memoizing most of the solutions and exiting branches where the array is not "pretty" immediately.
Can someone help me understand how to analyze the runtime for this solution? Thanks.

What is meant by 'Polynomial in number of bits in the input'?

So I have a piece of coursework which is asking a questions about complexity theory where I have some problem, PRIMES, which is in the NPTime complexity class.
All good up to that point.
The thing that is throwing me is a question to come up with a polynomial time algorithms for computing (a^p)mod(b). It has to be polynomial in the size of the input (no of bits).
It's the latter sentence that confuses me.
This is where it loses me! Surely, assuming a brute force attempt (all values between 2 and sqrt(n)), would give 2^NoBits which is exponential?!
Now I don't want the answer! This my coursework so I can't ask for that. I just want clarity on what is meant by 'polynomial in the number of input bits'. Explain it like you would to a child ;)
You have 3 numbers, a, p, and b. Each one can be of any size. 1. 10. 123_456_789. No limits.
When we write them in base 2, each is some number of bits, followed by that string of bits. So 1, 110, 111010110111100110100010101. Each one is some number of bits. 1, 2, 27. The sum of the number of bits is the total size of your input. 30.
Your algorithm should be efficient enough that there is some polynomial p(x) where x is the sum of the number of bits in the inputs, which is an upper bound on how long your computation will take.
In particular note that you do not want to multiply a by itself p times!

finding power of a number

I have a very big number which is a product of several small primes. I know the number and also I know the prime factors but I don't know their powers. for example:
(2^a)x(3^b)x(5^c)x(7^d)x(11^e)x .. = 2310
Now I want to recover the exponents in a very fast and efficient manner. I want to implement it in an FPGA.
Regards,
The issue is that you are doing a linear search for the right power when you should be doing a binary search. Below is an example showing how to the case where the power of p is 10 (p^10). This method finds the power in O(log N) divisions rather than O(N).
First find the upper limit by increasing the power quickly until it's too high, which happens at step 5. Then it uses a binary search to find the actual power.
Check divisibility by p. Works.
Check divisibility by p^2. Works.
Check divisibility by p^4. Works.
Check divisibility by p^8. Works.
Check divisibility by p^16. Doesn't work. Undo/ignore this one.
Check divisibility by p^((8+16)/2)=p^12. Doesn't work. Undo/ignore this one.
Check divisibility by p^((8+12)/2)=p^10. Works, but might be too low.
Check divisibility by p^((10+12)/2)=p^11. Doesn't work. Undo/ignore this one.
Since ((10+11)/2)=10.5 is not an integer, the power most be the low end, which is 10.
Note, there is a method where you actually divide by p, and at step 4, you've actually divided the number by p^(1+2+4+8)=p^15, but it's a bit more difficult to explain the binary search part. However, the size of the number being divided gets smaller, so division operations are faster.

Algorithm to find an arbitrarily large number

Here's something I've been thinking about: suppose you have a number, x, that can be infinitely large, and you have to find out what it is. All you know is if another number, y, is larger or smaller than x. What would be the fastest/best way to find x?
An evil adversary chooses a really large number somehow ... say:
int x = 9^9^9^9^9^9^9^9^9^9^9^9^9^9^9
and provides isX, isBiggerThanX, and isSmallerThanx functions. Example code might look something like this:
int c = 2
int y = 2
while(true)
if isX(y) return true
if(isBiggerThanX(y)) fn()
else y = y^c
where fn() is a function that, once a number y has been found (that's bigger than x) does something to determine x (like divide the number in half and compare that, then repeat). The thing is, since x is arbitrarily large, it seems like a bad idea to me to use a constant to increase y.
This is just something that I've been wondering about for a while now, I'd like to hear what other people think
Use a binary search as in the usual "try to guess my number" game. But since there is no finite upper end point, we do a first phase to find a suitable one:
Initially set the upper end point arbitrarily (e.g. 1000000, though 1 or 1^100 would also work -- given the infinite space to work in, all finite values are equally disproportionate).
Compare the mystery number X with the upper end point.
If it's not big enough, double it, and try again.
Once the upper end point is bigger than the mystery number, proceed with a normal binary search.
The first phase is itself similar to a binary search. The difference is that instead of halving the search space with each step, it's doubling it! The cost for each phase is O(log X). A small improvement would be to set the lower end point at each doubling step: we know X is at least as high as the previous upper end point, so we can reuse it as the lower end point. The size of the search space still doubles at each step, but in the end it will be half as large as would have been. The cost of the binary search will be reduced by only 1 step, so its overall complexity remains the same.
Some notes
A couple of notes in response to other comments:
It's an interesting question, and computer science is not just about what can be done on physical machines. As long as the question can be defined properly, it's worth asking and thinking about.
The range of numbers is infinite, but any possible mystery number is finite. So the above method will eventually find it. Eventually is defined such as that, for any possible finite input, the algorithm will terminate within a finite number of steps. However since the input is unbounded, the number of steps is also unbounded (it's just that, in every particular case, it will "eventually" terminate.)
If I understand your question correctly (advise if I do not), you're asking about how to solve "pick a number from 1 to 10", except that instead of 10, the upper bound is infinity.
If your number space is truly infinite, the following are true:
The value will never be held in an int (or any other data type) on any physical hardware
You will NEVER find your number
If the space is immensely large but bound, I think the best you can do is a binary search. Start at the middle of the number range. If the desired number turns out to be higher or lower, divide that half of the number space, and repeat until the desired number is found.
In your suggested implementation you raise y ^ c. However, no matter how large c is chosen to be, it will not even move the needle in infinite space.
Infinity isn't a number. Thus you can't find it, even with a computer.
That's funny. I've wondered the same thing for years, though I've never heard anyone else ask the question.
As simple as your scenario is, it still seems to provide insufficient information to allow the choice of an optimal strategy. All one can choose is a suitable heuristic. My heuristic had been to double y, but I think that I like yours better. Yours doubles log(y).
The beauty of your heuristic is that, so long as the integer fits in the computer's memory, it finds a suitable y in logarithmic time.
Counter-question. Once you find y, how do you proceed?
I agree with using binary search, though I believe that a ONE-SIDED binary search would be more suitable, since here the complexity would NOT be O( log n ) [ Where n is the range of allowable numbers ], but O( log k ) - where k is the number selected by your adversary.
This would work as follows : ( Pseudocode )
k = 1;
while( isSmallerThanX( k ) )
{
k = k*2;
}
// At this point, once the loop is exited, k is bigger than x
// Now do normal binary search for the range [ k/2, k ] to find your number :)
So even if the allowable range is infinity, as long as your number is finite, you should be able to find it :)
Your method of tetration is guaranteed to take longer than the age of the universe to find an answer, if the opponent merely uses a paradigm which is better (for example, pentation). This is how you should do it:
You can only do this with symbolic representations of numbers, because it is trivial to name a number your computer cannot store in floating-point representation, even if it used arbitrary-precision arithmetic and all its memory.
Required reading: http://www.scottaaronson.com/writings/bignumbers.html - that pretty much sums it up
How do you represent a number then? You represent it by a program which will, if run to completion, print out that number. Even then, your computer is incapable of computing BusyBeaver(10^100) (if you dictated a program 1 terabyte in size, this well over the maximum number of finite clock cycles it could run without looping forever). You can see that we could easily have the computer print out 1 0 0... each clock cycle, making the maximum number it could say (if we waited nearly an eternity) would be 10^BusyBeaver(10^100). If you allowed it to say more complicated expressions like eval(someprogram), power-towers, Ackermann's function, whatever-- then I believe that would be no better than increasing the original 10^100 by some constant proportional to the complexity of what you described (plus some logarithmic interpreter factor, see Kolmogorov complexity).
So let's frame this another way:
Your opponent picks a finite computable number, and gives you a function tells you if the number is smaller/larger/equal by computing it. He also gives you a representation for the output (in a sane world this would be "you can only print numbers like 99999", but he can make it more complicated; it actually doesn't matter). Proceed to measure the size of this function in bits.
Now, answer with your own function, which is twice the size of his function (in bits), and prints out the largest number it can while keeping the code to less than 2N bits in length. (You use the same representation he chose: In a world where you can only print out numbers like "99999", that's what you do. If you can define functions, it gets slightly more complicated.)
I do not understand the purpose here, but I this is what I thought of:
Reading your comments, I suppose you aren't looking for infinitely large number, but a "super large number" instead. And whatever be the number, it will have a large no. of digits. How you got them, isn't the concern. Keeping this in mind:
No complex computation is required. Just type random keys on your numeric keyboard to have a super large number, and then have a program randomly add/remove/modify digits of that number. You get a list of very large numbers - select any one out of them.
e.g: 3672036025039629036790672927305060260103610831569252706723680972067397267209
and keep modifying/adding digits to get more numbers
PS: If you state the purpose in your question clearly, we might be able to give better answers.

Guessing a number knowing only if the number proposed is lower or higher?

I need to guess a number. I can only see if the number I'm proposing is lower or higher. Performance matters a whole lot, so I thought of the following algorithm:
Let's say the number I'm trying to guess is 600.
I start out with the number 1000 (or for even higher performance, the average result of previous numbers).
I then check if 1000 is higher or lower than 600. It is higher.
I then divide the number by 2 (so that it is now 500), and check if it is lower or higher than 600. It is lower.
I then find the difference and divide it by 2 in the following way to retrieve a new number: (1000 + 500) / 2. The result is 750. I then check that number.
And so on.
Is this the best approach or is there a smarter way of doing this? For my case, every guess takes approximately 500 milliseconds, and I need to guess quite a lot of numbers in as low time as possible.
I can roughly assume that the average result of previous guesses is close to the upcoming numbers too, so there's a pattern there which I can use for my own advantage.
yes binary search is the most effective way of doing this. Binary Search is what you described. For a number between 1 and N Binary Search runs in O(log(n)) time.
So here is the algorithm to find a number between 1-N
int a = 1, b = n, guess = average of previous answers;
while(guess is wrong) {
if(guess lower than answer) {a = guess;}
else if(guess higher than answer) {b = guess;}
guess = (a+b)/2;
} //Go back to while
Well, you're taking the best possible approach without the extra information - it's a binary search, basically.
Exactly how you use the "average result of previous guesses" is up to you; it would suggest biasing the results towards that average, but you'd need to perform analysis of just how indicative previous results are in order to work out the best approach. Don't just use the average: use the complete distribution.
For example, if all the results have been in the range 600-700 (even though the hypothetical range is up to 1000) with an average of 670, you might start with 670 but if it says "guess higher" then you would probably want to choose a value between 670 and 700 as your next guess, rather than 835 (which is very likely to be higher than the real result).
I suggest you log all the results from previous enquiries, so you can then use that as test data for alternative approaches.
In general, binary search starting at the middle point of the range is the optimal strategy. However, you have additional specific information which may make this a suboptimal strategy. This depends critically in what exactly "close to the average of the previous results" means.
If numbers are close to the previous average then dividing by 2 in the second step is not optimal.
Example: Previous numbers 630, 650, 620, 660. You start with 640.
Your number is actually closer. Imagine that it is 634.
The number is lower. If in the second step you divide by 2, you get 320, thus losing any advantage about the previous average numbers.
You should analyze the behaviour further. It may be optimal, in your specific case, to start at the mean of the N previous numbers and then add or substract some quantity related to the standard deviation of the previous numbers.
Yes, binary search (your algorithm) is correct here. However there is one thing missing in the standard binary search:
For binary search you normally need to know the maximum and minimum between which you are searching. In case you do not know this, you have to iteratively find the maximum in the beginning, like so:
Start with zero
if it is higher than the number searched, zero is your maximum and you have to find a minimum
if it is lower than the number searched, zero is your minimum and you have to find a maximum
You can search for your maximum/minimum by starting at 1 or -1 and always multiplying by two until you find a number which is greater/smaller
When you always multiply by two, you will be much faster than when you search linearly.
Do you know the range of possible values? If yes, always start in the middle and do exactly what you describe.
A standard binary search between 0 and N(N is the given number) will give you the answer in logN time.
int a = 1, b = n+1, guess = average of previous answers;
while(guess is wrong) {
if(guess lower than answer) {a = guess;}
else if(guess higher than answer) {b = guess;}
guess = (a+b)/2;
} //Go back to while
You got to add +1 to n else you can never get n since it's an int.
I gave an answer to a similar question "Optimal algorithm to guess any random integer without limits?"
Actually, provided there algorithm not just searches for the conceived number, but it estimates a median of the distribution of the number that you may re-conceive at each step! And also the number could be even from the real domain ;)

Resources