Checking if an integer is an integer power of another? - algorithm

This is identical to the question found on Check if one integer is an integer power of another, but I am wondering about the complexity of a method that I came up with to solve this problem.
Given an integer n and another integer m, is n = m^p for some integer p. Note that ^ here is exponentiation and not xor.
There is a simple O(log_m n) solution based on dividing n repeatedly by m until it's 1 or until there's a non-zero remainder.
I'm thinking of a method inspired by binary search, and it's not clear to me how complexity should be calculated in this case.
Essentially you start with m, then you go to m^2, then m^4, m^8, m^16, .....
When you find that m^{2^k} > n, you check the range bounded by m^{2^{k-1}} and m^{2^k}. Is this solution O(log_2 (log_m(n)))?
Somewhat related to this, if I do something like
m^2 * m^2
vs.
m * m * m * m
Do these 2 have the same complexity? If they do, then I think the algorithm I came up with is still O(log_m (n))

Not quite. First of all, let's assume that multiplication is O(1), and exponentiation a^b is O(log b) (using exponentiation by squaring).
Now using your method of doubling the exponent p_candidate and then doing a binary search, you can find the real p in log(p) steps (or observe that p does not exist). But each try within the binary search requires you to compute m^p_candidate, which is bounded by m^p, which by assumption is O(log(p)). So the overall time complexity is O(log^2(p)).
But we want to express the time complexity in terms of the inputs n and m. From the relationship n = m^p, we get p = log(n)/log(m), and hence log(p) = log(log(n)/log(m)). Hence the overall time complexity is
O(log^2(log(n)/log(m)))
If you want to get rid of the m, you can provide a looser upper bound by using
O(log^2(log(n)))
which is close to, but not quite O(log(log(n))).
(Note that you can always omit the logarithmic bases in the O-notation since all logarithmic functions differ only by a constant factor.)
Now, the interesting question is: is this algorithm better than one that is O(log(n))? I haven't proved it, but I'm pretty certain it is the case that O(log^2(log(n))) is in O(log(n)) but not vice versa. Anyone cares to prove it?

Related

Can one compute the nth Fibonacci number in time O(n) or O(1)? Why?

I asked myself if one can compute the nth Fibonacci number in time O(n) or O(1) and why?
Can someone explain please?
Yes. It is called Binet's Formula, or sometimes, incorrectly, De Moivre's Formula (the real De Moivre's formula is another, but De Moivre did discover Binet's formula before Binet), and involves the golden ratio Phi. The mathematical reasoning behind this (see link) is a bit involved, but doable:
While it is an approximate formula, Fibonacci numbers are integers -- so, once you achieve a high enough precision (depends on n), you can just approximate the number from Binet's formula to the closest integer.
Precision however depends on constants, so you basically have two versions, one with float numbers and one with double precision numbers, with the second also running in constant time, but slightly slower. For large n you will need an arbitrary precision number library, and those have processing times that do depend on the numbers involved; as observed by #MattTimmermans, you'll then probably end up with a O(log^2 n) algorithm. This should happen for large enough values of n that you'd be stuck with a large-number library no matter what (but I'd need to test this to be sure).
Otherwise, the Binet formula is mainly made up of two exponentiations and one division (the three sums and divisions by 2 are probably negligible), while the recursive formula mainly employs function calls and the iterative formula uses a loop. While the first formula is O(1), and the other two are O(n), the actual times are more like a, b n + c and d n + e, with values for a, b, c, d and e that depend on the hardware, compiler, implementation etc. . With a modern CPU it is very likely that a is not too larger than b or d, which means that the O(1) formula should be faster for almost every n. But most implementations of the iterative algorithm start with
if (n < 2) {
return n;
}
which is very likely to be faster for n = 0 and n = 1. I feel confident that Binet's formula is faster for any n beyond the single digits.
Instead of thinking about the recursive method, think of building the sequence from the bottom up, starting at 1+1.
You can also use a matrix m like this:
1 1
1 0
and calculate power n of it. then output m^n[0,0].

Algorithm that sorts n numbers from 0 to n^m in O(n)? where m is a constant

So i came upon this question where:
we have to sort n numbers between 0 and n^3 and the answer of time complexity is O(n) and the author solved it this way:
first we convert the base of these numbers to n in O(n), therefore now we have numbers with maximum 3 digits ( because of n^3 )
now we use radix sort and therefore the time is O(n)
so i have three questions :
1. is this correct? and the best time possible?
2. how is it possible to convert the base of n numbers in O(n)? like O(1) for each number? because some previous topics in this website said its O(M(n) log(n))?!
3. and if this is true, then it means we can sort any n numbers from 0 to n^m in O(n) ?!
( I searched about converting the base of n numbers and some said its
O(logn) for each number and some said its O(n) for n numbers so I got confused about this too)
1) Yes, it's correct. It is the best complexity possible, because any sort would have to at least look at the numbers and that is O(n).
2) Yes, each number is converted to base-n in O(1). Simple ways to do this take O(m^2) in the number of digits, under the usual assumption that you can do arithmetic operations on numbers up to O(n) in O(1) time. m is constant so O(m^2) is O(1)... But really this step is just to say that the radix you use in the radix sort is in O(n). If you implemented this for real, you'd use the smallest power of 2 >= n so you wouldn't need these conversions.
3) Yes, if m is constant. The simplest way takes m passes in an LSB-first radix sort with a radix of around n. Each pass takes O(n) time, and the algorithm requires O(n) extra memory (measured in words that can hold n).
So the author is correct. In practice, though, this is usually approached from the other direction. If you're going to write a function that sorts machine integers, then at some large input size it's going to be faster if you switch to a radix sort. If W is the maximum integer size, then this tradeoff point will be when n >= 2^(W/m) for some constant m. This says the same thing as your constraint, but makes it clear that we're thinking about large-sized inputs only.
There is wrong assumption that radix sort is O(n), it is not.
As described on i.e. wiki:
if all n keys are distinct, then w has to be at least log n for a
random-access machine to be able to store them in memory, which gives
at best a time complexity O(n log n).
The answer is no, "author implementation" is (at best) n log n. Also converting these numbers can take probably more than O(n)
is this correct?
Yes it's correct. If n is used as the base, then it will take 3 radix sort passes, where 3 is a constant, and since time complexity ignores constant factors, it's O(n).
and the best time possible?
Not always. Depending on the maximum value of n, a larger base could be used so that the sort is done in 2 radix sort passes or 1 counting sort pass.
how is it possible to convert the base of n numbers in O(n)? like O(1) for each number?
O(1) just means a constant time complexity == fixed number of operations per number. It doesn't matter if the method chosen is not the fastest if only time complexity is being considered. For example, using a, b, c to represent most to least significant digits and x as the number, then using integer math: a = x/(n^2), b = (x-(a*n^2))/n, c = x%n (assumes x >= 0). (side note - if n is a constant, then an optimizing compiler may convert the divisions into a multiply and shift sequence).
and if this is true, then it means we can sort any n numbers from 0 to n^m in O(n) ?!
Only if m is considered a constant. Otherwise it's O(m n).

How to handle Big O when one variable is known to be smaller than another one?

We have 4 algorithms, all of them with complexity depending on m and n, like:
Alg1: O(m+n)
Alg2: O(mlogm + nlogn)
Alg3: O(mlogn + nlogm)
Alg4: O(m+n!) (ouch, this one sucks, but whatever)
Now, how do we handle this if we now that n>m? My first thought is: Big O notation "discard" constant and smaller variables because it doesn't matter when, but sooner or later the "bigger term" will overwhelm all the others, making them irrelevant in the computation cost.
So, can we rewrite Alg1 as O(n), or Alg2 as O(mlogm)? If so, what about the others?
yes you can rewrite it if you know that it is always the case that n>m. Formally, have a look at this
if we know that n>m (always) then it follows that
O(m+n) < O(n+n) which is O(2n) = O(n) (we don't really care about the 2)
also we can say the same thing about the other algorithms as well
O(mlogm + nlogn) < O(nlogn + nlogn) = O(2nlogn) = O(nlogn)
I think you can see where the rest of them are going. But if you do not know that n > m then you cannot say the above.
EDIT: as #rici nicely pointed out, you also need to be careful as well, since it'll always depend on the given function. (Note that O((2n)!) can not be simplified to O(n!))
With a bit of playing around you can see how this is not true
(2n)! = (2n) * (2n-1) * (2n-2)... < 2(n) * 2(n-1) * 2(n-2) ...
=> (2n)! = (2n) * (2n-1) * (2n-2)... < 2^n * n! (After combining all of the 2 coefficients)
Thus we can see that O((2n)!) is more like O(2^n * n!) to get a more accurate calculation you can see this thread here Are the two complexities O((2n + 1)!) and O(n!) equal?
Consider the problem of finding the k largest elements of an array. There's a nice O(n log k)-time algorithm for solving this using only O(k) space that works by maintaining a binary heap of at most k elements. In this case, since k is guaranteed to be no bigger than n, we could have rewritten these bounds as O(n log n) time with O(n) memory, but that ends up being less precise. The direct dependency of the runtime and memory usage on k makes clear that this algorithm takes more time and uses more memory as k changes.
Similarly, consider many standard graph algorithms, like Dijkstra's algorithm. If you implement Dijkstra's algorithm using a Fibonacci heap, the runtime works out O(m + n log n), where m is the number of nodes and n is the number of edges. If you assume your input graph is connected, then the runtime also happens to be O(m + m log m), but that's a less precise bound than the one that we had.
On the other hand, if you implement Dijkstra's algorithm with a binary heap, then the runtime works out to O(m log n + n log n). In this case (again, assuming the graph is connected), the m log n term strictly dominates the n log n term, and so rewriting this as O(m log n) doesn't lose any precision.
Generally speaking, you'll want to give the most precise bounds that you can in the course of documenting the runtime and memory usage of a piece of code. If you have multiple terms where one clearly strictly dominates the other, you can safely discard those lower terms. But otherwise, I wouldn't recommend discarding one of the variables, since that loses precision in the bound you're giving.

Ambiguity about the Big-oh notation

I am currently trying to learn time complexity of algorithms, big-o notation and so on. However, some point confuses me a lot. I know that most of the time, the input size of an array or whatever we are dealing with determines the running time of the algorithm. Let's say I have an unsorted array with size N and I want to find the maximum element of this array without using any special algorithm. I just want to iterate over the array and find the maximum element. Since the size of my array is N, this process runs at O(N) or linear time. Let M is an integer that is the square root of N. So N can be written as the square of M that is M*M or M^2. So, I think there is nothing wrong if I want to replace N with M^2. I know that M^2 is also the size of my array so my big-o notation could be written as O(M^2). So, my new running time looks like running in quadratic time. Why does this happen?
You are correct, if it happens to be that you have some variable M such that M^2 ~= N is always true, you could say the algorithm runs in O(M^2).
But, note that now - the algorithm runs in quadratic related to M, and not quadratic time related to the input, it is still linear related to the size of the input.
The important thing here is defining linear/quadratic, etc.
More precisely, you have to detail linear/quadratic, etc. with respect to something (N or M for your example). The most natural choice is to study the complexity wrt. the size of the input (N for your example).
Another trap for big integers is that the size of n is log(n). So for instance if you loop over all smaller integers, your algorithm is not polynomial.

complexity of combinatorial function

How the complexity of an algorithm involved with combinatorial operations is classified.
Let's say the input is m, n, and the complexity is determined by C(m,n). (C is the combination function of choosing m from n). The question is how the complexity should be categorized instead of just giving C(m,n).
I mean, to give an idea of the running time of an algorithm, you can say the algorithm is of polynomial, exponential time complexity. But what to do with C(m,n) ?
I know factorials can be approximated by using Stirling's approximation, but the result is still too complex to put it in a complexity class.
If you insist on retaining both m and n, then it's going to be hard to do better than Stirling's approximation. The upper bound for m alone is C(m, m/2), which is asymptotic to 2m/√m and thus exponential.
You can't "put it into a complexity class" because it isn't a single-variable running time.
It's "asymptotic behaviour" is undefined, because which variable should we consider to approach infinite? Even if we said both approach infinite lim {n->inf, m->inf} nCm is undefined because their relative values are undefined. I.e. the behaviour depends not just on n and m being greater than a certain value, but their relative values as well.
The complexity depends on two variables, and nCm is a perfectly valid complexity function.
If you have a reasonable approximation for m relative to n then you can class it more easily. Maybe it's worthwhile working out cases, where m = O(n), m = O(1).
Or where m = [kn] and 0 <= k <= 1 is constant + Stilring's formula, gives you a nice relation in one variable while still being able to consider values of m relative to k.

Resources