Algorithm for product of n numbers - algorithm

I have a question about running time of an algorithm that express the product of n numbers.. I think the best solution is divide and conquer which is based on halving the n element by 2 recursively and multiply 2 elements. The confusing part is the number of simple operations. In case of Divide and Conquer the complexity should be O(logn) So if we have 8 numbers to multiply we should end up with 3 basic steps E.g we have 8 numbers... we can halve 8 until we reach to 2 and start multiplying it.. (a1 a2 a3 a4 a5 a6 a7 a8) ... (a1*a2=b1) (a3*a4=b2) (a5*a6=b3) (a7*a8=b4) (b1*b2=c1) (b3*b4=c2) (c1*c2=final result)..However, in this result we need 7 simple multiplication. Can someone clarify this to me..?

Divide and conquer is for cases when you can divide your original set into multiple subsets which, after you've identified and created them, do not interact anymore (or only in a way that's neglectably cheap compared to the operation on each subset). In your case, you're violating the "subsets do not interact after identifying them" rule.
Also, O(x) does not mean the number of operations is less than x. It just means, that for any concrete data set of size x, there is a finite value d so that the number of operations needed is smaller than d*x. (My native language is german, i hope i didn't change the meaning when translating). So the fact that you need 8 operations on 8 data items does not, per se, mean the complexity is larger than O(log n).

If I understand your objective correctly, then the complexity should be O(n), as you can just multiply the n values in sequence. For n values, you need n-1 multiplications. No need for any divide and conquer.

Whichever way you choose to perform multiplication of n numbers, you will need to multiply all of them, making n-1 multiplications. Thus, the time complexity of this multiplication will always be O(n).
Here is how I can explain that the number of steps required to multiply n numbers by your approach is close to n, not log(n). To multiply n numbers you need to make n/2 multiplications first - first number with the second, third with the fourth and so on, till the n-1-th and n-th number. After that you have n/2 numbers and to multiply them all you need to make n/4 multiplications - first result of previous multiplication with the second, third with the fourth and so on. After that you have n/4 numbers, and you'll make n/8 multiplications. This process will end when you'll have only two numbers left - their multiplication will give you the result.
Now, let's count the total number of multiplications needed. On the first stage you've made n/2 multiplications, on the second - n/4 and so on, till the only one multiplication. You have the following sequence: n/2 + n/4 + n/8 + ... + 1. It was shown, that the sum of such sequence equals to n.

Related

Is the algorithm that involves enumerating n choose k exponetial

Say if we have an algorithm needs to list out all possibilities of choosing k elements from n elements (k<=n), is the time complexity of the particular algorithm exponential and why?
No.
There are n choose k = n!/(k!(n-k)!) possibilities [1].
Consider that, n choose k = n^k / (k!). [2].
Assuming you are keeping k constant, as n grows, the amount of possibilities increases in polynomial time.
For this example, ignore the (1/(k!)) term because it is constant. If k = 2, and you increase n from 2 to 3, then you have a 2^2 to 3^2 change. An exponential change would be from 2^2 to 2^3. This is not the same.
Keeping k constant and changing n results in a big O of O(n^k) (the 1/(k!) term is constant and you ignore it).
Thinking carefully about the size of the input instance is required since the input instance contains numbers - a basic familiarity with weak NP-hardness can also be helpful.
Assume that we fix k=1 and encode n in binary. Since the algorithm must visit n choose 1 = n numbers, it takes at least n steps. Since the magnitude of the number n may be exponential in the size of the input (the number of bits used to encode n), the algorithm in the worst case consumes exponential time.
You can get a feel for this exponential-time behavior by writing a simple C program that prints all the numbers from 1 to n with n = 2^64 and see how far you get in a minute. While the input is only 64 bits long, it would take you about 600 years to print all the numbers assuming that your device can print a million numbers per second.
An algorithm that finds all possibilities of choosing k elements from n unique elements (k<=n), does NOT have an exponential time complexity, O(K^n), because it instead has a factorial time complexity, O(n!). The relevant formula is:
p = n!/(k!(n-k)!)

big-Theta running time in terms of input size N

while N-bit integer a > 1,
a = a / 2
I was thinking it is log(n) because each time you go through the while loop you are dividing a by two but my friend thinks its 2 logn(n).
Clearly your algorithm is in big-Theta(log(a)), where a is your number
But as far as I understand your problem, you want to know the asymptotic runtime depending on the amount of bits of your number
That's really difficult to say and depends on your number:
Let's say you have an n-bit integer and the Most significant bit is 1. You have to divide it n-times, to get a number smaller than 1.
Now let's look at a integer where only the least siginficant bit is 1 (so it equals the number 1 in decimal system). There you need just one division.
So i would say, it'll take n/2 in average which makes it big-Theta(n) where n is the amount of bits of your number. The worst-case is also in big-Theta(n) and the best-case is in big-Theta(1)
NOTE: Dividing a number by two in binary system has a similar effect as dividing a number by ten in decimal system
Dividing an integer by two can be efficiently implemented by taking a number in binary notation and shifting the bits. In the worst case, all the bits are set at you have to shift (n-1) bits for the first division, (n-2) bits for the second, etc. etc. until you shift 1 bit on the last iteration and find the number becomes equal to 1, at which point you stop. This means your algorithm must shift 1+2+...+(n-1) = n(n-1)/2 bits, making your algorithm O(n^2) in the number of bits of input.
A more efficient algorithm that will leave a with the same value is a = (a == 0 ? 0 : 1). This generates the same answer in linear time (equality checking is linear in the number of bits) and it works because your code will only leave a = 0 if a is originally zero; in all other cases, the highest-order bit ends up in the unit's place.

Understanding the running time analysis from an exercise of CLRS

Here's the problem I am looking for an answer for:
An array A[1...n] contains all the integers from 0 to n except one. It would be easy to determine the missing
integer in O(n) time by using an auxiliary array B[0...n] to record which numbers appear in A. In this
problem, however, we cannot access an entire integer in A with a single operation. The elements of A are
represented in binary, and the only operation we can use to access them is ”fetch the jth bit of A[i],” which
takes constant time.
Show that if we use only this operation, we can still determine the missing integer in O(n) time.
I have this approach in mind:
If I didn't have the above fetching restriction, I would have taken all the numbers and did an XOR of them together. Then XOR the result with all numbers from 1..n. And the result of this would be my answer.
In this problem similarly I can repeatedly XOR the bits of different numbers at a distance of log(n+1) bits with each other for all elements in array and then XOR them with the elements 1...n but the complexity comes out to be O(nlogn) in my opinion.
How to achieve the O(n) complexity?
Thanks
You can use a variation of radix sort:
sort numbers according to MSb (Most Significant bit)
You get two lists of sizes n/2, n/2-1. You can 'drop' the list with n/2 elements - the missing number is not there.
Repeat for the second MSb and so on.
At the end, the 'path' you have chosen (the bit with the smaller list for each bit) will represent the missing number.
Complexity is O(n + n/2 + ... + 2 + 1), and since n + n/2 + .. + 1 < 2n - this is O(n)
This answer assumes for simplicity that n=2^k for some integer k (this relaxation can be later dropped off, by doing s 'special' handle for the MSb).
You have n integers with range [0..n]. You can inspect every number's most significant bit and divide these number into to groups C(with MSB 0) and D(with MSB 1). Since you know the range is [0..n], you can calculate how many numbers with MSB 0 in this range, called S1, and how many number with MSB 1 in this range, called S2. If the size of C is not equal to S1, then you know the missing number has MSB 0. Otherwise, you know the missing number has MSB 1. Then, you can recursively solve this problem. Since each recursive call takes linear time and each recursive call except the first one can half the problem size, the total running time is linear.

Bit cost of bit shift

I updated a question I asked before with this but as the original question was answered I'm guessing I should ask it seperately in a new question.
Take for example the simple multiplication algorithm. I see in numerous places the claim that this is a Log^2(N) operation. The given explanation is that this is due to it consisting of Log(N) additions of a Log(N) number.
The problem I have with this is although that is true it ignores the fact that each of those Log(N) numbers will be the result of a bit shift and by the end we will have bitshifted at least Log(N) times. As bitshift by 1 is a Log(N) operation the bitshifts considered alone give us Log^2(N) operations.
It therefore makes no sense to me when I see it further claimed that in practice multiplication doesn't in fact use Log^2(N) operations as various methods can reduce the number of required additions. Since the bitshifting alone gives us Log^2(N) I'm left confused as to how that claim can be true.
In fact any shift and add method would seem to have this bit cost irrespective of how many addition's there are.
Even if we use perfect minimal bit coding any Mbit by Nbit multiplication will result in an approx M+Nbit Number so M+N bits will have to have been shifted at least N times just to output/store/combine terms, meaning a minimum N^2 bit operations.
This seems to contradict the claimed number of operations for Toom-Cook etc so can someone please point out where my reasoning is flawed.
I think the way around this issue is the fact that you can do the operation
a + b << k
without having to perform any shifts at all. If you imagine what the addition would look like, it would look something like this:
bn b(n - 1) ... b(n-k) b(n-k-1) b0 0 ... 0 0 0 0
an a(n-1) ... ak a(k-1) ... a3 a2 a1 a0
In other words, the last k digits of the number will just be the last k digits of the number a, the middle digits will consist of the sum of a subset of b's digits and a subset of a's digits, and the leading digits can be formed by doing a ripple propagation of any carries up through the remaining digits of b. In other words, the total runtime will be proportional to the number of digits in a and b, plus the number of places to do the shift.
The real trick here is realizing that you can shift by k places without doing k individual shifts by one place over. Rather than shuffling everything down k times, you can just figure out where the bits are going to end up and write them there directly. In other words, the cost of a shift by k bits is not k times the cost of a shift by 1 bit. It's O(N + k), where N is the number of bits in the number.
Consequently, if you can implement multiplication in terms of some number of "add two numbers with a shift" operations, you will not necessarily have to do O((log n)2) bit operations. Each addition does O(log n + k) total bit operations, so if k is small (say, O(log n)) and you only do a small number of additions, then you can do better than O((log n)2) bit operations.
Hope this helps!

Why is Sieve of Eratosthenes more efficient than the simple "dumb" algorithm?

If you need to generate primes from 1 to N, the "dumb" way to do it would be to iterate through all the numbers from 2 to N and check if the numbers are divisable by any prime number found so far which is less than the square root of the number in question.
As I see it, sieve of Eratosthenes does the same, except other way round - when it finds a prime N, it marks off all the numbers that are multiples of N.
But whether you mark off X when you find N, or you check if X is divisable by N, the fundamental complexity, the big-O stays the same. You still do one constant-time operation per a number-prime pair. In fact, the dumb algorithm breaks off as soon as it finds a prime, but sieve of Eratosthenes marks each number several times - once for every prime it is divisable by. That's a minimum of twice as many operations for every number except primes.
Am I misunderstanding something here?
In the trial division algorithm, the most work that may be needed to determine whether a number n is prime, is testing divisibility by the primes up to about sqrt(n).
That worst case is met when n is a prime or the product of two primes of nearly the same size (including squares of primes). If n has more than two prime factors, or two prime factors of very different size, at least one of them is much smaller than sqrt(n), so even the accumulated work needed for all these numbers (which form the vast majority of all numbers up to N, for sufficiently large N) is relatively insignificant, I shall ignore that and work with the fiction that composite numbers are determined without doing any work (the products of two approximately equal primes are few in number, so although individually they cost as much as a prime of similar size, altogether that's a negligible amount of work).
So, how much work does the testing of the primes up to N take?
By the prime number theorem, the number of primes <= n is (for n sufficiently large), about n/log n (it's n/log n + lower order terms). Conversely, that means the k-th prime is (for k not too small) about k*log k (+ lower order terms).
Hence, testing the k-th prime requires trial division by pi(sqrt(p_k)), approximately 2*sqrt(k/log k), primes. Summing that for k <= pi(N) ~ N/log N yields roughly 4/3*N^(3/2)/(log N)^2 divisions in total. So by ignoring the composites, we found that finding the primes up to N by trial division (using only primes), is Omega(N^1.5 / (log N)^2). Closer analysis of the composites reveals that it's Theta(N^1.5 / (log N)^2). Using a wheel reduces the constant factors, but doesn't change the complexity.
In the sieve, on the other hand, each composite is crossed off as a multiple of at least one prime. Depending on whether you start crossing off at 2*p or at p*p, a composite is crossed off as many times as it has distinct prime factors or distinct prime factors <= sqrt(n). Since any number has at most one prime factor exceeding sqrt(n), the difference isn't so large, it has no influence on complexity, but there are a lot of numbers with only two prime factors (or three with one larger than sqrt(n)), thus it makes a noticeable difference in running time. Anyhow, a number n > 0 has only few distinct prime factors, a trivial estimate shows that the number of distinct prime factors is bounded by lg n (base-2 logarithm), so an upper bound for the number of crossings-off the sieve does is N*lg N.
By counting not how often each composite gets crossed off, but how many multiples of each prime are crossed off, as IVlad already did, one easily finds that the number of crossings-off is in fact Theta(N*log log N). Again, using a wheel doesn't change the complexity but reduces the constant factors. However, here it has a larger influence than for the trial division, so at least skipping the evens should be done (apart from reducing the work, it also reduces storage size, so improves cache locality).
So, even disregarding that division is more expensive than addition and multiplication, we see that the number of operations the sieve requires is much smaller than the number of operations required by trial division (if the limit is not too small).
Summarising:
Trial division does futile work by dividing primes, the sieve does futile work by repeatedly crossing off composites. There are relatively few primes, but many composites, so one might be tempted to think trial division wastes less work.
But: Composites have only few distinct prime factors, while there are many primes below sqrt(p).
In the naive method, you do O(sqrt(num)) operations for each number num you check for primality. Ths is O(n*sqrt(n)) total.
In the sieve method, for each unmarked number from 1 to n you do n / 2 operations when marking multiples of 2, n / 3 when marking those of 3, n / 5 when marking those of 5 etc. This is n*(1/2 + 1/3 + 1/5 + 1/7 + ...), which is O(n log log n). See here for that result.
So the asymptotic complexity is not the same, like you said. Even a naive sieve will beat the naive prime-generation method pretty fast. Optimized versions of the sieve can get much faster, but the big-oh remains unchanged.
The two are not equivalent like you say. For each number, you will check divisibility by the same primes 2, 3, 5, 7, ... in the naive prime-generation algorithm. As you progress, you check divisibility by the same series of numbers (and you keep checking against more and more as you approach your n). For the sieve, you keep checking less and less as you approach n. First you check in increments of 2, then of 3, then 5 and so on. This will hit n and stop much faster.
Because with the sieve method, you stop marking mutiples of the running primes when the running prime reaches the square root of N.
Say, you want to find all primes less than a million.
First you set an array
for i = 2 to 1000000
primetest[i] = true
Then you iterate
for j=2 to 1000 <--- 1000 is the square root of 10000000
if primetest[j] <--- if j is prime
---mark all multiples of j (except j itself) as "not a prime"
for k = j^2 to 1000000 step j
primetest[k] = false
You don't have to check j after 1000, because j*j will be more than a million.
And you start from j*j (you don't have to mark multiples of j less than j^2 because they are already marked as multiples of previously found, smaller primes)
So, in the end you have done the loop 1000 times and the if part only for those j's that are primes.
Second reason is that with the sieve, you only do multiplication, not division. If you do it cleverly, you only do addition, not even multiplication.
And division has larger complexity than addition. The usual way to do division has O(n^2) complexity, while addition has O(n).
Explained in this paper: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
I think it's quite readable even without Haskell knowledge.
the first difference is that division is much more expensive than addition. Even if each number is 'marked' several times, it's trivial when compared with the huge number of divisions needed for the 'dumb' algorithm.
A "naive" Sieve of Eratosthenes will mark non-prime numbers multiple times.
But, if you have your numbers on a linked list and remove numbers taht are multiples (you will still need to walk the remainder of the list), the work left to do after finding a prime is always smaller than it was before finding the prime.
http://en.wikipedia.org/wiki/Prime_number#Number_of_prime_numbers_below_a_given_number
the "dumb" algorithm does i/log(i) ~= N/log(N) work for each prime number
the real algorithm does N/i ~= 1 work for each prime number
Multiply by roughly N/log(N) prime numbers.
It was a time when I was trying to find an efficient way of finding sum of primes less than x:
There I decided to use N by N square table and started checking if numbers with a unit digits in [1,3,7,9]
But Eratosthenes Method of prime made it a little easier: How
Let you want to know if N is prime or Not
You started finding factorization. So you will realize when N is factorized
when you divide N with the highest factor quotient will be less.
So, You take a number: int(sqrt(N)) = K(say) divides N you get somewhat same and close number to K
Now let's say you divide N with u<K, but if "U" is not prime and one of the prime factors of U is V(prime) then will obviously be less than U (V<U) and V will also divide N
then
why not divide and check if N is prime or not by DIVIDING 'N' WITH ONLY PRIMES LESS THAN K=int(sqrt(N))
Number of times for which loop Keeps executing = π(√n)
This is how the brilliant idea of Eratosthenes starts taking pictures and will start giving you intuition behind this all.
Btw using the Sieve of Eratosthenes one can find sum of primes less than a multiple of 10.
because for a given column you just check need to check their unit digits[1,3,7,9] and for how many times a particular unit digit is repeating.
Being new to Stack Overflow Community! Would like to know suggestions on the same if anything is wrong.

Resources