Big-O of while loop with an indeterminate number of iterations - random

I can't seem to rationalize the Big-O notation for a while loop that has an indeterminate number of iterations.
In my code for a personal project, I have a list containing all 0s. I then implement a while loop that will generate a random integer between 0 and 9. If the value in the list at the index of the random number is a 0, then the value is written to a 1 and the while loop exits. Otherwise, a random number is generated again and the process repeats.
I'm not entirely sure what the time complexity of this would be, however. For example, if after 9 iterations of the algorithm, every single value in the list except index 9 is 1, and if the random number generator just happens to not generate the number 9 for, say, 99 iterations, then it would exit after 99 + 9 iterations. Wouldn't the worst-case be O(infinity)? I don't think this is possible, but I figured I'd ask since I wasn't sure.
My textbooks and online resources don't seem to provide much insight on examples such as this. I'm sure that the best-case would be O(1), but the average and worst cases I'm a bit unsure about.
I found a similar problem that has the same premise. Here's the pseudocode, where n is some integer of arbitrary size:
sample_found = false
while(!sample_found) {
if (rand(0,n) == 0) {
sample_found = true
}
}
In the worst case, this would run infinitely, right? I'm not sure about average case, either.

It sounds like you're using IID Bernoulli trials to control looping, with a probability p=0.1 of continuing. Assuming that's the case, you can just use the Geometric distribution.
The mean of this distribution is just 1/p, so 10, and I'd use quantiles to further understand how many draws would be needed to finish. For example:
10% of the time you'd expect to finish immediately
50% of the runs you'd need to loop 6 times or less
90% of runs finish by 21 loops
99% of runs finish by 43 loops
calculated in R, using qgeom(c(0.1, 0.5, 0.9, 0.99), 0.1).
The worst case obviously goes out to infinity, but in practice you'd be unlikely to loop 200 times. 1-pgeom(200, 0.1) gives 6e-10, so you can expect to iterate your loop more than a billion times before needing to wait this many iterations.

Related

Big O Notation O(n^2) what does it mean?

For example, it says that in 1 sec 3000 number are sorted with selection sort. How can we predict how many numbers are gonna be sorted in 10 sec ?
I checked that selection sort needs O(n^2) but I dont understand how I am gonna calculate how many numbers are gonna be sorted in 10 sec.
We cannot use big O to reliably extrapolate actual running times or input sizes (whichever is the unknown).
Imagine the same code running on two machines A and B, different parsers, compilers, hardware, operating system, array implementation, ...etc.
Let's say they both can parse and run the following code:
procedure sort(reference A)
declare i, j, x
i ← 1
n ← length(A)
while i < n
x ← A[i]
j ← i - 1
while j >= 0 and A[j] > x
A[j+1] ← A[j]
j ← j - 1
end while
A[j+1] ← x[3]
i ← i + 1
end while
end procedure
Now system A spends 0.40 seconds on the initialisation part before the loop starts, independent on what A is, because on that configuration the initiation of the function's execution context including the allocation of the variables is a very, very expensive operation. It also needs to spend 0.40 seconds on the de-allocation of the declared variables and the call stack frame when it arrives at the end of the procedure, again because on that configuration the memory management is very expensive. Furthermore, the length function is costly as well, and takes 0.19 seconds. That's a total overhead of 0.99 seconds
On system B this memory allocation and de-allocation is cheap and takes 1 microsecond. Also the length function is fast and needs 1 microsecond. That's a total overhead that is 2 microseconds.
System A is however much faster on the rest of the statements in comparison with system B.
Both implementations happen to need 1 second to sort an array A having 3000 values.
If we now take the reasoning that we could predict the array size that can be sorted in 10 seconds based on the results for 1 second, we would say:
𝑛 = 3000, and the duration is 1 second which corresponds to 𝑛² = 9 000 000 operations. So if 9 000 000 operations corresponds to 1 second, then 90 000 000 operations correspond to 10 seconds, and 𝑛 = √(𝑛²) ~= 9 487 (the size of the array that can be sorted in 10 seconds).
However, if we follow the same reasoning, we can look at the time needed for completing the outer loop only (without the initialisation overhead), which also is O(𝑛²) and thus the same reasoning can be followed:
𝑛 = 3000, and the duration in system A is 0.01 second which corresponds to 𝑛² = 9 000 000 operations. So if 9 000 000 operations can be executed in 0.01 second then in 10 - 0.99 seconds (overhead is subtracted) we can execute 9.01 / 0.01 operations, i.e 𝑛² = 8 109 000 000 operations, and now 𝑛 = √(𝑛²) ~= 90 040.
The problem is that using the same reasoning on big O, the predicted outcomes differ by a factor of about 10!
We may be tempted to think that this is now only a "problem" of constant overhead, but similar things can be said about operations in the outer loop. For instance it might be that x ← A[i] has a relatively high cost for some reason on some system. These are factors that are not revealed in the big O notation, which only retains the most significant factor, omitting linear and constant factors that play a role.
The actual running time for an actual input size is dependent on a more complex function that is likely close to polynomial, like 𝑛² + 𝑎𝑛 + 𝑏. These coefficients 𝑎, and 𝑏 would be needed to make a more reasonable prediction possible. There might even be function components that are non-polynomial, like 𝑛² + 𝑎𝑛 + 𝑏 + 𝑐√𝑛... This may seem unlikely, but systems on which the code runs may do all kinds of optimisations while code runs which may have such or similar effect on actual running time.
The conclusion is that this type of reasoning gives no guarantee that the prediction is anywhere near the reality -- without more information about the actual code, system on which it runs,... etc, it is nothing more than a guess. Big O is a measure for asymptotic behaviour.
As the comments say, big-oh notation has nothing to do with specific time measurements; however, the question still makes sense, because the big-oh notation is perfectly usable as a relative factor in time calculations.
Big-oh notation gives us an indication of how the number of elementary operations performed by an algorithm varies as the number of items to process varies.
Simple algorithms perform a fixed number of operations per item, but in more complicated algorithms the number of operations that need to be performed per item varies as the number of items varies. Sorting algorithms are a typical example of such complicated algorithms.
The great thing about big-oh notation is that it belongs to the realm of science, rather than technology, because it is completely independent of your hardware, and of the speed at which your hardware is capable of performing a single operation.
However, the question tells us exactly how much time it took for some hypothetical hardware to process a certain number of items, so we have an idea of how much time that hardware takes to perform a single operation, so we can reason based on this.
If 3000 numbers are sorted in 1 second, and the algorithm operates with O( N ^ 2 ), this means that the algorithm performed 3000 ^ 2 = 9,000,000 operations within that second.
If given 10 seconds to work, the algorithm will perform ten times that many operations within that time, which is 90,000,000 operations.
Since the algorithm works in O( N ^ 2 ) time, this means that after 90,000,000 operations it will have sorted Sqrt( 90,000,000 ) = 9,486 numbers.
To verify: 9,000,000 operations within a second means 1.11e-7 seconds per operation. Since the algorithm works at O( N ^ 2 ), this means that to process 9,486 numbers it will require 9,486 ^ 2 operations, which is roughly equal to 90,000,000 operations. At 1.11e-7 seconds per operation, 90,000,000 operations will be done in roughly 10 seconds, so we are arriving at the same result via a different avenue.
If you are seriously pursuing computer science or programming I would recommend reading up on big-oh notation, because it is a) very important and b) a very big subject which cannot be covered in stackoverflow questions and answers.

How to efficiently compute average on the fly (moving average)?

I come up with this
n=1;
curAvg = 0;
loop{
curAvg = curAvg + (newNum - curAvg)/n;
n++;
}
I think highlights of this way are:
- It avoids big numbers (and possible overflow if you would sum and then divide)
- you save one register (not need to store sum)
The trouble might be with summing error - but I assume that generally there shall be balanced numbers of round up and round down so the error shall not sum up dramatically.
Do you see any pitfalls in this solution?
Have you any better proposal?
Your solution is essentially the "standard" optimal online solution for keeping a running track of average without storing big sums and also while running "online", i.e. you can just process one number at a time without going back to other numbers, and you only use a constant amount of extra memory. If you want a slightly optimized solution in terms of numerical accuracy, at the cost of being "online", then assuming your numbers are all non-negative, then sort your numbers first from smallest to largest and then process them in that order, the same way you do now. That way, if you get a bunch of numbers that are really small about equal and then you get one big number, you will be able to compute the average accurately without underflow, as opposed to if you processed the large number first.
I have used this algorithm for many years.
The loop is any kind of loop. Maybe it is individual web sessions or maybe true loop. The point is all you need to track is the current count (N) and the current average (avg). Each time a new value is received, apply this algorithm to update the average. This will compute the exact arithmetic average. It has the additional benefit that it is resistant to overflow. If you have a gazillion large numbers to average, summing them all up may overflow before you get to divide by N. This algorithm avoids that pitfall.
Variables that are stored during the computation of the average:
N = 0
avg = 0
For each new value: V
N=N+1
a = 1/N
b = 1 - a
avg = a * V + b * avg

Distinct digit count

Is is possible to count the distinct digits in a number in constant time O(1)?
Suppose n=1519 output should be 3 as there are 3 distinct digits(1,5,9).
I have done it in O(N) time but anyone knows how to find it in O(1) time?
I assume N is the number of digits of n. If the size of n is unlimited, it can't be done in general in O(1) time.
Consider the number n=11111...111, with 2 trillion digits. If I switch one of the digits from a 1 to a 2, there is no way to discover this without in some way looking at every single digit. Thus processing a number with 2 trillion digits must take (of the order of) 2 trillion operations at least, and in general, a number with N digits must take (of the order of) N operations at least.
However, for almost all numbers, the simple O(N) algorithm finishes very quickly because you can just stop as soon as you get to 10 distinct digits. Almost all numbers of sufficient length will have all 10 digits: e.g. the probability of not terminating with the answer '10' after looking at the first 100 digits is about 0.00027, and after the first 1000 digits it's about 1.7e-45. But unfortunately, there are some oddities which make the worst case O(N).
After seeing that someone really posted a serious answer to this question, I'd rather repeat my own cheat here, which is a special case of the answer described by #SimonNickerson:
O(1) is not possible, unless you are on radix 2, because that way, every number other than 0 has both 1 and 0, and thus my "solution" works not only for integers...
EDIT
How about 2^k - 1? Isn't that all 1s?
Drat! True... I should have known that when something seems so easy, it is flawed somehow... If I got the all 0 case covered, I should have covered the all 1 case too.
Luckily this case can be tested quite quickly (if addition and bitwise AND are considered an O(1) operation): if x is the number to be tested, compute y this way: y=(x+1) AND x. If y=0, then x=2^k - 1. because this is the only case when all the bits needed to be flipped by the addition. Of course, this is quite a bit flawed, as with bit lengths exceeding the bus width, the bitwise operators are not O(1) anymore, but rather O(N).
At the same time, I think it can be brought down to O(logN), by breaking the number into bus width size chunks, and AND-ing together the neighboring ones, repeating until only one is left: if there were no 0s in the number tested, the last one will be full 1s too...
EDIT2: I was wrong... This is still O(N).

Big O confusion

I'm testing out some functions I made and I am trying to figure out the time complexity.
My problem is that even after reading up on some articles on Big O I can't figure out what the following should be:
1000 loops : 15000 objects : time 6
1000 loops : 30000 objects : time 9
1000 loops : 60000 objects : time 15
1000 loops : 120000 objects : time 75
The difference between the first 2 is 3 ms, then 6 ms, and then 60, so the time doubles up with each iteration. I know this isn't O(n), and I think it's not O(log n).
When I try out different sets of data, the time doesn't always go up. For example take this sequence (ms): 11-17-26-90-78-173-300
The 78 ms looks out of place. Is this even possible?
Edit:
NVM, I'll just have to talk this over with my college tutor.
The output of time differs too much with different variables.
Thanks for those who tried to help though!
Big O notation is not about how long it takes exactly for an operation to complete. It is a (very rough) estimation of how various algorithms compare asymptotically with respect to changing input sizes, expressed in generic "steps". That is "how many steps does my algorithm do for an input of N elements?".
Having said this, note that in the Big O notation constants are ignored. Therefore a loop over N elements doing 100 calculations at each iteration would be 100 * N but still equal to O(N). Similarly, a loop doing 10000 calculations would still be O(N).
Hence in your example, if you have something like:
for(int i = 0; i < 1000; i++)
for(int j = 0; j < N; j++)
// computations
it would be 1000 * N = O(N).
Big O is just a simplified algorithm running time estimation, which basically says that if an algorithm has running time O(N) and another one has O(N^2) then the first one will eventually be faster than the second one for some value of N. This estimation of course does not take into account anything related to the underlying platform like CPU speed, caching, I/O bottlenecks, etc.
Assuming you can't get O(n) from theory alone, then I think you need to look over more orders of magnitude in O(n) -- at least three, preferably six or more (you will just need to experiment to see what variation in n is required). Leave the thing running overnight if you have to. Then plot the results logarithmically.
Basically I suspect you are looking at noise right now.
Without seeing your actual algorithm, I can only guess:
If you allow a constant initialisation overhead of 3ms, you end up with
1000x15,000 = (OH:3) + 3
1000x30,000 = (OH:3) + 6
1000x60,000 = (OH:3) + 12
This, to me, appears to be O(n)
The disparity in your timestamping of differing datasets could be due to any number of factors.

Get 10000+ unique random numbers (performance) [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Create Random Number Sequence with No Repeats
I'd like to write an URL shortener that only uses numbers as short string.
I don't want to count up, I want the next new number to be random (or pseudo random).
At first thought algorithm would then look like this (pseudo code):
do
{
number = random(0,10000)
}
while (datastore.contains(number))
datastore.store(number, url)
The problem with this implementation is: As the datastore contains more numbers, the more likely it is that the loop will be executed multiple times. The performance will decrease over time.
Isn't there a better way to get a random number that is not already in use?
1) fill an array with sequential values
2) shuffle the array
Use an encryption. Since encryption is reversible, unique inputs generate unique outputs. For 64 bit numbers use a cypher with a 64 bit blocksize. For smaller block sizes, such as 32 bit or 16 bit, have a look at the Hasty Pudding Cypher.
Whatever block size you need, just encrypt the numbers 0, 1, 2, ... (in the appropriate block size) to generate as many unique non-sequential numbers as you need.
Some related questions: # 2394246, # 54059, # 158716, # 196017, and # 1608181.
The proper approach depends on how many numbers you will generate and on if realtime performance is required. If you draw no more than a small fraction of the numbers available in a range, average time per number for your code snippet is O(1), with slight increase of time per later number but still O(1). See, for example, question #1608181 answer in which I show that getting k numbers from a range of more than 2*k numbers with such code is O(k). (That answer also has C code to generate M numbers from a range of N numbers, in O(M) time when M<N/2, and explains how to use it for O(M) time when M>=N/2.)
If you want O(1) performance with a hard time limit, you can use the program just mentioned to pre-load an array, or can shuffle the whole range of integers, as mentioned by Justin. After that preprocessing, each access is O(1). Buf if you know you won't draw more than say 3000 numbers from your 1...10000 range, and don't have a hard time limit, the code you have will run in O(1) time on average, with probability of k passes decreasing like 0.3 ^ k; i.e., at worst about 70% chance of 1 pass, 21% for 2, 6% for 3, 2% for 4, 0.6% for 5, and so forth.

Resources