Generate infinite stream of unique numbers between 0 and 1 - random

Came across this question previously on an interview. The requirements are to write a function that
Generates a number between 0..1
Never returns the same number
Can scale (called every few milliseconds and continuously for years)
Can use only 1mb of heap memory
Does not need to return as a decimal, can render directly to stdout
My idea was hacky at best which involved manipulating a string of the "0.1" then "0.11" then "0.12" etc. Since the requirements did not mention it had to be uniformly distributed, it does not need to be random. Another idea is generate a timestamp of the form yyyyMMddhhmmssSSS (where SSS is msec) then convert that to a string and prefix it with "0." . This way the values will always be unique.
It's a pretty open ended question and I'm curious how other people would tackle it.

Pseudo code that can do what you except guarantee no repeats.
Take your 1 MB allocation.
Randomly set every byte.
Echo to stdout as "0.<bytes as integer string>" (will be very long)
Go to #2
Your "Never returns the same number" is not guaranteed but it is extremely unlikely (1 in 2^8192) assuming a good implementation of Random.

Allocate about a million characters and set them initially to all 0.
Then each call to the function simply increments the number and returns it, something like:
# Gives you your 1MB heap space.
num = new digit/byte/char/whatever[about a million]
# Initialise all digits to zero (1-based arrays).
def init():
for posn ranges from 1 to size(num):
set num[posn] to 0
# Print next value.
def printNext():
# Carry-based add-1-to-number.
# Last non-zero digit stored for truncated output.
set carry to 1
set posn to size(num)
set lastposn to posn
# Keep going until no more carry or out of digits.
while posn is greater than 0 and carry is 1:
# Detect carry and continue, or increment and stop.
if num[posn] is '9':
set num[posn] to '0'
set lastposn to posn minus 1
else:
set num[posn] to num[posn] + 1
set carry to 0
set posn to posn minus one
# Carry set after all digits means you've exhausted all numbers.
if carry is 1:
exit badly
# Output the number.
output "0."
for posn ranges from 1 to lastposn
output num[posn]
The use of lastposn prevents the output of trailing zeros. If you don't care about that, you can remove every line with lastposn in it and run the output loop from 1 to size(num) instead.
Calling this every millisecond will give you about well over 10some--big-number-resulting-in-a-runtime-older-than-the-age-of-the-universe years of run time.
I wouldn't go with your time-based solution because the time may change - think daylight savings or summer time and people adjusting clocks due to drift.
Here's some actual Python code which demonstrates it:
import sys
num = "00000"
def printNext():
global num
carry = 1
posn = len(num) - 1
lastposn = posn
while posn >= 0 and carry == 1:
if num[posn:posn+1] == '9':
num = num[:posn] + '0' + num[posn+1:]
lastposn = posn - 1
else:
num = num[:posn] + chr(ord(num[posn:posn+1]) + 1) + num[posn+1:]
carry = 0
posn = posn - 1
if carry == 1:
print "URK!"
sys.exit(0)
s = "0."
for posn in range (0,lastposn+1):
s = s + num[posn:posn+1];
print s
for i in range (0,15):
printNext()
And the output:
0.00001
0.00002
0.00003
0.00004
0.00005
0.00006
0.00007
0.00008
0.00009
0.0001
0.00011
0.00012
0.00013
0.00014
0.00015

Your method would eventually use more than 1mb of heap memory. Every way you represent numbers, if you are constrained by 1mb of heap then there is only a finite number of values. I would take the maximum ammount of memory possible, and increment the least significant bit by one on each call. That would ensure running as longer as possible before returning a repeted number.

Yes, because there is no random requirement, you have a lot of flexibility.
The idea here I think is very close to that of enumerating all strings over the regular expression [0-9]* with a couple modifications:
the real string starts with the sequence 0.
you cannot end with a 0
So how would you enumerate? One idea is
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.11 0.12 0.13 0.14 0.15 ... 0.19 0.21 0.22 ... 0.29 0.31 ... 0.99 0.101 0.102 ...
The only state you need here is an integer I think. Just be clever in skipping those zeros at the end (not difficult really). 1 MB of memory should be fine. It stores a massive massive integer, so I think you would be good here.
(It is different from yours because I generate all one character strings, then all two character strings, then all three character strings, ... so I believe there is no need for state other than the last number generated.)
Then again I may be wrong; I haven't tried this.
ADDENDUM
Okay I will try it. Here is the generator in Ruby
i = 0
while true
puts "0.#{i}" if i % 10 != 0
i += 1
end
Looks okay to me....

If you are programming in C, the nextafter() family of functions are Posix-compatible functions useful for producing the next double after or before any given value. This will give you about 2^64 different values to output, if you output both positive and negative values.
If you are required to print out the values, use the %a or %A format for exact representation. From the printf(3) man page: "For 'a' conversion, the double argument is converted to hexadecimal notation (using the letters abcdef) in the style [-]0xh.hhhhp±d..." "The default precision suffices for an exact representation of the value if an exact representation in base 2 exists..."
If you want to generate random numbers rather than sequentially ascending ones, perhaps do a google search for 64-bit KISS RNG. Implementations in Java, C, Ada, Fortran, et al are available on the web. The period of 64-bit KISS RNG itself is ~ 2^250, but there are not that many 64-bit double-precision numbers, so some numbers will re-appear within 2^64 outputs, but with different neighbor values. On some systems, long doubles have 128-bit values; on others, only 80 or 96. Using long doubles, you could accordingly increase the number of different values output by combining two randoms into each output.
It may be that the point of this question in an interview is to figure out if you can recognize a silly spec when you see it.

Related

Check the Number of Powers of 2

I have a number X , I want to check the number of powers of 2 it have ?
For Ex
N=7 ans is 2 , 2*2
N=20 ans is 4, 2*2*2*2
Similar I want to check the next power of 2
For Ex:
N=14 Ans=16
Is there any Bit Hack for this without using for loops ?
Like we are having a one line solution to check if it's a power of 2 X&(X-1)==0,similarly like that ?
GCC has a built-in instruction called __builtin_clz() that returns the number of leading zeros in an integer. So for example, assuming a 32-bit int, the expression p = 32 - __builtin_clz(n) will tell you how many bits are needed to store the integer n, and 1 << p will give you the next highest power of 2 (provided p<32, of course).
There are also equivalent functions that work with long and long long integers.
Alternatively, math.h defines a function called frexp() that returns the base-2 exponent of a double-precision number. This is likely to be less efficient because your integer will have to be converted to a double-precision value before it is passed to this function.
A number is power of two if it has only single '1' in its binary value. For example, 2 = 00000010, 4 = 00000100, 8 = 00001000 and so on. So you can check it using counting the no. of 1's in its bit value. If count is 1 then the number is power of 2 and vice versa.
You can take help from here and here to avoid for loops for counting set bits.
If count is not 1 (means that Value is not power of 2) then take position of its first set bit from MSB and the next power of 2 value to this number is the value having only set bit at position + 1. For example, number 3 = 00000011. Its first set bit from MSB is 2nd bit. Therefore the next power of 2 number is a value having only set bit at 3rd position. i.e. 00000100 = 4.

How many times does a zero occur on an odometer

I am solving how many times a zero occus on an odometer. I count +1 everytime I see a zero.
10 -> +1
100-> +2 because in 100 I see 2 zero's
10004 -> +3 because I see 3 zero's
So I get,
1 - 100 -> +11
1 - 500 -> +91
1 - 501 -> +92
0 - 4294967295-> +3825876150
I used rubydoctest for it. I am not doing anything with begin_number yet. Can anyone explain how to calculate it without a brute force method?
I did many attempts. They go well for numbers like 10, 1000, 10.000, 100.000.000, but not for numbers like 522, 2280. If I run the rubydoctest, it will fail on # >> algorithm_count_zero(1, 500)
# doctest: algorithm_count_zero(begin_number, end_number)
# >> algorithm_count_zero(1, 10)
# => 1
# >> algorithm_count_zero(1, 1000)
# => 192
# >> algorithm_count_zero(1, 10000000)
# => 5888896
# >> algorithm_count_zero(1, 500)
# => 91
# >> algorithm_count_zero(0, 4294967295)
# => 3825876150
def algorithm_count_zero(begin_number, end_number)
power = Math::log10(end_number) - 1
if end_number < 100
return end_number/10
else
end_number > 100
count = (9*(power)-1)*10**power+1
end
answer = ((((count / 9)+power)).floor) + 1
end
end_number = 20000
begin_number = 10000
puts "Algorithm #{algorithm_count_zero(begin_number, end_number)}"
As noticed in a comment, this is a duplicate to another question, where the solution gives you correct guidelines.
However, if you want to test your own solution for correctness, i'll put in here a one-liner in the parallel array processing language Dyalog APL (which i btw think everyone modelling mathemathics and numbers should use).
Using tryapl.org you'll be able to get a correct answer for any integer value as argument. Tryapl is a web page with a backend that executes simple APL code statements ("one-liners", which are very typical to the APL language and it's extremely compact code).
The APL one-liner is here:
{+/(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 142857
Copy that and paste it into the edit row at tryapl.org, and press enter - you will quickly see an integer, which is the answer to your problem. In the code row above, you can see the argument rightmost; it is 142857 this time but you can change it to any integer.
As you have pasted the one-liner once, and executed it with Enter once, the easiest way to get it back for editing is to press [Up arrow]. This returns the most recently entered statement; then you can edit the number sitting rightmost (after the curly brace) and press Enter again to get the answer for a different argument.
Pasting teh code row above will return 66765 - that many zeroes exist for 142857.
If you paste this 2 characters shorter row below, you will see the individual components of the result - the sum of these components make up the final result. You will be able to see a pattern, which possibly makes it easier to understand what happens.
Try for example
{(c×1+d|⍵)+d×(-c←0=⌊(a|⍵)÷d←a×+0.1)+⌊⍵÷a←10*⌽⍳⌈10⍟⍵} 1428579376
0 100000000 140000000 142000000 142800000 142850000 142857000 142857900 142857930 142857937
... and see how the intermediate results contain segments of the argument 1428579376, starting from left! There are as many intermediate results as there are numbers in the argument (10 this time).
The result for 1428579376 will be 1239080767, ie. the sum of the 10 numbers above. This many zeroes appear in all numbers between 1 and 1428579376 :-).
Consider each odometer position separately. The position x places from the far right changes once every 10^x times. By looking at the numbers to its right, you know how long it will be until it next changes. It will then hold each value for 10^x times before changing, until it reaches the end of the range you are considering, when it will hold its value at that time for some number of times that you can work out given the value at the very end of the range.
Now you have a sequence of the form x...0123456789012...y where you know the length and you know the values of x and y. One way to count the number of 0s (or any other digit) within this sequence is to clip off the prefix from x.. to just before the first 0, and clip off the suffix from just after the last 9 to y. Look for 0s n in this suffix, and measure the length of the long sequence from prefix to suffix. This will be of a length divisible by 10, and will contain each digit the same number of times.
Based on this you should be able to work out, for each position, how often within the range it will assume each of its 10 possible values. By summing up the values for 0 from each of the odometer positions you get the answer you want.

Complex Numbers Seemingly Arising from Non-Complex Logarithms

I have a simple program written in TI-BASIC that converts from base 10 to base 2
0->B
1->E
Input "DEC:",D
Repeat D=0
int(round(log(D)/log(2),1))->E
round(E)->E
B+10^E->B
D-2^E->D
End
Disp B
This will sometimes return an the error 'ERR: DATA TYPE'. I checked, and this is because the variable D, will sometimes become a complex number. I am not sure how this happens.
This happens with seemingly random numbers, like 5891570. It happens with this number, but not something close to it like 5891590 Which is strange. It also happens with 1e30, But not 1e25. Another example is 1111111111111111, and not 1111111111111120.
I haven't tested this thoroughly, and don't see any pattern in these numbers. Any help would be appreciated.
The error happens because you round the logarithm to one decimal place before taking the integer part; therefore, if log(D)/log(2) is something like 8.99, you will round E up rather than down, and 2^9 will be subtracted from D instead of 2^8, causing, in the next iteration, D to become negative and its logarithm to be complex. Let's walk through your code when D is 511, which has base-2 logarithm 8.9971:
Repeat D=0 ;Executes first iteration without checking whether D=0
log(D)/log(2 ;8.9971
round(Ans,1 ;9.0
int(Ans ;9.0
round(Ans)->E ;E = 9.0
B+10^E->B ;B = 1 000 000 000
D-2^E->D ;D = 511-512 = -1
End ;loops again, since D≠0
---next iteration:----
log(D ;log(-1) = 1.364i; throws ERR:NONREAL ANS in Real mode
Rounding the logarithm any more severely than nine decimal places (nine digits is the default for round( without a "digits" argument) is completely unnecessary, as on my TI-84+ rounding errors do not accumulate: round(int(log(2^X-1)/log(2)) returns X-1 and round(int(log(2^X)/log(2)) returns X for all integer X≤28, which is high enough that precision would be lost anyway in other parts of the calculation.
To fix your code, simply round only once, and only to nine places. I've also removed the unnecessary double-initialization of E, removed your close-parens (it's still legal code!), and changed the Repeat (which always executes one loop before checking the condition D=0) to a While loop to prevent ERR:DOMAIN when the input is 0.
0->B
Input "DEC:",D
While D
int(round(log(D)/log(2->E
B+10^E->B
D-2^E->D
End
B ;on the last line, so it prints implicitly
Don't expect either your code or my fix to work correctly for D > 213 or so, because your calculator can only store 14 digits in its internal representation of any number. You'll lose the digits while you store the result into B!
Now for a trickier, optimized way of computing the binary representation (still only works for D < 213:
Input D
int(2fPart(D/2^cumSum(binomcdf(13,0
.1sum(Ans10^(cumSum(1 or Ans

Random Numbers based on the ANU Quantum Random Numbers Server

I have been asked to use the ANU Quantum Random Numbers Service to create random numbers and use Random.rand only as a fallback.
module QRandom
def next
RestClient.get('http://qrng.anu.edu.au/API/jsonI.php?type=uint16&length=1'){ |response, request, result, &block|
case response.code
when 200
_json=JSON.parse(response)
if _json["success"]==true && _json["data"]
_json["data"].first || Random.rand(65535)
else
Random.rand(65535) #fallback
end
else
puts response #log problem
Random.rand(65535) #fallback
end
}
end
end
Their API service gives me a number between 0-65535. In order to create a random for a bigger set, like a random number between 0-99999, I have to do the following:
(QRandom.next.to_f*(99999.to_f/65535)).round
This strikes me as the wrong way of doing, since if I were to use a service (quantum or not) that creates numbers from 0-3 and transpose them into space of 0-9999 I have a choice of 4 numbers that I always get. How can I use the service that produces numbers between 0-65535 to create random numbers for a larger number set?
Since 65535 is 1111111111111111 in binary, you can just think of the random number server as a source of random bits. The fact that it gives the bits to you in chunks of 16 is not important, since you can make multiple requests and you can also ignore certain bits from the response.
So after performing that abstraction, what we have now is a service that gives you a random bit (0 or 1) whenever you want it.
Figure out how many bits of randomness you need. Since you want a number between 0 and 99999, you just need to find a binary number that is all ones and is greater than or equal to 99999. Decimal 99999 is equal to binary 11000011010011111, which is 17 bits long, so you will need 17 bits of randomness.
Now get 17 bits of randomness from the service and assemble them into a binary number. The number will be between 0 and 2**17-1 (131071), and it will be evenly distributed. If the random number happens to be greater than 99999, then throw away the bits you have and try again. (The probability of needing to retry should be less than 50%.)
Eventually you will get a number between 0 and 99999, and this algorithm should give you a totally uniform distribution.
How about asking for more numbers? Using the length parameter of that API you can just ask for extra numbers and sum them so you get bigger numbers like you want.
http://qrng.anu.edu.au/API/jsonI.php?type=uint16&length=2
You can use inject for the sum and the modulo operation to make sure the number is not bigger than you want.
json["data"].inject(:+) % MAX_NUMBER
I made some other changes to your code like using SecureRandom instead of the regular Random. You can find the code here:
https://gist.github.com/matugm/bee45bfe637f0abf8f29#file-qrandom-rb
Think of the individual numbers you are getting as 16 bits of randomness. To make larger random numbers, you just need more bits. The tricky bit is figuring out how many bits is enough. For example, if you wanted to generate numbers from an absolutely fair distribution from 0 to 65000, then it should be pretty obvious that 16 bits are not enough; even though you have the range covered, some numbers will have twice the probability of being selected than others.
There are a couple of ways around this problem. Using Ruby's Bignum (technically that happens behind the scenes, it works well in Ruby because you won't overflow your Integer type) it is possible to use a method that simply collects more bits until the result of a division could never be ambiguous - i.e. the difference when adding more significant bits to the division you are doing could never change the result.
This what it might look like, using your QRandom.next method to fetch bits in batches of 16:
def QRandom.rand max
max = max.to_i # This approach requires integers
power = 1
sum = 0
loop do
sum = 2**16 * sum + QRandom.next
power *= 2**16
lower_bound = sum * max / power
break lower_bound if lower_bound == ( (sum + 1) * max ) / power
end
end
Because it costs you quite a bit to fetch random bits from your chosen source, you may benefit from taking this to the most efficient form possible, which is similar in principle to Arithmetic Coding and squeezes out the maximum possible entropy from your source whilst generating unbiased numbers in 0...max. You would need to implement a method QRandom.next_bits( num ) that returned an integer constructed from a bitstream buffer originating with your 16-bit numbers:
def QRandom.rand max
max = max.to_i # This approach requires integers
# I prefer this: start_bits = Math.log2( max ).floor
# But this also works (and avoids suggestions the algo uses FP):
start_bits = max.to_s(2).length
sum = QRandom.next_bits( start_bits )
power = 2 ** start_bits
# No need for fractional bits if max is power of 2
return sum if power == max
# Draw 1 bit at a time to resolve fractional powers of 2
loop do
lower_bound = (sum * max) / power
break lower_bound if lower_bound == ((sum + 1) * max)/ power
sum = 2 * sum + QRandom.next_bits(1) # 0 or 1
power *= 2
end
end
This is the most efficient use of bits from your source possible. It is always as efficient or better than re-try schemes. The expected number of bits used per call to QRandom.rand( max ) is 1 + Math.log2( max ) - i.e. on average this allows you to draw just over the fractional number of bits needed to represent your range.

"interval is empty", Lua math.random isn't working for large numbers?

I didn't know if this is a bug in Lua itself or if I was doing something wrong. I couldn't find anything about it anywhere. I am using Lua for Windows (Lua 5.1.4):
>return math.random(0, 1000000000)
1251258
This returns a random integer between 0 and 10000000000, as expected. This seems to work for all other values. But if I add a single 0:
>return math.random(0, 10000000000)
stdin:1: bad argument #2 to 'random' (interval is empty)
Any number higher than that does the same thing.
I tried to figure out exactly how high a number has to be to cause this and found something even weirder:
>return math.random(0, 2147483647)
-75617745
If the value is 2147483647 then it gives me negative numbers. Any higher than that and it throws an error. Any lower than that and it works fine.
That's 0b1111111111111111111111111111111 in binary, 31 binary digits exactly. I am not sure what that means though.
This unexpected behavior (bug?) is due to how math.random treats the input arguments passed in Lua 5.1. From lmathlib.c:
case 2: { /* lower and upper limits */
int l = luaL_checkint(L, 1);
int u = luaL_checkint(L, 2);
luaL_argcheck(L, l<=u, 2, "interval is empty");
lua_pushnumber(L, floor(r*(u-l+1))+l); /* int between `l' and `u' */
break;
}
As you may know in C, a standard int can represent values -2,147,483,648 to 2,147,483,647. Adding +1 to 2,147,483,647, like in your use-case, will overflow and wrap around the value giving -2,147,483,648. The end result is negative since you're multiplying a positive with a negative number.
Furthermore, anything above 2,147,483,647 will fail the luaL_argcheck due to overflow wraparound.
There are a few ways to address this problem:
Upgrade to Lua 5.2. That one has since fixed this issue by treating the input arguments as lua_Number instead.
Switch to LuaJIT which does not have this integer overflow issue.
Patch the Lua 5.1 source yourself with the fix and recompile.
Modify your random range so it does not overflow.
If you need a range that is larger than what the random function supports (32 bit signed integers or 2^31 due to sign bit, because math.random is at C level), but smaller than the range of Lua "number" type (based on What is the maximum value of a number in Lua?, 2^52, or maybe even 2^53), you could try generating two random numbers: scale the first to the range desired; add the second to "fill the gap". For example, say you want a range of 0 to 2^36. The largest from math.random is 2^31. So you could do:
-- 2^36 = 2^31 * 2^5 so
scale = 2^5
baseRand = scale * math.random(0, 2^31)
-- baseRand is now between 0 and 2^36 but there are gaps of 2^5 in the set
-- of possible values; fill the gaps with second random number:
fillGap = math.random(0, 2^5)
randNum = baseRand + fillGap
This will work as long as the desired range is less than the Lua interpreter's maximum for Lua numbers, which is a configurable compile time parameter but if you use stock build it is 2^52, a very large number (although not as large as largest long integer, 2^63).
Note also that largest positive N-bit integer is 2^N-1 (not 2^N), but the above technique can be applied to any range, you could have for instance scale = 10^6 then randNum = 10^6 * math.random(0, 10^8) + math.random(0, 10^6).

Resources