Remove the inferior digits of a number - algorithm

Given a number n of x digits. How to remove y digits in a way the remaining digits results in the greater possible number?
Examples:
1)x=7 y=3
n=7816295
-8-6-95
=8695
2)x=4 y=2
n=4213
4--3
=43
3)x=3 y=1
n=888
=88
Just to state: x > y > 0.

For each digit to remove: iterate through the digits left to right; if you find a digit that's less than the one to its right, remove it and stop, otherwise remove the last digit.
If the number of digits x is greater than the actual length of the number, it means there are leading zeros. Since those will be the first to go, you can simply reduce the count y by a corresponding amount.
Here's a working version in Python:
def remove_digits(n, x, y):
s = str(n)
if len(s) > x:
raise ValueError
elif len(s) < x:
y -= x - len(s)
if y <= 0:
return n
for r in range(y):
for i in range(len(s)):
if s[i] < s[i+1:i+2]:
break
s = s[:i] + s[i+1:]
return int(s)
>>> remove_digits(7816295, 7, 3)
8695
>>> remove_digits(4213, 4, 2)
43
>>> remove_digits(888, 3, 1)
88
I hesitated to submit this, because it seems too simple. But I wasn't able to think of a case where it wouldn't work.

if x = y we have to remove all the digits.
Otherwise, you need to find maximum digit in first y + 1 digits. Then remove all the y0 elements before this maximum digit. Then you need to add that maximum to the answer and then repeat that task again, but you need now to remove y - y0 elements now.
Straight forward implementation will work in O(x^2) time in the worst case.
But finding maximum in the given range can be done effectively using Segment Tree data structure. Time complexity will be O(x * log(x)) in the worst case.
P. S. I just realized, that it possible to solve in O(x) also, using the fact, that exists only 10 digits (but the algorithm maybe a little bit complicated). We need to find the minimum in the given range [L, R], but the ranges in this task will "change" from left to the right (L and R always increase). And we just need to store 10 pointers to the digits (1 per digit) to the first position in the number such that position >= L. Then to find the minimum, we need to check only 10 pointers. To update the pointers, we will try to move them right.
So the time complexity will be O(10 * x) = O(x)

Here's an O(x) solution. It builds an index that maps (i, d) to j, the smallest number > i such that the j'th digit of n is d. With this index, one can easily find the largest possible next digit in the solution in O(1) time.
def index(digits):
next = [len(digits)+1] * 10
for i in xrange(len(digits), 0, -1):
next[ord(digits[i-1])-ord('0')] = i-1
yield next[::-1]
def minseq(n, y):
n = str(n)
idx = list(index(n))[::-1]
i, r = 0, []
for ry in xrange(len(n)-y):
i = next(j for j in idx[i] if j <= y+ry) + 1
r.append(n[i - 1])
return ''.join(r)
print minseq(7816295, 3)
print minseq(4213, 2)

Pseudocode:
Number.toDigits().filter (sortedSet (Number.toDigits()). take (y))
Imho you don't need to know x.
For efficiency, Number.toDigits () could be precalculated
digits = Number.toDigits()
digits.filter (sortedSet (digits).take (y))
Depending on language and context, you either output the digits and are done or have to convert the result into a number again.
Working Scala-Code for example:
def toDigits (l: Long) : List [Long] = if (l < 10) l :: Nil else (toDigits (l /10)) :+ (l % 10)
val num = 734529L
val dig = toDigits (num)
dig.filter (_ > ((dig.sorted).take(2).last))
A sorted set is a set which is sorted, which means, every element is only contained once and then the resulting collection is sorted by some criteria, for example numerical ascending. => 234579.
We take two of them (23) and from that subset the last (3) and filter the number by the criteria, that the digits have to be greater than that value (3).
Your question does not explicitly say, that each digit is only contained once in the original number, but since you didn't give a criterion, which one to remove in doubt, I took it as an implicit assumption.
Other languages may of course have other expressions (x.sorted, x.toSortedSet, new SortedSet (num), ...) or lack certain classes, functions, which you would have to build on your own.
You might need to write your own filter method, which takes a pedicate P, and a collection C, and returns a new collection of all elements which satisfy P, P being a Method which takes one T and returns a Boolean. Very useful stuff.

Related

Given integers X and Y, how do you find the largest permutation of X that is less than or equal to Y?

Given two positive integers X and Y, find the largest permutation of X
that is less than or equal to Y. Return the largest permutation that is
less than or equal to Y as an integer. If there is no permutation of X
that is less than or equal to Y, return -1.
Example 1:
Input: X = 123, Y = 321
Output: 321
Example 2:
Input: X = 1733, Y = 3311
Output: 3173
Example 3:
Input: X = 999, Y = 111
Output: -1
Got this problem for an online assessment earlier yesterday, couldn't find an efficient solution for it and have been thinking about it but still can't think of the right approach. I first tried greedy, in which I would iterate Y from left to right and I create a permutation of X by appending the largest digit in X that is less than or equal to the digit in Y. But for X = 1733 and Y = 3311, my implementation would return -1 because the greedy algorithm rearranged X to 3317. So I turned to recursion, but as you'd expect this very quickly reached stack limit.
I've read this thread that seems to discuss a similar problem, but I believe the top solution fails for example 2. How do you approach this problem?
A recursive solution.
Sort the digits of X decreasingly. Then, as long as you find no solution
take in turn every digit in X that is not larger than the leading digit of Y;
if those digits are equal, recurse on X less this digit and the tail of Y;
if the digit of X is smaller (or X is empty), you are done;
if there is no such digit, you reached a dead-end.
This works because you are trying the permutations of X by decreasing value.
321 vs. 321
3 21 vs. 3 21
21 vs. 21
1 vs. 1
Done
7331 vs. 3311
3 731 vs. 3 311
3 71 vs. 3 11
1 7 vs. 1 1
Dead end
1 73 vs. 3 11
Done
999 vs. 111
Dead end
A non-recursive efficient solution, hinted by #Stef.
The permutations of X can be ordered increasingly by sorting the digits then picking every first digit and recursing on the remaining ones. This established a bijection between the permutations and the integers in [0, d!) for d digits.
For an integer m, you can retrieve the corresponding permutation using a conversion from the factorial basis (take the quotient by (d-1)! and proceed recursively with the remainder). This takes d operations, and you can compare the permutation to Y in O(d) operations.
Now just implement a dichotomic search on the d! permutations, which takes O(d.log(d!)) = O(d².log(d))) operations.
Update: the second solution only works for distinct digits otherwise the permutations do not yield increasing numbers. I hope that there is a workaround.
If X has more digits then there is no solution. If Y has more digits then a descending sort of the digits of X is the solution. Assuming X and Y have the same number of digits:
Put the digits of X in a counting hash.
For each digit of Y going in descending order (left-to-right), take the max digit of X that isn't greater than it and use that in your permutation.
If you ever place a digit lower than its counterpart in Y, place all remaining digits in descending order.
If there ever isn't a non-greater digit available then do the following: repeatedly unwind your prior move until you get to a digit where a lower digit was available. Select the max such lower digit. Then, all remaining digits can be placed in descending order from the map. If there is no such digit (where a lower digit could have been chosen) then there is no solution.
If you get through all the digits then you've produced the max solution.
This is linear in the number of digits if this is limited to base 10. If your base can vary, this is O(num_digits * base)
Here's Ruby code for this.
def get_perm(x, y)
# hist keeps a count of each of the digits of x
hist = Hash.new 0; x.digits.each { |d| hist[d] += 1 }
# output_digits is the answer we're building
output_digits = []
y_digits = y.digits
x_digits = x.digits
# If x has fewer digits then all permutations are good so pick the largest
if x.digits.length < y.digits.length
9.downto(0) do |digit|
output_digits += [digit] * hist[digit]
end
return output_digits
end
# If y has fewer digits then no permutation is good, return -1
if y.digits.length < x.digits.length
return -1
end
# parse the digits of y
(y_digits.length - 1).downto(0) do |i|
cur_y_digit = y_digits[i]
# use the current digit of y if possible
if hist[cur_y_digit] > 0
hist[cur_y_digit] -= 1
output_digits.append(cur_y_digit)
return output_digits if i == 0
# otherwise, use the largest smaller digit available if possible
else
(cur_y_digit - 1).downto(0) do |smaller_digit|
if hist[smaller_digit] > 0
# place the smaller digit, then all remaining digits in descending order
hist[smaller_digit] -= 1
output_digits.append(smaller_digit)
9.downto(0) do |digit|
output_digits += [digit] * hist[digit]
end
return output_digits
end
end
# If we make it here then no digit was available; we need to unwind moves until we
# can replace a digit of our solution with a smaller digit
smallest_digit = hist.keys.min
while i < (y.digits.length - 1) do
i += 1
cur_y_digit = y_digits[i]
cur_unwound_digit = output_digits.pop
hist[cur_unwound_digit] += 1
smallest_digit = [smallest_digit, cur_unwound_digit].min
if cur_y_digit > smallest_digit
(cur_y_digit - 1).downto(smallest_digit) do |d|
if hist[d] >= 1
output_digits.append(d)
hist[d] -= 1
9.downto(0) do |digit|
output_digits += [digit] * hist[digit]
end
return output_digits
end
end
end
end
return -1
end
end
end
Outputs for OP sample cases:
> get_perm(123, 321)
=> [3, 2, 1]
> get_perm(1733, 3311)
=> [3, 1, 7, 3]
> get_perm(999, 111)
=> -1
If Z is the answer, and the numbers have n digits, you can show that there is an index i such that Z[:i] = Y[:i], Z[i]<Y[i], and Z[i+1:] is as large as possible given digits of X \ Z[:i+1] (I use python array slice notation, and the last expression means "the set of digits of X minus those already chosen in Z up to i+1").
Given this, you can easily loop over each candidate i, and efficiently check if it's feasible to chose such i as in above. The solution is with the largest possible i.
The solution should be O(n*log(n)).
I'll leave the proof and implementation details, as I understand it's a homework :)

Number of Ways To arrange Sequence

I am having a M character, from these character i need to make a sequence of length N such that no two consecutive character are same and also first and last character of the sequence is fix. So i need to find the total number of ways.
My Approach:
Dynamic programming.
If first and last character are '0' and '1'
dp[1][0]=1 , dp[1][1]=1
for(int i=2;i<N;i++)
for(int j=0;j<M;j++)
for(int k=0;k<M;k++)
if(j!=k) dp[i][j]+=dp[i-1][k]
So final answer would summation dp[n-1][i] , i!=1
Problem:
Here length N is too large around 10^15 and M is around 128, how find the number of permutation without using arrays ?
Assume M is fixed. Let D(n) be the number of sequences of length n with no repeated characters where the first and last character differ (but are fixed). Let S(n) be the number of sequences of length n where the first and last characters are the same (but are fixed).
For example, D(6) is the number of strings of the form a????b (for some a and b -- noting that for counting it doesn't matter which two characters we chose, and where the ? represent other characters). Similarly, S(6) is the number of strings of the form a????a.
Consider a sequence of length n>3 of the form a....?b. The ? can be any of m-1 characters (anything except b). One of these is a. So D(n) = S(n-1) + (m-2)D(n-1). Using a similar argument, one can figure out that S(n) = (M-1)D(n-1).
For example, how many strings are there of the form a??b? Well, the character just before the b could be a or something else. How many strings are there when it's a? Well, it's the same as the number of strings of the form a?a. How many strings are there when it's something else? Well it's the same as the number of strings of the form a?c multiplied by the number of choices we had for c (namely: m-2 -- everything except for a which we've already counted, and b which is excluded by the rules).
If n is odd, we can consider the middle character. Consider a sequence of length n of the form a...?...b. The ? (which is in the center of the string) can be a, b, or one of the other M-2 characters. Thus D(2n+1) = S(n+1)D(n+1) + D(n+1)S(n+1) + (M-2)D(n+1)D(n+1). Similarly, S(2n+1) = S(n+1)S(n+1) + (M-1)D(n+1)D(n+1).
For small n, S(2)=0, S(3)=M-1, D(2)=1, D(3)=M-2.
We can use the above equations (the first set for even n>3, the second set for odd n>3, and the base cases for n=2 or 3 to compute the result you need in O(log N) arithmetic operations. Presumably the question asks you to compute the result modulo something (since the result grows like O(M^(N-2)), but that's easy to incorporate into the results.
Working code that uses this approach:
def C(n, m, p):
if n == 2:
return 0, 1
if n == 3:
return (m-1)%p, (m-2)%p
if n % 2 == 0:
S, D = C(n-1, m, p)
return ((m-1) * D)%p, (S + (m-2) * D)%p
else:
S, D = C((n-1)//2+1, m, p)
return (S*S + (m-1)*D*D)%p, (2*S*D + (m-2)*D*D)%p
Note that in this code, C(n, m, p) returns two numbers -- S(n)%p and D(n)%p.
For example:
>>> p = 2**64 - 59 # Some large prime
>>> print(C(4, 128, p))
>>> print(C(5, 128, p))
>>> print(C(10**15, 128, p))
(16002, 16003)
(2032381, 2032380)
(12557489471374801501, 12557489471374801502)
Looking at these examples, it seems like D(n) = S(n) + (-1)^n. If that's true, the code can be simplified a bit I guess.
Another, perhaps easier, way to do it efficiently is to use a matrix and the first set of equations. (Sorry for the ascii art -- this diagram is a vector = matrix * vector):
(D(n)) = (M-2 1) * (D(n-1))
(S(n)) = (M-1 0) (S(n-1))
Telescoping this, and using that D(2)=1, S(2)=0:
(D(n)) = (M-2 1)^(n-2) (1)
(S(n)) = (M-1 0) (0)
You can perform the matrix power using exponentiation by squaring in O(log n) time.
Here's working code, including the examples (which you can check produce the same values as the code above). Most of the code is actually matrix multiply and matrix power -- you can probably replace a lot of it with numpy code if you use that package.
def mat_mul(M, N, p):
R = [[0, 0], [0, 0]]
for i in range(2):
for j in range(2):
for k in range(2):
R[i][j] += M[i][k] * N[k][j]
R[i][j] %= p
return R
def mat_pow(M, n, p):
if n == 0:
return [[1, 0], [0, 1]]
if n == 1:
return M
if n % 2 == 0:
R = mat_pow(M, n//2, p)
return mat_mul(R, R, p)
return mat_mul(M, mat_pow(M, n-1, p), p)
def Cmat(n, m, p):
M = [((m-2), 1), (m-1, 0)]
M = mat_pow(M, n-2, p)
return M[1][0], M[0][0]
p = 2**64 - 59
print(Cmat(4, 128, p))
print(Cmat(5, 128, p))
print(Cmat(10**15, 128, p))
You only need to count the number of acceptable sequences, not find them explicitly. It turns out that it doesn't matter what the majority of the characters are. There are only 4 kinds of characters that matter:
The first character
The last character
The last-used character, so you don't repeat characters consecutively
All other characters
In other words, you don't need to iterate over all 10^15 characters. You only need to consider the four cases above, since most characters can be lumped together into the last case.

What is the logic behind the algorithm

I am trying to solve a problem from codility
"Even sums"
but am unable to do so. Here is the question below.
Even sums is a game for two players. Players are given a sequence of N positive integers and take turns alternately. In each turn, a player chooses a non-empty slice (a subsequence of consecutive elements) such that the sum of values in this slice is even, then removes the slice and concatenates the remaining parts of the sequence. The first player who is unable to make a legal move loses the game.
You play this game against your opponent and you want to know if you can win, assuming both you and your opponent play optimally. You move first.
Write a function:
string solution(vector< int>& A);
that, given a zero-indexed array A consisting of N integers, returns a string of format "X,Y" where X and Y are, respectively, the first and last positions (inclusive) of the slice that you should remove on your first move in order to win, assuming you have a winning strategy. If there is more than one such winning slice, the function should return the one with the smallest value of X. If there is more than one slice with the smallest value of X, the function should return the shortest. If you do not have a winning strategy, the function should return "NO SOLUTION".
For example, given the following array:
A[0] = 4 A[1] = 5 A[2] = 3 A[3] = 7 A[4] = 2
the function should return "1,2". After removing a slice from positions 1 to 2 (with an even sum of 5 + 3 = 8), the remaining array is [4, 7, 2]. Then the opponent will be able to remove the first element (of even sum 4) or the last element (of even sum 2). Afterwards you can make a move that leaves the array containing just [7], so your opponent will not have a legal move and will lose. One of possible games is shown on the following picture
Note that removing slice "2,3" (with an even sum of 3 + 7 = 10) is also a winning move, but slice "1,2" has a smaller value of X.
For the following array:
A[0] = 2 A[ 1 ] = 5 A[2] = 4
the function should return "NO SOLUTION", since there is no strategy that guarantees you a win.
Assume that:
N is an integer within the range [1..100,000]; each element of array A is an integer within the range [1..1,000,000,000]. Complexity:
expected worst-case time complexity is O(N); expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments). Elements of input arrays can be modified.
I have found a solution online in python.
def check(start, end):
if start>end:
res = 'NO SOLUTION'
else:
res = str(start) + ',' + str(end)
return res
def trans( strr ):
if strr =='NO SOLUTION':
return (-1, -1)
else:
a, b = strr.split(',')
return ( int(a), int(b) )
def solution(A):
# write your code in Python 2.7
odd_list = [ ind for ind in range(len(A)) if A[ind]%2==1 ]
if len(odd_list)%2==0:
return check(0, len(A)-1)
odd_list = [-1] + odd_list + [len(A)]
res_cand = []
# the numbers at the either end of A are even
count = odd_list[1]
second_count = len(A)-1-odd_list[-2]
first_count = odd_list[2]-odd_list[1]-1
if second_count >= count:
res_cand.append( trans(check( odd_list[1]+1, len(A)-1-count )))
if first_count >= count:
res_cand.append( trans(check( odd_list[1]+count+1, len(A)-1 )))
twosum = first_count + second_count
if second_count < count <= twosum:
res_cand.append( trans(check( odd_list[1]+(first_count-(count-second_count))+1, odd_list[-2] )))
###########################################
count = len(A)-1-odd_list[-2]
first_count = odd_list[1]
second_count = odd_list[-2]-odd_list[-3]-1
if first_count >= count:
res_cand.append( trans(check( count, odd_list[-2]-1 )))
if second_count >= count:
res_cand.append( trans(check( 0, odd_list[-2]-count-1)) )
twosum = first_count + second_count
if second_count < count <= twosum:
res_cand.append( trans(check( count-second_count, odd_list[-3])) )
res_cand = sorted( res_cand, key=lambda x: (-x[0],-x[1]) )
cur = (-1, -2)
for item in res_cand:
if item[0]!=-1:
cur = item
return check( cur[0], cur[1] )
This code works and I am unable to understand the code and flow of one function to the the other. However I don't understand the logic of the algorithm. How it has approached the problem and solved it. This might be a long task but can anybody please care enough to explain me the algorithm. Thanks in advance.
So far I have figured out that the number of odd numbers are crucial to find out the result. Especially the index of the first odd number and the last odd number is needed to calculate the important values.
Now I need to understand the logic behind the comparison such as "if first_count >= count" and if "second_count < count <= twosum".
Update:
Hey guys I found out the solution to my question and finally understood the logic of the algorithm.
The idea lies behind the symmetry of the array. We can never win the game if the array is symmetrical. Here symmetrical is defined as the array where there is only one odd in the middle and equal number of evens on the either side of that one odd.
If there are even number of odds we can directly win the game.
If there are odd number of odds we should always try to make the array symmetrical. That is what the algorithm is trying to do.
Now there are two cases to it. Either the last odd will remain or the first odd will remain. I will be happy to explain more if you guys didn't understand it. Thanks.

Find the minimum number of operations required to compute a number using a specified range of numbers

Let me start with an example -
I have a range of numbers from 1 to 9. And let's say the target number that I want is 29.
In this case the minimum number of operations that are required would be (9*3)+2 = 2 operations. Similarly for 18 the minimum number of operations is 1 (9*2=18).
I can use any of the 4 arithmetic operators - +, -, / and *.
How can I programmatically find out the minimum number of operations required?
Thanks in advance for any help provided.
clarification: integers only, no decimals allowed mid-calculation. i.e. the following is not valid (from comments below): ((9/2) + 1) * 4 == 22
I must admit I didn't think about this thoroughly, but for my purpose it doesn't matter if decimal numbers appear mid-calculation. ((9/2) + 1) * 4 == 22 is valid. Sorry for the confusion.
For the special case where set Y = [1..9] and n > 0:
n <= 9 : 0 operations
n <=18 : 1 operation (+)
otherwise : Remove any divisor found in Y. If this is not enough, do a recursion on the remainder for all offsets -9 .. +9. Offset 0 can be skipped as it has already been tried.
Notice how division is not needed in this case. For other Y this does not hold.
This algorithm is exponential in log(n). The exact analysis is a job for somebody with more knowledge about algebra than I.
For more speed, add pruning to eliminate some of the search for larger numbers.
Sample code:
def findop(n, maxlen=9999):
# Return a short postfix list of numbers and operations
# Simple solution to small numbers
if n<=9: return [n]
if n<=18: return [9,n-9,'+']
# Find direct multiply
x = divlist(n)
if len(x) > 1:
mults = len(x)-1
x[-1:] = findop(x[-1], maxlen-2*mults)
x.extend(['*'] * mults)
return x
shortest = 0
for o in range(1,10) + range(-1,-10,-1):
x = divlist(n-o)
if len(x) == 1: continue
mults = len(x)-1
# We spent len(divlist) + mults + 2 fields for offset.
# The last number is expanded by the recursion, so it doesn't count.
recursion_maxlen = maxlen - len(x) - mults - 2 + 1
if recursion_maxlen < 1: continue
x[-1:] = findop(x[-1], recursion_maxlen)
x.extend(['*'] * mults)
if o > 0:
x.extend([o, '+'])
else:
x.extend([-o, '-'])
if shortest == 0 or len(x) < shortest:
shortest = len(x)
maxlen = shortest - 1
solution = x[:]
if shortest == 0:
# Fake solution, it will be discarded
return '#' * (maxlen+1)
return solution
def divlist(n):
l = []
for d in range(9,1,-1):
while n%d == 0:
l.append(d)
n = n/d
if n>1: l.append(n)
return l
The basic idea is to test all possibilities with k operations, for k starting from 0. Imagine you create a tree of height k that branches for every possible new operation with operand (4*9 branches per level). You need to traverse and evaluate the leaves of the tree for each k before moving to the next k.
I didn't test this pseudo-code:
for every k from 0 to infinity
for every n from 1 to 9
if compute(n,0,k):
return k
boolean compute(n,j,k):
if (j == k):
return (n == target)
else:
for each operator in {+,-,*,/}:
for every i from 1 to 9:
if compute((n operator i),j+1,k):
return true
return false
It doesn't take into account arithmetic operators precedence and braces, that would require some rework.
Really cool question :)
Notice that you can start from the end! From your example (9*3)+2 = 29 is equivalent to saying (29-2)/3=9. That way we can avoid the double loop in cyborg's answer. This suggests the following algorithm for set Y and result r:
nextleaves = {r}
nops = 0
while(true):
nops = nops+1
leaves = nextleaves
nextleaves = {}
for leaf in leaves:
for y in Y:
if (leaf+y) or (leaf-y) or (leaf*y) or (leaf/y) is in X:
return(nops)
else:
add (leaf+y) and (leaf-y) and (leaf*y) and (leaf/y) to nextleaves
This is the basic idea, performance can be certainly be improved, for instance by avoiding "backtracks", such as r+a-a or r*a*b/a.
I guess my idea is similar to the one of Peer Sommerlund:
For big numbers, you advance fast, by multiplication with big ciphers.
Is Y=29 prime? If not, divide it by the maximum divider of (2 to 9).
Else you could subtract a number, to reach a dividable number. 27 is fine, since it is dividable by 9, so
(29-2)/9=3 =>
3*9+2 = 29
So maybe - I didn't think about this to the end: Search the next divisible by 9 number below Y. If you don't reach a number which is a digit, repeat.
The formula is the steps reversed.
(I'll try it for some numbers. :) )
I tried with 2551, which is
echo $((((3*9+4)*9+4)*9+4))
But I didn't test every intermediate result whether it is prime.
But
echo $((8*8*8*5-9))
is 2 operations less. Maybe I can investigate this later.

Generate Random(a, b) making calls to Random(0, 1)

There is known Random(0,1) function, it is a uniformed random function, which means, it will give 0 or 1, with probability 50%. Implement Random(a, b) that only makes calls to Random(0,1)
What I though so far is, put the range a-b in a 0 based array, then I have index 0, 1, 2...b-a.
then call the RANDOM(0,1) b-a times, sum the results as generated idx. and return the element.
However since there is no answer in the book, I don't know if this way is correct or the best. How to prove that the probability of returning each element is exactly same and is 1/(b-a+1) ?
And what is the right/better way to do this?
If your RANDOM(0, 1) returns either 0 or 1, each with probability 0.5 then you can generate bits until you have enough to represent the number (b-a+1) in binary. This gives you a random number in a slightly too large range: you can test and repeat if it fails. Something like this (in Python).
def rand_pow2(bit_count):
"""Return a random number with the given number of bits."""
result = 0
for i in xrange(bit_count):
result = 2 * result + RANDOM(0, 1)
return result
def random_range(a, b):
"""Return a random integer in the closed interval [a, b]."""
bit_count = math.ceil(math.log2(b - a + 1))
while True:
r = rand_pow2(bit_count)
if a + r <= b:
return a + r
When you sum random numbers, the result is not longer evenly distributed - it looks like a Gaussian function. Look up "law of large numbers" or read any probability book / article. Just like flipping coins 100 times is highly highly unlikely to give 100 heads. It's likely to give close to 50 heads and 50 tails.
Your inclination to put the range from 0 to a-b first is correct. However, you cannot do it as you stated. This question asks exactly how to do that, and the answer utilizes unique factorization. Write m=a-b in base 2, keeping track of the largest needed exponent, say e. Then, find the biggest multiple of m that is smaller than 2^e, call it k. Finally, generate e numbers with RANDOM(0,1), take them as the base 2 expansion of some number x, if x < k*m, return x, otherwise try again. The program looks something like this (simple case when m<2^2):
int RANDOM(0,m) {
// find largest power of n needed to write m in base 2
int e=0;
while (m > 2^e) {
++e;
}
// find largest multiple of m less than 2^e
int k=1;
while (k*m < 2^2) {
++k
}
--k; // we went one too far
while (1) {
// generate a random number in base 2
int x = 0;
for (int i=0; i<e; ++i) {
x = x*2 + RANDOM(0,1);
}
// if x isn't too large, return it x modulo m
if (x < m*k)
return (x % m);
}
}
Now you can simply add a to the result to get uniformly distributed numbers between a and b.
Divide and conquer could help us in generating a random number in range [a,b] using random(0,1). The idea is
if a is equal to b, then random number is a
Find mid of the range [a,b]
Generate random(0,1)
If above is 0, return a random number in range [a,mid] using recursion
else return a random number in range [mid+1, b] using recursion
The working 'C' code is as follows.
int random(int a, int b)
{
if(a == b)
return a;
int c = RANDOM(0,1); // Returns 0 or 1 with probability 0.5
int mid = a + (b-a)/2;
if(c == 0)
return random(a, mid);
else
return random(mid + 1, b);
}
If you have a RNG that returns {0, 1} with equal probability, you can easily create a RNG that returns numbers {0, 2^n} with equal probability.
To do this you just use your original RNG n times and get a binary number like 0010110111. Each of the numbers are (from 0 to 2^n) are equally likely.
Now it is easy to get a RNG from a to b, where b - a = 2^n. You just create a previous RNG and add a to it.
Now the last question is what should you do if b-a is not 2^n?
Good thing that you have to do almost nothing. Relying on rejection sampling technique. It tells you that if you have a big set and have a RNG over that set and need to select an element from a subset of this set, you can just keep selecting an element from a bigger set and discarding them till they exist in your subset.
So all you do, is find b-a and find the first n such that b-a <= 2^n. Then using rejection sampling till you picked an element smaller b-a. Than you just add a.

Resources