Number of arrangements - algorithm

Suppose we have n elements, a1, a2, ..., an, arranged in a circle. That is, a2 is between a1 and a3, a3 is between a2 and a4, an is between an-1 and a1, and so forth.
Each element can take the value of either 1 or 0. Two arrangements are different if there are corresponding ai's whose values differ. For instance, when n=3, (1, 0, 0) and (0, 1, 0) are different arrangements, even though they may be isomorphic under rotation or reflection.
Because there are n elements, each of which can take two values, the total number of arrangements is 2n.
Here is the question:
How many arrangements are possible, such that no two adjacent elements both have the value 1? If it helps, only consider cases where n>3.
I ask here for several reasons:
This arose while I was solving a programming problem
It sounds like the problem may benefit from Boolean logic/bit arithmetic
Maybe there is no closed solution.

Let's first ask the question "how many 0-1 sequences of length n are there with no two consecutive 1s?" Let the answer be A(n). We have A(0)=1 (the empty sequence), A(1) = 2 ("0" and "1"), and A(2)=3 ("00", "01" and "10" but not "11").
To make it easier to write a recurrence, we'll compute A(n) as the sum of two numbers:
B(n), the number of such sequences that end with a 0, and
C(n), the number of such sequences that end with a 1.
Then B(n) = A(n-1) (take any such sequence of length n-1, and append a 0)
and C(n) = B(n-1) (because if you have a 1 at position n, you must have a 0 at n-1.)
This gives A(n) = B(n) + C(n) = A(n-1) + B(n-1) = A(n-1) + A(n-2).
By now it should be familiar :-)
A(n) is simply the Fibonacci number Fn+2 where the Fibonacci sequence is defined by F0=0, F1=1, and Fn+2= Fn+1+Fn for n ≥ 0.
Now for your question. We'll count the number of arrangements with a1=0 and a1=1 separately. For the former, a2 … an can be any sequence at all (with no consecutive 1s), so the number is A(n-1)=Fn+1. For the latter, we must have a2=0, and then a3…an is any sequence with no consecutive 1s that ends with a 0, i.e. B(n-2)=A(n-3)=Fn-1.
So the answer is Fn+1 + Fn-1.
Actually, we can go even further than that answer. Note that if you call the answer as G(n)=Fn+1+Fn-1, then
G(n+1)=Fn+2+Fn, and
G(n+2)=Fn+3+Fn+1, so even G(n) satisfies the same recurrence as the Fibonacci sequence! [Actually, any linear combination of Fibonacci-like sequences will satisfy the same recurrence, so it's not all that surprising.] So another way to compute the answers would be using:
G(2)=3
G(3)=4
G(n)=G(n-1)+G(n-2) for n≥4.
And now you can also use the closed form Fn=(αn-βn)/(α-β) (where α and β are (1±√5)/2, the roots of x2-x-1=0), to get
G(n) = ((1+√5)/2)n + ((1-√5)/2)n.
[You can ignore the second term because it's very close to 0 for large n, in fact G(n) is the closest integer to ((1+√5)/2)n for all n≥2.]

I decided to hack up a small script to try it out:
#!/usr/bin/python
import sys
# thx google
bstr_pos = lambda n: n>0 and bstr_pos(n>>1)+str(n&1) or ""
def arrangements(n):
count = 0
for v in range(0, pow(2,n)-1):
bin = bstr_pos(v).rjust(n, '0')
if not ( bin.find("11")!=-1 or ( bin[0]=='1' and bin[-1]=='1' ) ):
count += 1
print bin
print "Total = " + str(count)
arrangements(int(sys.argv[1]))
Running this for 5, gave me a total of 11 possibilities with 00000,
00001,
00010,
00100,
00101,
01000,
01001,
01010,
10000,
10010,
10100.
P.S. - Excuse the not() in the above code.

Throwing my naive script into the mix. Plenty of opportunity for caching partial results, but it ran fast enough for small n that I didn't bother.
def arcCombinations(n, lastDigitMustBeZero):
"""Takes the length of the remaining arc of the circle, and computes
the number of legal combinations.
The last digit may be restricted to 0 (because the first digit is a 1)"""
if n == 1:
if lastDigitMustBeZero:
return 1 # only legal answer is 0
else:
return 2 # could be 1 or 0.
elif n == 2:
if lastDigitMustBeZero:
return 2 # could be 00 or 10
else:
return 3 # could be 10, 01 or 00
else:
# Could be a 1, in which case next item is a zero.
return (
arcCombinations(n-2, lastDigitMustBeZero) # If it starts 10
+ arcCombinations(n-1, lastDigitMustBeZero) # If it starts 0
)
def circleCombinations(n):
"""Computes the number of legal combinations for a given circle size."""
# Handle case where it starts with 0 or with 1.
total = (
arcCombinations(n-1,True) # Number of combinations where first digit is a 1.
+
arcCombinations(n-1,False) # Number of combinations where first digit is a 0.
)
return total
print circleCombinations(13)

This problem is very similar to Zeckendorf representations. I can't find an obvious way to apply Zeckendorf's Theorem, due to the circularity constraint, but the Fibonacci numbers are obviously very prevalent in this problem.

Related

The algorithm to generate binary numbers from 1 to n in lexicographical order

Is there any efficient algorithm to do so ,I have tried to produce all binary numbers and store them in an array then sort them, if we can directly generate the binary numbers in lexicographical order the code will be much faster.
For eg : n=7 produces 1,10,100,101,11,110,111
The key property here is, 0 will always come before 1, so you can use recursion to solve this. The algorithm would look like:
Start recursion from 1
If current number > n, ignore it
Else, add it to the output list. Call recursion(curr_number + "0") and recursion(curr_number + "1")
This is a simple algorithm, which can be easily implemented in most languages.
Edit: Python implementation:
def dfs(current_string, current_number, n):
if current_number > n:
return []
strings = [current_string]
strings.extend(dfs(current_string + "0", current_number << 1, n))
strings.extend(dfs(current_string + "1", (current_number << 1)+1, n))
return strings
print(dfs("1", 1, 7))
If you number a complete binary tree row by row, from 1 to 2^d-1, the enumeration of the nodes in lexicographical order is the preorder traversal. Now as the two children of a node carry the value of the parent followed by 0 or by 1, we have the recursive enumeration
n= ...
def Emit(m):
print(bin(m))
if 2 * m <= n:
Emit(2 * m)
if 2 * m + 1 <= n:
Emit(2 * m + 1)
Emit(1)
(You can also obtain the binary representations by concatenating 0's or 1's as you go.)
There are a few rules you can follow to generate the next item in a lexicographical ordering of any set of strings:
The first symbol that changes must increase (otherwise you'll get an earlier symbol)
The first symbols that changes must be as far right as possible (otherwise there would be a smaller lexicographical change); and
The symbols after the first change must be made as small as possible (otherwise again there would be a smaller lexicographical change).
For ordering the binary strings, these rules are easy to apply. In each iteration:
If you can append a zero without exceeding n, then do so;
Otherwise, find the rightmost possible 0, change it to a 1, and remove the remainder. The "rightmost possible 0" in this case is rightmost one that produces a result <= n. This is not necessarily the rightmost one if n is not 2x-1.
This iteration is pretty easy to implement with bitwise operators, leading to this nice quick algorithm. To simplify step (2), we assume that n is 2x-1 and just check our outputs:
def printLexTo(n):
val=1
while True:
if val<=n:
print("{0:b}".format(val))
if 2*val <= n:
val *= 2
else:
# get the smallest 0 bit
bit = (val+1) & ~val
# set it to 1 and remove the remainder
val = (val+1)//bit
if val==1:
# there weren't any 0 bits in the string
break
Try it online
As is often the case, this iterative algorithm is a lot faster than recursive ones, but coming up with it requires a deeper understanding of the structure of the solution.

Finding all n digit binary numbers with r adjacent digits as 1 [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Let me explain with an example. If n=4 and r=2 that means all 4 digit binary numbers such that two adjacent digits can be 1. so the answer is 0011 0110 1011 1100 1101
Q. i am unable to figure out a pattern or an algorithm.
Hint: The 11 can start in position 0, 1, or 2. On either side, the digit must be zero, so the only "free" digits are in the remaining position and can cycle through all possible values.
For example, if there are n=10 digits and you're looking for r=3 adjacent ones, the pattern is
x01110y
Where x and y can cycle through all possible suffixes and prefixes for the remaining five free digits. Note, on the sides, the leading and trailing zero gets dropped, leaving six free digits in x0111 and 1110y.
Here's an example using Python:
from itertools import product
def gen(n, r):
'Generate all n-length sequences with r fixed adjacent ones'
result = set()
fixed = tuple([1] * r + [0])
for suffix in product([0,1], repeat=n-r-1):
result.add(fixed + suffix)
fixed = tuple([0] + [1] * r + [0])
rem = n - r - 2
for leadsize in range(1, rem):
for digits in product([0,1], repeat=rem):
result.add(digits[:leadsize] + fixed + digits[leadsize:])
fixed = tuple([0] + [1] * r)
for prefix in product([0,1], repeat=n-r-1):
result.add(prefix + fixed)
return sorted(result)
I would start with simplifying the problem. Once you have a solution for the simplest case, generalize it and then try to optimize it.
First design an algorithm that will find out if a given number has 'r' adjacent 1s. Once you have it, the brute-force way is to go through all the numbers with 'n' digits, checking each with the algorithm you just developed.
Now, you can look for optimizing it. For example: if you know whether 'r' is even or odd, you can reduce your set of numbers to look at. The counting 1's algorithm given by KNR is order of number of set bits. Thus, you rule out half of the cases with lesser complexity then actual bit by bit comparison. There might be a better way to reduce this as well.
Funny problem with very simple recursive solution. Delphi.
procedure GenerateNLengthWithROnesTogether(s: string;
N, R, Len, OnesInRow: Integer; HasPatternAlready: Boolean);
begin
if Len = N then
Output(s)
else
begin
HasPatternAlready := HasPatternAlready or (OnesInRow >= R);
if HasPatternAlready or (N - Len > R) //there is chance to make pattern}
then
GenerateNLengthWithROnesTogether('0' + s, N, R, Len + 1, 0, HasPatternAlready);
if (not HasPatternAlready) or (OnesInRow < R - 1) //only one pattern allowed
then
GenerateNLengthWithROnesTogether('1' + s, N, R, Len + 1, OnesInRow + 1, HasPatternAlready);
end;
end;
begin
GenerateNLengthWithROnesTogether('', 5, 2, 0, 0, False);
end;
program output:
N=5,R=2
11000 01100 11010 00110
10110 11001 01101 00011
10011 01011
N=7, R=3
1110000 0111000 1110100 0011100
1011100 1110010 0111010 1110110
0001110 1001110 0101110 1101110
1110001 0111001 1110101 0011101
1011101 1110011 0111011 0000111
1000111 0100111 1100111 0010111
1010111 0110111
As I've stated in the comment above, I am still unclear about the full restrictions of the output set. However, the algorithm below can be refined to cover your final case.
Before I can describe the algorithm, there is an observation: let S be 1 repeated m times, and D be the set of all possible suffixes we can use to generate valid outputs. So, the bit string S0D0 (S followed by the 0 bit, followed by the bit string D followed by the 0 bit) is a valid output for the algorithm. Also, all strings ror(S0D0, k), 0<=k<=n-m are valid outputs (ror is the rotate right function, where bits that disappear on the right side come in from left). These will generate the bit strings S0D0 to 0D0S. In addition to these rotations, the solutions S0D1 and 1D0S are valid bit strings that can be generated by the pair (S, D).
So, the algorithm is simply enumerating all valid D bit strings, and generating the above set for each (S, D) pair. If you allow more than m 1s together in the D part, it is simple bit enumeration. If not, it is a recursive definition, where D is the set of outputs of the same algorithm with n'=n-(m+2) and m' is each of {m, m-1, ..., 1}.
Of course, this algorithm will generate some duplicates. The cases I can think of are when ror(S0D0,k) matches one of the patterns S0E0, S0E1 or 1E0S. For the first case, you can stop generating more outputs for larger k values. D=E generator will take care of those. You can also simply drop the other two cases, but you need to continue rotating.
I know there is an answer, but I wanted to see the algorithm at work, so I implemented a crude version. It turned out to have more edge cases than I realized. I haven't added duplication check for the two last yields of the family() function, which causes duplication for outputs like 11011, but the majority of them are eliminated.
def ror(str, n):
return str[-n:]+str[:-n]
def family(s, d, r):
root = s + '0' + d + '0'
yield root # root is always a solution
for i in range(1, len(d)+3):
sol=ror(root, i)
if sol[:r]==s and sol[r]=='0' and sol[-1]=='0':
break
yield sol
if d[-r:]!=s: # Make sure output is valid
yield s + '0' + d + '1'
if d[:r]!=s: # Make sure output is valid (todo: duplicate check)
yield '1' + d + '0' + s
def generate(n, r):
s="1"*r
if r==0: # no 1's allowed
yield '0'*n
elif n==r: # only one combination
yield s
elif n==r+1: # two cases. Cannot use family() for this
yield s+'0'
yield '0'+s
else:
# generate all sub-problem outputs
for rr in range(r+1):
if n-r-2>=rr:
for d in generate(n-r-2, rr):
for sol in family(s, d, r):
yield sol
You use it either as [s for s in generate(6,2)], or in a loop as
for s in generate(6,3):
print(s)

How to implement Random(a,b) with only Random(0,1)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

Determine whether a symbol is part of the ith combination nCr

UPDATE:
Combinatorics and unranking was eventually what I needed.
The links below helped alot:
http://msdn.microsoft.com/en-us/library/aa289166(v=vs.71).aspx
http://www.codeproject.com/Articles/21335/Combinations-in-C-Part-2
The Problem
Given a list of N symbols say {0,1,2,3,4...}
And NCr combinations of these
eg. NC3 will generate:
0 1 2
0 1 3
0 1 4
...
...
1 2 3
1 2 4
etc...
For the ith combination (i = [1 .. NCr]) I want to determine Whether a symbol (s) is part of it.
Func(N, r, i, s) = True/False or 0/1
eg. Continuing from above
The 1st combination contains 0 1 2 but not 3
F(N,3,1,"0") = TRUE
F(N,3,1,"1") = TRUE
F(N,3,1,"2") = TRUE
F(N,3,1,"3") = FALSE
Current approaches and tibits that might help or be related.
Relation to matrices
For r = 2 eg. 4C2 the combinations are the upper (or lower) half of a 2D matrix
1,2 1,3 1,4
----2,3 2,4
--------3,4
For r = 3 its the corner of a 3D matrix or cube
for r = 4 Its the "corner" of a 4D matrix and so on.
Another relation
Ideally the solution would be of a form something like the answer to this:
Calculate Combination based on position
The nth combination in the list of combinations of length r (with repitition allowed), the ith symbol can be calculated
Using integer division and remainder:
n/r^i % r = (0 for 0th symbol, 1 for 1st symbol....etc)
eg for the 6th comb of 3 symbols the 0th 1st and 2nd symbols are:
i = 0 => 6 / 3^0 % 3 = 0
i = 1 => 6 / 3^1 % 3 = 2
i = 2 => 6 / 3^2 % 3 = 0
The 6th comb would then be 0 2 0
I need something similar but with repition not allowed.
Thank you for following this question this far :]
Kevin.
I believe your problem is that of unranking combinations or subsets.
I will give you an implementation in Mathematica, from the package Combinatorica, but the Google link above is probably a better place to start, unless you are familiar with the semantics.
UnrankKSubset::usage = "UnrankKSubset[m, k, l] gives the mth k-subset of set l, listed in lexicographic order."
UnrankKSubset[m_Integer, 1, s_List] := {s[[m + 1]]}
UnrankKSubset[0, k_Integer, s_List] := Take[s, k]
UnrankKSubset[m_Integer, k_Integer, s_List] :=
Block[{i = 1, n = Length[s], x1, u, $RecursionLimit = Infinity},
u = Binomial[n, k];
While[Binomial[i, k] < u - m, i++];
x1 = n - (i - 1);
Prepend[UnrankKSubset[m - u + Binomial[i, k], k-1, Drop[s, x1]], s[[x1]]]
]
Usage is like:
UnrankKSubset[5, 3, {0, 1, 2, 3, 4}]
{0, 3, 4}
Yielding the 6th (indexing from 0) length-3 combination of set {0, 1, 2, 3, 4}.
There's a very efficient algorithm for this problem, which is also contained in the recently published:Knuth, The Art of Computer Programming, Volume 4A (section 7.2.1.3).
Since you don't care about the order in which the combinations are generated, let's use the lexicographic order of the combinations where each combination is listed in descending order. Thus for r=3, the first 11 combinations of 3 symbols would be: 210, 310, 320, 321, 410, 420, 421, 430, 431, 432, 510. The advantage of this ordering is that the enumeration is independent of n; indeed it is an enumeration over all combinations of 3 symbols from {0, 1, 2, …}.
There is a standard method to directly generate the ith combination given i, so to test whether a symbol s is part of the ith combination, you can simply generate it and check.
Method
How many combinations of r symbols start with a particular symbol s? Well, the remaining r-1 positions must come from the s symbols 0, 1, 2, …, s-1, so it's (s choose r-1), where (s choose r-1) or C(s,r-1) is the binomial coefficient denoting the number of ways of choosing r-1 objects from s objects. As this is true for all s, the first symbol of the ith combination is the smallest s such that
&Sum;k=0s(k choose r-1) ≥ i.
Once you know the first symbol, the problem reduces to finding the (i - &Sum;k=0s-1(k choose r-1))-th combination of r-1 symbols, where we've subtracted those combinations that start with a symbol less than s.
Code
Python code (you can write C(n,r) more efficiently, but this is fast enough for us):
#!/usr/bin/env python
tC = {}
def C(n,r):
if tC.has_key((n,r)): return tC[(n,r)]
if r>n-r: r=n-r
if r<0: return 0
if r==0: return 1
tC[(n,r)] = C(n-1,r) + C(n-1,r-1)
return tC[(n,r)]
def combination(r, k):
'''Finds the kth combination of r letters.'''
if r==0: return []
sum = 0
s = 0
while True:
if sum + C(s,r-1) < k:
sum += C(s,r-1)
s += 1
else:
return [s] + combination(r-1, k-sum)
def Func(N, r, i, s): return s in combination(r, i)
for i in range(1, 20): print combination(3, i)
print combination(500, 10000000000000000000000000000000000000000000000000000000000000000)
Note how fast this is: it finds the 10000000000000000000000000000000000000000000000000000000000000000th combination of 500 letters (it starts with 542) in less than 0.5 seconds.
I have written a class to handle common functions for working with the binomial coefficient, which is the type of problem that your problem falls under. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
This class can easily be applied to your problem. If you have the rank (or index) to the binomial coefficient table, then simply call the class method that returns the K-indexes in an array. Then, loop through that returned array to see if any of the K-index values match the value you have. Pretty straight forward...

Number base conversion as a stream operation

Is there a way in constant working space to do arbitrary size and arbitrary base conversions. That is, to convert a sequence of n numbers in the range [1,m] to a sequence of ceiling(n*log(m)/log(p)) numbers in the range [1,p] using a 1-to-1 mapping that (preferably but not necessarily) preservers lexigraphical order and gives sequential results?
I'm particularly interested in solutions that are viable as a pipe function, e.i. are able to handle larger dataset than can be stored in RAM.
I have found a number of solutions that require "working space" proportional to the size of the input but none yet that can get away with constant "working space".
Does dropping the sequential constraint make any difference? That is: allow lexicographically sequential inputs to result in non lexicographically sequential outputs:
F(1,2,6,4,3,7,8) -> (5,6,3,2,1,3,5,2,4,3)
F(1,2,6,4,3,7,9) -> (5,6,3,2,1,3,5,2,4,5)
some thoughts:
might this work?
streamBasen -> convert(n, lcm(n,p)) -> convert(lcm(n,p), p) -> streamBasep
(where lcm is least common multiple)
I don't think it's possible in the general case. If m is a power of p (or vice-versa), or if they're both powers of a common base, you can do it, since each group of logm(p) is then independent. However, in the general case, suppose you're converting the number a1 a2 a3 ... an. The equivalent number in base p is
sum(ai * mi-1 for i in 1..n)
If we've processed the first i digits, then we have the ith partial sum. To compute the i+1'th partial sum, we need to add ai+1 * mi. In the general case, this number is going have non-zero digits in most places, so we'll need to modify all of the digits we've processed so far. In other words, we'll have to process all of the input digits before we'll know what the final output digits will be.
In the special case where m are both powers of a common base, or equivalently if logm(p) is a rational number, then mi will only have a few non-zero digits in base p near the front, so we can safely output most of the digits we've computed so far.
I think there is a way of doing radix conversion in a stream-oriented fashion in lexicographic order. However, what I've come up with isn't sufficient for actually doing it, and it has a couple of assumptions:
The length of the positional numbers are already known.
The numbers described are integers. I've not considered what happens with the maths and -ive indices.
We have a sequence of values a of length p, where each value is in the range [0,m-1]. We want a sequence of values b of length q in the range [0,n-1]. We can work out the kth digit of our output sequence b from a as follows:
bk = floor[ sum(ai * mi for i in 0 to p-1) / nk ] mod n
Lets rearrange that sum into two parts, splitting it at an arbitrary point z
bk = floor[ ( sum(ai * mi for i in z to p-1) + sum(ai * mi for i in 0 to z-1) ) / nk ] mod n
Suppose that we don't yet know the values of a between [0,z-1] and can't compute the second sum term. We're left with having to deal with ranges. But that still gives us information about bk.
The minimum value bk can be is:
bk >= floor[ sum(ai * mi for i in z to p-1) / nk ] mod n
and the maximum value bk can be is:
bk <= floor[ ( sum(ai * mi for i in z to p-1) + mz - 1 ) / nk ] mod n
We should be able to do a process like this:
Initialise z to be p. We will count down from p as we receive each character of a.
Initialise k to the index of the most significant value in b. If my brain is still working, ceil[ logn(mp) ].
Read a value of a. Decrement z.
Compute the min and max value for bk.
If the min and max are the same, output bk, and decrement k. Goto 4. (It may be possible that we already have enough values for several consecutive values of bk)
If z!=0 then we expect more values of a. Goto 3.
Hopefully, at this point we're done.
I've not considered how to efficiently compute the range values as yet, but I'm reasonably confident that computing the sum from the incoming characters of a can be done much more reasonably than storing all of a. Without doing the maths though, I won't make any hard claims about it though!
Yes, it is possible
For every I character(s) you read in, you will write out O character(s)
based on Ceiling(Length * log(In) / log(Out)).
Allocate enough space
Set x to 1
Loop over digits from end to beginning # Horner's method
Set a to x * digit
Set t to O - 1
Loop while a > 0 and t >= 0
Set a to a + out digit
Set out digit at position t to a mod to base
Set a to a / to base
Set x to x * from base
Return converted digit(s)
Thus, for base 16 to 2 (which is easy), using "192FE" we read '1' and convert it, then repeat on '9', then '2' and so on giving us '0001', '1001', '0010', '1111', and '1110'.
Note that for bases that are not common powers, such as base 17 to base 2 would mean reading 1 characters and writing 5.

Resources