As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I've noticed recently that there are a great many algorithms out there based in part or in whole on clever uses of numbers in creative bases. For example:
Binomial heaps are based on binary numbers, and the more complex skew binomial heaps are based on skew binary numbers.
Some algorithms for generating lexicographically ordered permutations are based on the factoradic number system.
Tries can be thought of as trees that look at one digit of the string at a time, for an appropriate base.
Huffman encoding trees are designed to have each edge in the tree encode a zero or one in some binary representation.
Fibonacci coding is used in Fibonacci search and to invert certain types of logarithms.
My question is: what other algorithms are out there that use a clever number system as a key step of their intuition or proof?. I'm thinking about putting together a talk on the subject, so the more examples I have to draw from, the better.
Chris Okasaki has a very good chapter in his book Purely Functional Data Structures that discusses "Numerical Representations": essentially, take some representation of a number and convert it into a data structure. To give a flavor, here are the sections of that chapter:
Positional Number Systems
Binary Numbers (Binary Random-Access Lists, Zeroless Representations, Lazy Representations, Segmented Representations)
Skew Binary Numbers (Skew Binary Random Access Lists, Skew Binomial Heaps)
Trinary and Quaternary Numbers
Some of the best tricks, distilled:
Distinguish between dense and sparse representations of numbers (usually you see this in matrices or graphs, but it's applicable to numbers too!)
Redundant number systems (systems that have more than one representation of a number) are useful.
If you arrange the first digit to be non-zero or use a zeroless representation, retrieving the head of the data structure can be efficient.
Avoid cascading borrows (from taking the tail of the list) and carries (from consing onto the list) by segmenting the data structure
Here is also the reference list for that chapter:
Guibas, McCreight, Plass and Roberts: A new representation for linear lists.
Myers: An applicative random-access stack
Carlsson, Munro, Poblete: An implicit binomial queue with constant insertion time.
Kaplan, Tarjan: Purely functional lists with catenation via recursive slow-down.
"Ternary numbers can be used to convey
self-similar structures like a
Sierpinski Triangle or a Cantor set
conveniently." source
"Quaternary numbers are used in the
representation of 2D Hilbert curves." source
"The quater-imaginary numeral system
was first proposed by Donald Knuth in
1955, in a submission to a high-school
science talent search. It is a
non-standard positional numeral system
which uses the imaginary number 2i as
its base. It is able to represent
every complex number using only the
digits 0, 1, 2, and 3." source
"Roman numerals are a biquinary system." source
"Senary may be considered useful in the
study of prime numbers since all
primes, when expressed in base-six,
other than 2 and 3 have 1 or 5 as the
final digit." source
"Sexagesimal (base 60) is a numeral
system with sixty as its base. It
originated with the ancient Sumerians
in the 3rd millennium BC, it was
passed down to the ancient
Babylonians, and it is still used — in
a modified form — for measuring time,
angles, and the geographic coordinates
that are angles." source
etc...
This list is a good starting point.
I read your question the other day, and today was faced with a problem: How do I generate all partitionings of a set? The solution that occurred to me, and that I used (maybe due to reading your question) was this:
For a set with (n) elements, where I need (p) partitions, count through all (n) digit numbers in base (p).
Each number corresponds to a partitioning. Each digit corresponds to an element in the set, and the value of the digit tells you which partition to put the element in.
It's not amazing, but it's neat. It's complete, causes no redundancy, and uses arbitrary bases. The base you use depends on the specific partitioning problem.
I recently came across a cool algorithm for generating subsets in lexicographical order based on the binary representations of the numbers between 0 and 2n - 1. It uses the numbers' bits both to determine what elements should be chosen for the set and to locally reorder the generated sets to get them into lexicographical order. If you're curious, I have a writeup posted here.
Also, many algorithms are based on scaling (such as a weakly-polynomial version of the Ford-Fulkerson max-flow algorithm), which uses the binary representation of the numbers in the input problem to progressively refine a rough approximation into a complete solution.
Not exactly a clever base system but a clever use of the base system: Van der Corput sequences are low-discrepancy sequences formed by reversing the base-n representation of numbers. They're used to construct the 2-d Halton sequences which look kind of like this.
I vaguely remember something about double base systems for speeding up some matrix multiplication.
Double base system is a redundant system that uses two bases for one number.
n = Sum(i=1 --> l){ c_i * 2^{a_i} * 3 ^ {b_i}, where c in {-1,1}
Redundant means that one number can be specified in many ways.
You can look for the article "Hybrid Algorithm for the Computation of the Matrix Polynomial" by Vassil Dimitrov, Todor Cooklev.
Trying to give the best short overview I can.
They were trying to compute matrix polynomial G(N,A) = I + A + ... + A^{N-1}.
Supoosing N is composite G(N,A) = G(J,A) * G(K, A^J), if we apply for J = 2, we get:
/ (I + A) * G(K, A^2) , if N = 2K
G(N,A) = |
\ I + (A + A^2) * G(K, A^2) , if N = 2K + 1
also,
/ (I + A + A^2) * G(K, A^3) , if N = 3K
G(N,A) = | I + (A + A^2 + A^3) * G(K, A^3) , if N = 3K + 1
\ I + A * (A + A^2 + A^3) * G(K, A^3) , if N = 3K + 2
As it's "obvious" (jokingly) that some of these equations are fast in the first system and some better in the second - so it is a good idea to choose the best of those depending on N. But this would require fast modulo operation for both 2 and 3. Here's why the double base comes in - you can basically do the modulo operation fast for both of them giving you a combined system:
/ (I + A + A^2) * G(K, A^3) , if N = 0 or 3 mod 6
G(N,A) = | I + (A + A^2 + A^3) * G(K, A^3) , if N = 1 or 4 mod 6
| (I + A) * G(3K + 1, A^2) , if N = 2 mod 6
\ I + (A + A^2) * G(3K + 2, A^2) , if N = 5 mod 6
Look at the article for better explanation as I'm not an expert in this area.
RadixSort can use a various number bases.
http://en.wikipedia.org/wiki/Radix_sort
Pretty interesting implementation of a bucketSort.
here is a good post on using ternary numbers to solve the "counterfeit coin" problem (where you have to detect a single counterfeit coin in a bag of regular ones, using a balance as few times as possible)
Hashing strings (e.g. in the Rabin-Karp algorithm) often evaluate the string as a base-b number consisting of n digits (where n is the length of the string, and b is some chosen base that is large enough). For example the string "ABCD" can be hashed as:
'A'*b^3+'B'*b^2+'C'*b^1+'D'*b^0
Substituting ASCII values for characters and taking b to be 256 this becomes,
65*256^3+66*256^2+67*256^1+68*256^0
Though, in most practical applications, the resulting value is taken modulo some reasonably sized number to keep the result sufficiently small.
Exponentiation by squaring is based on binary representation of the exponent.
In Hackers Delight (a book every programmer should know in my eyes) there is a complete chapter about unusal bases, like -2 as base (yea, right negative bases) or -1+i (i as imaginary unit sqrt(-1)) as base.
Also I nice calculation what the best base is (in terms of hardware design, for all who dont want to read it: The solution of the equation is e, so you can go with 2 or 3, 3 would be little bit better (factor 1.056 times better than 2) - but is technical more practical).
Other things which come to my mind are gray counter (you when you count in this system only 1 bit changes, you often use this property in hardware design to reduce metastability issues) or the generalisation of the already mentioned Huffmann encoding - the arithmetic encoding.
Cryptography makes extensive use of integer rings (modular arithmatic) and also finite fields, whose operations are intuitively based on the way polynomials with integer coefficients behave.
I really like this one for converting binary numbers into Gray codes: http://www.matrixlab-examples.com/gray-code.html
Great question. The list is long indeed.
Telling time is a simple instance of mixed bases (days | hours | minutes | seconds | am/pm)
I've created a meta-base enumeration n-tuple framework if you're interested in hearing about it. It's some very sweet syntactic sugar for base numbering systems. It's not released yet. Email my username (at gmail).
One of my favourites using base 2 is Arithmetic Encoding. Its unusual because the hart of the algorithm uses representations of numbers between 0 and 1 in binary.
May be AKS is the case.
Related
I am reading the book Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein.. In the second chapter under "Analyzing Algorithms" it is mentioned that :
We also assume a limit on the size of each word of data. For example , when working with inputs of size n , we typically assume that integers are represented by c lg n bits for some constant c>=1 . We require c>=1 so that each word can hold the value of n , enabling us to index the individual input elements , and we restrict c to be a constant so that the word size doesn't grow arbitrarily .( If the word size could grow arbitrarily , we could store huge amounts of data in one word and operate on it all in constant time - clearly an unrealistic scenario.)
My questions are why this assumption that each integer should be represented by c lg n bits and also how c>=1 being the case allows us to index the individual input elements ?
first, by lg they apparently mean log base 2, so lg n is the number of bits in n.
then what they are saying is that if they have an algorithm that takes a list of numbers (i am being more specific in my example to help make it easier to understand) like 1,2,3,...n then they assume that:
a "word" in memory is big enough to hold any of those numbers.
a "word" in memory is not big enough to hold all the numbers (in one single word, packed in somehow).
when calculating the number of "steps" in an algorithm, an operation on one "word" takes one step.
the reason they are doing this is to keep the analysis realistic (you can only store numbers up to some size in "native" types; after that you need to switch to arbitrary precision libraries) without choosing a particular example (like 32 bit integers) that might be inappropriate in some cases, or become outdated.
You need at least lg n bits to represent integers of size n, so that's a lower bound on the number of bits needed to store inputs of size n. Setting the constant c >= 1 makes it a lower bound. If the constant multiplier were less than 1, you wouldn't have enough bits to store n.
This is a simplifying step in the RAM model. It allows you to treat each individual input value as though it were accessible in a single slot (or "word") of memory, instead of worrying about complications that might arise otherwise. (Loading, storing, and copying values of different word sizes would take differing amounts of time if we used a model that allowed varying word lengths.) This is what's meant by "enabling us to index the individual input elements." Each input element of the problem is assumed to be accessible at a single address, or index (meaning it fits in one word of memory), simplifying the model.
This question was asked very long ago and the explanations really helped me, but I feel like there could still be a little more clarification about how the lg n came about. For me talking through things really helps:
Lets choose a random number in base 10, like 27, we need 5 bits to store this. Why? Well because 27 is 11011 in binary. Notice 11011 has 5 digits each 'digit' is what we call a bit hence 5 bits.
Think of each bit as being a slot. For binary, each of those slots can hold a 0 or 1. What's the largest number I can store with 5 bits? Well, the largest number would fill each slot: 11111
11111 = 31 = 2^5 so to store 31 we need 5 bits and 31 is 2^5
Generally (and I will use very explicit names for clarity):
numToStore = 2 ^ numBitsNeeded
Since log is the mathematical inverse of exponent we get:
log(numToStore) = numBitsNeeded
Since this is likely to not result in an integer, we use ceil to round our answer up. So applying our example to find how many bits are needed to store the number 31:
log(31) = 4.954196310386876 = 5 bits
I am aware of the fact that the Sieve of Eratosthenes can be implemented so that it finds primes continuosly without an upper bound (the segmented sieve).
My question is, could the Sieve of Atkin/Bernstein be implemented in the same way?
Related question: C#: How to make Sieve of Atkin incremental
However the related question has only 1 answer, which says "It's impossible for all sieves", which is obviously incorrect.
Atkin/Bernstein give a segmented version in Section 5 of their original paper. Presumably Bernstein's primegen program uses that method.
In fact, one can implement an unbounded Sieve of Atkin (SoA) not using segmentation at all as I have done here in F#. Note that this is a pure functional version that doesn't even use arrays to combine the solutions of the quadratic equations and the squaresfree filter and thus is considerably slower than a more imperative approach.
Berstein's optimizations using look up tables for optimum 32-bit ranges would make the code extremely complex and not suitable for presentation here, but it would be quite easy to adapt my F# code so that the sequences start at a set lower limit and are used only over a range in order to implement a segmented version, and/or applying the same techniques to a more imperative approach using arrays.
Note that even Berstein's implementation of the SoA isn't really faster than the Sieve of Eratosthenes with all possible optimizations as per Kim Walisch's primesieve but is only faster than an equivalently optimized version of the Sieve of Eratosthenes for the selected range of numbers as per his implementation.
EDIT_ADD: For those who do not want to wade through Berstein's pseudo-code and C code, I am adding to this answer to add a pseudo-code method to use the SoA over a range from LOW to HIGH where the delta from LOW to HIGH + 1 might be constrained to an even modulo 60 in order to use the modulo (and potential bit packing to only the entries on the 2,3,5 wheel) optimizations.
This is based on a possible implementation using the SoA quadratics of (4*x^2 + y^), (3*x^2 + y^2), and (3*x^2 -y^2) to be expressed as sequences of numbers with the x value for each sequence fixed to values between one and SQRT((HIGH - 1) / 4), SQRT((HIGH - 1) / 3), and solving the quadratic for 2*x^2 + 2*x - HIGH - 1 = 0 for x = (SQRT(1 + 2 * (HIGH + 1)) - 1) / 2, respectively, with the sequences expressed in my F# code as per the top link. Optimizations to the sequences there use that when sieving for only odd composites, for the "4x" sequences, the y values need only be odd and that the "3x" sequences need only use odd values of y when x is even and vice versa. Further optimization reduce the number of solutions to the quadratic equations (= elements in the sequences) by observing that the modulo patterns over the above sequences repeat over very small ranges of x and also repeat over ranges of y of only 30, which is used in the Berstein code but not (yet) implemented in my F# code.
I also do not include the well known optimizations that could be applied to the prime "squares free" culling to use wheel factorization and the calculations for the starting segment address as I use in my implementations of a segmented SoE.
So for purposes of calculating the sequence starting segment addresses for the "4x", "3x+", and "3x-" (or with "3x+" and "3x-" combined as I do in the F# code), and having calculated the ranges of x for each as per the above, the pseudo-code is as follows:
Calculate the range LOW - FIRST_ELEMENT, where FIRST_ELEMENT is with the lowest applicable value of y for each given value of x or y = x - 1 for the case of the "3x-" sequence.
For the job of calculating how many elements are in this range, this boils down to the question of how many of (y1)^2 + (y2)^2 + (y3)^2... there are where each y number is separated by two, to produce even or odd 'y's as required. As usual in square sequence analysis, we observe that differences between squares have a constant increasing increment as in delta(9 - 1) is 8, delta(25 - 9) is 16 for an increase of 8, delta (49 - 25) is 24 for a further increase of 8, etcetera. so that for n elements the last increment is 8 * n for this example. Expressing the sequence of elements using this, we get it is one (or whatever one chooses as the first element) plus eight times the sequence of something like (1 + 2 + 3 + ...+ n). Now standard reduction of linear sequences applies where this sum is (n + 1) * n / 2 or n^2/2 + n/2. This we can solve for how many n elements there are in the range by solving the quadratic equation n^2/2 + n/2 - range = 0 or n = (SQRT(8*range + 1) - 1) / 2.
Now, if FIRST_ELEMENT + 4 * (n + 1) * n does not equal LOW as the starting address, add one to n and use FIRST_ELEMENT + 4 * (n + 2) * (n + 1) as the starting address. If one uses further optimizations to apply wheel factorization culling to the sequence pattern, look up table arrays can be used to look up the closest value of used n that satisfies the conditions.
The modulus 12 or 60 of the starting element can be calculated directly or can be produced by use of look up tables based on the repeating nature of the modulo sequences.
Each sequence is then used to toggle the composite states up to the HIGH limit. If the additional logic is added to the sequences to jump values between only the applicable elements per sequence, no further use of modulo conditions is necessary.
The above is done for every "4x" sequence followed by the "3x+" and "3x-" sequences (or combine "3x+" and "3x-" into just one set of "3x" sequences) up to the x limits as calculated earlier or as tested per loop.
And there you have it: given an appropriate method of dividing the sieve range into segments, which is best used as fixed sizes that are related to the CPU cache sizes for best memory access efficiency, a method of segmenting the SoA just as used by Bernstein but somewhat simpler in expression as it mentions but does not combine the modulo operations and bit packing.
Two binary numbers can be represented in the usual "regular, redundant" representation (i.e. introduce another digit, say 2, to obtain a non-unique representation such that any two consecutive 2's have a zero in between), so that addition becomes carry-free. I have heard that the complexity is O(k), where k is the length of the shorter of the two numbers. But what is the algorithm itself? It doesn't seem to appear on the web anywhere. I know you can add 1 to such a representation in constant time so that the result maintains regularity. But I don't know how to generalize this.
I see this is an old post, and the poster does not have much recent activity here but thought I'd put forward the answer anyway.
In order to represent this circuit as a traditional equation, let's set forth some notation. Each 'bit' in RBR notation actually consists of two bits, so to refer to these right and left bits, I will use [0] and [1] respectively. To refer to a certain 'bit' position I will use braces {0},{1},{2},...{n}.
Addition of two or three single bits can result in a two-bit sum (the MSB is traditionally called the carry bit). These can also be referenced by [0] and [1], the latter being the carry bit. For example:(0+1+1)[0]=0, (0+1+1)[1]=1, (0+0+1)[0]=1, (0+0+1)[1]=0 Now without much further, the general algorithm for adding numbers z = x + y is given by:z{n}[0] = ((x{n-1}[1] + x{n-1}[0] + y{n-1}[1])[0] + (y{n-1}[0]) + (x{n-2}[1] + x{n-2}[0] + y{n-2}[1])[1])[1]
z{n}[1] = ((x{n}[1] + x{n}[0] + y{n}[1])[0] + (y{n}[0]) + (x{n-1}[1] + x{n-1}[0] + y{n-1}[1])[1])[0]
You will note that there is some carrying going on here, but the algorithm achieves O(n) because the carrying is limited to two orders. Also note the special considerations for z{0} and z{1}, which are defined in the circuit diagram in the aforementioned link.
Out of pure interested, I'm curious how to create PI sequentially so that instead of the number being produced after the outcome of the process, allow the numbers to display as the process itself is being generated. If this is the case, then the number could produce itself, and I could implement garbage collection on previously seen numbers thus creating an infinite series. The outcome is just a number being generated every second that follows the series of Pi.
Here's what I've found sifting through the internets :
This it the popular computer-friendly algorithm, The Machin-like Algorithm :
def arccot(x, unity)
xpow = unity / x
n = 1
sign = 1
sum = 0
loop do
term = xpow / n
break if term == 0
sum += sign * (xpow/n)
xpow /= x*x
n += 2
sign = -sign
end
sum
end
def calc_pi(digits = 10000)
fudge = 10
unity = 10**(digits+fudge)
pi = 4*(4*arccot(5, unity) - arccot(239, unity))
pi / (10**fudge)
end
digits = (ARGV[0] || 10000).to_i
p calc_pi(digits)
To expand on "Moron's" answer: What the Bailey-Borwein-Plouffe formula does for you is that it lets you compute binary (or equivalently hex) digits of pi without computing all of the digits before it. This formula was used to compute the quadrillionth bit of pi ten years ago. It's a 0. (I'm sure that you were on the edge of your seat to find out.)
This is not the same thing as a low-memory, dynamic algorithm to compute the bits or digits of pi, which I think what you could mean by "sequentially". I don't think that anyone knows how to do that in base 10 or in base 2, although the BPP algorithm can be viewed as a partial solution.
Well, some of the iterative formula for pi are also sort-of like a sequential algorithm, in the sense that there is an iteration that produces more digits with each round. However, it's also only a partial solution, because typically the number of digits doubles or triples with each step. So you'd wait with a lot of digits for a while, and the whoosh a lot more digits come quickly.
In fact, I don't know if there is any low-memory, efficient algorithm to produce digits of any standard irrational number. Even for e, you'd think that the standard infinite series is an efficient formula and that it's low-memory. But it only looks low memory at the beginning, and actually there are also faster algorithms to compute many digits of e.
Perhaps you can work with hexadecimal? David Bailey, Peter Borwein and Simon Plouffe discovered a formula for the nth digit after the decimal, in the hexadecimal expansion of pi.
The formula is:
(source: sciencenews.org)
You can read more about it here: http://www.andrews.edu/~calkins/physics/Miracle.pdf
The question of whether such a formula exists for base 10 is still open.
More info: http://www.sciencenews.org/sn_arc98/2_28_98/mathland.htm
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I've read about it on a message board - Random class isn't really random. It is created with predictable fashion using a mathematical formula.
Is it really true? If so, Random isn't really random??
Because deterministic computers are really bad at generating "true" random numbers by themselves.
Also, a predictable/repeatable random sequence is often surprisingly useful, since it helps in testing.
It's really hard to create something that is absolutely random. See the Wikipedia articles on randomness and pseudo-randomness
As others have already said, Random creates pseudo-random numbers, depending on some seed value. It may be helpful to know that the .NET class Random has two constructors:
Random(int Seed)
creates a random number generator with a given seed value, helpful if you want reproducible behaviour of your program. On the other hand,
Random()
creates a random number generator with date-time depending seed value, which means, almost every time you start your program again, it will produce a different sequence of (pseudo-)random numbers.
The sequence is predictable for each starting seed. For different seeds, different sequences of numbers are returned. If the seed used is itself random (such as the DatetTime.Now.Ticks), then the numbers returned a adequately 'random'.
Alternatively, you can use a cryptographic random number generator such as the RNGCryptoServiceProvider class.
It isn't random it's a random-like number generating algorithm and it's based on a number to generate. If you set that random number to something like the system time the numbers are more close to random, but if you use these numbers to lets say, an encryption algorithm, is the attacker knows WHEN you generate the random numbers and the algorithm you use, then it is more possible that your encryption will break.
The only way to generate true random numbers is to measure something natural, for example voltage levels or have a microphone picking up sounds somewhere or something like that.
It is true, but you can always seed the random number generator with some time dependent value, or if you're really prepared to push the boat out, look at www.random.org...
In the case of the Random class though, I think it should be random enough for most requirements... I can't see a method to actually seed it, so I'm guessing it must automatically seed as built in behaviour...
Correct. Class Random is not absolutely totally random. The important question is, is it as statistically close to being random as you need it to be. The output from class Random is statistically as nearly random as a reasonable deterministic program can be. The algorithm uses a 48-bit seed modified by a linear congruential formula. If the Random object is created using the parameterless constructor, the 48 low-order bits of milli-second time get used as the seed. If the Random object is created using the seed parameter (a long), the 48 low-order bits of the long get used as the seed.
If Random is instanced with the same seed and make the exact same sequence of next calls are made from it, the exact same sequence of values will result from that instance. This is deliberate to allow for predictable software testing and dmonstrations. Ordinarliy, Random is not used with a constant seed for operational use since it is usually used to get unpredictable psuedo-random sequences. If two instances of Random with the parameterless constructors get created in the same clock millisecond, they will also get the same sequences from both instances. It is important to note that eventually, a Random instance will repeat its pattern. Therefore, a Random instance should not be used for enormously long sequences before creating a new instance.
There is no reason not to use the Random class except for high-security cryptographic applications or some special need where some aspect of true randomness is of paramount importance, something that is uncommon. In those cases, you really need a hardware randomizer that uses radioactive decay or infinitesimal molecular level brownian motion induced randomness to generate a random result. Sun SPARC hardware platforms had such hardware installable. Other platforms can have them too, along with the hardware drivers that give access to the randomness they generate.
The algorithm used in class Random is the result of considerable research by some of the best minds in computer science and mathematics. Given the right parameters, it provides remarkable and outstanding results. Other more recent algorithms may be better for some limited applications, but they also have performance or specific application issues that make them less suitible for general purpose use. The linear congruential algorithm still remains one of the most widely used general purpose pseudo-random number generators.
The following quote is from Donald Knuth's book, The Art of Computer Programming, Volume 2, Semi-numerical Algorithms, Section 3.2.1. The quote describes the linear congruential method and discusses its properties. If you don't know who Donald Knuth is or have never read any of his papers or books, he, amongst other things, showed that there can be no sort faster than Tony Hoare's Quicksort with partion pivot strategies created by Robert Sedgewick. Robert Sedgewick, who suggested the best simple pivot selection strategies for Quicksort, did his doctoral thesis on Quicksort under Donald Knuth's supervision. Knuth's multi-volume work, The Art Of Computer Programming, is one of the greatest expositions of the most important theoretical aspects of computing ever assembled, including sorting, searching and randomizing algorithms. There is a lot of discussion in Chapter 3 of this about what randomness really is, statistically and philosophically, and about software that emmulates true randomness to the point where it is statistically nearly indistinguishable from it for very large, but still finite, sequences. What follows is pretty heavy reading:
3.2.1. The Linear Congruential Method
By far the most popular random number generators in use today are special cases of the following
scheme, introduced by D. H. Lehmer in 1949. [See Proc. 2nd Symp. on
Large-Scale Digital Calculating Machinery (Cambridge, Mass.: Harvard
University Press, 1951), 141-146.]
We choose four magic integers:
m, the modulus; 0 < m.
a, the multiplier; 0 <= a < m.
c, the increment; 0 <= c < m.
X[0], the starting value; 0 <= X[0] < m. (equation 1)
The desired sequence of random numbers (X[n] ) is then obtained by setting
X[n+1] = (a * X[n] + c) mod m, n >= O. (equation 2)
This is called a linear congruential sequence. Taking the remainder mod m is somewhat
like determining where a ball will land in a spinning roulette wheel.
For example, the sequence obtained when m == 10 and X[0] == a == c == 7 is
7, 6, 9, 0, 7, 6, 9, 0, ... . (example 3)
As this example shows, the
sequence is not always "random" for all choices of m, a, c, and X[0];
the principles of choosing the magic numbers appropriately will be
investigated carefully in later parts of this chapter.
Example (3)
illustrates the fact that the congruential sequences always get into
a loop: There is ultimately a cycle of numbers that is repeated
endlessly. This property is common to all sequences having the
general form X[n+1] = f(X[n]), when f transforms a finite set into
itself; see exercise 3.1-6. The repeating cycle is called the period;
sequence (3) has a period of length 4. A useful sequence will of
course have a relatively long period.
The special case c == 0 deserves
explicit mention, since the number generation process is a little
faster when c == 0 than it is when c != O. We shall see later that the
restriction c == 0 cuts down the length of the period of the sequence,
but it is still possible to make the period reasonably long. Lehmer's
original
generation method had c == 0, although he mentioned c != 0 as a possibility; the fact that c
!= 0 can lead to longer periods is due to Thomson [Compo J. 1 (1958), 83, 86] and, independently, > to Rotenberg [JACM 7 (1960), 75-77]. The
terms multiplicative congruential method and mixed congruential
method are used by many authors to denote linear congruential
sequences with c == 0 and c != 0, respectively.
The letters m, a, c,
and X[0] will be used throughout this chapter in the sense described
above. Furthermore, we will find it useful to define
b = a - 1, (equation 4)
in order to simplify many of our formulas.
We can immediately reject the case a == 1, for this would mean that X[n] = (X[0]
+ n * c) mod m, and the sequence would certainly not behave as a random sequence. The case a == 0 is even worse. Hence for practical purposes
we may assume that
a >= 2, b >= 1. (equation 5)
Now we can prove a generalization of Eq. (2),
X[n+k] = (a^k * X[n] + (a^k - 1) * c / b) mod m, k >= 0, n >= 0,
(equation 6)
which expresses the (n+k)th term directly in terms of the nth
term. (The special case n == 0 in this equation is worthy of note.) It
follows that the subsequence consisting of every kth term of (X[n])
is another linear congruential sequence, having the multiplier a k
mod m and the increment ((a^k - 1) * c / b) mod m.
An important corollary
of (6) is that the general sequence defined by m, a, c, and X[0] can be
expressed very simply in terms of the special case where c == 1 and X[0]
== O. Let
Y[0] = 0, Y[n+1] = (a * Y[n+1]) mod m. (equation 7)
According to Eq. (6) we will have Y[k] === (a^k - 1) / b(modulo m), hence the general
sequence defined in (2) satisfies
X[n] = (A * Y[n] + X[0]) mod m, where A == (X[0] * b + c) mod m. (equation 8)