Random number generation with next and previous support? - algorithm

How to write two functions for generating random numbers that supporting next and previous?
I mean how to write two functions: next_number() and previous_number(), that next_number() function generates a new random number and previous_number() function generates previously generated random number.
for example:
int next_number()
{
// ...?
}
int previous_number()
{
// ...?
}
int num;
// Forward random number generating.
// ---> 54, 86, 32, 46, 17
num = next_number(); // num = 54
num = next_number(); // num = 86
num = next_number(); // num = 32
num = next_number(); // num = 46
num = next_number(); // num = 17
// Backward random number generating.
// <--- 17, 46, 32, 86, 54
num = previous_number(); // num = 46
num = previous_number(); // num = 32
num = previous_number(); // num = 86
num = previous_number(); // num = 54

You can trivially do this with a Pseudo-Random Function (PRF).
Such functions take a key and a value, and output a pseudo-random number based on them. You'd select a key from /dev/random that remains the same for the run of the program, and then feed the function an integer that you increment to go forward or decrement to go back.
Here's an example in pseudo-code:
initialize():
Key = sufficiently many bytes from /dev/random
N = 0
next_number():
N = N + 1
return my_prf(Key, N)
previous_number():
N = N - 1
return my_prf(Key, N)
Strong, Pseudo-Random Functions are found in most cryptography libraries. As rici points out, you can also use any encryption function (encryption functions are pseudo-random permutations, a subset of PRFs, and the period is so huge that the difference doesn't matter).

Some linear congruential generators (a common but not very good PRNG) are reversible.
They work by next = (a * previous + c) mod m. That's reversible if a has a modular multiplicative inverse mod m. That's often the case, because m is often a power of two and a is usually odd.
For example for the "MSVC" parameters from the table from the first link:
m = 232
a = 214013
c = 2531011
The reverse is:
previous = (current - 2531011) * 0xb9b33155;
With types chosen to make it work modulo 232.

Suppose you have a linear congruential sequence S defined by
S[0] = seed
S[i] = (p * S[i-1] + k) % m
for some p, m, k such that gcd(p, m) == 1. Then you can find q such that (p * q) % m == 1 and compute:
S[i-1] = (q * (S[i] - k)) % m
In other words: if you pick suitable p and precompute q, you can traverse your sequence in either order in O(1) time.

A reasonably simple way of generating an indexable pseudo-random sequence -- that is, a sequence which looks random, but can be traversed in either direction -- is to choose some (reasonably good) encryption algorithm and a fixed encryption key, and then define:
sequence(i): encrypt(i, known_key)
You don't need to know the value of i, because you can decrypt it from the number:
next(r): encrypt(decrypt(r, known_key) + 1)
prev(r): encrypt(decrypt(r, known_key) - 1)
Consequently, i does not have to be a small integer; since the only arithmetic you need to do to it is addition and subtraction by a small integer, a bignum implementation is trivial. So if you wanted 128-bit pseudorando numbers, you could set the first i to be a 128-bit random number extracted from /dev/random.
You have to keep the entire value of i in static storage, and the period of the pseudorandom numbers cannot be greater than the range of i. That will be true of any solution to this problem, though: since the next() and prev() operators are required to be functions, every value has a unique successor and predecessor, and thus can only appear once in the cycle of values. That's quite different from the Mersenne twister, for example, whose cycle is much larger than 232.

I think what you are asking for is random number generator that is deterministic. This does not make sense because if it is deterministic, it's not random. The only solution is to generate a list of random numbers and then step back and forward in this list.
PS! I know that essentialy all software PRNG-s are deterministic. You can of course use this to create the functionality you need, but don't fool yourself, it has nothing to do with randomness. If your software design requires having deterministic PRNG then you could probably skip the PRNG part at all.

Related

How to turn integers into Fibonacci coding efficiently?

Fibonacci sequence is obtained by starting with 0 and 1 and then adding the two last numbers to get the next one.
All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition. For example: 13 can be the sum of the sets {13}, {5,8} or {2,3,8}. But, as we have seen, some numbers have more than one set whose sum is the number. If we add the constraint that the sets cannot have two consecutive Fibonacci numbers, than we have a unique representation for each number.
We will use a binary sequence (just zeros and ones) to do that. For example, 17 = 1 + 3 + 13. Then, 17 = 100101. See figure 2 for a detailed explanation.
I want to turn some integers into this representation, but the integers may be very big. How to I do this efficiently.
The problem itself is simple. You always pick the largest fibonacci number less than the remainder. You can ignore the the constraint with the consecutive numbers (since if you need both, the next one is the sum of both so you should have picked that one instead of the initial two).
So the problem remains how to quickly find the largest fibonacci number less than some number X.
There's a known trick that starting with the matrix (call it M)
1 1
1 0
You can compute fibbonacci number by matrix multiplications(the xth number is M^x). More details here: https://www.nayuki.io/page/fast-fibonacci-algorithms . The end result is that you can compute the number you're look in O(logN) matrix multiplications.
You'll need large number computations (multiplications and additions) if they don't fit into existing types.
Also store the matrices corresponding to powers of two you compute the first time, since you'll need them again for the results.
Overall this should be O((logN)^2 * large_number_multiplications/additions)).
First I want to tell you that I really liked this question, I didn't know that All positive integers can be represented as a sum of a set of Fibonacci numbers without repetition, I saw the prove by induction and it was awesome.
To respond to your question I think that we have to figure how the presentation is created. I think that the easy way to find this is that from the number we found the closest minor fibonacci item.
For example if we want to present 40:
We have Fib(9)=34 and Fib(10)=55 so the first element in the presentation is Fib(9)
since 40 - Fib(9) = 6 and (Fib(5) =5 and Fib(6) =8) the next element is Fib(5). So we have 40 = Fib(9) + Fib(5)+ Fib(2)
Allow me to write this in C#
class Program
{
static void Main(string[] args)
{
List<int> fibPresentation = new List<int>();
int numberToPresent = Convert.ToInt32(Console.ReadLine());
while (numberToPresent > 0)
{
int k =1;
while (CalculateFib(k) <= numberToPresent)
{
k++;
}
numberToPresent = numberToPresent - CalculateFib(k-1);
fibPresentation.Add(k-1);
}
}
static int CalculateFib(int n)
{
if (n == 1)
return 1;
int a = 0;
int b = 1;
// In N steps compute Fibonacci sequence iteratively.
for (int i = 0; i < n; i++)
{
int temp = a;
a = b;
b = temp + b;
}
return a;
}
}
Your result will be in fibPresentation
This encoding is more accurately called the "Zeckendorf representation": see https://en.wikipedia.org/wiki/Fibonacci_coding
A greedy approach works (see https://en.wikipedia.org/wiki/Zeckendorf%27s_theorem) and here's some Python code that converts a number to this representation. It uses the first 100 Fibonacci numbers and works correctly for all inputs up to 927372692193078999175 (and incorrectly for any larger inputs).
fibs = [0, 1]
for _ in xrange(100):
fibs.append(fibs[-2] + fibs[-1])
def zeck(n):
i = len(fibs) - 1
r = 0
while n:
if fibs[i] <= n:
r |= 1 << (i - 2)
n -= fibs[i]
i -= 1
return r
print bin(zeck(17))
The output is:
0b100101
As the greedy approach seems to work, it suffices to be able to invert the relation N=Fn.
By the Binet formula, Fn=[φ^n/√5], where the brackets denote the nearest integer. Then with n=floor(lnφ(√5N)) you are very close to the solution.
17 => n = floor(7.5599...) => F7 = 13
4 => n = floor(4.5531) => F4 = 3
1 => n = floor(1.6722) => F1 = 1
(I do not exclude that some n values can be off by one.)
I'm not sure if this is an efficient enough for you, but you could simply use Backtracking to find a(the) valid representation.
I would try to start the backtracking steps by taking the biggest possible fib number and only switch to smaller ones if the consecutive or the only once constraint is violated.

Generating a non-repeating set from a random seed, and extract result by index

p.s. I have referred to this as Random, but this is a Seed Based Random Shuffle, where the Seed will be generated by a PRNG, but with the same Seed, the same "random" distribution will be observed.
I am currently trying to find a method to assist in doing 2 things:
1) Generate Non-Repeating Sequence
This will take 2 arguments: Seed; and N. It will generate a sequence, of size N, populated with numbers between 1 and N, with no repetitions.
I have found a few good methods to do this, but most of them get stumped by feasibility with the second thing.
2) Extract an entry from the Sequence
This will take 3 arguments: Seed; N; and I. This is for determining what value would appear at position I in a Sequence that would be generated with Seed and N. However, in order to work with what I have in mind, it absolutely cannot use a generated sequence, and pick out an element.
I initially worked with pre-calculating the sequence, then querying it, but this only really works in test cases, as the number of Seeds, and the value of N that will be used would create a database into the Petabytes.
From what I can tell, having a method that implements requirement 1 by using requirement 2 would be the most ideal method.
i.e. a sequence is generated by:
function Generate_Sequence(int S, int N) {
int[] sequence = new int[N];
for (int i = 0; i < N; i++) {
sequence[i] = Extract_From_Sequence(S, N, i);
}
return sequence;
}
For Example
GS = Generate Sequence
ES = Extract from Sequence
for:
S = 1
N = 5
I = 4
GS(S, N) = { 4, 2, 5, 1, 3 }
ES(S, N, I) = 1
let S = 2
GS(S, N) = { 3, 5, 2, 4, 1 }
ES(S, N, I) = 4
One way to do this is to make a permutation over the bit positions of the number. Assume that N is a power of two (I will discuss the general case later!).
Use the seed S to generate a permutation \sigma over the set of {1,2,...,log(n)}. Then permute the bits of I according to the \sigma to obtain I'. In other words, the bit of I' at the position \sigma(x) is obtained from the bit of I at the position x.
One problem with this method is its linearity (It is closed under the XOR operation). To overcome this, you can find a number p with gcd(p,N)=1 (this can be done easily even for very large Ns) and generate a random number (q < N) using the seed S. The output of the Extract_From_Sequence(S, N, I) would be (p*I'+q mod N).
Now the case where N is not a complete power of two. The problem arises when the I' falls outside the range of [1,N]. In that case, we return the most significant bits of I to their initial position until the resulting value falls into the desired range. This is done by changing the \sigma(log(n)) bit of I' with the log(n) bit, and so on ....

Is this a good Primality Checking Solution?

I have written this code to check if a number is prime (for numbers upto 10^9+7)
Is this a good method ??
What will be the time complexity for this ??
What I have done is that I have made a unordered_set which stores the prime numbers upto sqrt(n).
When checking if a number is prime or not if first check if its is less than the max number in the table.
If it is less it is searched in the table so the complexity should be O(1) in this case.
If it is more the number is put through a divisibility test with the numbers from the set of number containing the prime numbers.
#include<iostream>
#include<set>
#include<math.h>
#include<unordered_set>
#define sqrt10e9 31623
using namespace std;
unordered_set<long long> primeSet = { 2, 3 }; //used for fast lookups
void genrate_prime_set(long range) //this generates prime number upto sqrt(10^9+7)
{
bool flag;
set<long long> tempPrimeSet = { 2, 3 }; //a temporay set is used for genration
set<long long>::iterator j;
for (int i = 3; i <= range; i = i + 2)
{
//cout << i << " ";
flag = true;
for (j = tempPrimeSet.begin(); *j * *j <= i; ++j)
{
if (i % (*j) == 0)
{
flag = false;
break;
}
}
if (flag)
{
primeSet.insert(i);
tempPrimeSet.insert(i);
}
}
}
bool is_prime(long long i,unordered_set<long long> primeSet)
{
bool flag = true;
if(i <= sqrt10e9) //if number exist in the lookup table
return primeSet.count(i);
//if it doesn't iterate through the table
for (unordered_set<long long>::iterator j = primeSet.begin(); j != primeSet.end(); ++j)
{
if (*j * *j <= i && i % (*j) == 0)
{
flag = false;
break;
}
}
return flag;
}
int main()
{
//long long testCases, a, b, kiwiCount;
bool primeFlag = true;
//unordered_set<int> primeNum;
genrate_prime_set(sqrt10e9);
cout << primeSet.size()<<"\n";
cout << is_prime(9999991,primeSet);
return 0;
}
This doesn't strike me as a particularly efficient way to do the job at hand.
Although it probably won't make a big difference in the end, the efficient way to generate all the primes up to some specific limit is clearly to use a sieve--the sieve of Eratosthenes is simple and fast. There are a couple of modifications that can be faster, but for the small size you're dealing with, they're probably not worthwhile.
These normally produce their output in a more effective format than you're currently using as well. In particular, you typically just dedicate one bit to each possible prime (i.e., each odd number) and end up with it zeroed if the number is composite, and one if it's prime (you can, of course, reverse the sense if you prefer).
Since you only need one bit for each odd number from 3 to 31623, this requires only about 16 K bits, or about 2K bytes--a truly minuscule amount of memory by modern standards (especially: little enough to fit in L1 cache quite easily).
Since the bits are stored in order, it's also trivial to compute and test by the factors up to the square root of the number you're testing instead of testing against all the numbers in the table (including those greater than the square root of the number you're testing, which is obviously a waste of time). This also optimizes access to the memory in case some of it's not in the cache (i.e., you can access all the data in order, making life as easy as possible for the hardware prefetcher).
If you wanted to optimize further, I'd consider just using the sieve to find all primes up to 109+7, and look up inputs. Whether this is a win will depend (heavily) upon the number of queries you can expect to receive. A quick check shows that a simple implementation of the Sieve of Eratosthenes can find all primes up to 109 in about 17 seconds. After that, each query is (of course) essentially instantaneous (i.e., the cost of a single memory read). This does require around 120 megabytes of memory for the result of the sieve, which would once have been a major consideration, but (except on fairly limited systems) normally wouldn't be any more.
The very short answer: do research on the subject, starting with the term "Miller-Rabin"
The short answer is no:
Looking for factors of a number is a poor way to check for primality
Exhaustively searching through primes is a poor way to look for factors
Especially if you search through every prime, rather than just the ones less than or equal to the square root of the number
Doing a primality test on each number of them is a poor way to generate a list of primes
Also, you should take in primeSet by reference rather than copy, if it really needs to be a parameter.
Note: testing small primes to see if they divide a number is a useful first step of a primality test, but should generally only be used for the smallest primes before switching to a better method
No, it's not a very good way to determine if a number is prime. Here is pseudocode for a simple primality test that is sufficient for numbers in your range; I'll leave it to you to translate to C++:
function isPrime(n)
d := 2
while d * d <= n
if n % d == 0
return False
d := d + 1
return True
This works by trying every potential divisor up to the square root of the input number n; if no divisor has been found, then the input number could not be composite, meaning of the form n = p × q, because one of the two divisors p or q must be less than the square root of n while the other is greater than the square root of n.
There are better ways to determine primality; for instance, after initially checking if the number is even (and hence prime only if n = 2), it is only necessary to test odd potential divisors, halving the amount of work necessary. If you have a list of primes up to the square root of n, you can use that list as trial divisors and make the process even faster. And there are other techniques for larger n.
But that should be enough to get you started. When you are ready for more, come back here and ask more questions.
I can only suggest a way to use a library function in Java to check the primality of a number. As for the other questions, I do not have any answers.
The java.math.BigInteger.isProbablePrime(int certainty) returns true if this BigInteger is probably prime, false if it's definitely composite. If certainty is ≤ 0, true is returned. You should try and use it in your code. So try rewriting it in Java
Parameters
certainty - a measure of the uncertainty that the caller is willing to tolerate: if the call returns true the probability that this BigInteger is prime exceeds (1 - 1/2^certainty). The execution time of this method is proportional to the value of this parameter.
Return Value
This method returns true if this BigInteger is probably prime, false if it's definitely composite.
Example
The following example shows the usage of math.BigInteger.isProbablePrime() method
import java.math.*;
public class BigIntegerDemo {
public static void main(String[] args) {
// create 3 BigInteger objects
BigInteger bi1, bi2, bi3;
// create 3 Boolean objects
Boolean b1, b2, b3;
// assign values to bi1, bi2
bi1 = new BigInteger("7");
bi2 = new BigInteger("9");
// perform isProbablePrime on bi1, bi2
b1 = bi1.isProbablePrime(1);
b2 = bi2.isProbablePrime(1);
b3 = bi2.isProbablePrime(-1);
String str1 = bi1+ " is prime with certainity 1 is " +b1;
String str2 = bi2+ " is prime with certainity 1 is " +b2;
String str3 = bi2+ " is prime with certainity -1 is " +b3;
// print b1, b2, b3 values
System.out.println( str1 );
System.out.println( str2 );
System.out.println( str3 );
}
}
Output
7 is prime with certainity 1 is true
9 is prime with certainity 1 is false
9 is prime with certainity -1 is true

Keep uniform distribution after remapping to a new range

Since this is about remapping a uniform distribution to another with a different range, this is not a PHP question specifically although I am using PHP.
I have a cryptographicaly secure random number generator that gives me evenly distributed integers (uniform discrete distribution) between 0 and PHP_INT_MAX.
How do I remap these results to fit into a different range in an efficient manner?
Currently I am using $mappedRandomNumber = $randomNumber % ($range + 1) + $min where $range = $max - $min, but that obvioulsy doesn't work since the first PHP_INT_MAX%$range integers from the range have a higher chance to be picked, breaking the uniformity of the distribution.
Well, having zero knowledge of PHP definitely qualifies me as an expert, so
mentally converting to float U[0,1)
f = r / PHP_MAX_INT
then doing
mapped = min + f*(max - min)
going back to integers
mapped = min + (r * max - r * min)/PHP_MAX_INT
if computation is done via 64bit math, and PHP_MAX_INT being 2^31 it should work
This is what I ended up doing. PRNG 101 (if it does not fit, ignore and generate again). Not very sophisticated, but simple:
public function rand($min = 0, $max = null){
// pow(2,$numBits-1) calculated as (pow(2,$numBits-2)-1) + pow(2,$numBits-2)
// to avoid overflow when $numBits is the number of bits of PHP_INT_MAX
$maxSafe = (int) floor(
((pow(2,8*$this->intByteCount-2)-1) + pow(2,8*$this->intByteCount-2))
/
($max - $min)
) * ($max - $min);
// discards anything above the last interval N * {0 .. max - min -1}
// that fits in {0 .. 2^(intBitCount-1)-1}
do {
$chars = $this->getRandomBytesString($this->intByteCount);
$n = 0;
for ($i=0;$i<$this->intByteCount;$i++) {$n|=(ord($chars[$i])<<(8*($this->intByteCount-$i-1)));}
} while (abs($n)>$maxSafe);
return (abs($n)%($max-$min+1))+$min;
}
Any improvements are welcomed.
(Full code on https://github.com/elcodedocle/cryptosecureprng/blob/master/CryptoSecurePRNG.php)
Here is the sketch how I would do it:
Consider you have uniform random integer distribution in range [A, B) that's what your random number generator provide.
Let L = B - A.
Let P be the highest power of 2 such that P <= L.
Let X be a sample from this range.
First calculate Y = X - A.
If Y >= P, discard it and start with new X until you get an Y that fits.
Now Y contains log2(P) uniformly random bits - zero extend it up to log2(P) bits.
Now we have uniform random bit generator that can be used to provide arbitrary number of random bits as needed.
To generate a number in the target range, let [A_t, B_t) be the target range. Let L_t = B_t - A_t.
Let P_t be the smallest power of 2 such that P_t >= L_t.
Read log2(P_t) random bits and make an integer from it, let's call it X_t.
If X_t >= L_t, discard it and try again until you get a number that fits.
Your random number in the desired range will be L_t + A_t.
Implementation considerations: if your L_t and L are powers of 2, you never have to discard anything. If not, then even in the worst case you should get the right number in less than 2 trials on average.

Finding if a random number has occured before or not

Let me be clear at start that this is a contrived example and not a real world problem.
If I have a problem of creating a random number between 0 to 10. I do this 11 times making sure that a previously occurred number is not drawn again, if I get a repeated number,
I create another random number again to make sure it has not be seen earlier. So essentially I get a a sequence of unique numbers from 0 - 10 in a random order
e.g. 3 1 2 0 5 9 4 8 10 6 7 and so on
Now to come up with logic to make sure that the random numbers are unique and not one which we have drawn before, we could use many approaches
Use C++ std::bitset and set the bit corresponding to the index equal to value of each random no. and check it next time when a new random number is drawn.
Or
Use a std::map<int,int> to count the number of times or even simple C array with some sentinel values stored in that array to indicate if that number has occurred or not.
If I have to avoid these methods above and use some mathematical/logical/bitwise operation to find whether a random number has been draw before or not, is there a way?
You don't want to do it the way you suggest. Consider what happens when you have already selected 10 of the 11 items; your random number generator will cycle until it finds the missing number, which might be never, depending on your random number generator.
A better solution is to create a list of numbers 0 to 10 in order, then shuffle the list into a random order. The normal algorithm for doing this is due to Knuth, Fisher and Yates: starting at the first element, swap each element with an element at a location greater than the current element in the array.
function shuffle(a, n)
for i from n-1 to 1 step -1
j = randint(i)
swap(a[i], a[j])
We assume an array with indices 0 to n-1, and a randint function that sets j to the range 0 <= j <= i.
Use an array and add all possible values to it. Then pick one out of the array and remove it. Next time, pick again until the array is empty.
Yes, there is a mathematical way to do it, but it is a bit expansive.
have an array: primes[] where primes[i] = the i'th prime number. So its beginning will be [2,3,5,7,11,...].
Also store a number mult Now, once you draw a number (let it be i) you check if mult % primes[i] == 0, if it is - the number was drawn before, if it wasn't - then the number was not. chose it and do mult = mult * primes[i].
However, it is expansive because it might require a lot of space for large ranges (the possible values of mult increases exponentially
(This is a nice mathematical approach, because we actually look at a set of primes p_i, the array of primes is only the implementation to the abstract set of primes).
A bit manipulation alternative for small values is using an int or long as a bitset.
With this approach, to check a candidate i is not in the set you only need to check:
if (pow(2,i) & set == 0) // not in the set
else //already in the set
To enter an element i to the set:
set = set | pow(2,i)
A better approach will be to populate a list with all the numbers, shuffle it with fisher-yates shuffle, and iterate it for generating new random numbers.
If I have to avoid these methods above and use some
mathematical/logical/bitwise operation to find whether a random number
has been draw before or not, is there a way?
Subject to your contrived constraints yes, you can imitate a small bitset using bitwise operations:
You can choose different integer types on the right according to what size you need.
bitset code bitwise code
std::bitset<32> x; unsigned long x = 0;
if (x[i]) { ... } if (x & (1UL << i)) { ... }
// assuming v is 0 or 1
x[i] = v; x = (x & ~(1UL << i)) | ((unsigned long)v << i);
x[i] = true; x |= (1UL << i);
x[i] = false; x &= ~(1UL << i);
For a larger set (beyond the size in bits of unsigned long long), you will need an array of your chosen integer type. Divide the index by the width of each value to know what index to look up in the array, and use the modulus for the bit shifts. This is basically what bitset does.
I'm assuming that the various answers that tell you how best to shuffle 10 numbers are missing the point entirely: that your contrived constraints are there because you do not in fact want or need to know how best to shuffle 10 numbers :-)
Keep a variable too map the drawn numbers. The i'th bit of that variable will be 1 if the number was drawn before:
int mapNumbers = 0;
int generateRand() {
if (mapNumbers & ((1 << 11) - 1) == ((1 << 11) - 1)) return; // return if all numbers have been generated
int x;
do {
x = newVal();
} while (!x & mapNumbers);
mapNumbers |= (1 << x);
return x;
}

Resources