Looking for a pseudo random number generation algorithm with specific properties - algorithm

I'm looking for a pseudo random number generator which has the following properties:
Non-repeating: The returned numbers must be unique until all numbers from 0 to n have been returned once, only then it can repeat each number once more, etc.
Deterministic: If I used the same seed twice it needs to result in the same sequence.
Few allocations: It should not require to allocate a large memory area in order to then mix its data up like sequence permutations would.
My goal is that I could initialize the random number generator with some seed value and then continuously call its function to generate the next number in the sequence, possibly passing it the previous one.

One possible method is a block cypher. Encrypt the numbers 0, 1, 2, ... with a given key and the output is guaranteed unique, and will only repeat once the block size is passed. Each key will generate a different permutation. You just need to keep track of the key and the last number you encrypted.
DES uses a 64 bit block and AES uses a 128 bit block. If those sizes don't suit then you need to look at Format preserving encryption for an appropriately sized block.
One point to note, a non-repeating generator is not random. As more numbers are generated the pool of unused numbers shrinks, until the last number is fully determined. You need to consider if this is important in your application.

Related

How to select nth random integer from a range of integers without repetition or storage? [duplicate]

This question already has answers here:
Unique (non-repeating) random numbers in O(1)?
(22 answers)
Closed 1 year ago.
Let's say my system needs to provide a unique integer id regularly, between 1 and 10^20, from a function like --
function getNextRandomUniqueId(index:BigInt, min:BigInt, max:BigInt, seed:BigInt): BigInt { ? }
id = getNextRandomUniqueId(index=42, min=1, max=10^20, seed=0)
These ids need to be provided in random order as the index increases, not sequentially. Once an id has been provided, it cannot be provided again, as long as the index increases. My system cannot store a random list of all the numbers to be issued, or all the numbers issued, there's too many. I also don't want to rely on something like a random UUID, which is exceedingly unlikely to have a collision, but not guaranteed to.
How can this be done? To have a deterministic mathematical way to iterate randomly through a set of sequential integers without repetition and without storage?
EDIT: Fixed 1^20 to 10^20
This can be done, assuming you are allowed to store an encryption key and counter. Encryption is a one-to-one mapping so by encrypting all the numbers in a given range you will get back all those same numbers in a randomized order. Different keys will give a different order. Encrypt the numbers 0, 1, 2, 3, ... in order, using the key and keeping track of how far you have got.
Depending on the range of numbers, you may need to use some form of Format Preserving encryption to keep the outputs within the required range.
You cannot guarantee that your same id is not in another seed sequence.
Most languages use the time to generate the sequence when you are not providing a seed yourself. You have set your seed to zero so each time you restart your program, you will get your same ids. This is most likely not your intent :-)
But even when you would do this, the chance that you hit the same id is there.
1 in the 100,000,000,000,000,000,000.
The reason you can get the same id is because it is RANDOM
I would go with a GUID.
1 in the 340.280.000.000.000.000.000.000.000.000.000.000.000

Generate random looking numbers deterministically from a random-access lookup key in O(1) space and time

I want to output random looking numbers based on an input. If the same input is put in, the same output is given.
I don't want to pregenerate and store a bunch of random data, and I don't want it to take an O(n) amount of time to recover the nth index.
It does not need to be secure, cryptographically or otherwise, just enough to look random.
If you want a deterministic random-access function from an (index,length) pair to a random looking string of bytes you could use SHA3-N(index)[:length] where N is the first convenient number greater than length.
This would not behave identically to an actual array as reading indexes 1 (with length 10) and 5 (with length 10) would not have any overlap (which you'd expect from an array).
This is going to be slow and very inconvenient for N>512, so if you need longer strings you'll want to do multiple rounds. Something like SHA3-512(SHA3-512(index)[0:256])++SHA3-512(SHA3-512(index)[256:512]) to get something 1024bytes long.
Armed with the multiple rounds part you could use any hash function (e.g. SHA256, MD5) which might be more convenient.
I should note that this is definitely not secure and the output could easily be predicted by an adversary.
Typically, a random number generator will generate the same sequence of pseudo-random numbers given the same seed. For example, such python code might be like so:
random.seed(1)
for i in range(1, 10):
print(random.randint(1,100)
Will print the same list no matter how many times you invoke that code. Similarly, so will this:
random.seed(42)
for i in range(1, 10):
print(random.randint(1,100)
If somehow you then describe the sections of your array as a seed (you could use a hash function to do this indeed) you can seed the generator with that value and reliably allow dynamic sizing of the list requested.

Random number generation guaranteeing uniqueness over time

I created a counter that goes up from 0 to 9999 until it resets again. I use the output of this counter as a value to make unique entries. However, the application needs to find its last created number each time the application is restarted. Therfore I am looking for a method which avoids any sort of object storage and relies solely on random number generation.
Something like:
int randomTimeBasedGenerator() {
Random r = new Random(System.currentTimeMillis())
int num = r.nextInt() % 9999
return num
}
But what guarantee do I have that this method generates unique numbers? And, if not, how long would it remain unique? Are there any study papers I can look into for this sort of scenario?
Random number generation would be an elegant solution for my situation, if I can at least guarantee it won't repeat within a couple of weeks or months. But random number generation would be useless in my case if no such guarantee exists.
You have no guarantee that the return value of a random number generator remains unique. Random number generators generate unique sequences of numbers, not unique numbers. Random numbers will always repeat themselves, sooner or later.
As suggested by #Thilo, UUIDs are unique numbers. But an even better approach in your case might be to set up a lightweight database (sqlite will do) and add a record to a table with incremental id's. It is not possible to keep track of a process without storing values somewhere.

what algorithm can save one bit of storage space for each arbitrary 32bit number in a LUT

a lookup table has a total of 4G entries, each entry of it is a 32bit arbitrary number but they never repeats.
is there any algorithm is able to utilize the index of each entry and its (index) value(32bit number)to make a fixed position bit of the value is always zero(so I can utilize the bit as a flag to log something). And I can retrieve the 32bit number by doing a reverse calculation.
Or step back and say, whether or not I can make a fixed position bit of every two continuous entries always zero?
my question is that is there any universal codes can make each arbitrary 32bit numeric save 1 bit. so I can utilize this bit as a lock flag. alternatively, is there a way can leverage the index and its value of a lookup table entry by some calculation to save 1 bit storage of the value.
It is not at all clear what you are asking. However I can perhaps find one thing in there that can be addressed, if I am reading it correctly, which is that you have a permutation of all of the integers in 0..232-1. Such a permutation can be represented in fewer bits than direct representation, which takes 32*232 bits. With a perfect representation of the permutations, each would be ceiling(log2(232!)) bits, since there are 232! possible permutations. That length turns out to be about 95.5% of the bits in the direct representation. So each permutation could be represented in about 30.6*232 bits, effectively taking off more than one bit per word.

how to generate longer random number from a short random number?

I have a short random number input, let's say int 0-999.
I don't know the distribution of the input. Now I want to generate a random number in range 0-99999 based on the input without changing the distribution shape.
I know there is a way to make the input to [0,1] by dividing it by 999 and then multiple 99999 to get the result. However, this method doesn't cover all the possible values, like 99999 will never get hit.
Assuming your input is some kind of source of randomness...
You can take two consecutive inputs and combine them:
input() + 1000*(input()%100)
Be careful though. This relies on the source having plenty of entropy, so that a given input number isn't always followed by the same subsequent input number. If your source is a PRNG designed to cycle between the numbers 0–999 in some fashion, this technique won't work.
With most production entropy sources (e.g., /dev/urandom), this should work fine. OTOH, with a production entropy source, you could fetch a random number between 0–99999 fairly directly.
You can try something like the following:
(input * 100) + random
where random is a random number between 0 and 99.
The problem is that input only specifies which 100 range to use. For instance 50 just says you will have a number between 5000 and 5100 (to keep a similar shape distribution). Which number between 5000 and 5100 to pick is up to you.

Resources