True random number generator using VHDL - random

I'm asked to design a true random generator using VHDL.With lot of struggle I could only design a PRNGs not TRNG. Is it possible to generate number perfectly random??? Please suggest me in this. I'm really clueless!

There is NO such thing as a "true" random number generator. This is one of my favorite pseudo-random generators however, and would be fun to implement in VHDL.
http://en.wikipedia.org/wiki/Xorshift
Also, see this: http://en.wikipedia.org/wiki/Random_number_generation#.22True.22_random_numbers_vs._pseudorandom_numbers
The only thing that I can think of to get you "better" randomness would be to do something like write a file and then read a file. The scheduler on the host PC might have enough entropy associated with it to cause some variance in the time it takes for these operations and you could use that time as a key to seed your algorithm.

Since you are asking about VHDL, you want to design special-purpose hardware. Now if you operate hardware in a way which should never be done for digital logic, you might get some kind of "true" random behavior.
If, e.g., you design a circuit with a D-type flip-flop that is clocked when its data input changes its level, the output becomes metastable, i.e. is some time undefined (between 0 and 1), before it becomes stable as 0 or 1 again. How long this takes, depends among others on the electric noise, e.g. is random. I could imagine that you can use such effects to make a random generator.

Contrary to the claims of most of the other answers, there are several TRNG designs for FPGAs mostly based on ring oscillators or self-timed rings, see e.g.
B. Yang, "True Random Number Generators for FPGAs," PhD thesis, KU Leuven, N. Mentens, and I. Verbauwhede (promotors), 2018.
and
VHDL TRNG designs

thank u all for ur replies.I'm thinking to use a register holding different values and take it each time the repetition starts. the idea is to provide different seed values so I can get random values. Since I'm new to VHDL coding, Im not sure if this works but just a try from my side if I can do like this. Any suggestions are welcomed on this.

You're not going to get a true random number generator out of an FPGA / VHDL. The best you can hope for is a h/w PRNG that's readable from some register somewhere.
You might choose to implement one of the PRNG algorithms out there. You're then going to have to trust the algorithm designer and then trust which ever VHDL implementation you go with (your own or one you acquire off someone else). You might start by looking at:
http://en.wikipedia.org/wiki/List_of_pseudorandom_number_generators#Cryptographic_algorithms

Related

Pipelining the Ziggurat Random Number Generator

I'm currently implementing a version of George Marsaglia's Ziggurat random number generator. Although it is supposedly one of the fastest ways to generate good quality normally-distributed random number generators, it is full of loop control code (ie. return statements in the middle of a loop, if-statements, branches, etc) and it makes several calls to standard C functions like exp() and log(). Not to mention the infinite loop.
This makes for code that cannot be pipelined by the compiler. Ultimately, I feel like a basic approach, such as using the central limit theorem directly, might ultimately be faster since it can be pipelined easily. Unfortunately, it is not suitable for the tails of the Gaussian distribution and therefore it's not acceptable for my application.
Does anybody here have any ideas on how control code and function calls might be reduced. I am currently using Colin Green's implementation of the algorithm that I ported to C. My underlying uniform generator is the Tiny Mersenne Twister (so please don't tell me to use the MT as I've seen other people do, I'm already there. This discussion is for normally-distributed RNG's, not uniform RNG's).
You might take a look at my C implementation
here. The main function is only 20-something lines of code, so should be easy to unroll the loop a bit. It also gives you the choice of using integer or float compares, whichever is faster on your machine. You can plug in any back-end RNG.

Deterministic random byte/string generator from a seed in C++

I want a function that, given a seed string and a number of bytes X, it will output a pseudorandom string of length X. But it should do this deterministically. This is for encryption/decryption purposes, and so the deterministic part is important. I don't know where to begin looking for such a thing though because I have hardly any knowledge of cryptography. Thanks!
The most important rule of coding cryptography is Never Roll Your Own. Use an existing, well tested and reviewed library. Even if you understand the algorithms and theory, there are just too many ways to mess things up.
Anyway, what you are looking for is called a Stream Cipher.

Code generation by genetic algorithms

Evolutionary programming seems to be a great way to solve many optimization problems. The idea is very easy and the implementation does not make problems.
I was wondering if there is any way to evolutionarily create a program in ruby/python script (or any other language)?
The idea is simple:
Create a population of programs
Perform genetic operations (roulette-wheel selection or any other selection), create new programs with inheritance from best programs, etc.
Loop point 2 until program that will satisfy our condition is found
But there are still few problems:
How will chromosomes be represented? For example, should one cell of chromosome be one line of code?
How will chromosomes be generated? If they will be lines of code, how do we generate them to ensure that they are syntactically correct, etc.?
Example of a program that could be generated:
Create script that takes N numbers as input and returns their mean as output.
If there were any attempts to create such algorithms I'll be glad to see any links/sources.
If you are sure you want to do this, you want genetic programming, rather than a genetic algorithm. GP allows you to evolve tree-structured programs. What you would do would be to give it a bunch of primitive operations (while($register), read($register), increment($register), decrement($register), divide($result $numerator $denominator), print, progn2 (this is GP speak for "execute two commands sequentially")).
You could produce something like this:
progn2(
progn2(
read($1)
while($1
progn2(
while($1
progn2( #add the input to the total
increment($2)
decrement($1)
)
)
progn2( #increment number of values entered, read again
increment($3)
read($1)
)
)
)
)
progn2( #calculate result
divide($1 $2 $3)
print($1)
)
)
You would use, as your fitness function, how close it is to the real solution. And therein lies the catch, that you have to calculate that traditionally anyway*. And then have something that translates that into code in (your language of choice). Note that, as you've got a potential infinite loop in there, you'll have to cut off execution after a while (there's no way around the halting problem), and it probably won't work. Shucks. Note also, that my provided code will attempt to divide by zero.
*There are ways around this, but generally not terribly far around it.
It can be done, but works very badly for most kinds of applications.
Genetic algorithms only work when the fitness function is continuous, i.e. you can determine which candidates in your current population are closer to the solution than others, because only then you'll get improvements from one generation to the next. I learned this the hard way when I had a genetic algorithm with one strongly-weighted non-continuous component in my fitness function. It dominated all others and because it was non-continuous, there was no gradual advancement towards greater fitness because candidates that were almost correct in that aspect were not considered more fit than ones that were completely incorrect.
Unfortunately, program correctness is utterly non-continuous. Is a program that stops with error X on line A better than one that stops with error Y on line B? Your program could be one character away from being correct, and still abort with an error, while one that returns a constant hardcoded result can at least pass one test.
And that's not even touching on the matter of the code itself being non-continuous under modifications...
Well this is very possible and #Jivlain correctly points out in his (nice) answer that genetic Programming is what you are looking for (and not simple Genetic Algorithms).
Genetic Programming is a field that has not reached a broad audience yet, partially because of some of the complications #MichaelBorgwardt indicates in his answer. But those are mere complications, it is far from true that this is impossible to do. Research on the topic has been going on for more than 20 years.
Andre Koza is one of the leading researchers on this (have a look at his 1992 work) and he demonstrated as early as 1996 how genetic programming can in some cases outperform naive GAs on some classic computational problems (such as evolving programs for Cellular Automata synchronization).
Here's a good Genetic Programming tutorial from Koza and Poli dated 2003.
For a recent reference you might wanna have a look at A field guide to genetic programming (2008).
Since this question was asked, the field of genetic programming has advanced a bit, and there have been some additional attempts to evolve code in configurations other than the tree structures of traditional genetic programming. Here are just a few of them:
PushGP - designed with the goal of evolving modular functions like human coders use, programs in this system store all variables and code on different stacks (one for each variable type). Programs are written by pushing and popping commands and data off of the stacks.
FINCH - a system that evolves Java byte-code. This has been used to great effect to evolve game-playing agents.
Various algorithms have started evolving C++ code, often with a step in which compiler errors are corrected. This has had mixed, but not altogether unpromising results. Here's an example.
Avida - a system in which agents evolve programs (mostly boolean logic tasks) using a very simple assembly code. Based off of the older (and less versatile) Tierra.
The language isn't an issue. Regardless of the language, you have to define some higher-level of mutation, otherwise it will take forever to learn.
For example, since any Ruby language can be defined in terms of a text string, you could just randomly generate text strings and optimize that. Better would be to generate only legal Ruby programs. However, it would also take forever.
If you were trying to build a sorting program and you had high level operations like "swap", "move", etc. then you would have a much higher chance of success.
In theory, a bunch of monkeys banging on a typewriter for an infinite amount of time will output all the works of Shakespeare. In practice, it isn't a practical way to write literature. Just because genetic algorithms can solve optimization problems doesn't mean that it's easy or even necessarily a good way to do it.
The biggest selling point of genetic algorithms, as you say, is that they are dirt simple. They don't have the best performance or mathematical background, but even if you have no idea how to solve your problem, as long as you can define it as an optimization problem you will be able to turn it into a GA.
Programs aren't really suited for GA's precisely because code isn't good chromossome material. I have seen someone who did something similar with (simpler) machine code instead of Python (although it was more of an ecossystem simulation then a GA per se) and you might have better luck if you codify your programs using automata / LISP or something like that.
On the other hand, given how alluring GA's are and how basically everyone who looks at them asks this same question, I'm pretty sure there are already people who tried this somewhere - I just have no idea if any of them succeeded.
Good luck with that.
Sure, you could write a "mutation" program that reads a program and randomly adds, deletes, or changes some number of characters. Then you could compile the result and see if the output is better than the original program. (However we define and measure "better".) Of course 99.9% of the time the result would be compile errors: syntax errors, undefined variables, etc. And surely most of the rest would be wildly incorrect.
Try some very simple problem. Say, start with a program that reads in two numbers, adds them together, and outputs the sum. Let's say that the goal is a program that reads in three numbers and calculates the sum. Just how long and complex such a program would be of course depends on the language. Let's say we have some very high level language that lets us read or write a number with just one line of code. Then the starting program is just 4 lines:
read x
read y
total=x+y
write total
The simplest program to meet the desired goal would be something like
read x
read y
read z
total=x+y+z
write total
So through a random mutation, we have to add "read z" and "+z", a total of 9 characters including the space and the new-line. Let's make it easy on our mutation program and say it always inserts exactly 9 random characters, that they're guaranteed to be in the right places, and that it chooses from a character set of just 26 letters plus 10 digits plus 14 special characters = 50 characters. What are the odds that it will pick the correct 9 characters? 1 in 50^9 = 1 in 2.0e15. (Okay, the program would work if instead of "read z" and "+z" it inserted "read w" and "+w", but then I'm making it easy by assuming it magically inserts exactly the right number of characters and always inserts them in the right places. So I think this estimate is still generous.)
1 in 2.0e15 is a pretty small probability. Even if the program runs a thousand times a second, and you can test the output that quickly, the chance is still just 1 in 2.0e12 per second, or 1 in 5.4e8 per hour, 1 in 2.3e7 per day. Keep it running for a year and the chance of success is still only 1 in 62,000.
Even a moderately competent programmer should be able to make such a change in, what, ten minutes?
Note that changes must come in at least "packets" that are correct. That is, if a mutation generates "reax z", that's only one character away from "read z", but it would still produce compile errors, and so would fail.
Likewise adding "read z" but changing the calculation to "total=x+y+w" is not going to work. Depending on the language, you'll either get errors for the undefined variable or at best it will have some default value, like zero, and give incorrect results.
You could, I suppose, theorize incremental solutions. Maybe one mutation adds the new read statement, then a future mutation updates the calculation. But without the calculation, the additional read is worthless. How will the program be evaluated to determine that the additional read is "a step in the right direction"? The only way I see to do that is to have an intelligent being read the code after each mutation and see if the change is making progress toward the desired goal. And if you have an intelligent designer who can do that, that must mean that he knows what the desired goal is and how to achieve it. At which point, it would be far more efficient to just make the desired change rather than waiting for it to happen randomly.
And this is an exceedingly trivial program in a very easy language. Most programs are, what, hundreds or thousands of lines, all of which must work together. The odds against any random process writing a working program are astronomical.
There might be ways to do something that resembles this in some very specialized application, where you are not really making random mutations, but rather making incremental modifications to the parameters of a solution. Like, we have a formula with some constants whose values we don't know. We know what the correct results are for some small set of inputs. So we make random changes to the constants, and if the result is closer to the right answer, change from there, if not, go back to the previous value. But even at that, I think it would rarely be productive to make random changes. It would likely be more helpful to try changing the constants according to a strict formula, like start with changing by 1000's, then 100's then 10's, etc.
I want to just give you a suggestion. I don't know how successful you'd be, but perhaps you could try to evolve a core war bot with genetic programming. Your fitness function is easy: just let the bots compete in a game. You could start with well known bots and perhaps a few random ones then wait and see what happens.

Probabilistic logic vs. analog

There are research articles (e.g. Chakrapani & Palem) and devices (e.g. Lyric) that use a so-called probabilistic logic. I suppose the idea is that the outputs of such a device, given some inputs, will converge to some probability distribution. What is the difference between these devices and those using analog signals? That is, are these devices still considered digital, analog, mixed-signal?
This paper seems to describe some novel kind of (probabilistic) boolean logic, and it is not about implementation. I only skimmed through the paper, but it seems to be just another one of those theories. There is, by the way, a simple reason why probabilistic logics don't give you what classical logics give you, namely, they are not truth functional (i.e. the value of A & B does not depend solely on the value of A and the value of B).
As for implementing such a thing on a chip: I think both are possible. If you do it digitally, then you're calculating probabilities, and you can just as well run some code on a CPU. I don't really know about analog implementations, but I guess any elementary analog component (transistor, opamp etc) can be seen as performing some kind of basic arithmetic operation on voltages and currents. Whether a circuit gives outputs that adhere to, or approximate, the Kolmogorov laws of probability, that's another question, but my guess is: it is somehow possible and maybe it has been done.

Safe mixing of entropy sources

Let us assume we're generating very large (e.g. 128 or 256bit) numbers to serve as keys for a block cipher.
Let us further assume that we wear tinfoil hats (at least when outside).
Being so paranoid, we want to be sure of our available entropy, but we don't entirely trust any particular source. Maybe the government is rigging our coins. Maybe these dice are ever so subtly weighted. What if the hardware interrupts feeding into /dev/random are just a little too consistent? (Besides being paranoid, we're lazy enough that we don't want to generate it all by hand...)
So, let's mix them all up.
What are the secure method(s) for doing this? Presumably just concatenating a few bytes from each source isn't entirely secure -- if one of the sources is biased, it might, in theory, lend itself to such things as a related-key attack, for example.
Is running SHA-256 over the concatenated bytes sufficient?
(And yes, at some point soon I am going to pick up a copy of Cryptography Engineering. :))
Since you mention /dev/random -- on Linux at least, /dev/random is fed by an algorithm that does very much what you're describing. It takes several variously-trusted entropy sources and mixes them into an "entropy pool" using a polynomial function -- for each new byte of entropy that comes in, it's xor'd into the pool, and then the entire pool is stirred with the mixing function. When it's desired to get some randomness out of the pool, the entire pool is hashed with SHA-1 to get the output, then the pool is mixed again (and actually there's some more hashing, folding, and mutilating going on to make sure that reversing the process is about as hard as reversing SHA-1). At the same time, there's a bunch of accounting going on -- each time some entropy is added to the pool, an estimate of the number of bits of entropy it's worth is added to the account, and each time some bytes are extracted from the pool, that number is subtracted, and the random device will block (waiting on more external entropy) if the account would go below zero. Of course, if you use the "urandom" device, the blocking doesn't happen and the pool simply keeps getting hashed and mixed to produce more bytes, which turns it into a PRNG instead of an RNG.
Anyway... it's actually pretty interesting and pretty well commented -- you might want to study it. drivers/char/random.c in the linux-2.6 tree.
Using a hash function is a good approach - just make sure you underestimate the amount of entropy each source contributes, so that if you are right about one or more of them being less than totally random, you haven't weakened your key unduly.
This isn't dissimilar to the approach used in key stretching (though you have no need for multiple iterations here).
I've done this before, and my approach was just to XOR them, byte-by-byte, against each other.
Running them through some other algorithm, like SHA-256, is terribly inefficient, so it's not practical, and I think it would be not really useful and possibly harmful.
If you do happen to be incredibly paranoid, and have a tiny bit of money, it might be fun to buy a "true" (depending on how convinced you are by Quantum Mechanics) a Quantum Random Number Generator.
-- Edit:
FWIW, I think the method I describe above (or something similar) is effectively a One-Time Pad from the point of view of either sources, assuming one of them is random, and therefore unattackable assuming they are independant and out to get you. I'm happy to be corrected on this if someone takes issue with it, and I encourage anyone not taking issue with it to question it anyway, and find out for yourself.
If you have a source of randomness but you're not sure whether it is biased or not, then there are a lot of different algorithms. Depending on how much work you want to do, the entropy you waste from the original source differes.
The easiest algorithm is the (improved) van Neumann algorithm. You can find the details in this pdf:
http://security1.win.tue.nl/~bskoric/physsec/files/PhysSec_LectureNotes.pdf
at page 27.
I also recommend you to read this document if you're interested in how to produce uniformly randomness from a given souce, how true random number generators work, etc!

Resources