Is there a way to generate random bytes for crypto in Go deterministically, from a high-entropy seed?
I found crypto/rand, which is safe for crypto but not deterministic.
I found math/rand, which can be initialized with a seed, but is not safe for crypto.
I found x/crypto/chacha20, and I was wondering if it would be safe to use XORKeyStream with a src value of 1s. The seed would be the key and nonce, which could be generated with crypto/rand.
Edit
As an example of what I'm after, cryptonite, which is the main Haskell crypto library, has a function drgNewSeed that you can use to make a random generator from a seed.
Yes, XORKeyStream would be fine for this, and is a good design for a CSPRNG. The entire point of a stream cipher is that it generates a stream of "effectively random" values given a seed (the key and the IV). Those stream values are then XORed with the plaintext. "Effectively random" in this context means that there is no "efficient algorithm" (one that runs in polynomial time) that can distinguish this sequence from a "truly random" sequence. And that's what you want.
There's no need to pull in ChaCha20, though. You can use a built-in cipher like AES. Any block cipher can be converted into a stream cipher using one of several modes, such as CTR, OFB, or CFB. The differences between these modes don't really matter for this problem.
// Defining some seed, split across a "key" and an "iv"
key, _ := hex.DecodeString("6368616e676520746869732070617373")
iv, _ := hex.DecodeString("0123456789abcdef0123456789abcdef")
// We can turn a block cipher into a stream cipher, and AES is handy
block, err := aes.NewCipher(key)
if err != nil {
panic(err)
}
// Convert block cipher into a stream cipher using a streaming mode like CTR
// OFB or CFB would work, too
stream := cipher.NewCTR(block, iv)
for x := 0; x < 10; x++ {
// Create a fixed value of the size you want
value := []byte{0}
// Transform it to a random value
stream.XORKeyStream(value, value)
fmt.Printf("%d\n", value)
}
Playground
There are several other approaches you can use here. You can use a secure hash like SHA-256 to hash a counter (pick a random 128-bit number and keep incrementing it, hashing each value). Or you could hash the previous result (I have heard a little bit of controversy over whether it's possible for repeated hashes to impact the security of the hash. See https://crypto.stackexchange.com/questions/19392/any-weakness-when-performing-sha-256-repeatedly and https://crypto.stackexchange.com/questions/15481/will-repeated-rounds-of-sha-512-provide-random-numbers/15488 for more.)
You can also use a block cipher to do the same thing, by encrypting a counter or the previous output. That's pretty close to what the stream cipher modes are doing. You can just do it by hand, too.
If you want to dig into this, you might search for "csprng stream cipher" at crypto.stackexchange.com. That's a better place to ask for crypto advice, but IMO this is a programming-specific question, so did belong here.
In random number generation, first, limited randomness harvested from the computer's unpredictable elements. This physical randomness then cleaned from possible biases, like hashing, then the resulting smaller true randomness is stretched into many by using a pseudorandom number generator (PRNG).
PRNG's are deterministic which means that if the initial value (the initial true randomness - the seed) is known then the rest will be known. Keep this always secret!
We are not lost, the important design goal of the PRNGs is that the outputs should not be predictable from any other output. This is a strong requirement indicating that it should be impossible to learn the internal states only by looking at the outputs.
Go's crypto/rand uses the underlying system functionalities to get the physical randomness.
On Linux and FreeBSD, Reader uses getrandom(2) if available, /dev/urandom otherwise. On OpenBSD, Reader uses getentropy(2). On other Unix-like systems, Reader reads from /dev/urandom. On Windows systems, Reader uses the CryptGenRandom API. On Wasm, Reader uses the Web Crypto API.
Then one can use possible good Deterministic RBG like the Hash-DRGB, HMAC-DRGB, and CTR-DRGB as defined in NIST 800-90
You can use the x/crypto/chacha20 to generate the long deterministic random sequence. Keep the key and nonce fixed and secret then you will have a deterministic DRGB. It is very fast, and seekable, too.
Related
I've been thinking about this as a thought experiment to try and understand some hashing concepts. Consider the requirement for a say 128 bit hash function (i.e., its output is exactly 128 bits in length).
A. You might look at something like MD5. So you input your data to be hashed, and out pops a 128 bit number.
B. Alternatively, you find a magical pseudo random number generator (PRNG). Some sort of Frankenstein version of the Twister. It seeds itself from all of your input data to be hashed, and has an internal state size >> 128 bits. You then generate 128 pseudo random bits as output.
It seems to me that both A and B effectively produce an output that is determined solely by the input data. Are these two approaches therefore equivalent?
Supplemental:
Some feed back has suggested that there might be a security in-equivalence with my scenario. If the pseudo random number generator were to be something like Java's SecureRandom (which uses SHA-1), seeded from the input data, then might A <=> B?
If you seed a PRNG with your input data and then extract 128 bits of random data from it, then you effectively leave the hashing to the PRNG seed function, and the size of the hash that it generates will be the size of the PRNG state buffer.
However, if the state of the PRNG is larger than the 128 bits you extract as a hash, then there's a risk that some of the input data used for the seed won't have any effect on the bits of the PRNG state that you extract. This makes it a really bad hash, so you don't want to do that.
PRNG seed functions are typically very weak hashes, because hashing is not their business. They're almost certainly insecure (which you did not ask about), and separate from that they're usually quite weak at avalanching. A strong hash typically tries to ensure that every bit of input has a fair chance of affecting every bit of output. Insecure hashes typically don't worry that they'll fail at this if the input data is too short, but a PRNG seed will often make no effort at all.
Cryptographic hash functions are designed to make it hard to create input that generates a specific hash; and/or more it hard to create two inputs that generate the same hash.
If something is designed as a random number generating algorithm, then this was not one of the requirements for the design. So if something is "just" a random number generator, there is no guarantee that it satisfies these important constraints on a cryptographic hashcode. So in that sense, they are not equivalent.
Of course there may be random number generating algorithms that were also designed as cryptographic hashing algorithms, and in that case (if the implementation did a good job at satisfying the requirements) they may be equivalent.
I am looking for a fast encryption/decryption algorithm to be used against spam.
I don't know enough about this field to try and make my own, and in any case, I understand that it would be a bad idea to use something new, so I need some suggestions.
I have looked around SO and tried google but most of the results were explaining how encryption/decryption is slow in order to be hard to break, which I understand, but there are cases when the data expires rapidly and the secret key(salt?) can change very fast, so a fast algorithm would be very useful.
Look at this article on block ciphers. Here is how you can make your own cipher:
Encryption:
Store your own private data, preferably randomly generated for each cipher.
Using your private data as a seed in a pseudorandom number generator. Produce a string of bits as long as the data you want to encode, a.k.a. the plaintext. This string of "random" bits is the key.
For each bit of the key, take the corresponding bit from the plaintext, which we will call a and b respectively. The XOR of the two yields the corresponding bit in the ciphertext.
Use the ciphertext as you wish.
Decryption:
Take the ciphertext and retrieve the private data for it.
Use the private data as a seed in the same pseudorandom number generator to produce the key from before.
Follow the steps above to get the plaintext instead of the ciphertext.
Example:
// ENCODE
plaintext (in bits) = 00100001111110
key (from pseudo-random number generator) = 10101110110101
ciphertext (XOR each bit) = 10001111001011
// DECODE
ciphertext = 10001111001011
key (from pseudo-random number generator) = 10101110110101
plaintext = 00100001111110
Is there a simple algorithm to encrypt integers? That is, a function E(i,k) that accepts an n-bit integer and a key (of any type) and produces another, unrelated n-bit integer that, when fed into a second function D(E(i),k) (along with the key) produces the original integer?
Obviously there are some simple reversible operations you can perform, but they all seem to produce clearly related outputs (e.g. consecutive inputs lead to consecutive outputs). Also, of course, there are cryptographically strong standard algorithms, but they don't produce small enough outputs (e.g. 32-bit). I know any 32-bit cryptography can be brute-forced, but I'm not looking for something cryptographically strong, just something that looks random. Theoretically speaking it should be possible; after all, I could just create a dictionary by randomly pairing every integer. But I was hoping for something a little less memory-intensive.
Edit: Thanks for the answers. Simple XOR solutions will not work because similar inputs will produce similar outputs.
Would not this amount to a Block Cipher of block size = 32 bits ?
Not very popular, because it's easy to break. But theorically feasible.
Here is one implementation in Perl :
http://metacpan.org/pod/Crypt::Skip32
UPDATE: See also Format preserving encryption
UPDATE 2: RC5 supports 32-64-128 bits for its block size
I wrote an article some time ago about how to generate a 'cryptographically secure permutation' from a block cipher, which sounds like what you want. It covers using folding to reduce the size of a block cipher, and a trick for dealing with non-power-of-2 ranges.
A simple one:
rand = new Random(k);
return (i xor rand.Next())
(the point xor-ing with rand.Next() rather than k is that otherwise, given i and E(i,k), you can get k by k = i xor E(i,k))
Ayden is an algorithm that I developed. It is compact, fast and looks very secure. It is currently available for 32 and 64 bit integers. It is on public domain and you can get it from http://github.com/msotoodeh/integer-encoder.
You could take an n-bit hash of your key (assuming it's private) and XOR that hash with the original integer to encrypt, and with the encrypted integer to decrypt.
Probably not cryptographically solid, but depending on your requirements, may be sufficient.
If you just want to look random and don't care about security, how about just swapping bits around. You could simply reverse the bit string, so the high bit becomes the low bit, second highest, second lowest, etc, or you could do some other random permutation (eg 1 to 4, 2 to 7 3 to 1, etc.
How about XORing it with a prime or two? Swapping bits around seems very random when trying to analyze it.
Try something along the lines of XORing it with a prime and itself after bit shifting.
How many integers do you want to encrypt? How much key data do you want to have to deal with?
If you have few items to encrypt, and you're willing to deal with key data that's just as long as the data you want to encrypt, then the one-time-pad is super simple (just an XOR operation) and mathematically unbreakable.
The drawback is that the problem of keeping the key secret is about as large as the problem of keeping your data secret.
It also has the flaw (that is run into time and again whenever someone decides to try to use it) that if you take any shortcuts - like using a non-random key or the common one of using a limited length key and recycling it - that it becomes about the weakest cipher in existence. Well, maybe ROT13 is weaker.
But in all seriousness, if you're encrypting an integer, what are you going to do with the key no matter which cipher you decide on? Keeping the key secret will be a problem about as big (or bigger) than keeping the integer secret. And if you're encrypting a bunch of integers, just use a standard, peer reviewed cipher like you'll find in many crypto libraries.
RC4 will produce as little output as you want, since it's a stream cipher.
XOR it with /dev/random
Is there a well-known (to be considered) algorithm that can encrypt/decrypt any arbitrary byte inside the file based on the password entered and the offset inside the file.
(Databyte, Offset, Password) => EncryptedByte
(EncryptedByte, Offset, Password) => DataByte
And is there some fundamental weakness in this approach or it's still theoretically possible to build it strong enough
Update:
More datails: Any cryptographic algorithm has input and output. For many existing ones the input operates on large blocks. I want to operate on only one byte, but the system based on this can only can remap bytes and weak by default, but if we take the position in the file of this byte, we for example can take the bits of this position value to interpret them as some operation on some step (0: xor, 1: shitf) and create the encrypted byte with this. But it's too simple, I'm looking for something stronger.
Maybe it's not very efficient but how about this:
for encryption use:
encryptedDataByte = Encrypt(offset,key) ^ dataByte
for decryption use:
dataByte = Encrypt(offset,key) ^ encryptedDataByte
Where Encrypt(offset,key) might be e.g. 3DES or AES (with padding the offset, if needed, and throwing away all but one result bytes)
If you can live with block sizes of 16 byte, you can try the XTS-mode described in the wikipedia article about Disk encryption theory (the advantage being that some good cryptologists already looked at it).
If you really need byte-wise encryption, I doubt that there is an established solution. In the conference Crypto 2009 there was a talk about How to Encipher Messages on a Small Domain: Deterministic Encryption and the Thorp Shuffle. In your case the domain is a byte, and as this is a power of 2, a Thorp Shuffle corresponds to a maximally unbalanced Feistel network. Maybe one can build something using the position and the password as key, but I'd be surprised if a home-made solution will be secure.
You can use AES in Counter Mode where you divide your input into blocks of 16 bytes (128 bits) and then basically encrypt a counter on the block number to get a pseudo-random 16 bytes that you can XOR with the plaintext. It is critically important to not use the same counter start value (and/or initialization vector) for the same key ever again or you will open yourself for an easy attack where an attacker can use a simple xor to recover the key.
You mention that you want to only operate on individual bytes, but this approach would give you that flexibility. Output Feedback Mode is another common one, but you have to be careful in its use.
You might consider using the EAX mode for better security. Also, make sure you're using something like PBKDF-2 or scrypt to generate your encryption key from the password.
However, as with most cryptography related issues, it's much better to use a rigorously tested and evaluated library rather than rolling your own.
Basically what you need to do is generate some value X (probably 1 byte) based on the offset and password, and use this to encrypt/decrypt the byte at that offset. We'll call it
X = f(offset,password)
The problem is that an attacker that "knows something" about the file contents (e.g. the file is English text, or a JPEG) can come up with an estimate (or sometimes be certain) of what an X could be. So he has a "rough idea" about many X values, and for each of these he knows what the offset is. There is a lot of information available.
Now, it would be nice if all that information were of little use to the attacker. For most purposes, using a cryptographic hash function (like SHA-1) will give you a reasonable assurance of decent security.
But I must stress that if this is something critical, consult an expert.
One possibility is a One Time Pad, possibly using the password to seed some pseudo-random number generator. One time pads theoretically achieve perfect secrecy, but there are some caveats. It should do what you're looking for though.
I came across an article about Car remote entry system at http://auto.howstuffworks.com/remote-entry2.htm In the third bullet, author says,
Both the transmitter and the receiver use the same pseudo-random number generator. When the transmitter sends a 40-bit code, it uses the pseudo-random number generator to pick a new code, which it stores in memory. On the other end, when the receiver receives a valid code, it uses the same pseudo-random number generator to pick a new one. In this way, the transmitter and the receiver are synchronized. The receiver only opens the door if it receives the code it expects.
Is it possible to have two PRNG functions producing same random numbers at the same time?
In PRNG functions, the output of the function is dependent on a 'seed' value, such that the same output will be provided from successive calls given the same seed value. So, yes.
An example (using C#) would be something like:
// Provide the same seed value for both generators:
System.Random r1 = new System.Random(1);
System.Random r2 = new System.Random(1);
// Will output 'True'
Console.WriteLine(r1.Next() == r2.Next());
This is all of course dependent on the random number generator using some sort of deterministic formula to generate its values. If you use a so-called 'true random' number generator that uses properties of entropy or noise in its generation, then it would be very difficult to produce the same values given some input, unless you're able to duplicate the entropic state for both calls into the function - which, of course, would defeat the purpose of using such a generator...
In the case of remote keyless entry systems, they very likely use a PRNG function that is deterministic in order to take advantage of this feature. There are many ICs that provide this sort of functionality to produce random numbers for electronic circuits.
Edit: upon request, here is an example of a non-deterministic random number generator which doesn't rely upon a specified seed value: Quantum Random Number Generator. Of course, as freespace points out in the comments, this is not a pseudorandom number generator, since it generates truly random numbers.
Most PRNGs have an internal state in the form of a seed, which they use to generate their next values. The internal logic goes something like this:
nextNumber = function(seed);
seed = nextNumber;
So every time you generate a new number, the seed is updated. If you give two PRNGs that use the same algorithm the same seed, function(seed) is going to evaluate to the same number (given that they are deterministic, which most are).
Applied to your question directly: the transmitter picks a code, and uses it as a seed. The receiver, after receiving it, uses this to seed its generator. Now the two are aligned, and they will generate the same values.
As Erik and Claudiu have said, ad long as you seed your PRNG with the same value you'll end up with the same output.
An example can be seen when using AES (or any other encryption algorithm) as the basis of your PRNG. As long as you keep using an inputs that match on both device (transmitter and receiver) then the outputs will also match.