A fast encryption/decryption algorithm that will NOT be used for security, but to combat spam - performance

I am looking for a fast encryption/decryption algorithm to be used against spam.
I don't know enough about this field to try and make my own, and in any case, I understand that it would be a bad idea to use something new, so I need some suggestions.
I have looked around SO and tried google but most of the results were explaining how encryption/decryption is slow in order to be hard to break, which I understand, but there are cases when the data expires rapidly and the secret key(salt?) can change very fast, so a fast algorithm would be very useful.

Look at this article on block ciphers. Here is how you can make your own cipher:
Encryption:
Store your own private data, preferably randomly generated for each cipher.
Using your private data as a seed in a pseudorandom number generator. Produce a string of bits as long as the data you want to encode, a.k.a. the plaintext. This string of "random" bits is the key.
For each bit of the key, take the corresponding bit from the plaintext, which we will call a and b respectively. The XOR of the two yields the corresponding bit in the ciphertext.
Use the ciphertext as you wish.
Decryption:
Take the ciphertext and retrieve the private data for it.
Use the private data as a seed in the same pseudorandom number generator to produce the key from before.
Follow the steps above to get the plaintext instead of the ciphertext.
Example:
// ENCODE
plaintext (in bits) = 00100001111110
key (from pseudo-random number generator) = 10101110110101
ciphertext (XOR each bit) = 10001111001011
// DECODE
ciphertext = 10001111001011
key (from pseudo-random number generator) = 10101110110101
plaintext = 00100001111110

Related

Deterministic pseudorandom bytes for crypto

Is there a way to generate random bytes for crypto in Go deterministically, from a high-entropy seed?
I found crypto/rand, which is safe for crypto but not deterministic.
I found math/rand, which can be initialized with a seed, but is not safe for crypto.
I found x/crypto/chacha20, and I was wondering if it would be safe to use XORKeyStream with a src value of 1s. The seed would be the key and nonce, which could be generated with crypto/rand.
Edit
As an example of what I'm after, cryptonite, which is the main Haskell crypto library, has a function drgNewSeed that you can use to make a random generator from a seed.
Yes, XORKeyStream would be fine for this, and is a good design for a CSPRNG. The entire point of a stream cipher is that it generates a stream of "effectively random" values given a seed (the key and the IV). Those stream values are then XORed with the plaintext. "Effectively random" in this context means that there is no "efficient algorithm" (one that runs in polynomial time) that can distinguish this sequence from a "truly random" sequence. And that's what you want.
There's no need to pull in ChaCha20, though. You can use a built-in cipher like AES. Any block cipher can be converted into a stream cipher using one of several modes, such as CTR, OFB, or CFB. The differences between these modes don't really matter for this problem.
// Defining some seed, split across a "key" and an "iv"
key, _ := hex.DecodeString("6368616e676520746869732070617373")
iv, _ := hex.DecodeString("0123456789abcdef0123456789abcdef")
// We can turn a block cipher into a stream cipher, and AES is handy
block, err := aes.NewCipher(key)
if err != nil {
panic(err)
}
// Convert block cipher into a stream cipher using a streaming mode like CTR
// OFB or CFB would work, too
stream := cipher.NewCTR(block, iv)
for x := 0; x < 10; x++ {
// Create a fixed value of the size you want
value := []byte{0}
// Transform it to a random value
stream.XORKeyStream(value, value)
fmt.Printf("%d\n", value)
}
Playground
There are several other approaches you can use here. You can use a secure hash like SHA-256 to hash a counter (pick a random 128-bit number and keep incrementing it, hashing each value). Or you could hash the previous result (I have heard a little bit of controversy over whether it's possible for repeated hashes to impact the security of the hash. See https://crypto.stackexchange.com/questions/19392/any-weakness-when-performing-sha-256-repeatedly and https://crypto.stackexchange.com/questions/15481/will-repeated-rounds-of-sha-512-provide-random-numbers/15488 for more.)
You can also use a block cipher to do the same thing, by encrypting a counter or the previous output. That's pretty close to what the stream cipher modes are doing. You can just do it by hand, too.
If you want to dig into this, you might search for "csprng stream cipher" at crypto.stackexchange.com. That's a better place to ask for crypto advice, but IMO this is a programming-specific question, so did belong here.
In random number generation, first, limited randomness harvested from the computer's unpredictable elements. This physical randomness then cleaned from possible biases, like hashing, then the resulting smaller true randomness is stretched into many by using a pseudorandom number generator (PRNG).
PRNG's are deterministic which means that if the initial value (the initial true randomness - the seed) is known then the rest will be known. Keep this always secret!
We are not lost, the important design goal of the PRNGs is that the outputs should not be predictable from any other output. This is a strong requirement indicating that it should be impossible to learn the internal states only by looking at the outputs.
Go's crypto/rand uses the underlying system functionalities to get the physical randomness.
On Linux and FreeBSD, Reader uses getrandom(2) if available, /dev/urandom otherwise. On OpenBSD, Reader uses getentropy(2). On other Unix-like systems, Reader reads from /dev/urandom. On Windows systems, Reader uses the CryptGenRandom API. On Wasm, Reader uses the Web Crypto API.
Then one can use possible good Deterministic RBG like the Hash-DRGB, HMAC-DRGB, and CTR-DRGB as defined in NIST 800-90
You can use the x/crypto/chacha20 to generate the long deterministic random sequence. Keep the key and nonce fixed and secret then you will have a deterministic DRGB. It is very fast, and seekable, too.

bcrypt in Go with KDF for specific output key-length

It seems like the Go ecosystem just has a basic bcrypt implementation (golang.org/x/crypto/bcrypt) and it's left as an exercise for the developer to extract the key from the encoded output string to then further expand it to satisfy a particular key length if you're going to be using it as an encryption key rather than just storing it as a password in a DB somewhere. It confounds me that there don't seem to be any quick treatments of this concept online for Go or just in general.
At the risk of introducing a bug by doing it myself, I suspect that I'm gonna be forced to use scrypt, where, at least in Go, it does take an output-length parameter.
Am I missing something? Is there an implementation of bcrypt somewhere in Go that takes a key-length parameter and manages producing a key of acceptable length directly?
Bcrypt is not a key derivation algorithm; it is a password hashing algorithm.
PBKDF2 can take a password and output n desired bits
scrypt can take a password and output n desired bits
These are key-derivation functions. They take a password and generate n bits that you can then use as an encryption key.
BCrypt cannot do that. BCrypt is not a key-derivation function. It is a password hashing function. It always outputs the same amount of bits.
Bonus: bcrypt always outputs exactly 24-bytes (192 bits), because the output from bcrypt is the result of encrypting OrpheanBeholderScryDoubt.
Note: It's not the result of hashing OrpheanBeholderScryDoubt - the bcrypt algorithm is actually encrypting OrpheanBeholderScryDoubt using the blowfish cipher (and repeating the encryption 64 times).
The strenght of bcrypt comes from the "expensive key setup".
Bonus: The strength of bcrypt comes from the fact that it is expensive. And "expensive" means memory. The more memory an algorithm requires, the stronger it is against bruteforce attacks.
SHA-2: can operate in 128 bytes of RAM
bcrypt: constantly touches 4 KB of RAM
scrypt: constantly touches 16 MB of RAM (in the default configuration in Android and LiteCoin)
Argon2: is usually recommended you configure it to touch 1 GB of RAM
Defending against bruteforce attacks means to defend against parallelization. An algorithm that requires 128 bytes can have 7 million parallel operations on a 1 GB video card.
Scrypt, requiring 16 MB of RAM, can only have 62 running in parallel.
Argon2, using 1 GB of RAM, can only have 1 running on a video card. And it runs faster on a CPU anyway.
Kludge bcrypt into a Key Derivation Function (KDF)
You could kludge bcrypt into being a key-derivation function. You can use the standard function PBKDF2 to do it for you.
Normally PBKDF2 is called as:
String password = "hunter2";
String salt = "sea salt 69 nice";
Byte[] key = PBKDF2(password, salt, 32, 10000); //32-bytes is 256 bits
But instead you can use the bcrypt string result as your salt:
String password = "hunter2";
String salt = bcrypt.HashPassword(password, 12);
Byte[] key = PBKDF2(password, salt, 32, 1); //32-bytes is 256-bits
And now you've generated a 256-bit key "using bcrypt". It's a neat hack.
In fact the hack is so neat, that it is literally what scrypt does:
String password = "hunter2";
String salt = ScryptExpensiveKeyHash(password, userSalt, ...);
Byte[] key = PBKDF2(password, salt, 32, 1); //32-bytes is 256-bits
Conclusion
Bcrypt is not a key derivation function. That is the goal of functions like PBKDF2, scrypt (which uses PBKDF2), and argon2.
Using bcrytp when you're only allowed NIST approved algorithms
There is another good reason to use this pbkdf2 construction with bcrypt.
Sometimes a "security expert", who has no idea what they're talking about, will insist that you use PBDKF2 for key derivation. (Yes, it does happen). And you'll try to tell them over and over that PBDKF2 is horribly weak system for key derivation (SHA2 that it is based on runs way too fast, and 10,000 or 100,000 iterations is nowhere near enough to protect you from brute-force attacks - that's what bcrypt, scrypt, and argon2 were invented for).
But this person won't let it go, and will demand the use of PBKDF2. With this construction you can still use bcrypt for security, and PBKDF2 for ignoramus who demands it be in there.
You just happen to use a strong "salt".

Simple integer encryption

Is there a simple algorithm to encrypt integers? That is, a function E(i,k) that accepts an n-bit integer and a key (of any type) and produces another, unrelated n-bit integer that, when fed into a second function D(E(i),k) (along with the key) produces the original integer?
Obviously there are some simple reversible operations you can perform, but they all seem to produce clearly related outputs (e.g. consecutive inputs lead to consecutive outputs). Also, of course, there are cryptographically strong standard algorithms, but they don't produce small enough outputs (e.g. 32-bit). I know any 32-bit cryptography can be brute-forced, but I'm not looking for something cryptographically strong, just something that looks random. Theoretically speaking it should be possible; after all, I could just create a dictionary by randomly pairing every integer. But I was hoping for something a little less memory-intensive.
Edit: Thanks for the answers. Simple XOR solutions will not work because similar inputs will produce similar outputs.
Would not this amount to a Block Cipher of block size = 32 bits ?
Not very popular, because it's easy to break. But theorically feasible.
Here is one implementation in Perl :
http://metacpan.org/pod/Crypt::Skip32
UPDATE: See also Format preserving encryption
UPDATE 2: RC5 supports 32-64-128 bits for its block size
I wrote an article some time ago about how to generate a 'cryptographically secure permutation' from a block cipher, which sounds like what you want. It covers using folding to reduce the size of a block cipher, and a trick for dealing with non-power-of-2 ranges.
A simple one:
rand = new Random(k);
return (i xor rand.Next())
(the point xor-ing with rand.Next() rather than k is that otherwise, given i and E(i,k), you can get k by k = i xor E(i,k))
Ayden is an algorithm that I developed. It is compact, fast and looks very secure. It is currently available for 32 and 64 bit integers. It is on public domain and you can get it from http://github.com/msotoodeh/integer-encoder.
You could take an n-bit hash of your key (assuming it's private) and XOR that hash with the original integer to encrypt, and with the encrypted integer to decrypt.
Probably not cryptographically solid, but depending on your requirements, may be sufficient.
If you just want to look random and don't care about security, how about just swapping bits around. You could simply reverse the bit string, so the high bit becomes the low bit, second highest, second lowest, etc, or you could do some other random permutation (eg 1 to 4, 2 to 7 3 to 1, etc.
How about XORing it with a prime or two? Swapping bits around seems very random when trying to analyze it.
Try something along the lines of XORing it with a prime and itself after bit shifting.
How many integers do you want to encrypt? How much key data do you want to have to deal with?
If you have few items to encrypt, and you're willing to deal with key data that's just as long as the data you want to encrypt, then the one-time-pad is super simple (just an XOR operation) and mathematically unbreakable.
The drawback is that the problem of keeping the key secret is about as large as the problem of keeping your data secret.
It also has the flaw (that is run into time and again whenever someone decides to try to use it) that if you take any shortcuts - like using a non-random key or the common one of using a limited length key and recycling it - that it becomes about the weakest cipher in existence. Well, maybe ROT13 is weaker.
But in all seriousness, if you're encrypting an integer, what are you going to do with the key no matter which cipher you decide on? Keeping the key secret will be a problem about as big (or bigger) than keeping the integer secret. And if you're encrypting a bunch of integers, just use a standard, peer reviewed cipher like you'll find in many crypto libraries.
RC4 will produce as little output as you want, since it's a stream cipher.
XOR it with /dev/random

Encryption algorithm that output byte by byte based on password and offset

Is there a well-known (to be considered) algorithm that can encrypt/decrypt any arbitrary byte inside the file based on the password entered and the offset inside the file.
(Databyte, Offset, Password) => EncryptedByte
(EncryptedByte, Offset, Password) => DataByte
And is there some fundamental weakness in this approach or it's still theoretically possible to build it strong enough
Update:
More datails: Any cryptographic algorithm has input and output. For many existing ones the input operates on large blocks. I want to operate on only one byte, but the system based on this can only can remap bytes and weak by default, but if we take the position in the file of this byte, we for example can take the bits of this position value to interpret them as some operation on some step (0: xor, 1: shitf) and create the encrypted byte with this. But it's too simple, I'm looking for something stronger.
Maybe it's not very efficient but how about this:
for encryption use:
encryptedDataByte = Encrypt(offset,key) ^ dataByte
for decryption use:
dataByte = Encrypt(offset,key) ^ encryptedDataByte
Where Encrypt(offset,key) might be e.g. 3DES or AES (with padding the offset, if needed, and throwing away all but one result bytes)
If you can live with block sizes of 16 byte, you can try the XTS-mode described in the wikipedia article about Disk encryption theory (the advantage being that some good cryptologists already looked at it).
If you really need byte-wise encryption, I doubt that there is an established solution. In the conference Crypto 2009 there was a talk about How to Encipher Messages on a Small Domain: Deterministic Encryption and the Thorp Shuffle. In your case the domain is a byte, and as this is a power of 2, a Thorp Shuffle corresponds to a maximally unbalanced Feistel network. Maybe one can build something using the position and the password as key, but I'd be surprised if a home-made solution will be secure.
You can use AES in Counter Mode where you divide your input into blocks of 16 bytes (128 bits) and then basically encrypt a counter on the block number to get a pseudo-random 16 bytes that you can XOR with the plaintext. It is critically important to not use the same counter start value (and/or initialization vector) for the same key ever again or you will open yourself for an easy attack where an attacker can use a simple xor to recover the key.
You mention that you want to only operate on individual bytes, but this approach would give you that flexibility. Output Feedback Mode is another common one, but you have to be careful in its use.
You might consider using the EAX mode for better security. Also, make sure you're using something like PBKDF-2 or scrypt to generate your encryption key from the password.
However, as with most cryptography related issues, it's much better to use a rigorously tested and evaluated library rather than rolling your own.
Basically what you need to do is generate some value X (probably 1 byte) based on the offset and password, and use this to encrypt/decrypt the byte at that offset. We'll call it
X = f(offset,password)
The problem is that an attacker that "knows something" about the file contents (e.g. the file is English text, or a JPEG) can come up with an estimate (or sometimes be certain) of what an X could be. So he has a "rough idea" about many X values, and for each of these he knows what the offset is. There is a lot of information available.
Now, it would be nice if all that information were of little use to the attacker. For most purposes, using a cryptographic hash function (like SHA-1) will give you a reasonable assurance of decent security.
But I must stress that if this is something critical, consult an expert.
One possibility is a One Time Pad, possibly using the password to seed some pseudo-random number generator. One time pads theoretically achieve perfect secrecy, but there are some caveats. It should do what you're looking for though.

two-way keyed encryption/hash algorithm

I am no way experienced in this type of thing so I am not even sure of the keywords (hence the title).
Basically I need a two way function
encrypt(w,x,y) = z
decrypt(z) = w, x, y
Where w = integer
x = string (username)
y = unix timestamp
and z = is an 8 digit number (possibly including letters, spec isn't there yet.)
I would like z to be not easily guessable and easily verifiable. Speed isn't a huge concern, security isn't either. Tracking one-to-one relationship is the main requirement.
Any resources or direction would be appreciated.
EDIT
Thanks for the answers, learning a lot. So to clarify, 8 characters is the only hard requirement, along with the ability to link W <-> Z. The username (Y) and timestamp (Z) would be considered icing on the cake.
I would like to do this mathematically rather than doing some database looks up, if possible.
If i had to finish this tonight, I could just find a fitting hash algorithm and use a look up table. I am simply trying to expand my understanding of this type of thing and see if I could do it mathematically.
Encryption vs. Hashing
This is an encryption problem, since the original information needs to be recovered. The quality of a cryptographic hash is judged by how difficult it is to reverse the hash and recover the original information, so hashing is not applicable here.
To perform encryption, some key material is needed. There are many encryption algorithms, but they fall into two main groups: symmetric and asymmetric.
Application
The application here isn't clear. But if you are "encrypting" some information and sending it somewhere, then later getting it back and doing something with it, symmetric encryption is the way to go. For example, say you want to encode a user name, an IP address, and some identifier from your application in a parameter that you include in a link in some HTML. When the user clicks the link, that parameter is passed back to your application and you decode it to recover the original information. That's a great fit for symmetric encryption, because the sender and the recipient are the same party, and key exchange is a no-op.
Background
In symmetric encryption, the sender and recipient need to know the same key, but keep it secret from everyone else. As a simple example, two people could meet in person, and decide on a password. Later on, they could use that password to keep their email to each other private. However, anyone who overhears the password exchange will be able to spy on them; the exchange has to happen over a secure channel... but if you had a secure channel to begin with, you wouldn't need to exchange a new password.
In asymmetric encryption, each party creates a pair of keys. One is public, and can be freely distributed to anyone who wants to send a private message. The other is private. Only the message recipient knows that private key.
A big advantage to symmetric encryption is that it is fast. All well-designed protocols use a symmetric algorithm to encrypt large amounts of data. The downside is that it can be difficult to exchange keys securely—what if you can't "meet up" (virtually or physically) in a secure place to agree on a password?
Since public keys can be freely shared, two people can exchange a private message over an insecure channel without having previously agreed on a key. However, asymmetric encryption is much slower, so its usually used to encrypt a symmetric key or perform "key agreement" for a symmetric cipher. SSL and most cryptographic protocols go through a handshake where asymmetric encryption is used to set up a symmetric key, which is used to protect the rest of the conversation.
You just need to encrypt a serialization of (w, x, y) with a private key. Use the same private key to decrypt it.
In any case, the size of z cannot be simply bounded like you did, since it depends on the size of the serialization (since it needs to be two way, there's a bound on the compression you can do, depending on the entropy).
And you are not looking for a hash function, since it would obviously lose some information and you wouldn't be able to reverse it.
EDIT: Since the size of z is a hard limit, you need to restrict the input to 8 bytes, and choose a encryption technique that use 64 bits (or less) block size. Blowfish and Triple DES use 64 bits blocks, but remember that those algorithms didn't receive the same scrutiny as AES.
If you want something really simple and quite unsecure, just xor your input with a secret key.
You probably can't.
Let's say that w is 32 bits, x supports at least 8 case-insensitive ASCII chars, so at least 37 bits, and y is 32 bits (gets you to 2038, and 31 bits doesn't even get you to now).
So, that's a total of at least 101 bits of data. You're trying to store it in an 8 digit number. It's mathematically impossible to create an invertible function from a larger set to a smaller set, so you'd need to store more than 12.5 bits per "digit".
Of course if you go to more than 8 characters, or if your characters are 16 bit unicode, then you're at least in with a chance.
Let's formalize your problem, to better study it.
Let k be a key from the set K of possible keys, and (w, x, y) a piece of information, from a set I, that we need to crypt. Let's define the set of "crypted-messages" as A8, where A is the alphabet from which we extract the characters to our crypted message (A = {0, 1, ..., 9, a, b, ..., z, ... }, depending on your specs, as you said).
We define the two functions:
crypt: I * K --> A^8.
decrypt A^8 * K --> I
The problem here is that the size of the set A^8, of crypted-messages, might be smaller than the set of pieces of information (w, x, y). If this is so, it is simply impossible to achieve what you are looking for, unless we try something different...
Let's say that only YOU (or your server, or your application on your server) have to be able to calculate (w, x, y) from z. That is, you might send z to someone, and you don't care that they will not be able to decrypt it.
In this case, what you can do is use a database on your server. You will crypt the information using a well-known algorithm, than you generate a random number z. You define the table:
Id: char[8]
CryptedInformation: byte[]
You will then store z on the Id column, and the crypted information on the corresponding column.
When you need to decrypt the information, someone will give you z, the index of the crypted information, and then you can proceed to decryption.
However, if this works for you, you might not even need to crypt the information, you could have a table:
Id: char[8]
Integer: int
Username: char[]
Timestamp: DateTime
And use the same method, without crypting anything.
This can be applied to an "e-mail verification system" on a subscription process, for example. The link you would send to the user by mail would contain z.
Hope this helps.
I can't tell if you are trying to set this up a way to store passwords, but if you are, you should not use a two way hash function.
If you really want to do what you described, you should just concatenate the string and the timestamp (fill in extra spaces with underscores or something). Take that resulting string, convert it to ASCII or UTF-8 or something, and find its value modulo the largest prime less than 10^8.
Encryption or no encryption, I do not think it is possible to pack that much information into an 8 digit number in such a way that you will ever be able to get it out again.
An integer is 4 bytes. Let's assume your username is limited to 8 characters, and that characters are bytes. Then the timestamp is at least another 4 bytes. That's 16 bytes right there. In hex, that will take 32 digits. Base36 or something will be less, but it's not going to be anywhere near 8.
Hashes by definition are one way only, once hashed, it is very difficult to get the original value back again.
For 2 way encryption i would look at TripleDES which .net has baked right in with TripleDESCryptoServiceProvider.
A fairly straight forward implementation article.
EDIT
It has been mentioned below that you can not cram a lot of information into a small encrypted value. However, for many (not all) situations this is exactly what Bit Masks exist to solve.

Resources