how secure is a digital signature? - algorithm

Digital signature, if I understood right, means sending the message in clear along with a hash of the message which is encrypted using a private key.
The recipient of the message calculates the hash, decrypts the received hash using the public key, then compares the two hashes for a match.
How safe is this? I mean, you can obtain the hash of the message easily and you also have the encrypted hash. How easy is it to find the private key used to create the Encrypted_hash?
Example:
Message Hash Encrypted_hash
-----------------------------------------
Hello world! 1234 abcd
Hi there 5678 xyzt
Bla bla 0987 gsdj
...
Given the Hash and the Encrypted_hash values, and enough of these messages, how easy/hard is it to find out the private key?

Because of the algorithms used to generate the keys (RSA is the typical one), the answer is essentially "impossible in any reasonable amount of time" assuming that the key is of a sufficient bit length. As long as the private key is not stolen or given away, you won't be able to decrypt it with just a public key and a message that was hashed with the private key.
As linked to in #Henk Holterman's answer, the RSA algorithm is built on the fact that the computations needed to decrypt the private key - prime factorization being one of them - are hard problems, which cannot be solved in any reasonable amount time (that we currently know of). In other words, the underlying problem (prime factorization) is an NP problem, meaning that it cannot be solved in polynomial time (cracking the private key) but it can be verified in polynomial time (decrypting using the public key).

Ciphers developed before electronic computers were often vulnerable to "known plain-text" attack, which is essentially what is described here: if an attacker had the cipher-text and the corresponding plain-text, he could discover the key. World War II-era codes were sometimes broken by guessing at plain-text words that had been encrypted, like the locations of battles, ranks, salutations, or weather conditions.
However, the RSA algorithm used most often for digital signatures is invulnerable even to a "chosen plain-text attack" when proper padding is used (like OAEP). Chosen plain-text means that the attacker can choose a message, and trick the victim into encrypting it; it's usually even more dangerous than a known plain-text attack.
Anyway, a digital signature is safe by any standard. Any compromise would be due to an implementation flaw, not a weakness in the algorithm.

A digital signature says nothing about how the actual message is transferred. Could be clear text or encrypted.
And current asymmetric algorithms (public+private key) are very secure, how secure depends on the key-size.
An attacker does have enough information to crack it. But it is part of the 'proof' of asymmetric encryption that that takes an impractical amount of CPU time: the method is computationally safe.

What you're talking about is known as a "known plaintext" attack. With any reasonably secure modern encryption algorithm known plaintext is of essentially no help in an attack. When you're designing an encryption algorithm, you assume that an attacker will have access to an arbitrary amount of known plaintext; if that assists the attacker, the algorithm is considered completely broken by current standards.
In fact, you normally take for granted that the attacker will not only have access to an arbitrary amount of known plaintext, but even an arbitrary amount of chosen plaintext (i.e., they can choose some text, somehow get you to encrypt it, and compare the result to the original. Again, any modern algorithm needs to be immune to this to be considered secure.

Given the Hash and the Encrypted_hash values, and enough of these messages, how easy/hard is it to find out the private key?
This is the scenario of a Known-plaintext attack: you are given many plaintext messages (the hash) and corresponding cipher texts (the encrypted hash) and you want to find out the encryption key.
Modern cryptographic algorithms are designed to withstand this kind of attack, like the RSA algorithm, which is one of the algorithms currently in use for digital signatures.
In other words, it is still extremely difficult to find out the private key. You'd either need an impossible amount of computing power, or you'd need to find a really fast algorithm for factorizing integers, but that would guarantee you lasting fame in the history of mathematics, and hence is even more difficult.
For a more detailed and thorough understanding of cryptography, have a look at the literature, like the Wikipedia pages or Bruce Schneier's Applied Cryptography.

For a perfectly designed hash it is impossible (or rather - there is no easier way than trying every possible input key)

Related

Is there a good symmetric encryption algorithm that changes ciphertext if password or plaintext changes?

I want that when the original data or the password changes (I mean, any one of them changes, or both of them change), the encrypted data will always change. In other words, once the encrypted data is certain, then both the original data and the password will be certain, although they are not known to those who don't have the password.
Is there any good symmetric encryption algorithm that fits my specific need?
I assume that you use the password to derive the key for the cipher.
Changing key
Every modern encryption algorithm produces different ciphertexts when a different key is used. That's just how encryption usually works. If it doesn't have this property then everything is broken. All the usual suspect like AES, Blowfish, 3DES have this property.
Changing plaintext
The other property is a little harder to do. This runs under the umbrella of semantic security.
Take for example any modern symmetric cipher in ECB mode. If only a single block of plaintext changes then only the same block changes in the ciphertext. So if you encrypt many similar plaintexts, an attacker who observes the ciphertexts can infer relationships between those. ECB mode is really bad.
Ok, now take a cipher in CBC mode. If you use the same IV over and over again, then an attacker may infer similar relationships as in ECB mode. If the 10th block of plaintext changes, then the previous 9 blocks will be the same in both ciphertexts. So, if you use a new random IV for every encryption, then there is nothing an attacker can deduce besides the length without breaking the underlying cipher.
In other words, once the encrypted data is certain, then both the original data and the password will be certain
The previous paragraph may not be completely what you wanted, because now if you encrypt the same plaintext with the same key twice, you get different results (this is a weak security property) due to a random IV. Since you derive the key from a password, you may also derive the IV from the same password. If you use for example PBKDF2, you can set the number of output bits to be the size of key+IV. You will need to use a static salt value.
If you don't need that last property, then I suggest you use an authenticated mode like GCM or EAX. When you transmit ciphertext or give the attacker an encryption oracle then there are possible attack vectors when no integrity checks are used. An authenticated mode solves this for you without the need to use an encrypt-then-MAC scheme.
If all you care about is detecting when either the data or the password changes, create a file with the data and then append the password. Then use a cryptographic hash like SHA-2 on the file. If either the data or password changes, the hash will change.
Encryption algorithms are generally used to keep data private or to verify identities. Hashes are for detecting when two data objects are different.
The usual reason for encrypting data is that it will be placed in an insecure environment. If the user's encrypted data is in an insecure environment, an opponent can copy it and use software the opponent obtained or wrote, instead of your software, to try to decrypt it. This is similar to some of the controls put on PDFs in Adobe software, such as not being able to cut and paste from the document. Other brands of software may not enforce the no-cut-and-paste restriction.
If the opponent discerns the encryption algorithm, but uses the wrong password, the chances of getting the correct plain text are very small; for the popular modern algorithms, the chance is small enough to neglect compared to other everyday risks we must endure, such as life on earth being destroyed by a comet.

High cost encryption but less cost decryption

I want that user/attacker to encrypt the data and send to server. Now I want an algorithm that is completely opposite to standard algorithms (fast to use, hard to decrypt), i.e it is very hard to encrypt the data like passwords using a key send by the server, to protect against random attacks, but very easy to decrypt so server consumes very less time in validating the user, but it becomes very hard for attacker, to every time encrypt new trial password with the key send by server.
Once again I am not talking about SSL.
It sounds as if you're looking for a proof-of-work scheme -- one of the applications for such a scheme is just what you describe: To force a client to do a certain amount of work and so prevent it from flooding a server with requests.
One silly idea that might work really well is to attach a "puzzle" to the encryption scheme. Make it so that in order to post encrypted data to the server, you have to solve some known NP-hard problem (for example, finding a satisfying assignment to a large Boolean formula) and send the answer along with. The server can then easily verify the solution, but under the assumption that P ≠ NP, clients trying to post data have to do a superpolynomial amount of extra work, preventing them from flooding your server.
Hope this helps!
You can use RSA signatures (e.g. PKCS#1).
Your server may accept answers only if they are signed with a certain RSA key whose private part you distributed earlier. The server uses the public part.
RSA has the property that verification is much quicker than signing when you select a small public exponent (typically called e, for instance e=3) by a factor or x10 or x100, depending on the key length and on whether your clients are smart enough to use CRT.
RSA key generation is extremely slow, but it is probably sufficient to use one key for all clients, as long as you include a challenge in your protocol to foil replay attacks.
Finally, RSA signing does not give you confidentiality.

New cryptographic algorithms?

I was wondering about new trends in cryptography. Which algorithms are new ? Which are improved and which died beacuse of the time that past ?
For example EEC ( Elliptic Curve Cryptography ) is quite new approach, but definitly not the only one. Could you name some of them ?
ECC actually originates from the 80's; it is not exactly new.
In the context of asymmetric encryption and digital signatures, there has been in the last few years much research on pairings. Pairings open up the next level. Conceptually: symmetric cryptography is for problems with one entity (all entities share a secret key, thus they are the "same" entity), asymmetric cryptography is for problems with two entities (the signer and the verifier), and pairings are a tool for protocols with three entities (for instance, electronic cash, where there are the bank, the merchant and the buyer). The only really practical pairings found so far use elliptic curves, but with a much higher dose of mathematics.
As for more classical asymmetric encryption and signatures, there has been some work on many other algorithms, such as HFE, which seems especially good with regards to signature sizes, or lattice-based cryptography. This is still quite new. It takes some time (say a dozen years or so) before a newly created algorithm becomes mature enough to be standardized.
Following work by Bellovin and Merritt in 1992, some Password Authenticated Key Exchange protocols have been described. These protocols are meant to allow for password-based mutual authentication immune to offline dictionary attacks (i.e. the protocols imply that an attacker, even if actively masquerading as one of the parties, cannot obtain enough information to test passwords at his leisure; each guess from the attacker must go through an interaction with one of the entities who knows the password). IEEE group P1363 is working on writing standards on that subject.
In the area of symmetric encryption, the AES has been a bit "final". A few stream ciphers have been designed afterwards (stream ciphers are supposed to provide better performance, at the cost of less generic applicability); some were analyzed by the eSTREAM project. There has been quite some work on encryption modes, which try to combine symmetric encryption and integrity checks in one efficient system (see for instance GCM and CWC).
Hash functions have been a hot subject lately. A bunch of old hash functions, including the famous MD5, were broken in 2004. There is an ongoing competition to determine the next American standard hash function, codenamed SHA-3.
An awful lot of work has been done on some implementation issues, in particular side-channel leaks (how secret data leaks through power consumption, timing, residual electro-magnetic emissions...) and how to block them.
The main problem of contemporary cryptography is not finding algorithms but whole concepts and approaches for different situations (but of course the algorithms are continually improved too).
We have today
Symmetric algorithms (AES)
Asymmetric algorithms (RSA, ECC)
Key exchange (Diffie-Hellman-Key-Exchange, Shamir's no key protocol)
Secret sharing (intersection of n-dimensional planes)
Cryptographic hash functions (SHA)
Some have proven insecure and were improved
DES due to a much to small key-space
MD5
and some are broken
Merke/Hellman knapsack cryptosystem
Monoalphabetic subsitution
Naive Vigenère
Which particular algorithm is chosen is often a question of available resources (elliptic curves need smaller keys that RSA algorithm for comparable safety) or just of standardization (as tanascius pointed out, there are competitions for such algorithms). Totally new trends usually start when a whole class of cryptosystems has been shown vulnerable against a specific attack (man-in-the-middle, side-channel) or scientific progress is made (quantum cryptography).
Of course, there is also steganography which doesn't attempt so conceal the content but the existence of a secret message by hiding it in other documents.
Currently there is the NIST hash function competition running with the goal to find a replacement for the older SHA-1 and SHA-2 functions. So this is about a cryptographic hash function.
You can have a look at the list of the accepted algorithms for round two, and you can get whitepapers to all of the algorithms taking part there.
I am not up-to-date, but I doubt that there are any completely new approaches for the algorithms.
EDIT: Well, one of the candidates was the Elliptic curve only hash, but it is listed under "Entrants with substantial weaknesses" ^^
People have covered most other things; I'll talk about design:
Block ciphers: Traditional block cipher (e.g. DES) uses a Feistel structure. There's a move to a more general S-P network (Rijndael, Serpent) which are easier to parallelize, and cipher modes which support parallelization (CS, GCM) and efficient authenticated encryption (CS, GCM, IGE/BIGE).
Hashes: Traditional hashes use the obvious Merkle–Damgård construction. This has some undesirable properties:
Length extension is trivial (to some extent this is mitigated by proper finalization)
Several attacks on the collision resistance (multicollisions, Joux 2004; "expandable message" attacks, Kelsey 2005; herding attacks, Kelsey 2006).
In the popular Davies–Meyer mode used in MD{4,5} and SHA-{0,1,2} — hi=hi−1⊕E(mi, hi−1)) — every message block has a fixed-point D(mi,0). This makes the expandable message attack much easier.
The SHA-3 contest notice (070911510–7512–01) also suggests randomized hashing and resistance against length extension (currently achieved with HMAC for MD5/SHA-1/SHA-2) and parallelization (a few hashes specify a tree hashing mode).
There's a general trend towards avoiding potential table lookups (e.g. Threefish, XTEA) to mitigate cache timing attacks.

What is currently the most secure one-way encryption algorithm?

As many will know, one-way encryption is a handy way to encrypt user passwords in databases. That way, even the administrator of the database cannot know a user's password, but will have to take a password guess, encrypt that with the same algorithm and then compare the result with the encrypted password in the database. This means that the process of figuring out the password requires massive amounts of guesses and a lot of processing power.
Seeing that computers just keep getting faster and that mathematicians are still developing these algorithms, I'm wondering which one is the most secure considering modern computing power and encryption techniques.
I've been using MD5 almost exclusively for years now, and I'm wondering if there's something more I should be doing. Should I be contemplating a different algorithm?
Another related question: How long should a field typically be for such an encrypted password? I must admit that I know virtually nothing about encryption, but I'm assuming that an MD5 hash (as an example) can be longer and would presumably take more processing power to crack. Or does the length of the field not matter at all, provided that the encrypted password fits in it in the first place?
Warning: Since this post was written in 2010, GPUs have been widely deployed to brute-force password hashes. Moderately-priced GPUs
can run ten billion MD5s per second. This means that even a
completely-random 8-character alphanumeric password (62 possible
characters) can be brute forced in 6 hours. SHA-1 is only slightly
slower, it'd take one day. Your user's passwords are much weaker, and
(even with salting) will fall at a rate of thousands of passwords per
second. Hash functions are designed to be fast. You don't want this
for passwords. Use scrypt, bcrypt, or PBKDF-2.
MD5 was found to be weak back in 1996, and should not be used anymore for cryptographic purposes. SHA-1 is a commonly used replacement, but has similar problems. The SHA-2 family of hash functions are the current replacement of SHA-1. The members of SHA-2 are individually referred to as SHA-224, SHA-256, SHA-384, and SHA-512.
At the moment, several hash functions are competing to become SHA-3, the next standardised cryptographic hashing algorithm. A winner will be chosen in 2012. None of these should be used yet!
For password hashing, you may also consider using something like bcrypt. It is designed to be slow enough to make large scale brute force attacks infeasible. You can tune the slowness yourself, so it can be made slower when computers are becoming faster.
Warning: bcrypt is based on an older two-way encryption algorithm, Blowfish, for which better alternatives exist today. I do not think that the cryptographic hashing properties of bcrypt are completely understood. Someone correct me if I'm wrong; I have never found a reliable source that discusses bcrypt's properties (other than its slowness) from a cryptographic perspective.
It may be somewhat reassuring that the risk of collisions matters less for password hashing than it does for public-key cryptography or digital signatures. Using MD5 today is a terrible idea for SSL, but not equally disastrous for password hashing. But if you have the choice, simply pick a stronger one.
Using a good hash function is not enough to secure your passwords. You should hash the passwords together with salts that are long and cryptographically random. You should also help your users pick stronger passwords or pass phrases if possible. Longer always is better.
Great question! This page is a good read. In particular, the author claims that MD5 is not appropriate for hashing passwords:
The problem is that MD5 is fast. So are its modern competitors, like SHA1 and SHA256. Speed is a design goal of a modern secure hash, because hashes are a building block of almost every cryptosystem, and usually get demand-executed on a per-packet or per-message basis.
Speed is exactly what you don’t want in a password hash function.
The article then goes on to explain some alternatives, and recommends Bcrypt as the "correct choice" (his words, not mine).
Disclaimer: I have not tried Bcrypt at all. Consider this a friendly recommendation but not something I can back up with my own technical experience.
To increase password strength you should use a wider variety of symbols. If you have 8-10 characters in the password it becomes pretty hard to crack. Although making it longer will make it more secure, only if you use numeric/alphabetic/other characters.
SHA1 is another hashing (one way encryption) algorithm, it is slower, but is has a longer digest. (encoded messsage) (160 bit) where MD5 only has 128 bit.
Then SHA2 is even more secure, but it used less.
salting the password is always an extra level of defense
$salt = 'asfasdfasdf0a8sdflkjasdfapsdufp';
$hashed = md5( $userPassword . $salt );
Seeing that computers just keep getting faster and that mathematicians are still developing these algorithms
RSA encryption is secure in that it relies on a really big number being hard to factor. Eventually, computers will get fast enough to factor the number in a reasonable amount of time. To stay ahead of the curve, you use a bigger number.
However, for most web sites, the purpose of hashing passwords is to make it inconvenient for someone with access to the database to read the password, not to provide security. For that purpose, MD5 is fine1.
The implication here is that if a malicious user gains access to your entire database, they don't need the password. (The lock on the front door won't stop me from coming in the window.)
1 Just because MD5 is "broken" doesn't mean you can just reverse it whenever you want.
Besides being a cryptographically secure one-way function, a good hash function for password protection should be hard to brute force - i.e. slow by design. scrypt is one of the best in that area. From the homepage:
We estimate that on modern (2009) hardware, if 5 seconds are spent computing a derived key, the cost of a hardware brute-force attack against scrypt is roughly 4000 times greater than the cost of a similar attack against bcrypt (to find the same password), and 20000 times greater than a similar attack against PBKDF2.
That said, from commonly available hash functions, doing a few thousand of iterations of anything from the SHA family is pretty reasonable protection for non-critical passwords.
Also, always add a salt to make it impossible to share effort for brute forcing many hashes at a time.
NIST is currently running a contest to select a new hashing algorith, just as they did to select the AES encryption algorithm. So the answer to this question will likely be different in a couple of years.
You can look up the submissions and study them for yourself to see if there's one that you'd like to use.

Why do you need lots of randomness for effective encryption?

I've seen it mentioned in many places that randomness is important for generating keys for symmetric and asymmetric cryptography and when using the keys to encrypt messages.
Can someone provide an explanation of how security could be compromised if there isn't enough randomness?
Randomness means unguessable input. If the input is guessable, then the output can be easily calculated. That is bad.
For example, Debian had a long standing bug in its SSL implementation that failed to gather enough randomness when creating a key. This resulted in the software generating one of only 32k possible keys. It is thus easily possible to decrypt anything encrypted with such a key by trying all 32k possibilities by trying them out, which is very fast given today's processor speeds.
The important feature of most cryptographic operations is that they are easy to perform if you have the right information (e.g. a key) and infeasible to perform if you don't have that information.
For example, symmetric cryptography: if you have the key, encrypting and decrypting is easy. If you don't have the key (and don't know anything about its construction) then you must embark on something expensive like an exhaustive search of the key space, or a more-efficient cryptanalysis of the cipher which will nonetheless require some extremely large number of samples.
On the other hand, if you have any information on likely values of the key, your exhaustive search of the keyspace is much easier (or the number of samples you need for your cryptanalysis is much lower). For example, it is (currently) infeasible to perform 2^128 trial decryptions to discover what a 128-bit key actually is. If you know the key material came out of a time value that you know within a billion ticks, then your search just became 340282366920938463463374607431 times easier.
To decrypt a message, you need to know the right key.
The more possibly keys you have to try, the harder it is to decrypt the message.
Taking an extreme example, let's say there's no randomness at all. When I generate a key to use in encrypting my messages, I'll always end up with the exact same key. No matter where or when I run the keygen program, it'll always give me the same key.
That means anyone who have access to the program I used to generate the key, can trivially decrypt my messages. After all, they just have to ask it to generate a key too, and they get one identical to the one I used.
So we need some randomness to make it unpredictable which key you end up using. As David Schmitt mentions, Debian had a bug which made it generate only a small number of unique keys, which means that to decrypt a message encrypted by the default OpenSSL implementation on Debian, I just have to try this smaller number of possible keys. I can ignore the vast number of other valid keys, because Debian's SSL implementation will never generate those.
On the other hand, if there was enough randomness in the key generation, it's impossible to guess anything about the key. You have to try every possible bit pattern. (and for a 128-bit key, that's a lot of combinations.)
It has to do with some of the basic reasons for cryptography:
Make sure a message isn't altered in transit (Immutable)
Make sure a message isn't read in transit (Secure)
Make sure the message is from who it says it's from (Authentic)
Make sure the message isn't the same as one previously sent (No Replay)
etc
There's a few things you need to include, then, to make sure that the above is true. One of the important things is a random value.
For instance, if I encrypt "Too many secrets" with a key, it might come out with "dWua3hTOeVzO2d9w"
There are two problems with this - an attacker might be able to break the encryption more easily since I'm using a very limited set of characters. Further, if I send the same message again, it's going to come out exactly the same. Lastly, and attacker could record it, and send the message again and the recipient wouldn't know that I didn't send it, even if the attacker didn't break it.
If I add some random garbage to the string each time I encrypt it, then not only does it make it harder to crack, but the encrypted message is different each time.
The other features of cryptography in the bullets above are fixed using means other than randomness (seed values, two way authentication, etc) but the randomness takes care of a few problems, and helps out on other problems.
A bad source of randomness limits the character set again, so it's easier to break, and if it's easy to guess, or otherwise limited, then the attacker has fewer paths to try when doing a brute force attack.
-Adam
A common pattern in cryptography is the following (sending text from alice to bob):
Take plaintext p
Generate random k
Encrypt p with k using symmetric encryption, producing crypttext c
Encrypt k with bob's private key, using asymmetric encryption, producing x
Send c+x to bob
Bob reverses the processes, decrypting x using his private key to obtain k
The reason for this pattern is that symmetric encryption is much faster than asymmetric encryption. Of course, it depends on a good random number generator to produce k, otherwise the bad guys can just guess it.
Here's a "card game" analogy: Suppose we play several rounds of a game with the same deck of cards. The shuffling of the deck between rounds is the primary source of randomness. If we didn't shuffle properly, you could beat the game by predicting cards.
When you use a poor source of randomness to generate an encryption key, you significantly reduce the entropy (or uncertainty) of the key value. This could compromise the encryption because it makes a brute-force search over the key space much easier.
Work out this problem from Project Euler, and it will really drive home what "lots of randomness" will do for you. When I saw this question, that was the first thing that popped into my mind.
Using the method he talks about there, you can easily see what "more randomness" would gain you.
A pretty good paper that outlines why not being careful with randomness can lead to insecurity:
http://www.cs.berkeley.edu/~daw/papers/ddj-netscape.html
This describes how back in 1995 the Netscape browser's key SSL implementation was vulnerable to guessing the SSL keys because of a problem seeding the PRNG.

Resources