Rainbow Tables: How to defend against them?

Rainbow Tables: How to defend against them? - windows

I recently obtained the l0pht-CD for windows and tried it out on my PC and It WORKS!!
2600hertz.wordpress.com/2009/12/22/100-windows-xp-vista-7-password-recovery
I have also read
kestas.kuliukas.com/RainbowTables/
I'm designing a "Login-Simulator" that stores pwd-s in a similar manner. The current implementation will be vulnerable to the above attack. Plz could anyone illustrate (in as simple terms as possible), how to strengthen against such a rainbow tables attack.
MY GOAL : Build "Login-Simulator" to be as secure as possible. (Read Hacking Competition ;-) )
Thank You.

Since a rainbow table is a series of precomputed hash chains for various passwords, it is easily foiled by adding a salt to the passwords. Because hash functions usually remove much of the local correspondence between input and output (that is, a small change in input produces large, seemingly unrelated changes in output), even a small salt will be immensely effective.
Best of all, the salt does not need to be secret for this to be effective; the rainbow table needs to be recomputed for all possible password-salt combinations.

You should use bcrypt, which has been designed by professional cryptographers to do exactly what you're looking for.
In general, you should never invent your own encryption / hashing schemes.
Cryptography is extremely complicated, and you should stick to what has been proven to work.
However, the basic answer to your question is to add a random per-user salt, and switch to a slower hash.

Related

What is the output of a fingerprint scanner? Is there any deterministic identifying information?

I am planning on generating a set of public/private keys from a deterministic identifying piece of information from a person and was planning on using fingerprints.
My question, therefore, is: what is the output of a fingerprint scanner? Is there any deterministic output I could use, or is it always going to be a matter of "confidence level"? i.e. Do I always get a "number" which, if matched exactly to the database, will allow access, or do I rather get a number which, if "close enough" to the stored value on the database, allows access, based on a high degree of confidence, rather than an exact match?
I am quite sure the second option is the answer but just wanted to double-check. Is there any way to get some sort of deterministic output? My hope was to re-generate keys every time rather than actually storing fingerprint data. That way a wrong fingerprint would simply generate a new and useless key.
Any suggestions?
Thanks in advance.

I would advise against it for several reasons.
The fingerprints are not entirely deterministic. As suggested in #ImSimplyAnna answer, you might 'round' the results in order to have more chances to obtain a deterministic result. But that would significantly reduce the number of possible/plausible fingerprints, and thus not meet the search space size requirement for a cryptographic algorithm. On top of it, I suspect the entropy of such result to be somehow low, compared to the requirements of modern algorithm which are always based on high quality random numbers.
Fingerprints are not secret, we expose them to everyone all the time, and they can be revealed to an attacker at any time, and stored in a picture using a simple camera. A key must be a secret, and the only place we know we can store secrets without exposing them is our brain (which is why we use passwords).
An important feature for cryptographic keys is the possibility to generate new one if there is a reason to believe the current ones might be compromised. This is not possible with fingerprints.
That is why I would advise against it. Globally, I discourage anyone (myself included) to write his/her own cryptographic algorithm, because it is so easy to screw them up. It might be the easiest thing to screw up, out of all the things you could write, because attacker are so vicicous!
The only good approach, if you're not a skilled specialist, is to use libraries that are used all around, because they've been written by experts on the matter, and they've been subject to many attacks and attempts to break them, so the ones still standing will offer much better levels of protection that anything a non specialist could write (or basically anything a single human could write).
You can also have a look at this question, on the crypto stack exchange. They also discourage the OP in using anything else than a battle hardened algorithm, or protocol.
Edit:
I am planning on generating a set of public/private keys from a
deterministic identifying piece of information
Actually, It did not strike me at first (it should have), but keys MUST NOT be generated from anything which is not random. NEVER.
You have to generate them randomly. If you don't, you already give more information to the attacker than he/she wants. Being a programmer does not make you a cryptographer. Your user's informations are at stake, do not take any chance (and if you're not a cryptographer, you actually don't stand any).

A fingerprint scanner looks for features where the lines on the fingerprint either split or end. It then calculates the distances and angles between such features in an attempt to find a match.
Here's some more reading on the subject:
https://www.explainthatstuff.com/fingerprintscanners.html
in the section "How fingerprints are stored and compared".
The source is the best explanation I can find, but looking around some more it seems that all fingerprint scanners use some variety of that algorithm to generate data that can be matched.
Storing raw fingerprints would not only take up way more space on a database but also be a pretty significant security risk if that information was ever leaked, so it's not really done unless absolutely necessary.
Judging by that algorithm, I would assume that there is always some "confidence level". The angles and distances will never be 100% equal between scans, so there has to be some leeway to make sure a match is still found even if the finger is pressed against the scanner a bit harder or the finger is at a slightly different angle.
Based on this, I'd assume that generating a key pair based on a fingerprint would be possible, if you can figure out a way to make similar scans result in the same information. Simply rounding the angles and distances may work, but may introduce cases where two different people generate the same key pairs, or cases where different scans of the same fingerprint have a high chance of generating several different keys.

What's the best hashing algorithm when using a smaller user database

When it comes to hashing passwords the best algorithms are slower ones, Argon2 and bcrypt to name a few. However if you have a small user base, say around 10,000 users, are slower algorithms still the best solution?

There is absolutely no argument against using best practise with smaller databases. The work, whether you call BCrypt or SHA-* is the same, maybe even smaller with BCrypt because the salt handling and storage is done for you.
Btw this would not be the first database/code designed for a small audience, but growing/reused for a much larger project.

Password hashing algorithm that will keep password safe even from supercomputers?

I was researching about how MD5 is known to have collisions, So its not secure enough. I am looking for some hashing algorithm that even super computers will take time to break.So can you tell me what hashing algorithm will keep my passwords safe for like next coming 20 years of super computing advancement.

Use a key derivation function with a variable number of rounds, such as bcrypt.
The passwords you encrypt today, with a hashing difficulty that your own system can handle without slowing down, will always be vulnerable to the faster systems of 20 years in the future. But by increasing the number of rounds gradually over time you can increase the amount of work it takes to check a password in proportion with the increasing power of supercomputers. And you can apply more rounds to existing stored passwords without having to go back to the original password.
Will it hold up for another 20 years? Difficult to say: who knows what crazy quantum crypto and password-replacement schemes we might have by then? But it certainly worked for the last 10.
Note also that entities owning supercomputers and targeting particular accounts are easily going to have enough power to throw at it that you can never protect all of your passwords. The aim of password hashing is to mitigate the damage from a database leak, by limiting the speed at which normal attackers can recover passwords, so that as few accounts as possible have already been compromised by the time you've spotted the leak and issued a notice telling everyone to change their passwords. But there is no 100% solution.

As someone else said, what you're asking is practically impossible to answer. Who knows what breakthroughs will be made in processing power over the next twenty years? Or mathematics?
In addition you aren't telling us many other important factors, including against which threat models you aim to protect. For example, are you trying to defend against an attacker getting a hold of a hashed password database and doing offline brute-forcing? An attacker with custom ASICs trying to crack one specific password? Etc.
With that said, there are things you can do to be as secure and future-proof as possible.
First of all, don't just use vanilla cryptographic hash algorithms; they aren't designed with your application in mind. Indeed they are designed for other applications with different requirements. For one thing, they are fast because speed is an important criterion for a hash function. And that works against you in this case.
Additionally some of the algorithms you mention, like MD5 or SHA1 have weaknesses (some theoretical, some practical) and should not be used.
Prefer something like bcrypt, an algorithm designed to resist brute force attacks by being much slower than a general purpose cryptographic hash whose security can be “tuned” as necessary.
Alternatively, use something like PBKDF2 which is. Designed to run a password through a function of your choice a configurable number of times along with a salt, which also makes brute forcing much more difficult.
Adjust the iteration count depending on your usage model, keeping in mind that the slower it is, the more security against brute-force you have.
In selecting a cryptographic hash function for PBKDF, prefer SHA-3 or, if you can't use that, prefer one of the long variants of SHA-2: SHA-384 or SHA-512. I'd steer clear of SHA-256 although I don't think there's an issue with it in this scenario.
In any case, use the largest possible and best salt you can; I'd suggest that you use a good cryptographically secure PRNG and never use a salt less than 64 bits (note: that I am talking about the length of the salt generated, not the value returned).
Will these recommendations help 20 years down the road? Who knows - I'd err on the side of caution and say "no". But if you need security for that long a timeframe, you should consider using something other than passwords.
Anyways, I hope this helps.

Here are two pedantic answers to this question:
If P = NP, there is provably no such hash function (and vice versa, incidentally). Since it has not been proven that P != NP at the time of this writing, we cannot make any strong guarantees of that nature.
That being said, I think it's safe to say that supercomputers developed within the next 20 years will take "time" to break your hash, regardless of what it is. Even if it is in plaintext some time is required for I/O.
Thus, the answer to your question is both yes and no :)

What is currently the most secure one-way encryption algorithm?

As many will know, one-way encryption is a handy way to encrypt user passwords in databases. That way, even the administrator of the database cannot know a user's password, but will have to take a password guess, encrypt that with the same algorithm and then compare the result with the encrypted password in the database. This means that the process of figuring out the password requires massive amounts of guesses and a lot of processing power.
Seeing that computers just keep getting faster and that mathematicians are still developing these algorithms, I'm wondering which one is the most secure considering modern computing power and encryption techniques.
I've been using MD5 almost exclusively for years now, and I'm wondering if there's something more I should be doing. Should I be contemplating a different algorithm?
Another related question: How long should a field typically be for such an encrypted password? I must admit that I know virtually nothing about encryption, but I'm assuming that an MD5 hash (as an example) can be longer and would presumably take more processing power to crack. Or does the length of the field not matter at all, provided that the encrypted password fits in it in the first place?

Warning: Since this post was written in 2010, GPUs have been widely deployed to brute-force password hashes. Moderately-priced GPUs
can run ten billion MD5s per second. This means that even a
completely-random 8-character alphanumeric password (62 possible
characters) can be brute forced in 6 hours. SHA-1 is only slightly
slower, it'd take one day. Your user's passwords are much weaker, and
(even with salting) will fall at a rate of thousands of passwords per
second. Hash functions are designed to be fast. You don't want this
for passwords. Use scrypt, bcrypt, or PBKDF-2.
MD5 was found to be weak back in 1996, and should not be used anymore for cryptographic purposes. SHA-1 is a commonly used replacement, but has similar problems. The SHA-2 family of hash functions are the current replacement of SHA-1. The members of SHA-2 are individually referred to as SHA-224, SHA-256, SHA-384, and SHA-512.
At the moment, several hash functions are competing to become SHA-3, the next standardised cryptographic hashing algorithm. A winner will be chosen in 2012. None of these should be used yet!
For password hashing, you may also consider using something like bcrypt. It is designed to be slow enough to make large scale brute force attacks infeasible. You can tune the slowness yourself, so it can be made slower when computers are becoming faster.
Warning: bcrypt is based on an older two-way encryption algorithm, Blowfish, for which better alternatives exist today. I do not think that the cryptographic hashing properties of bcrypt are completely understood. Someone correct me if I'm wrong; I have never found a reliable source that discusses bcrypt's properties (other than its slowness) from a cryptographic perspective.
It may be somewhat reassuring that the risk of collisions matters less for password hashing than it does for public-key cryptography or digital signatures. Using MD5 today is a terrible idea for SSL, but not equally disastrous for password hashing. But if you have the choice, simply pick a stronger one.
Using a good hash function is not enough to secure your passwords. You should hash the passwords together with salts that are long and cryptographically random. You should also help your users pick stronger passwords or pass phrases if possible. Longer always is better.

Great question! This page is a good read. In particular, the author claims that MD5 is not appropriate for hashing passwords:
The problem is that MD5 is fast. So are its modern competitors, like SHA1 and SHA256. Speed is a design goal of a modern secure hash, because hashes are a building block of almost every cryptosystem, and usually get demand-executed on a per-packet or per-message basis.
Speed is exactly what you don’t want in a password hash function.
The article then goes on to explain some alternatives, and recommends Bcrypt as the "correct choice" (his words, not mine).
Disclaimer: I have not tried Bcrypt at all. Consider this a friendly recommendation but not something I can back up with my own technical experience.

To increase password strength you should use a wider variety of symbols. If you have 8-10 characters in the password it becomes pretty hard to crack. Although making it longer will make it more secure, only if you use numeric/alphabetic/other characters.
SHA1 is another hashing (one way encryption) algorithm, it is slower, but is has a longer digest. (encoded messsage) (160 bit) where MD5 only has 128 bit.
Then SHA2 is even more secure, but it used less.

salting the password is always an extra level of defense
$salt = 'asfasdfasdf0a8sdflkjasdfapsdufp';
$hashed = md5( $userPassword . $salt );

Seeing that computers just keep getting faster and that mathematicians are still developing these algorithms
RSA encryption is secure in that it relies on a really big number being hard to factor. Eventually, computers will get fast enough to factor the number in a reasonable amount of time. To stay ahead of the curve, you use a bigger number.
However, for most web sites, the purpose of hashing passwords is to make it inconvenient for someone with access to the database to read the password, not to provide security. For that purpose, MD5 is fine1.
The implication here is that if a malicious user gains access to your entire database, they don't need the password. (The lock on the front door won't stop me from coming in the window.)
1 Just because MD5 is "broken" doesn't mean you can just reverse it whenever you want.

Besides being a cryptographically secure one-way function, a good hash function for password protection should be hard to brute force - i.e. slow by design. scrypt is one of the best in that area. From the homepage:
We estimate that on modern (2009) hardware, if 5 seconds are spent computing a derived key, the cost of a hardware brute-force attack against scrypt is roughly 4000 times greater than the cost of a similar attack against bcrypt (to find the same password), and 20000 times greater than a similar attack against PBKDF2.
That said, from commonly available hash functions, doing a few thousand of iterations of anything from the SHA family is pretty reasonable protection for non-critical passwords.
Also, always add a salt to make it impossible to share effort for brute forcing many hashes at a time.

NIST is currently running a contest to select a new hashing algorith, just as they did to select the AES encryption algorithm. So the answer to this question will likely be different in a couple of years.
You can look up the submissions and study them for yourself to see if there's one that you'd like to use.

Why do you need lots of randomness for effective encryption?

I've seen it mentioned in many places that randomness is important for generating keys for symmetric and asymmetric cryptography and when using the keys to encrypt messages.
Can someone provide an explanation of how security could be compromised if there isn't enough randomness?

Randomness means unguessable input. If the input is guessable, then the output can be easily calculated. That is bad.
For example, Debian had a long standing bug in its SSL implementation that failed to gather enough randomness when creating a key. This resulted in the software generating one of only 32k possible keys. It is thus easily possible to decrypt anything encrypted with such a key by trying all 32k possibilities by trying them out, which is very fast given today's processor speeds.

The important feature of most cryptographic operations is that they are easy to perform if you have the right information (e.g. a key) and infeasible to perform if you don't have that information.
For example, symmetric cryptography: if you have the key, encrypting and decrypting is easy. If you don't have the key (and don't know anything about its construction) then you must embark on something expensive like an exhaustive search of the key space, or a more-efficient cryptanalysis of the cipher which will nonetheless require some extremely large number of samples.
On the other hand, if you have any information on likely values of the key, your exhaustive search of the keyspace is much easier (or the number of samples you need for your cryptanalysis is much lower). For example, it is (currently) infeasible to perform 2^128 trial decryptions to discover what a 128-bit key actually is. If you know the key material came out of a time value that you know within a billion ticks, then your search just became 340282366920938463463374607431 times easier.

To decrypt a message, you need to know the right key.
The more possibly keys you have to try, the harder it is to decrypt the message.
Taking an extreme example, let's say there's no randomness at all. When I generate a key to use in encrypting my messages, I'll always end up with the exact same key. No matter where or when I run the keygen program, it'll always give me the same key.
That means anyone who have access to the program I used to generate the key, can trivially decrypt my messages. After all, they just have to ask it to generate a key too, and they get one identical to the one I used.
So we need some randomness to make it unpredictable which key you end up using. As David Schmitt mentions, Debian had a bug which made it generate only a small number of unique keys, which means that to decrypt a message encrypted by the default OpenSSL implementation on Debian, I just have to try this smaller number of possible keys. I can ignore the vast number of other valid keys, because Debian's SSL implementation will never generate those.
On the other hand, if there was enough randomness in the key generation, it's impossible to guess anything about the key. You have to try every possible bit pattern. (and for a 128-bit key, that's a lot of combinations.)

It has to do with some of the basic reasons for cryptography:
Make sure a message isn't altered in transit (Immutable)
Make sure a message isn't read in transit (Secure)
Make sure the message is from who it says it's from (Authentic)
Make sure the message isn't the same as one previously sent (No Replay)
etc
There's a few things you need to include, then, to make sure that the above is true. One of the important things is a random value.
For instance, if I encrypt "Too many secrets" with a key, it might come out with "dWua3hTOeVzO2d9w"
There are two problems with this - an attacker might be able to break the encryption more easily since I'm using a very limited set of characters. Further, if I send the same message again, it's going to come out exactly the same. Lastly, and attacker could record it, and send the message again and the recipient wouldn't know that I didn't send it, even if the attacker didn't break it.
If I add some random garbage to the string each time I encrypt it, then not only does it make it harder to crack, but the encrypted message is different each time.
The other features of cryptography in the bullets above are fixed using means other than randomness (seed values, two way authentication, etc) but the randomness takes care of a few problems, and helps out on other problems.
A bad source of randomness limits the character set again, so it's easier to break, and if it's easy to guess, or otherwise limited, then the attacker has fewer paths to try when doing a brute force attack.
-Adam

A common pattern in cryptography is the following (sending text from alice to bob):
Take plaintext p
Generate random k
Encrypt p with k using symmetric encryption, producing crypttext c
Encrypt k with bob's private key, using asymmetric encryption, producing x
Send c+x to bob
Bob reverses the processes, decrypting x using his private key to obtain k
The reason for this pattern is that symmetric encryption is much faster than asymmetric encryption. Of course, it depends on a good random number generator to produce k, otherwise the bad guys can just guess it.

Here's a "card game" analogy: Suppose we play several rounds of a game with the same deck of cards. The shuffling of the deck between rounds is the primary source of randomness. If we didn't shuffle properly, you could beat the game by predicting cards.
When you use a poor source of randomness to generate an encryption key, you significantly reduce the entropy (or uncertainty) of the key value. This could compromise the encryption because it makes a brute-force search over the key space much easier.

Work out this problem from Project Euler, and it will really drive home what "lots of randomness" will do for you. When I saw this question, that was the first thing that popped into my mind.
Using the method he talks about there, you can easily see what "more randomness" would gain you.

A pretty good paper that outlines why not being careful with randomness can lead to insecurity:
http://www.cs.berkeley.edu/~daw/papers/ddj-netscape.html
This describes how back in 1995 the Netscape browser's key SSL implementation was vulnerable to guessing the SSL keys because of a problem seeding the PRNG.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio