Why is BCrypt more secure than just storing a salt and an encrypted password in the database? - bcrypt

I am reading this article and it seems like BCrypt is:
slow to compute a hash from a password (a good thing)
doesn't store a salt in the database but just in the password directly
uses a log_rounds parameter which says how many times to compute the internal hash function.
So the hash would look something like this:
hashed = hashpw(plaintext_password, gensalt(log_rounds=13))
print hashed
'$2a$13$ZyprE5MRw2Q3WpNOGZWGbeG7ADUre1Q8QO.uUUtcbqloU0yvzavOm'
But if that's what's stored in the database, if the database gets hacked, aren't we still vulnerable? The BCrypt hash contains the salt and the encoded password and so why is this better than just storing the salt and the password in the database (The article calls it bad solution #4)?
Is the major difference the slowness of BCrypt's hashing mechanism which makes it hard and expensive to BCrypt a long list of common passwords?

You can't just hash a password, if you do that it will be vulnerable to dictionary attacks; therefore you salt the password before hashing it; this is what BCrypt does.
Password salts can be public, however they must be unique for each password. The point of them is to prevent dictionary attacks on hashes (so you can't look through a list of premade hashes which corresponds to passwords).
Like PBKDF2, Bcrypt is an adaptive function; you can increase iterations later on to make the hash less vulnerable to brute force attacks as more computing power comes out. Despite this Bcrypt is harder to accelerate on GPUs than PBKDF2.

Related

BCrypt - is known prefix a vulnerability?

I'm generating API tokens for my web app. Let's say, 30 symbols in length.
To protect against database leaks, I'd like to store them in encrypted form. BCrypt, because that's the easiest with the framework I'm using.
Since every user might have many tokens, for their convenience I'd like to have the first few symbols of the token available, so that they can, for example see that token qwert... was generated last month, and last used today, but asdfg... has not yet been used.
If I store first 5 symbols of the token in plain text, next to the BCrypt-ed form of the full token, do I have:
Effectively a 25 symbol token, but still reasonably secure system;
or
Security theater, because knowing the plaintext prefix enables attacker to easily force the whole token open?
tl;dr: Don't BCrypt API tokens; Use SHA256.
No, knowing the prefix (or any substring) doesn't weaken the BCrypt hash, beyond the obvious fact that you have reduced the difficulty of a bruteforce attack by shrinking the keyspace.
But regardless, this is a misapplication of BCrypt. BCrypt is for hashing short, poor quality, user-generated passwords, which are tested one time on login, when you can afford the relatively expensive BCrypt comparison.
Long, strongly random tokens don't need the protection BCrypt provides, and it is an egregious waste of computing resources to pay the cost of a BCrypt comparison on every API request your server handles. Your entire API winds up with awful response times for no good reason.
Use SHA256 or SHA512 for hashing tokens.

Is there a cryptographic disadvantage to applying bcrypt to an already hashed password

Imagine a scenario where a client application is sending a password to a backend server so that the server can validate that the user entered the correct password when being compared to a stored variation of the password.
The transport mechanism is HTTPS with the server providing HSTS & HPKP to the user agent and strong cryptographic ciphers being preferred by the server scoring A+ on SSL labs test. None the less, we may wish to avoid sending the original user provided password to the server from the user agent. Instead perhaps we'd send a hash after a number of rounds of SHA-256 on the client.
On the server-side, for the storage of passwords we are using bcrypt with a large number of rounds.
From a cryptographic point of view, is there any disadvantage to performing bcrypt on the already sha-256 hashed value as opposed to directly on the plain text password? Does the fixed length nature of the input text when using hashes somehow undermine the strengths of the algorithm.
EDIT: I'm not asking about performance such as the memory, CPU, storage requirements or wall clock time required to calculate, store, sent or compare values. I'm purely interested in whether applying a hash prior to applying bcrypt could weaken the strength of bcrypt in the case of a disclosure of the full list of stored values.
For anyone interested in this, I followed advice and asked on security.stackexchange.com here

Get constant SALT from encrypted and clear text values

I have a constant SALT that is appended to every cookie before it is encrypted with sha512. If I know the clear text and the final encrypted values of more than one cookie, is it possible to use a tool like john the ripper to guess the salt value?
The answers I found on the internet talk about finding the password, but i'm interested in finding the salt.
Short Answer:
No you can't.
Reasons:
First of all sha512 is a hashfunction. You can't "decrypt" hashfunctions. If it would be able to do this, sha512 would not be safe.
Days ago google found the first collision in 6,610 CPU-Years.
Source: First sha512 Hashcollsion
(This is not an attack!) They used an amount of distributed systems. So a normal program like john-the-ripper wouldn't be able to do this.
SHA512 is not encryption.
It would require a brute force attack across the salt range, if the salt is just a few characters or bytes the attack would easily succeed.
Depending on the usage an HMAC may be a better choice than just appending a salt, there are attacks (depending usage) on a concatenated salt.
If you use the same salt it will can discovered by the attacker who gains access to the system. A better method is to use a random salt with an HMAC and prepend the salt to the hash value, then it does not need to be secret. This assumes you need to be able to recompute the same hash from the same data
For passwords, where more security is needed, using a hash function with a salt does little to improve the security. Instead iIterate over an HMAC with a random salt for about a 100ms duration and save the salt with the hash. Use functions such as PBKDF2, Rfc2898DeriveBytes, password_hash, Bcrypt and similar functions. The point is to make the attacker spend a lot of time finding passwords by brute force.

Using another algorithm on top of an hashing one

For my Web class project I was told to make a website with login/logout functionality and one of the things my professors demanded was using hashing algorithms to encrypt the users password.
Is it smart to do 1 or more different algorithms to convert my data(in this case a string) before doing the hashing algorithm(ex: MD5, SHA-1,etc)?
No it's not.
Short answer, it won't increase security, and will probably only increase the risk of collisions.
Make sure you use an algorithm designed to hash password like PBKDF2 or BCrypt. Hashing algorithm like MD5 and SHA-1 were created to be efficient, not secure and therefore should never be used to hash password.
Also, use a salt to hash to password to prevent preimage attacks.

Is there a good symmetric encryption algorithm that changes ciphertext if password or plaintext changes?

I want that when the original data or the password changes (I mean, any one of them changes, or both of them change), the encrypted data will always change. In other words, once the encrypted data is certain, then both the original data and the password will be certain, although they are not known to those who don't have the password.
Is there any good symmetric encryption algorithm that fits my specific need?
I assume that you use the password to derive the key for the cipher.
Changing key
Every modern encryption algorithm produces different ciphertexts when a different key is used. That's just how encryption usually works. If it doesn't have this property then everything is broken. All the usual suspect like AES, Blowfish, 3DES have this property.
Changing plaintext
The other property is a little harder to do. This runs under the umbrella of semantic security.
Take for example any modern symmetric cipher in ECB mode. If only a single block of plaintext changes then only the same block changes in the ciphertext. So if you encrypt many similar plaintexts, an attacker who observes the ciphertexts can infer relationships between those. ECB mode is really bad.
Ok, now take a cipher in CBC mode. If you use the same IV over and over again, then an attacker may infer similar relationships as in ECB mode. If the 10th block of plaintext changes, then the previous 9 blocks will be the same in both ciphertexts. So, if you use a new random IV for every encryption, then there is nothing an attacker can deduce besides the length without breaking the underlying cipher.
In other words, once the encrypted data is certain, then both the original data and the password will be certain
The previous paragraph may not be completely what you wanted, because now if you encrypt the same plaintext with the same key twice, you get different results (this is a weak security property) due to a random IV. Since you derive the key from a password, you may also derive the IV from the same password. If you use for example PBKDF2, you can set the number of output bits to be the size of key+IV. You will need to use a static salt value.
If you don't need that last property, then I suggest you use an authenticated mode like GCM or EAX. When you transmit ciphertext or give the attacker an encryption oracle then there are possible attack vectors when no integrity checks are used. An authenticated mode solves this for you without the need to use an encrypt-then-MAC scheme.
If all you care about is detecting when either the data or the password changes, create a file with the data and then append the password. Then use a cryptographic hash like SHA-2 on the file. If either the data or password changes, the hash will change.
Encryption algorithms are generally used to keep data private or to verify identities. Hashes are for detecting when two data objects are different.
The usual reason for encrypting data is that it will be placed in an insecure environment. If the user's encrypted data is in an insecure environment, an opponent can copy it and use software the opponent obtained or wrote, instead of your software, to try to decrypt it. This is similar to some of the controls put on PDFs in Adobe software, such as not being able to cut and paste from the document. Other brands of software may not enforce the no-cut-and-paste restriction.
If the opponent discerns the encryption algorithm, but uses the wrong password, the chances of getting the correct plain text are very small; for the popular modern algorithms, the chance is small enough to neglect compared to other everyday risks we must endure, such as life on earth being destroyed by a comet.

Resources