I'm asking about the format used after the password is hashed and preparing it for storage. The dollar sign $ annotation is something that seems to be widespread. Is that described in a standard somewhere (including the identifiers for algorithms)?
For example, when using Go with golang.org/x/crypto/bcrypt, it gives such an encoded string (playground):
func main() {
h, err := bcrypt.GenerateFromPassword([]byte("foo"), bcrypt.DefaultCost)
if err != nil {
panic(err)
}
fmt.Printf("%s", h)
// Output: $2a$10$g1d5KuvDIrRoUyWL2BQs7uLOWCzlM.zqbRm8o364u20p20YNmJ.Ve
}
However, other hashing packages like scrypt (example) and argon2 return just the resulting hash. Using the argon2 shell command, there is an encoded string returned:
echo "foo" | argon2 saltsalt
Type: Argon2i
Iterations: 3
Memory: 4096 KiB
Parallelism: 1
Hash: d9e4f94546b9e5b0cfb2dbf9dad81d41371845d8b6a8c25ce7caf23e13f1ef72
Encoded: $argon2i$v=19$m=4096,t=3,p=1$c2FsdHNhbHQ$2eT5RUa55bDPstv52tgdQTcYRdi2qMJc58ryPhPx73I
0.005 seconds
Verification ok
I found a Go / argon2 specific blog post explaining this encoding, so far so good
Variations I found
My trouble lies with the definition of the dollar separated string, the portability and variations I found.
glibc
The man 3 crypt page gives some pointers. There is a table of identifiers:
ID Method
───────────────────────────────────────────────────────────
1 MD5
2a Blowfish (not in mainline glibc; added in some Linux
distributions)
5 SHA-256 (since glibc 2.7)
6 SHA-512 (since glibc 2.7)
But this doesn't cover newer types, like argon2i or scrypt.
Then there are the example strings:
$id$salt$encrypted
$id$rounds=yyy$salt$encrypted
The latter being only supported after Glibc 2.7.
bcrypt
While bcrypt uses the 2a (blowfish) identifier from Glibc, its encoding seems to be slightly different as seen from the above example:
$2a$10$g1d5KuvDIrRoUyWL2BQs7uLOWCzlM.zqbRm8o364u20p20YNmJ.Ve
$id$cost$<dot seperated line of what exactly?>
argon2
Argon2 uses 5 fields and a full name identifier like argon2
$argon2i$v=19$m=4096,t=3,p=1$c2FsdHNhbHQ$2eT5RUa55bDPstv52tgdQTcYRdi2qMJc58ryPhPx73I
$id$version$parameters$salt$encrypted
why?
I want to write a package that hashes and verifies passwords in an algorithm agnostic way. Allowing the consumers to change parameters and algorithms without refactoring their code. Therefore during verification the package should be able to assert the algorithm used when storing the password. If stored version of parameters or algorithm is different than the one currently in use, the password is re-hashed and a new encoded string is returned.
As a bonus, I would like the package to have the ability to re-hash "legacy" passwords which might have been stored by older (not go) applications. For instance, md5. In order to do all this I would like to have a deeper understanding of the storage format itself.
what is the standard for password hash string encoding?
There is none.
Hey, that was an easy answer! Clicks "Post Your Answer".
Okay, while the above statement is unfortunately true, thankfully, there are some people who have already gone through the trouble of collecting a lot of information about all of the variations in use.
In particular, the authors of the Passlib library for Python (which does essentially the same thing you want to do) have written up a page about what they call the Modular Crypt Format which they call "a standard that isn’t". Here are some choice quotes from that page [bold italic emphasis mine]:
However, there’s no official specification document describing this format. Nor is there a central registry of identifiers, or actual rules. The modular crypt format is more of an ad-hoc idea rather than a true standard.
[Modular Crypt Format – Overview]
Unfortunately, there is no specification document for this format. Instead, it exists in de facto form only
When MCF was first introduced, most schemes choose a single digit as their identifier (e.g. $1$ for md5_crypt). Because of this, some older systems only look at the first character when attempting to distinguish hashes.
Most modular crypt format hashes follow this convention, though some (like bcrypt) omit the $ separator between the configuration and the digest.
[T]here is no set standard about whether configuration strings should or should not include a trailing $ at the end
[Modular Crypt Format – Requirements]
Please note that the Modular Crypt Format is not a specification or a standard. It is a description of the various different formats that are used in the wild. There is an attempt at a specification by the organizers of the Password Hashing Competition (PHC), called the PHC String Format. However, the PHC is no formal standards organization with any kind of authority. It is just a loose group of cryptographers. While they recommend that every new password hashing function should use the PHC String Format, they can only mandate it for password hashing functions that are submitted to the Password Hashing Competition.
And either way, the PHC String Format only applies to new password hashing functions, not to existing ones.
While I strongly suggest that you should use the PHC String Format for any output you generate, you will still have to deal with inputs in all sorts of different formats, including some gems like these:
cta_pbkdf2_sha1 and dlitz_pbkdf2_sha1 both use the same identifier. While there are other internal differences, the two can be quickly distinguished by the fact that cta hashes always end in =, while dlitz hashes contain no = at all.
Related
Here's the problem, a string has been passed through three separate encryptions in the following order: Original -> Base64 -> AES-256 -> Blowfish (Keyless) -> Final. Write a method that takes this triple encoded string mystery_string = "OXbVgH7UriGqmRZcqOXUOvJt8Q4JKn5MwD1XP8bg9yHwhssYAKfWE+AMpr25HruA" and fully unencrypts it to its original state.
I looked into different libraries/documentation for aes256 and blowfish but all of them required a key. The only one that did not require a key was Base64 (i.e. Base64.encode64('some string') ). Not really sure where to go from here.
Firstly, the only way to crack AES-256 and Blowfish without the key is by brute force enumeration of every possibly 32-byte combination that could be used as the key. In theory, this means it's not crackable in our lifetime. There may be some vulnerabilities you could exploit as you also have the plain text, but I doubt you would have that in a real-life situation.
Second, and most importantly, just going by that site, encode-decode.comhttps://encode-decode.com/, you don't actually have enough information to decode the string even if you did know the password.
The various modes of operation for the AES256 cipher function requires either a 32-byte (or sometimes a 64-byte) key. The secret that you used (you may have just left it blank) needs to be converted into a 32-byte encryption key. This is done using a hashing algorithm, but we don't know which one is used. Hopefully, the site used a key derivation function, which provides several security benefits. However, key derivation functions require multiple parameters, and we would need to know what parameters to enter along with our secret to get the right encryption key.
Finally, we don't know if the secret is being concatenated with a salt before being hashed. Without knowing if a salt is used and what the salt is, we cannot determine the correct 32-byte key used to encrypt the plain text.
In summary, the answer to your question is: No, there is not a quick way to decrypt that string without knowing the key.
However, encryption is an awesome topic to learn.
I'd encourage you to look over the ruby docs for the OpenSSL library. They're actually quite good (besides the don'ts I mention below).
The PBKDF2 Password-based Encryption function is one of the key derivation functions I was referring to.
When encrypting with AES, you will most likely want to use AES-256-GCM which is authenticated encryption.
A couple of don'ts:
Don't use ciphers at random... understand their strengths and weaknesses
Don't use AES-128-EBC - explination
Another good encryption library is rb-NaCl.
The following question is more complex than it may first seem.
Assume that I've got an arbitrary JSON object, one that may contain any amount of data including other nested JSON objects. What I want is a cryptographic hash/digest of the JSON data, without regard to the actual JSON formatting itself (eg: ignoring newlines and spacing differences between the JSON tokens).
The last part is a requirement, as the JSON will be generated/read by a variety of (de)serializers on a number of different platforms. I know of at least one JSON library for Java that completely removes formatting when reading data during deserialization. As such it will break the hash.
The arbitrary data clause above also complicates things, as it prevents me from taking known fields in a given order and concatenating them prior to hasing (think roughly how Java's non-cryptographic hashCode() method works).
Lastly, hashing the entire JSON String as a chunk of bytes (prior to deserialization) is not desirable either, since there are fields in the JSON that should be ignored when computing the hash.
I'm not sure there is a good solution to this problem, but I welcome any approaches or thoughts =)
The problem is a common one when computing hashes for any data format where flexibility is allowed. To solve this, you need to canonicalize the representation.
For example, the OAuth1.0a protocol, which is used by Twitter and other services for authentication, requires a secure hash of the request message. To compute the hash, OAuth1.0a says you need to first alphabetize the fields, separate them by newlines, remove the field names (which are well known), and use blank lines for empty values. The signature or hash is computed on the result of that canonicalization.
XML DSIG works the same way - you need to canonicalize the XML before signing it. There is a proposed W3 standard covering this, because it's such a fundamental requirement for signing. Some people call it c14n.
I don't know of a canonicalization standard for json. It's worth researching.
If there isn't one, you can certainly establish a convention for your particular application usage. A reasonable start might be:
lexicographically sort the properties by name
double quotes used on all names
double quotes used on all string values
no space, or one-space, between names and the colon, and between the colon and the value
no spaces between values and the following comma
all other white space collapsed to either a single space or nothing - choose one
exclude any properties you don't want to sign (one example is, the property that holds the signature itself)
sign the result, with your chosen algorithm
You may also want to think about how to pass that signature in the JSON object - possibly establish a well-known property name, like "nichols-hmac" or something, that gets the base64 encoded version of the hash. This property would have to be explicitly excluded by the hashing algorithm. Then, any receiver of the JSON would be able to check the hash.
The canonicalized representation does not need to be the representation you pass around in the application. It only needs to be easily produced given an arbitrary JSON object.
Instead of inventing your own JSON normalization/canonicalization you may want to use bencode. Semantically it's the same as JSON (composition of numbers, strings, lists and dicts), but with the property of unambiguous encoding that is necessary for cryptographic hashing.
bencode is used as a torrent file format, every bittorrent client contains an implementation.
This is the same issue as causes problems with S/MIME signatures and XML signatures. That is, there are multiple equivalent representations of the data to be signed.
For example in JSON:
{ "Name1": "Value1", "Name2": "Value2" }
vs.
{
"Name1": "Value\u0031",
"Name2": "Value\u0032"
}
Or depending on your application, this may even be equivalent:
{
"Name1": "Value\u0031",
"Name2": "Value\u0032",
"Optional": null
}
Canonicalization could solve that problem, but it's a problem you don't need at all.
The easy solution if you have control over the specification is to wrap the object in some sort of container to protect it from being transformed into an "equivalent" but different representation.
I.e. avoid the problem by not signing the "logical" object but signing a particular serialized representation of it instead.
For example, JSON Objects -> UTF-8 Text -> Bytes. Sign the bytes as bytes, then transmit them as bytes e.g. by base64 encoding. Since you are signing the bytes, differences like whitespace are part of what is signed.
Instead of trying to do this:
{
"JSONContent": { "Name1": "Value1", "Name2": "Value2" },
"Signature": "asdflkajsdrliuejadceaageaetge="
}
Just do this:
{
"Base64JSONContent": "eyAgIk5hbWUxIjogIlZhbHVlMSIsICJOYW1lMiI6ICJWYWx1ZTIiIH0s",
"Signature": "asdflkajsdrliuejadceaageaetge="
}
I.e. don't sign the JSON, sign the bytes of the encoded JSON.
Yes, it means the signature is no longer transparent.
JSON-LD can do normalitzation.
You will have to define your context.
RFC 7638: JSON Web Key (JWK) Thumbprint includes a type of canonicalization. Although RFC7638 expects a limited set of members, we would be able to apply the same calculation for any member.
https://www.rfc-editor.org/rfc/rfc7638#section-3
What would be ideal is if JavaScript itself defined a formal hashing process for JavaScript Objects.
Yet we do have RFC-8785 JSON Canonicalization Scheme (JCS) which hopefully can be implemented in most libs for JSON and in particular added to popular JavaScript JSON object. With this canonicalization done it is just a matter of applying your preferred hashing algorithm.
If JCS is available in browsers and other tools and libs it becomes responsible to expect most JSON on-the-wire to be in this common canonicalized form. Common consistent application and verification of standards like this can go some way to pushing back against trivial security threats by low skilled actors.
I would do all fields in a given order (alphabetically for example). Why does arbitrary data make a difference? You can just iterate over the properties (ala reflection).
Alternatively, I would look into converting the raw json string into some well defined canonical form (remove all superflous formatting) - and hashing that.
We encountered a simple issue with hashing JSON-encoded payloads.
In our case we use the following methodology:
Convert data into JSON object;
Encode JSON payload in base64
Message digest (HMAC) the generated base64 payload .
Transmit base64 payload .
Advantages of using this solution:
Base64 will produce the same output for a given payload.
Since the resulting signature will be derived directly from the base64-encoded payload and since base64-payload will be exchanged between the endpoints, we will be certain that the signature and payload will be maintained.
This solution solve problems that arise due to difference in encoding of special characters.
Disadvantages
The encoding/decoding of the payload may add overhead
Base64-encoded data is usually 30+% larger than the original payload.
I need to communicate with an API that requires the request to be encoded in Blowfish and Base64.
In my custom library I start off with:
# encoding: utf-8
require "base64"
require 'crypt/blowfish'
I create an instance:
#blowfish_key = '1234567887654321'
#blowfish = Crypt::Blowfish.new(#blowfish_key)
And further down I create the encrypted string (or 'ticket' as the API calls it)
#string_to_encrypt = "#{#partnerid},#{user.id},#{exam_id},#{return_url},#{time_stamp}"
#enc = #blowfish.encrypt_string(#string_to_encrypt)
In the Rails console I can decrypt with #blowfish.decrypt_string(#enc) without any problems. But the API gives me gibberish:
Invalid ticket:
Decrypted String :)IŠkó}*Ogû…xÃË-ÖÐHé%q‹×ÎmªÇjEê !©†xRðá=Ͳ [À}=»ïN)'sïƒJJ=:›õ)¦$ô1X¢
Also, when I encrypt something simple in the console, like "Hello", and feed the encrypted string to an online Blowfish decoder, like http://webnet77.com/cgi-bin/helpers/blowfish.pl, I get the same gibberish mess back.
It's like the Ruby blowfish encryption is a format that is not used anywhere else.
Note:
In my actual application I send the encrypted string via a form field to the Webservice. The encrypted string is Base64 encoded and prefixed with an 'a'.
#enc = Base64.encode64(#enc)
#enc = 'a' + CGI::escape(#enc)
This is explained in their documentation:
Format of the ticket:
t=’a’ + URL_Encode (
Base64_Encode (
Blowfish_Encrypt ( ‘partnerid,user_id,test_id,URL_Encode(return_URL),ticket_timestamp’, ‘blowfish_key’)
)
)
Note above that the Blowfish_Encrypt function accepts two parameters-
1)the string to encrypt and 2)a hex key. Also note that the ticket has
been prefixed with a lower case ‘a’ (ASCII 97).
Example of what the HTML form will look like:
<form method=”POST” action=”http://www.expertrating.com/partner_site_name/”>
<input type=”hidden” name=”t” value=”adfinoidfhdfnsdfnoihoweirhqwdnd2394yuhealsnkxc234rwef45324fvsdf2” />
<input type=”submit” name=”submit” value=”Proceed to take the Test” />
</form>
I am lost, where do I go wrong?
There's a couple of things:
1) As was said in the comments, if you print the encrypted strings, then this is almost always causing trouble because the encrypted strings are very likely to contain non-ASCII characters that will either be unprintable or in a representation that is not understood by others. The best way to achieve a representation that is widely understood is to encode the result using Base64 or Hex encoding, so the Base64 encoding you apply in your application is fine for that.
2) The Perl app you linked to uses for example hex encoding. As a consequence, it will also only accept encrypted strings in hex encoding. That's why it wouldn't accept any of your inputs at all. You get hex encoding suitable for the application as follows:
hex = encrypted.unpack("H*")[0].upcase
3) But still no luck with the Perl app. One reason is this (taken from Crypt sources):
def encrypt_stream(plainStream, cryptStream)
initVector = generate_initialization_vector(block_size() / 4)
chain = encrypt_block(initVector)
cryptStream.write(chain)
What this means is that Crypt writes the IV as the first block of the encrypted message. This is totally fine, but most modern crypto libraries I know won't prepend the IV but rather assume it is exchanged as out-of-band information between the two communicating parties.
4) But even knowing this, you will still have no luck with the Perl app. The reason is that the Perl app uses ECB mode encryption where the Crypt gem uses CBC mode. CBC mode uses an IV, so even one-block messages won't match except if you are using an all-zero IV. But using an all-zero IV (or any other deterministic IV) is bad practice (and Crypt doesn't do it anyway). Doing so allows distinguishing the first block from random and opens you up to attacks like BEAST. Using ECB is bad practice as well except for totally rare edge cases. So let's forget about that Perl app and concentrate on the things at hand.
5) I'm naturally in favor of using Ruby OpenSSL, but in this case I think I'm not being subjective if I tell you it's better for overall security to use it instead of the Crypt gem. Just telling you so would be lame, so here's two reasons why:
Crypt generates its IVs using a predictable random generator (a combination of srand and rand), but that's not good enough. It has to be a cryptographically secure random generator.
Crypt seems to be no longer maintained and there have been some things going on in the meantime that either were unknown at the time or that were never in the scope of that project. For example, OpenSSL starts to deal with leakage-resilient cryptography to prevent side-channel attacks that target timing, cache misses etc. That has probably never been the intention of Crypt, but such attacks pose a real threat in real life.
6) If my preaching has convinced you to make the change, then I could continue and ask you whether it really has to be Blowfish. The algorithm itself is outstanding, no doubt, but there are better, even more secure options available now, such as AES for example. If it absolutely has to be Blowfish, it's supported by Ruby OpenSSL as well:
cipher = OpenSSL::Cipher.new('bf-cbc')
7) If your production key looks like the one in the example, then there's another weak spot. There's not enough entropy in such strings. What you should do is again use a cryptographically secure random generator to generate your key, which is quite easy in Ruby OpenSSL:
key = cipher.random_key
The nice thing is it will automatically choose an appropriate key length depending on the cipher algorithm to be used.
8) Finally, am I right in assuming that you use that encrypted result as some form of authentication token? As in you append it to the HTML being rendered, wait to receive it back in some POST request and compare the received token with the original in order to authenticate some action? If this is the case, then this would again be bad practice. This time even more so. You are not using authenticated encryption here, which means that your ciphertext is malleable. This implies that an attacker can relatively easily forge the contents of that token without actually knowing your encryption key. This leads to all sorts of attacks, even leading to total compromise involving key recovery.
You must either use authenticated encryption modes (GCM, CCM, EAX...) or use a message authentication code to detect if the ciphertexts had been tampered with. Even better, don't use encryption at all and generate your tickets by using a secure hash function. The key here is to compute the hash of a securely randomized value, otherwise it is again possible to predict the outcome. A timestamp, as used in your example, is not enough. It has to be a cryptographically secure nonce, probably generated by SecureRandom.
But then you still have to consider replay of such tokens, token hijacking, ... you see, it's not that easy.
I want something that will use a secret key to generate the hash. Also the hash generated should be URL safe and form safe.
There is something already available in your standard Ruby installation: the OpenSSL module.
What you are talking about is probably an OpenSSL::HMAC that provides you with an HMAC, a form of a secret key-based hash.
The result will be a simple Ruby string that potentially carries non URL-safe characters, you have to take care of making this URL-safe yourself. You could use either URI.escape or simply hexencode the string using str.unpack('H*')[0] (see String#unpack).
I am trying to create a ticket for Remote Assistance. Part of that requires creating a PassStub parameter. As of the documentation:
http://msdn.microsoft.com/en-us/library/cc240115(PROT.10).aspx
PassStub: The encrypted novice computer's password string. When the Remote
Assistance Connection String is sent as a file over e-mail, to provide additional security, a
password is used.<16>
In part 16 they detail how to create as PassStub.
In Windows XP and Windows Server 2003, when a password is used, it is encrypted using
PROV_RSA_FULL predefined Cryptographic provider with MD5 hashing and CALG_RC4, the RC4
stream encryption algorithm.
As PassStub looks like this in the file:
PassStub="LK#6Lh*gCmNDpj"
If you want to generate one yourself run msra.exe in Vista or run the Remote Assistance tool in WinXP.
The documentation says this stub is the result of the function CryptEncrypt with the key derived from the password and encrypted with the session id (Those are also in the ticket file).
The problem is that CryptEncrypt produces a binary output way larger than the 15 byte PassStub. Also the PassStub isn't encoding in any way I've seen before.
Some interesting things about the PassStub encoding. After doing statistical analysis the 3rd char is always a one of: !#$&()+-=#^. Only symbols seen everywhere are: *_ . Otherwise the valid characters are 0-9 a-z A-Z. There are a total of 75 valid characters and they are always 15 bytes.
Running msra.exe with the same password always generates a different PassStub, indicating that it is not a direct hash but includes the rasessionid as they say.
Another idea I've had is that it is not the direct result of CryptEncrypt, but a result of the rasessionid in the MD5 hash. In MS-RA (http://msdn.microsoft.com/en-us/library/cc240013(PROT.10).aspx). The "PassStub Novice" is simply hex encoded, and looks to be the right length. The problem is I have no idea how to go from any hash to way the PassStub looks like.
I am curious, have you already:
considered using ISAFEncrypt::EncryptString(bstrEncryptionkey, bstrInputString) as a higher-level alternative to doing all the dirty work directly with CryptEncrypt? (the tlb is in hlpsvc.exe)
looked inside c:\WINDOWS\pchealth\helpctr\Vendors\CN=Microsoft Corporation,L=Redmond,S=Washington,C=US\Remote Assistance\Escalation\Email\rcscreen9.htm (WinXP) to see what is going on when you pick the Save invitation as a file (Advanced) option and provide a password? (feel free to add alert() calls inside OnSave())