How does one read files/text encrypted in AES 256 CBC by Ruby in Clojure - ruby

We have some JSON files that live on a filesystem in our internal network that are written by Ruby and read by Clojure (and also Ruby). We'd like to encrypt them to increase their security. We've used AES 256 CBC inside our Ruby project for other things that need symmetric encryption so we'd like to use that. However, this time, the encryption will need to be decrypted in a Clojure application. The output of encryption (using this as a guide: OpenSSL::Cipher), as represented in a Ruby string, looks like: "a\x96\xECLI\xBC%\xC4#{\xBD\x99%\xA1\x84\x84" and putting that into a Clojure REPL results in a bunch of "Syntax error... Unsupported escape character: \x" I tried making every "\" a "\\" but then using Clojure's library "Buddy-Core" and :aes256-cbc-hmac-sha512 as the encryption algorithm results in:
Execution error (AssertionError) at buddy.core.crypto/eval1558$fn (crypto.clj:478).
Assert failed: (keylength? key 64)
Even though the key/iv were a fine length when used to encrypt the string in Ruby.
To sum up:
To convert from a Ruby String to a Clojure String is it correct to
replace all backslashes with double backslashes?
Is :aes256-cbc-hmac-sha512 the correct algorithm for use with Clojure's Buddy-Core to decrypt AES 256 CBC?
Would I be better off doing this in Java inside of Clojure? (sub question: Please do advise on converting Ruby Strings to Java/Clojure Strings)

Whoops misread the question!
Try to avoid strings at all cost when working with raw binary data

The Ruby string is not a valid Java string (i.e. \x is not a valid escape character)
The allowed Java escape characters are here:
https://docs.oracle.com/javase/specs/jls/se13/html/jls-3.html#jls-EscapeSequence
I'm not sure if Ruby strings are unicode by default, so this is a bit tricky.
The safest bet is to turn your encryption result into BASE64 and use that.

Related

Why is Ruby base64 encoded string different from all other base64 encoded strings?

For end-to-end encrypted communication between a client and a server, I am implementing an encryption/decryption algorithm.
However, they (encryption/decryption and base64 encoding/decoding) work fine only when it's in Ruby.
But the actual problem I see is with the Base64 encoding of Ruby.
For example, let's say I have this (32 bytes) AES key:
"\"1\xAF\xC7\xC0\xA6\xC9\xBA\xD6\x9F\xBA\xD2\xC9\xBE\x0F\x8E*\x88\x87(\x9B\xCBp\x15!/\x13\x8F\xCE\xFB\x15\x9B"
that I am using to encrypt data in AES algorithm.
I want to send this key to a client in Base64 encoded format. For that, I am doing (two ways, each produces different encoded output):
Key in double quotes
Base64.urlsafe_encode64("\"1\xAF\xC7\xC0\xA6\xC9\xBA\xD6\x9F\xBA\xD2\xC9\xBE\x0F\x8E*\x88\x87(\x9B\xCBp\x15!/\x13\x8F\xCE\xFB\x15\x9B")
# => "IjGvx8CmybrWn7rSyb4PjiqIhyiby3AVIS8Tj877FZs="
Key in single quotes
Base64.urlsafe_encode64('\"1\xAF\xC7\xC0\xA6\xC9\xBA\xD6\x9F\xBA\xD2\xC9\xBE\x0F\x8E*\x88\x87(\x9B\xCBp\x15!/\x13\x8F\xCE\xFB\x15\x9B')
# => "XCIxXHhBRlx4QzdceEMwXHhBNlx4QzlceEJBXHhENlx4OUZceEJBXHhEMlx4QzlceEJFXHgwRlx4OEUqXHg4OFx4ODcoXHg5Qlx4Q0JwXHgxNSEvXHgxM1x4OEZceENFXHhGQlx4MTVceDlC"
Output 1 is different from all other libraries: Java, Swift, and an online site, another site, which all produce the same output.
Output 2 is the same with other libraries with respect to the output encoding. But I have issues converting AES key and AES encrypted data to be used in single quotes, which is not possible, as I have encrypted data that already have those single quotes and other illegal characters, for which Ruby's Base64 encoding does not work correctly.
Any help would be appreciated.
The problem is that when you use single quote you will get a different result:
a = "\"1\xAF\xC7\xC0\xA6\xC9\xBA\xD6\x9F\xBA\xD2\xC9\xBE\x0F\x8E*\x88\x87(\x9B\xCBp\x15!/\x13\x8F\xCE\xFB\x15\x9B"
#=> "\"1\xAF\xC7\xC0\xA6ɺ֟\xBA\xD2ɾ\u000F\x8E*\x88\x87(\x9B\xCBp\u0015!/\u0013\x8F\xCE\xFB\u0015\x9B"
b = '\"1\xAF\xC7\xC0\xA6\xC9\xBA\xD6\x9F\xBA\xD2\xC9\xBE\x0F\x8E*\x88\x87(\x9B\xCBp\x15!/\x13\x8F\xCE\xFB\x15\x9B'
#=> "\\\"1\\xAF\\xC7\\xC0\\xA6\\xC9\\xBA\\xD6\\x9F\\xBA\\xD2\\xC9\\xBE\\x0F\\x8E*\\x88\\x87(\\x9B\\xCBp\\x15!/\\x13\\x8F\\xCE\\xFB\\x15\\x9B"
a == b
=> false
a.bytes.count
=> 32
b.bytes.count
=> 108
a.length
=> 29
b.length
=> 108
So you see single quotes will double escape the escapes.
However, as Jordan Running mentions in comments you should not roll your own encryption, it is not secure please read this answer to understand why.
It is a much better idea to use ruby's openssl gem. For instructions on how to use it, please refer to documentation here.
If you need single quote behavior but cannot use single quotes then Ruby has the %q option, that is %q followed by brackets or parens etc surrounding your string, or any char following %q will act as single quote: %q|it's your string| or %q{it's your string}

Can I alpha sort base32/64 encoded MD5 hashes?

I've got a massive file of hex encoded MD5 values that I'm using linux 'sort' utility to sort. The result is that the hashes come out in sequential order (which is what I need for the next stage of processing). E.g:
000001C35AE83CEFE245D255FFC4CE11
000003E4B110FE637E0B4172B386ACAC
000004AAD0EB3D896B654A960B0111FA
In the interest of speeding up the sort operation (and making the files smaller), I was considering encoding the data as base32 or base64.
The question is, would an alpha-sort of the base32/64 data get me the same result? My quick tests seem to indicate that it would work. For example, the above three hex strings correspond 1:1 to these base64 strings:
AAABw1roPO/iRdJV/8TOEQ==
AAAD5LEQ/mN+C0Fys4asrA==
AAAEqtDrPYlrZUqWCwER+g==
But I'm unsure as to the sort order when it comes to special characters used in Base64 like "/" and "+" and how those would be treated in the context of an alpha sort.
Note: I happen to be using the linux sort utility but the question still applies to other alpha-sorting tools. The tool used is not really part of the question.
I've since discovered that this isn't possible with the standard base32/64 implementations. There exists however a base32 variation called "base32hex" which preserves sort ordering, but there is no official "base64hex" equivalent.
Looks like that leaves creating a custom encoding like this.
EDIT:
This turned out to be very trivial to solve. Simply encode in base 64 then translate character to character with a custom table of characters that respects sort order.
Simply map from the standard Mime 64 characters:
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
To something like this:
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz|~"
Then sorting will work.

Password Hashes Characters in Applications

I appreciate your time.
I'm trying to understand something. Why is it that hashes that I generate manually only seem to include alphanumeric characters 0-9, a-f, but all of the hashes used by our favorite applications seem to contain all of the letters [and capitalized ones at that]?
Example:
Manual hash using sha256:
# sha256sum <<< asdf
d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1 -
You never see any letters above f. And nothing is capitalized.
But if I create a SHA hash using htpasswd, it's got all the alphanumerics:
# htpasswd -snb test asdf
test:{SHA}PaVBVZkYqAjCQCu6UBL2xgsnZhw=
Same thing happens if you look at a password hash in a website CMS database for example. There must be some extra step I'm missing or the end format is different than the actual hash format. I thought it might be base64 encoded or something, but it did not seem to decode.
Can someone please explain what's happening behind the scenes here? My friend explained that piping "asdf" to sha256sum is showing the checksum, which is not the actual hash itself. Is that correct? If so, how can I see the actual hash?
Thank you so much in advanced!
There's two things going on here.
First, your manual hash is using a different algorithm than htpasswd. The -s flag causes htpasswd to use SHA1, not SHA256. Use sha1sum instead of sha256sum.
Second, the encoding of the hashes are different. Your manual hash is Hex encoded, the htpasswd hash is Base64 encoded. The htpasswd hash will decode, it just decodes to binary. If you try to print this binary it will look like =¥AU™¨Â#+ºPöÆ'f (depending on what character encoding you're using), and that may be why you believe it's not decoding.
If you convert the Base64 directly to Hex (you can use an online tool like this one), you'll find that sha1sum will generate the same hash.
My friend explained that piping "asdf" to sha256sum is showing the checksum, which is not the actual hash itself.
Your friend is incorrect. You're seeing the Hex encoding of the hash. But the piping does affect the hash that's generated, it adds a newline character, so what you're actually hashing is asdf\n. Use this command instead:
echo -n "asdf" | sha1sum
It is base64 encoded.
Base64 encoding ends an an equal sign. So that is the first indicator. Although the htpasswd man page doesn't mention it, other Apache docs about "the password encryption formats generated and understood by Apache" does say that the SHA format understood by Apache is base64 encoded.

Create a SHA1 hash in Ruby via .net technique

OK hopefully the title didn't scare you away. I'm creating a sha1 hash using Ruby but it has to follow a formula that our other system uses to create the hash.
How can I do the following via Ruby? I'm creating hashes fine - but the format stuff is confusing me - curious if there's something equiv in the Ruby standard library.
System Security Cryptography (MSDN)
Here's the C# code that I'm trying to convert to Ruby. I'm making my hash fine, but not sure about the 'String.Format("{0,2:X2}' part.
//create our SHA1 provider
SHA1 sha = new SHA1CryptoServiceProvider();
String str = "Some value to hash";
String hashedValue; -- hashed value of the string
//hash the data -- use ascii encoding
byte[] hashedDataBytes = sha.ComputeHash(Encoding.ASCII.GetBytes(str));
//loop through each byte in the byte array
foreach (byte b in hashedDataBytes)
{
//convert each byte -- append to result
hashedValue += String.Format("{0,2:X2}", b);
}
A SHA1 hash of a specific piece of data is always the same hash (effectively just a large number), the only variation should be how you need to format it, to e.g. send to the other system. Although particularly obscure systems might post-process the data, truncate it or only take alternate bytes etc.
At a very rough guess from reading the C# code, this ends up a standard looking 40 character hex string. So my initial thought in Ruby is:
require 'digest/sha1'
Digest::SHA1.hexdigest("Some value to hash").upcase
. . . I don't know however what the force to ascii format in C# would do when it starts with e.g. a Latin-1 or UTF-8 string. They would be useful example inputs, if only to see C# throw an exception - you know then whether you need to worry about character encoding for your Ruby version. The data input to SHA1 is bytes, not characters, so which encoding to use and how to convert are parts of your problem.
My current understanding is that Encoding.ASCII.GetBytes will force anything over character number 127 to a '?', so you may need to emulate that in Ruby using a .gsub or similar, especially if you are actually expecting that to come in from the data source.

Ruby hexacode to unicode conversion

I crawled a website which contains unicode, an the results look something like, if in code
a = "\\u2665 \\uc624 \\ube60! \\uc8fd \\uae30 \\uc804 \\uc5d0"
May I know how do I do it in Ruby to convert it back to the original Unicode text which is in UTF-8 format?
If you have ruby 1.9, you can try:
a.force_encoding('UTF-8')
Otherwise if you have < 1.9, I'd suggest reading this article on converting to UTF-8 in Ruby 1.8.
short answer: you should be able to 'puts a', and see the string printed out. for me, at least, I can print out that string in both 1.8.7 and 1.9.2
long answer:
First thing: it depends on if you're using ruby 1.8.7, or 1.9.2, since the way strings and encodings were handled changed.
in 1.8.7:
strings are just lists of bytes. when you print them out, if your OS can handle it, you can just 'puts a' and it should work correctly. if you do a[0], you'll get the first byte. if you want to get each character, things are pretty darn tricky.
in 1.9.2
strings are lists of bytes, with an encoding. If the webpage was sent with the correct encoding, your string should already be encoded correctly. if not, you'll have to set it (as per Mike Lewis's answer). if you do a[0], you'll get the first character (the heart). if you want each byte, you can do a.bytes.
If your OS, for whatever reason, is giving you those literal ascii characters,my previous answer is obviously invalid, disregard it. :P
here's what you can do:
a.gsub(/\\u([a-z0-9]+)/){|p| [$1.to_i(16)].pack("U")}
this will scan for the ascii string '\u' followed by a hexadecimal number, and replace it with the correct unicode character.
You can also specify the encoding when you open a new IO object: http://www.ruby-doc.org/core/classes/IO.html#M000889
Compared to Mike's solution, this may prevent troubles if you forget to force the encoding before exposing the string to the rest of your application, if there are multiple mechanisms for retrieving strings from your module or class. However, if you begin crawling SJIS or KOI-8 encoded websites, then Mike's solution will be easier to adapt for the character encoding name returned by the web server in its headers.

Resources