OK hopefully the title didn't scare you away. I'm creating a sha1 hash using Ruby but it has to follow a formula that our other system uses to create the hash.
How can I do the following via Ruby? I'm creating hashes fine - but the format stuff is confusing me - curious if there's something equiv in the Ruby standard library.
System Security Cryptography (MSDN)
Here's the C# code that I'm trying to convert to Ruby. I'm making my hash fine, but not sure about the 'String.Format("{0,2:X2}' part.
//create our SHA1 provider
SHA1 sha = new SHA1CryptoServiceProvider();
String str = "Some value to hash";
String hashedValue; -- hashed value of the string
//hash the data -- use ascii encoding
byte[] hashedDataBytes = sha.ComputeHash(Encoding.ASCII.GetBytes(str));
//loop through each byte in the byte array
foreach (byte b in hashedDataBytes)
{
//convert each byte -- append to result
hashedValue += String.Format("{0,2:X2}", b);
}
A SHA1 hash of a specific piece of data is always the same hash (effectively just a large number), the only variation should be how you need to format it, to e.g. send to the other system. Although particularly obscure systems might post-process the data, truncate it or only take alternate bytes etc.
At a very rough guess from reading the C# code, this ends up a standard looking 40 character hex string. So my initial thought in Ruby is:
require 'digest/sha1'
Digest::SHA1.hexdigest("Some value to hash").upcase
. . . I don't know however what the force to ascii format in C# would do when it starts with e.g. a Latin-1 or UTF-8 string. They would be useful example inputs, if only to see C# throw an exception - you know then whether you need to worry about character encoding for your Ruby version. The data input to SHA1 is bytes, not characters, so which encoding to use and how to convert are parts of your problem.
My current understanding is that Encoding.ASCII.GetBytes will force anything over character number 127 to a '?', so you may need to emulate that in Ruby using a .gsub or similar, especially if you are actually expecting that to come in from the data source.
Related
We have some JSON files that live on a filesystem in our internal network that are written by Ruby and read by Clojure (and also Ruby). We'd like to encrypt them to increase their security. We've used AES 256 CBC inside our Ruby project for other things that need symmetric encryption so we'd like to use that. However, this time, the encryption will need to be decrypted in a Clojure application. The output of encryption (using this as a guide: OpenSSL::Cipher), as represented in a Ruby string, looks like: "a\x96\xECLI\xBC%\xC4#{\xBD\x99%\xA1\x84\x84" and putting that into a Clojure REPL results in a bunch of "Syntax error... Unsupported escape character: \x" I tried making every "\" a "\\" but then using Clojure's library "Buddy-Core" and :aes256-cbc-hmac-sha512 as the encryption algorithm results in:
Execution error (AssertionError) at buddy.core.crypto/eval1558$fn (crypto.clj:478).
Assert failed: (keylength? key 64)
Even though the key/iv were a fine length when used to encrypt the string in Ruby.
To sum up:
To convert from a Ruby String to a Clojure String is it correct to
replace all backslashes with double backslashes?
Is :aes256-cbc-hmac-sha512 the correct algorithm for use with Clojure's Buddy-Core to decrypt AES 256 CBC?
Would I be better off doing this in Java inside of Clojure? (sub question: Please do advise on converting Ruby Strings to Java/Clojure Strings)
Whoops misread the question!
Try to avoid strings at all cost when working with raw binary data
The Ruby string is not a valid Java string (i.e. \x is not a valid escape character)
The allowed Java escape characters are here:
https://docs.oracle.com/javase/specs/jls/se13/html/jls-3.html#jls-EscapeSequence
I'm not sure if Ruby strings are unicode by default, so this is a bit tricky.
The safest bet is to turn your encryption result into BASE64 and use that.
I have some binary data that I want to convert to something more easily readable and copy/pastable.
The binary data shows up like this
?Q?O?,???W%ʐ):?g????????
Which is pretty ugly. I can convert it to hex with:
value.unpack("H*").first
But since hexadecimal only has 16 characters, it isn't very compressed. I end up with a string that is hundreds of chars long.
I'd prefer a format that uses letters (capitalized and lowercase), numbers, and basic symbols, to make best use of the possible values. What can I use?
I'd also prefer something that comes built-in to Ruby, that doesn't require another library. Unfortunately I can't require another library unless it's really well known and popular, or ideally built-in to Ruby.
I tried the stuff from http://apidock.com/ruby/String/unpack and couldn't find anything.
A simple method uses Base64 encoding to encode the value. It's very similar to Hex encoding (which is Base16), but uses a longer dictionary.
Base64 strings, when properly prepared, contain only printable characters. This is a benefit for copy/paste and for sharing.
The secondary benefit is that it has a 3:4 encoding ratio, which means that it's reasonably efficient. A 3:4 encoding ration means that for each 3 bytes in the input, 4 bytes are used to encode (75% efficient); Hex encoding is a less efficient 1:2 encoding ratio, or for each 1 byte of input, 2 bytes are used to encode (50% efficient).
You can use the Ruby standard library Base64 implementation to encode and decode, like so:
require "base64"
encoded = Base64.encode64("Taste the thunder!") # <== "VGFzdGUgdGhlIHRodW5kZXIh\n"
decoded = Base64.decode64(encoded) # <== "Taste the thunder!"
Note that there is a (mostly) URL-safe version, as well, so that you can include an encoded value anywhere in a URL without requiring any additional URL encoding. This would allow you to pass information in a URL in an obscured way, and especially information that normally wouldn't be easily passed in that manner.
Try this to encode your data:
encoded_url_param = Base64.urlsafe_encode64("cake+pie=yummy!") # <== "Y2FrZStwaWU9eXVtbXkh"
decoded_url_param = Base64.urlsafe_decode64(encoded_url_param) # <== "cake+pie=yummy!"
Using Base64 in a URL, while actually not "security", will help keep prying eyes from your data and intent. The only potential downside to using Base64 values in a URL is that the URL must remain case-sensitive, and some applications don't honor that requirement. See the Should URL be case sensitive SO question for more information.
Sounds to me like you want base 64. It is part of the standard library:
require 'base64'
Base64.encode64(some_data)
Or using pack,
[some_data].pack("m")
The resulting data is about 4/3 the size of the input.
Base36 string encoding is a reasonable alternative to both Base64 and Hex encoding, as well. In this encoding method, only 36 characters are used, typically the ASCII lowercase letters and the ASCII numbers.
There's not a Ruby API that specifically does this, however this SO answer Base36 Encode a String shows how to do this efficiently in Ruby:
Encoding to Base36:
encoded = data.unpack('H*')[0].to_i(16).to_s(36)
Decoding from Base36:
decoded = [encoded.to_i(36).to_s(16)].pack 'H*'
Base36 encoding will work well when used in URLs, similarly to Base64, however it is not sensitive to the case sensitivity issues that Base64 is.
Note that Base36 string encoding should not be confused with base 36 radix integer encoding, which simply converts an integer value to the corresponding base 36 encoding. The integer technique uses String#to_i(36) and Fixnum#to_s(36) to accomplish its goals.
What is the best way to turn the string "FA" into /xFA/ ?
To be clear, I don't want to turn "FA" into 7065 or "FA".to_i(16).
In Java the equivalent would be this:
byte b = (byte) Integer.decode("0xFA");
So you're using / markers, but you aren't actually asking about regexps, right?
I think this does what you want:
['FA'].pack('H*')
# => "\xFA"
There is no actual byte type in ruby stdlib (I don't think? unless there's one I don't know about?), just Strings, that can be any number of bytes long (in this case, one). A single "byte" is typically represented as a 1-byte long String in ruby. #bytesize on a String will always return the length in bytes.
"\xFA".bytesize
# => 1
Your example happens not to be a valid UTF-8 character, by itself. Depending on exactly what you're doing and how you're environment is set up, your string might end up being tagged with a UTF-8 encoding by default. If you are dealing with binary data, and want to make sure the string is tagged as such, you might want to #force_encoding on it to be sure. It should NOT be neccesary when using #pack, the results should be tagged as ASCII-8BIT already (which has a synonym of BINARY, it's basically the "null encoding" used in ruby for binary data).
['FA'].pack('H*').encoding
=> #<Encoding:ASCII-8BIT
But if you're dealing with string objects holding what's meant to be binary data, not neccesarily valid character data in any encoding, it is useful to know you may sometimes need to do str.force_encoding("ASCII-8BIT") (or force_encoding("BINARY"), same thing), to make sure your string isn't tagged as a particular text encoding, which would make ruby complain when you try to do certain operations on it if it includes invalid bytes for that encoding -- or in other cases, possibly do the wrong thing
Actually for a regexp
Okay, you actually do want a regexp. So we have to take our string we created, and embed it in a regexp. Here's one way:
representation = "FA"
str = [representation].pack("H*")
# => "\xFA"
data = "\x01\xFA\xC2".force_encoding("BINARY")
regexp = Regexp.new(str)
data =~ regexp
# => 1 (matched on byte 1; the first byte of data is byte 0)
You see how I needed the force_encoding there on the data string, otherwise ruby would default to it being a UTF-8 string (depending on ruby version and environment setup), and complain that those bytes aren't valid UTF-8.
In some cases you might need to explicitly set the regexp to handle binary data too, the docs say you can pass a second argument 'n' to Regexp.new to do that, but I've never done it.
I have an elementary problem that I can't seem to figure out. I'm trying to generate a random key in AES-256-CBC that can be used to encrypt/decrypt data.
Here is what i'm doing:
require 'openssl'
cipher = OpenSSL::Cipher::AES256.new(:CBC)
cipher.encrypt
puts cipher.random_key
>> "\xACOM:\xCF\xB3#o)<&y!\x16A\xA1\xB5m?\xF1 \xC9\x1F>\xDB[Uhz)\v0"
That gives me the string above, which looks nothing like keys i've used in the past. I am very new to encryption as you may be able to tell, but I'm trying to understand if I need to further prepare the string. I created a quick view in rails so I could go to /generate and it would render a simple html page with a random key. It wouldn't even render the page and was complaining about invalid uTF8. the only way I could get the page to display was to Base64 encode the key first.
I know i'm missing something stupid. Any ideas would be great.
EDIT:
This is what it looks like if I Base64encode. Should I be stripping the = signs off or something?
AES-128-CBC
Random Key: 0xdq+IZdmYHHbLC9Uv8jgQ==
Random IV: vp08d/nFGE3R8HsmOzYzOA==
AES-256-CBC
Random Key: BW0wY5fUkcwszV5GIczI+D45eFOz/Ehvw5XdZIavVOQ=
Random IV: D0pXdwQAqu+XSOv8E/dqBw==
Thanks for the help!
To answer your question (quoted from Wikipedia):
The '==' sequence indicates that the last group contained only one
byte, and '=' indicates that it contained two bytes.
In theory, the padding character is not needed for decoding, since the
number of missing bytes can be calculated from the number of Base64
digits. In some implementations, the padding character is mandatory,
while for others it is not used.
For Ruby and your use case the answer is: No problem to strip the =
In my web application one model uses identifier that was generated by some UUID tool. As I want that identifier to be part of the URL I am investigating methods to shorten that UUID string. As it is currently is in hexadecimal format I thought about converting it to ASCII somehow. As it should afterwards only contain normal characters and number ([\d\w]+) the normal hex to ASCII conversion doesn't seem to work (ugly characters).
Do you know of some nice algorithm or tool (Ruby) to do that?
A UUID is a 128-bit binary number, in the end. If you represent it as 16 unencoded bytes, there's no way to avoid "ugly characters". What you probably want to do is decode it from hex and then encode it using base64. Note that base64 encoding uses the characters + / = as well as A-Za-z0-9, you'll want to do a little postprocessing (I suggest s/+/-/g; s/\//_/g; s/==$// -- a base64ed UUID will always end with two equals signs)