How do I get a UTF-8 string out of an MD5 digest? - ruby

I am trying to use an API that requires an MD5 hash to be sent in UTF-8 format.
Problem is, I can't find any way to actually make that happen.
require 'digest/md5'
api_sig = Digest::MD5.digest "api_key=blahblahblah"
puts api_sig
>> Decode error: not UTF-8
So I try force_encoding(Encoding::UTF_8). Same error. inspect, to_s, nothing gives me what I want.
How can I get a UTF-8 string representing an MD5 digest of another string?

Call Digest::MD5.hexdigest "api_key=blahblahblah"
The documentation of this is very poor, but you can find a lackluster explanation here: http://www.ruby-doc.org/stdlib-2.0/libdoc/digest/rdoc/Digest/Class.html#method-c-hexdigest

Related

Ruby zlib deflate method is generating invalid characters

I hope you can help me with this.
I've trying to implement a simple code to deflate a string using the zlib gem in a Sinatra app but it seems to be deflating it wrong?!
Here's my code so far:
require 'sinatra'
require 'zlib'
get '/v1/generate' do
file_content = "teste"
generate_diagram_from file_content
end
def generate_diagram_from file_content
data_compressed = Zlib::Deflate.deflate(file_content)
end
And here's what I'm getting from the deflate method:
x�+I-.I�&
A weird string with strange characters and everything.
I'd like to know what I am doing wrong here.
Thanks you guys in advance!
The only thing wrong is your expectation of a readable string.
Compression produces a sequence of bytes of all values (0..255). This is called binary data. It will always contain "weird" or "strange" characters if you try to display it as you have. In fact, it will almost certainly contain invalid UTF-8 sequences, which is why you are getting the white-on-black question marks. This is why you would never try to display or print such sequences as you have.
If you want to look at them or, for example, put them in a question here, display them in hexadecimal.

Ruby: How to decode strings which are partially encoded or fully encoded?

I am getting encoded strings while parsing text files. I have no idea on how to decode them to english or it's original language.
"info#cloudag.com"
is the encoded string and needs to have it decoded.
I want to decode using Ruby.
Here is a link for your reference and I am expecting the same.
This looks like HTML encoding, not URL encoding.
require 'cgi'
CGI.unescapeHTML("info#cloudag.com")
#=> "info#cloudag.com"

Password Hashes Characters in Applications

I appreciate your time.
I'm trying to understand something. Why is it that hashes that I generate manually only seem to include alphanumeric characters 0-9, a-f, but all of the hashes used by our favorite applications seem to contain all of the letters [and capitalized ones at that]?
Example:
Manual hash using sha256:
# sha256sum <<< asdf
d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1 -
You never see any letters above f. And nothing is capitalized.
But if I create a SHA hash using htpasswd, it's got all the alphanumerics:
# htpasswd -snb test asdf
test:{SHA}PaVBVZkYqAjCQCu6UBL2xgsnZhw=
Same thing happens if you look at a password hash in a website CMS database for example. There must be some extra step I'm missing or the end format is different than the actual hash format. I thought it might be base64 encoded or something, but it did not seem to decode.
Can someone please explain what's happening behind the scenes here? My friend explained that piping "asdf" to sha256sum is showing the checksum, which is not the actual hash itself. Is that correct? If so, how can I see the actual hash?
Thank you so much in advanced!
There's two things going on here.
First, your manual hash is using a different algorithm than htpasswd. The -s flag causes htpasswd to use SHA1, not SHA256. Use sha1sum instead of sha256sum.
Second, the encoding of the hashes are different. Your manual hash is Hex encoded, the htpasswd hash is Base64 encoded. The htpasswd hash will decode, it just decodes to binary. If you try to print this binary it will look like =¥AU™¨Â#+ºPöÆ'f (depending on what character encoding you're using), and that may be why you believe it's not decoding.
If you convert the Base64 directly to Hex (you can use an online tool like this one), you'll find that sha1sum will generate the same hash.
My friend explained that piping "asdf" to sha256sum is showing the checksum, which is not the actual hash itself.
Your friend is incorrect. You're seeing the Hex encoding of the hash. But the piping does affect the hash that's generated, it adds a newline character, so what you're actually hashing is asdf\n. Use this command instead:
echo -n "asdf" | sha1sum
It is base64 encoded.
Base64 encoding ends an an equal sign. So that is the first indicator. Although the htpasswd man page doesn't mention it, other Apache docs about "the password encryption formats generated and understood by Apache" does say that the SHA format understood by Apache is base64 encoded.

Ruby OpenSSL AES generate random key

I have an elementary problem that I can't seem to figure out. I'm trying to generate a random key in AES-256-CBC that can be used to encrypt/decrypt data.
Here is what i'm doing:
require 'openssl'
cipher = OpenSSL::Cipher::AES256.new(:CBC)
cipher.encrypt
puts cipher.random_key
>> "\xACOM:\xCF\xB3#o)<&y!\x16A\xA1\xB5m?\xF1 \xC9\x1F>\xDB[Uhz)\v0"
That gives me the string above, which looks nothing like keys i've used in the past. I am very new to encryption as you may be able to tell, but I'm trying to understand if I need to further prepare the string. I created a quick view in rails so I could go to /generate and it would render a simple html page with a random key. It wouldn't even render the page and was complaining about invalid uTF8. the only way I could get the page to display was to Base64 encode the key first.
I know i'm missing something stupid. Any ideas would be great.
EDIT:
This is what it looks like if I Base64encode. Should I be stripping the = signs off or something?
AES-128-CBC
Random Key: 0xdq+IZdmYHHbLC9Uv8jgQ==
Random IV: vp08d/nFGE3R8HsmOzYzOA==
AES-256-CBC
Random Key: BW0wY5fUkcwszV5GIczI+D45eFOz/Ehvw5XdZIavVOQ=
Random IV: D0pXdwQAqu+XSOv8E/dqBw==
Thanks for the help!
To answer your question (quoted from Wikipedia):
The '==' sequence indicates that the last group contained only one
byte, and '=' indicates that it contained two bytes.
In theory, the padding character is not needed for decoding, since the
number of missing bytes can be calculated from the number of Base64
digits. In some implementations, the padding character is mandatory,
while for others it is not used.
For Ruby and your use case the answer is: No problem to strip the =

Unescaping characters in a string with Ruby

Given a string in the following format (the Posterous API returns posts in this format):
s="\\u003Cp\\u003E"
How can I convert it to the actual ascii characters such that s="<p>"?
On OSX, I successfully used Iconv.iconv('ascii', 'java', s) but once deployed to Heroku, I receive an Iconv::IllegalSequence exception. I'm guessing that the system Heroku deploys to does't support the java encoder.
I am using HTTParty to make a request to the Posterous API. If I use curl to make the same request then I do not get the double slashes.
From HTTParty github page:
Automatic parsing of JSON and XML into
ruby hashes based on response
content-type
The Posterous API returns JSON (no double slashes) and HTTParty's JSON parsing is inserting the double slash.
Here is a simple example of the way I am using HTTParty to make the request.
class Posterous
include HTTParty
base_uri "http://www.posterous.com/api/2"
basic_auth "username", "password"
format :json
def get_posts
response = Posterous.get("/users/me/sites/9876/posts&api_token=1234")
# snip, see below...
end
end
With the obvious information (username, password, site_id, api_token) replaced with valid values.
At the point of snip, response.body contains a Ruby string that is in JSON format and response.parsed_response contains a Ruby hash object which HTTParty created by parsing the JSON response from the Posterous API.
In both cases the unicode sequences such as \u003C have been changed to \\u003C.
I've found a solution to this problem. I ran across this gist. elskwid had the identical problem and ran the string through a JSON parser:
s = ::JSON.parse("\\u003Cp\\u003E")
Now, s = "<p>".
I ran into this exact problem the other day. There is a bug in the json parser that HTTParty uses (Crack gem) - basically it uses a case-sensitive regexp for the Unicode sequences, so because Posterous puts out A-F instead of a-f, Crack isn't unescaping them. I submitted a pull request to fix this.
In the meantime HTTParty nicely lets you specify alternate parsers so you can do ::JSON.parse bypassing Crack entirely like this:
class JsonParser < HTTParty::Parser
def json
::JSON.parse(body)
end
end
class Posterous
include HTTParty
parser ::JsonParser
#....
end
You can also use pack:
"a\\u00e4\\u3042".gsub(/\\u(....)/){[$1.hex].pack("U")} # "aäあ"
Or to do the reverse:
"aäあ".gsub(/[^ -~\n]/){"\\u%04x"%$&.ord} # "a\\u00e4\\u3042"
The doubled-backslashes almost look like a regular string being viewed in a debugger.
The string "\u003Cp\u003E" really is "<p>", only the \u003C is unicode for < and \003E is >.
>> "\u003Cp\u003E" #=> "<p>"
If you are truly getting the string with doubled backslashes then you could try stripping one of the pair.
As a test, see how long the string is:
>> "\\u003Cp\\u003E".size #=> 13
>> "\u003Cp\u003E".size #=> 3
>> "<p>".size #=> 3
All the above was done using Ruby 1.9.2, which is Unicode aware. v1.8.7 wasn't. Here's what I get using 1.8.7's IRB for comparison:
>> "\u003Cp\u003E" #=> "u003Cpu003E"

Resources