I got the following ciphered string:
MDExMDExMTEwMTExMDAwMDAxMTAwMTAxMDExMDExMTAwMDEwMDAwMDAxMTEwMDExMDExMDAxMDEwMTExMDAxMTAx MTAwMDAxMDExMDExMDEwMTEwMDEwMQ==
Now I would like to decipher this string into original one. How can I do this? and I don't know about which algorithm is used to cipher the original string, the ciphered string has length of 121 characters.
Artjom B. already noted in a comment that trailing equal signs may indicate a Base64 encoding (a quick Google search reveals this, too). Fortunately, Ruby has a Base64 library to decode it:
require 'base64'
string = 'MDExMDExMTEwMTExMDAwMDAxMTAwMTAxMDExMDExMTAwMDEwMDAwMDAxMTEwMDExMDExMDAxMDEwMTExMDAxMTAx MTAwMDAxMDExMDExMDEwMTEwMDEwMQ=='
decoded = Base64.decode64(string)
#=> "0110111101110000011001010110111000100000011100110110010101110011011000010110110101100101"
The new string consists of 0's and 1's, apparently another encoding, this time a binary one. It could be ASCII characters. Let's take a look at the first 8 "bits":
decoded[0, 8] #=> "01101111"
Converted to a byte, i.e. an integer via to_i: (2 means binary)
decoded[0, 8].to_i(2) #=> 111
And finally to a character via chr:
decoded[0, 8].to_i(2).chr #=> "o"
Nice, "o" is a valid ASCII character, what about the following characters?
decoded[8, 8].to_i(2).chr #=> "p"
decoded[16, 8].to_i(2).chr #=> "e"
decoded[24, 8].to_i(2).chr #=> "n"
That's "open", an English word. I think we have something here. You can probably work out the rest yourself. And beware of the thieves ;-)
Difficult if you don't have any information about the algorithm, but the string looks like just base64 encoded, if you use a decoder you end up with
0110111101110000011001010110111000100000011100110110010101110011011000010110110101100101
don't know whether it makes sense or not
Related
This question already has answers here:
Rotating letters in a string so that each letter is shifted to another letter by n places
(4 answers)
Closed 5 years ago.
I'm trying to make a basic cipher.
def caesar_crypto_encode(text, shift)
(text.nil? or text.strip.empty? ) ? "" : text.gsub(/[a-zA-Z]/){ |cstr|
((cstr.ord)+shift).chr }
end
but when the shift is too high I get these kinds of characters:
Test.assert_equals(caesar_crypto_encode("Hello world!", 127), "eBIIL TLOIA!")
Expected: "eBIIL TLOIA!", instead got: "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3!"
What is this format?
The reason you get the verbose output is because Ruby is running with UTF-8 encoding, and your conversion has just produced gibberish characters (an invalid character sequence under UTF-8 encoding).
ASCII characters A-Z are represented by decimal numbers (ordinals) 65-90, and a-z is 97-122. When you add 127 you push all the characters into 8-bit space, which makes them unrecognizable for proper UTF-8 encoding.
That's why Ruby inspect outputs the encoded strings in quoted form, which shows each character as its hexadecimal number "\xC7...".
If you want to get some semblance of characters out of this, you could re-encode the gibberish into ISO8859-1, which supports 8-bit characters.
Here's what you get if you do that:
s = "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3!"
>> s.encoding
=> #<Encoding:UTF-8>
# Re-encode as ISO8859-1.
# Your terminal (and Ruby) is using UTF-8, so Ruby will refuse to print these yet.
>> s.force_encoding('iso8859-1')
=> "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3!"
# In order to be able to print ISO8859-1 on an UTF-8 terminal, you have to
# convert them back to UTF-8 by re-encoding. This way your terminal (and Ruby)
# can display the ISO8859-1 8-bit characters using UTF-8 encoding:
>> s.encode('UTF-8')
=> "Çäëëî öîñëã!"
# Another way is just to repack the bytes into UTF-8:
>> s.bytes.pack('U*')
=> "Çäëëî öîñëã!"
Of course the proper way to do this, is not to let the numbers overflow into 8-bit space under any circumstance. Your encryption algorithm has a bug, and you need to ensure that the output is in the 7-bit ASCII range.
A better solution
Like #tadman suggested, you could use tr instead:
AZ_SEQUENCE = *'A'..'Z' + *'a'..'z'
"Hello world!".tr(AZ_SEQUENCE.join, AZ_SEQUENCE.rotate(127).join)
=> "eBIIL tLOIA!
I'm still curious about that format though...
Those characters represent the corresponding ASCII encoding after getting the ordinal (ord) of each letter and adding 127 to it (i.e. (cstr.ord)+shift).chr)
Why? Check Integer#chr, from the docs:
Returns a string containing the character represented by the int's
value according to encoding.
So, for example, take your first letter "H":
char_ord = "H".ord
#=> 72
new_char_ord = char_ord + 127
#=> 199
new_char_ord.chr
#=> "\xC7"
So, 199 corresponds to "\xC7". Keep changing all characters in "Hello world" and you will get "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3".
To avoid this you need to loop only with ord values that represent a letter (answer in the Possible duplicate link).
I want to create a 32 bit string that I can use as encryption key. This string/key should be derived from a plain text string, e.g.:
'I am a string'
My approach would first be to hash it:
hashed_string = Digest::SHA1.hexdigest('I am a string') # => 'bd82fb0e81ee9f15f5929e0564093bc9f8015f1d'
And then to use just the first 32 characters:
hashed_string[0..31] # => 'bd82fb0e81ee9f15f5929e0564093bc9'
However, I feel there must be a better approach, and I'm not sure if I risk the possibility of 2 input strings yielding similar keys.
What would be a better approach? I have seen this post that touches on truncation, but can't find an answer that appeals to me there.
If you want a string with 32 bits out of your (weak) password :
Digest::SHA1.digest('I am a string').unpack('B32').first
#=> "10111101100000101111101100001110"
The same amount of information can also be displayed with 8 hexadecimal digits :
Digest::SHA1.hexdigest('I am a string')[0,8]
#=> "bd82fb0e"
or 4 ascii chars :
Digest::SHA1.digest('I am a string')[0,4]
#=> "\xBD\x82\xFB\x0E"
I am having a very difficult time with this:
# contained within:
"MA\u008EEIKIAI"
# should be
"MAŽEIKIAI"
# nature of string
$ p string3
"MA\u008EEIKIAI"
$ puts string3
MAEIKIAI
$ string3.inspect
"\"MA\\u008EEIKIAI\""
$ string3.bytes
#<Enumerator: "MA\u008EEIKIAI":bytes>
Any ideas on where to start?
Note: this is not a duplicate of my previous question.
\u008E means that the unicode character with the codepoint 8e (in hex) appears at that point in the string. This character is the control character “SINGLE SHIFT TWO” (see the code chart (pdf)). The character Ž is at the codepoint u017d. However it is at position 8e in the Windows CP-1252 encoding. Somehow you’ve got your encodings mixed up.
The easiest way to “fix” this is probably just to open the file containing the string (or the database record or whatever) and edit it to be correct. The real solution will depend on where the string in question came from and how many bad strings you have.
Assuming the string is in UTF-8 encoding, \u008E will consist of the two bytes c2 and 8e. Note that the second byte, 8e, is the same as the encoding of Ž in CP-1252. On way to convert the string would be something like this:
string3.force_encoding('BINARY') # treat the string just as bytes for now
string3.gsub!(/\xC2/n, '') # remove the C2 byte
string3.force_encoding('CP1252') # give the string the correct encoding
string3.encode('UTF-8') # convert to the desired encoding
Note that this isn’t a general solution to fix all issues like this. Not all CP-1252 characters, when mangled and expressed in UTF-8 this way will amenable to conversion like this. Some will be two bytes c2 xx where xx the correct byte (like in this case), others will be c3 yy where yy is a different byte.
What about using Regexp & String#pack to convert the Unicode escape?
str = "MA\\u008EEIKIAI"
puts str #=> MA\u008EEIKIAI
str.gsub!(/\\u(.{4})/) do |match|
[$1.to_i(16)].pack('U')
end
puts str #=> MA EIKIAI
I am trying to convert a hex value to a binary value (each bit in the hex string should have an equivalent four bit binary value). I was advised to use this:
num = "0ff" # (say for eg.)
bin = "%0#{num.size*4}b" % num.hex.to_i
This gives me the correct output 000011111111. I am confused with how this works, especially %0#{num.size*4}b. Could someone help me with this?
You can also do:
num = "0ff"
num.hex.to_s(2).rjust(num.size*4, '0')
You may have already figured out, but, num.size*4 is the number of digits that you want to pad the output up to with 0 because one hexadecimal digit is represented by four (log_2 16 = 4) binary digits.
You'll find the answer in the documentation of Kernel#sprintf (as pointed out by the docs for String#%):
http://www.ruby-doc.org/core/classes/Kernel.html#M001433
This is the most straightforward solution I found to convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1
This doesn't answer your original question, but I would assume that a lot of people coming here are, instead of looking to turn hexadecimal to actual "0s and 1s" binary output, to decode hexadecimal to a byte string representation (in the spirit of such utilities as hex2bin). As such, here is a good method for doing exactly that:
def hex_to_bin(hex)
# Prepend a '0' for padding if you don't have an even number of chars
hex = '0' << hex unless (hex.length % 2) == 0
hex.scan(/[A-Fa-f0-9]{2}/).inject('') { |encoded, byte| encoded << [byte].pack('H2') }
end
Getting back to hex again is much easier:
def bin_to_hex(bin)
bin.unpack('H*').first
end
Converting the string of hex digits back to binary is just as easy. Take the hex digits two at a time (since each byte can range from 00 to FF), convert the digits to a character, and join them back together.
def hex_to_bin(s) s.scan(/../).map { |x| x.hex.chr }.join end
I can't iterate over the entire range of unicode characters.
I searched everywhere...
I am building a fuzzer and want to embed into a url, all unicode characters (one at a time).
For example:
http://www.example.com?a=\uff1c
I know that there are some built tools but I need more flexibility.
If i could do someting like the following: "\u" + "ff1c" it would be great.
This is the closest I got:
char = "\u0000"
...
#within iteration
char.succ!
...
but after the character "\u0039", which is the number 9, I will get "10" instead of ":"
You could use pack to convert numbers to UTF8 characters but I'm not sure if this solves your problem.
You can either create an array with numeric values of all the characters and use pack to get an UTF8 string or you can just loop from 0 to whatever you need and use pack within the loop.
I've written a small example to explain myself. The code below prints out the hex value of each character followed by the character itself.
0.upto(100) do |i|
puts "%04x" % i + ": " + [i].pack("U*")
end
Here's some simpler code, albeit slightly obfuscated, that takes advantage of the fact that Ruby will convert an integer on the right hand side of the << operator to a codepoint. This only works with Ruby 1.8 up for integer values <= 255. It will work for values greater than 255 in 1.9.
0.upto(100) do |i|
puts "" << i
end