Convert string with hex ASCII codes to characters

Convert string with hex ASCII codes to characters - ruby

I have a string containing hex code values of ASCII characters, e.g. "666f6f626172". I want to convert it to the corresponding string ("foobar").
This is working but ugly:
"666f6f626172".scan(/../).map(&:hex).map(&:chr).join # => "foobar"
Is there a better (more concise) way? Could unpack be helpful somehow?

You can use Array#pack:
["666f6f626172"].pack('H*')
#=> "foobar"
H is the directive for a hex string (high nibble first).

Stefan has nailed it, but here's an alternative you may want to tuck away for another time and place:
"666f6f626172".gsub(/../) { |pair| pair.hex.chr } # => "foobar"

Related

Cannot convert ISO8859-1 to cyrillic in ruby

I have text "ÐÐ¾ÑÑÐ¸Ð½Ð°Ñ", and I want to convert it to cyrillic. 2cyr.com says that this is ISO8859-1 format. I tried
"ÐÐ¾ÑÑÐ¸Ð½Ð°Ñ".force_encoding("ISO8859-1").encode("UTF-8")
But it returned =>
"Ã\u0090Â\u0093Ã\u0090Â¾Ã\u0091Â\u0081Ã\u0091Â\u0082Ã\u0090Â¸Ã\u0090Â½Ã\u0090Â°Ã\u0091Â\u008F"
What should I do to make the final word be "Гостиная"

It's the other way round. Your string is the result of:
str = "Гостиная".force_encoding('ISO8859-1').encode('UTF-8')
#=> "Ð\u0093Ð¾Ñ\u0081Ñ\u0082Ð¸Ð½Ð°Ñ\u008F"
puts str
#=> ÐÐ¾ÑÑÐ¸Ð½Ð°Ñ
To revert it, use:
str.encode('ISO8859-1').force_encoding('UTF-8')
#=> "Гостиная"
Of course, this only works if the malformed string is left intact (it contains several invisible / unprintable characters).

Best you can do is switch the order of methods:
puts "ÐÐ¾ÑÑÐ¸Ð½Ð°Ñ".encode("CP1252")
#=> �о��ина�
Your string still contains broken chars, but that is likely to be inherent to your original string. Online tools like this one give the same result.

String#delete ignore special characters

String#delete interprets a-z as character range. However, I would like it to delete fa-zo.
"fojwfa-zowj".delete("fa-zo") #=> "-"
Desired result:
"fojwwj"

You could also use this little trick:
string = "fojwfa-zowj"
string[/fa-zo/] = ''
string
# => "fojwwj"
Notice however, that this modifies the string in place like #gsub!, which should be faster and should use less memory, but which could introduce side-effects if not considered well.

"fojwfa-zowj".gsub("fa-zo","") # => "fojwwj"

"fojwfa-zowj".tap{ |s| s.slice! "fa-zo" } # just for the Heaven of it

Convert matched string of UTF-8 values to UTF-8 characters in Ruby

Trying to convert output from a rest_client GET to the characters that are represented with escape sequences.
Input: ..."sub_id":"\u0d9c\u8138\u8134\u3f30\u8139\u2b71"...
(which I put in 'all_subs')
Match: m = /sub_id\"\:\"([^\"]+)\"/.match(all_subs.to_str) [1]
Print: puts m.force_encoding("UTF-8").unpack('U*').pack('U*')
But it just comes out the same way I put it in. ie, "\u0d9c\u8138\u8134\u3f30\u8139\u2b71"
However, if I convert a raw string of it:
puts "\u0d9c\u8138\u8134\u3f30\u8139\u2b71".unpack('U*').pack('U*')
The output is perfect as "ග脸脴㼰脹⭱"

What you're getting when you parse the input string is actually this:
m = "\\u0d9c\\u8138\\u8134\\u3f30\\u8139\\u2b71"
Which is not the same as:
"\u0d9c\u8138\u8134\u3f30\u8139\u2b71"
Therefore one option is to eval the string so that ruby applies the codepoints:
puts eval("\"#{m}\"")
=> ග脸脴㼰脹
However note that there are security implications when running eval.
If the string is always like in your example. You could also do something like this, which is safe:
puts m.split("\\u")[1..-1].map { |c| c.to_i(16) }.pack("U*")
=> ග脸脴㼰脹

How can I convert a string of codepoints to the string it represents?

I have a string (in Ruby) like this:
626c6168
(that is 'blah' without the quotes)
How do I convert it to 'blah'? Note that these are variable lengths, and also they aren't always letters and numbers. (They're being stored in a database, not being printed.)

Array#pack
['626c6168'].pack('H*')
# => "blah"

Using hex to convert each character:
"626c6168".scan(/../).map{ |c| c.hex.chr }.join
This gives blah.

Escape problem with hex

I need to print escaped characters to a binary file using Ruby. The main problem is that slashes need the whole byte to escape correctly, and I don't know/can't create the byte in such a way.
I am creating the hex value with, basically:
'\x' + char
Where char is some 'hex' value, such as 65. In hex, \x65 is the ASCII character 'e'.
Unfortunately, when I puts this sequence to the file, I end up with this:
\\x65
How do I create a hex string with the properly escaped value? I have tried a lot of things, involving single or double quotes, pack, unpack, multiple slashes, etc. I have tried so many different combinations that I feel as though I understand the problem less now then I did when I started.
How?

You may need to set binary mode on your file, and/or use putc.
File.open("foo.tmp", "w") do |f|
f.set_encoding(Encoding::BINARY) # set_encoding is Ruby 1.9
f.binmode # only useful on Windows
f.putc "e".hex
end
Hopefully this can give you some ideas even if you have Ruby <1.9.

Okay, if you want to create a string whose first byte
has the integer value 0x65, use Array#pack
irb> [0x65].pack('U')
#=> "e"
irb> "e"[0]
#=> 101
10110 = 6516, so this works.
If you want to create a literal string whose first byte is '\',
second is 'x', third is '6', and fourth is '5', then just use interpolation:
irb> "\\x#{65}"
#=> "\\x65"
irb> "\\x65".split('')
#=> ["\\", "x", "6", "5"]

If you have the hex value and you want to create a string containing the character corresponding to that hex value, you can do:
irb(main):002:0> '65'.hex.chr
=> "e"
Another option is to use Array#pack; this can be used if you need to convert a list of numbers to a single string:
irb(main):003:0> ['65'.hex].pack("C")
=> "e"
irb(main):004:0> ['66', '6f', '6f'].map {|x| x.hex}.pack("C*")
=> "foo"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Convert string with hex ASCII codes to characters - ruby

You can use Array#pack: ["666f6f626172"].pack('H*') #=> "foobar" H is the directive for a hex string (high nibble first).

Stefan has nailed it, but here's an alternative you may want to tuck away for another time and place: "666f6f626172".gsub(/../) { |pair| pair.hex.chr } # => "foobar"

Related

Cannot convert ISO8859-1 to cyrillic in ruby

String#delete ignore special characters

Convert matched string of UTF-8 values to UTF-8 characters in Ruby

How can I convert a string of codepoints to the string it represents?

Escape problem with hex

Categories

Resources