Converting binary (or hex) into ASCII - ruby

I need to convert a large binary string (a sequence of bytes) into ASCII like this table. I can also start with a hex string.
I read this post: Converting binary data to string in ruby. I found a solution that converts to characters in the extended ASCII table. I could write conditionals for every case in order to convert, but there has to be an easier way. Can someone help?

The link you specified contains javascript code, that performs a conversion, on the page, not obfuscated:
function OnConvert()
{
hex = document.calcform.hex.value;
hex = hex.match(/[0-9A-Fa-f]{2}/g);
len = hex.length;
if( len==0 ) return;
txt='';
for(i=0; i<len; i++)
{
h = hex[i];
code = parseInt(h,16);
t = String.fromCharCode(code);
txt += t;
}
document.calcform.txt.value = txt;
}
I did not understand your task clearly, since if you’ll enter e. g. EEEFFA there in the form, you’ll get îïú as an output, what, in my opinion, is extended ASCII. But there is a simple way to achieve the same functionality in ruby.
▶ "EEEFFA".scan(/[0-9a-f]{2}/i).map { |cp| cp.to_i(16) }.inject('', &:concat)
#⇒ "îïú"
UPD As I understood from the comments, you want to convert every 8 zeros and ones to the respective ASCII letter. Here you go (assuming you have a long string, containing zeroes and ones):
▶ '010000010100001001000011'.
▷ scan(/[01]{8}/). # allow only zeros and ones, scan by 8
▷ map { |e| e.to_i 2 }. # convert to integers, base 10
▷ inject '', &:concat # concatenate into one string
#⇒ 'ABC'

A slight variation on #mudasobwa's excellent solution, using an (apparently undocumented) feature of String#oct:
'010000010100001001000011'
.scan(/0[01]{7}/)
.map { |b| b.prepend('0b').oct.chr }
.join
And hex, for completeness:
'627e29397c5727611147503e36355a4f683737'
.scan(/[0-7]\h/)
.map { |x| x.prepend('0x').oct.chr }
.join
I've opened a bug report at ruby-lang if anybody is interested...

Related

Convert digits in string to ints and then back to string

Say I have a string: formula = "C3H12O4"
How can I convert the digit chars in the string to ints?
My end goal is to do something along the lines of:
formula * 4
Once converted formula chars to an int, it would be best to report the result back to a string, thus
outputting as:
"C12H48O16"
formula = "C3H12O4"
Code
p formula.gsub(/\d+/) { |x| x.to_i * 4 }
output
"C12H48O16"
If you had many conversions to do it might be worthwhile to include the following in a benchmark of different methods:
h = (0..9).each_with_object({}) { |n,h| h[n.to_s] = (4*n).to_s }
#=> {"0"=>"0", "1"=>"4", "2"=>"8", "3"=>"12", "4"=>"16",
# "5"=>"20", "6"=>"24", "7"=>"28", "8"=>"32", "9"=>"36"}
Then for each string of interest the following calculation would be performed:
"C3H12O4".gsub(/\d/, h)
#=> "C12H48O16"
"99Ra$32".gsub(/\d/, h)
#=> "3636Ra$128"
This uses the form of String#gsub that employs a hash to make the substitutions.
A variant of this is the following.
"C3H12O4".gsub(/./) { |c| h.fetch(c, c) }
#=> "C12H48O16"
Here gsub matches every character, which it passes to the block to be held by the block variable c. Hash#fetch is then used to look up and return h[c], provided h has a key c. If h does not have a key c, fetch's second argument (c) is returned.
The use of the hash avoids the need to convert back and forth between integers and strings, except in the creation of the hash, of course, but that is done only once.

How to store a list of small numbers in Postgres

I have a long list of small numbers, all of them < 16 but there can be more than 10000 of them in a unique list.
I get the values as a comma separated list, like:
6,12,10,2,2,2,6,12,8,2,2,6,10,2,4,12,14,10,2, .... lots and lots of numbers
And finally I need to store the values in a database in the most efficient way in order to be read back and processed again ... as a string, comma separated values.
I was thinking of sort of storing them in a big TEXT field ... however I find that adding all the commas in there would be a waste of space.
I am wondering if there is any best practice for this scenario.
For more technical details:
for Database I have to use Postgres (and I am sort of beginner in this field), and the programming language is Ruby (also beginner :) )
For a fast and reasonably space efficient solution, you could simply write a hexadecimal string :
string = '6,12,10,2,2,2,6,12,8,2,2,6,10,2,4,12,14,10,2'
p string.split(',').map { |v| v.to_i.to_s(16) }.join
# "6ca2226c8226a24cea2"
p '6ca2226c8226a24cea2'.each_char.map { |c| c.to_i(16) }.join(',')
# "6,12,10,2,2,2,6,12,8,2,2,6,10,2,4,12,14,10,2"
It brings the advantage of being easily readable by any DB and any program.
Also, it works even if there are leading 0s in the string : "0,0,6".
If you have an even number of elements, you could pack 2 hexa characters into one byte, to divide the string length by 2.
numbers = "6,12,10,2,2,2,6,12,8,2,2,6,10,2,4,12,14,10,2"
numbers.split(',')
.map { |n| n.to_i.to_s(2).rjust(4, '0') }
.join
.to_i(2)
.to_s(36)
#⇒ "57ymwcgbl1umt2a"
"57ymwcgbl1umt2a".to_i(36)
.to_s(2)
.tap { |e| e.prepend('0') until (e.length % 4).zero? }
.scan(/.{4}/)
.map { |e| e.to_i(2).to_s }
.join(',')
#⇒ "6,12,10,2,2,2,6,12,8,2,2,6,10,2,4,12,14,10,2"

How to convert bytes in number into a string of characters? (character representation of a number)

How do I easily convert a number, e.g. 0x616263, equivalently 6382179 in base 10, into a string by dividing the number up into sequential bytes? So the example above should convert into 'abc'.
I've experimented with Array.pack but cant figure out how to get it to convert more than one byte in the number, e.g. [0x616263].pack("C*") returns 'c'.
I've also tried 0x616263.to_s(256), but that throws an ArgumentError: invalid radix. I guess it needs some sort of encoding information?
(Note: Other datatypes in pack like N work with the example I've given above, but only because it fits within 4 bytes, so e.g. [0x616263646566].pack("N") gives cdef, not abcdef)
This question is vaguely similar to this one, but not really. Also, I sort of figured out how to get the hex representation string from a character string using "abcde".unpack("c*").map{|c| c.to_s(16)}.join(""), which gives '6162636465'. I basically want to go backwards.
I don't think this is an X-Y problem, but in case it is - I'm trying to convert a number I've decoded with RSA into a character string.
Thanks for any help. I'm not too experienced with Ruby. I'd also be interested in a Python solution (for fun), but I don't know if its right to add tags for two separate programming languages to this question.
To convert a single number 0x00616263 into 3 characters, what you really need to do first is separate them into three numbers: 0x00000061, 0x00000062, and 0x00000063.
For the last number, the hex digits you want are already in the correct place. But for the other two, you have to do a bitshift using >> 16 and >> 8 respectively.
Afterwards, use a bitwise and to get rid of the other digits:
num1 = (0x616263 >> 16) & 0xFF
num2 = (0x616263 >> 8) & 0xFF
num3 = 0x616263 & 0xFF
For the characters, you could then do:
char1 = ((0x616263 >> 16) & 0xFF).chr
char2 = ((0x616263 >> 8) & 0xFF).chr
char3 = (0x616263 & 0xFF).chr
Of course, bitwise operations aren't very Ruby-esque. There are probably more Ruby-like answers that someone else might provide.
64 bit integers
If your number is smaller than 2**64 (8 bytes), you can :
convert the "big-endian unsigned long long" to 8 bytes
remove the leading zero bytes
Ruby
[0x616263].pack('Q>').sub(/\x00+/,'')
# "abc"
[0x616263646566].pack('Q>').sub(/\x00+/,'')
# "abcdef"
Python 2 & 3
In Python, pack returns bytes, not a string. You can use decode() to convert bytes to a String :
import struct
import re
print(re.sub('\x00', '', struct.pack(">Q", 0x616263646566).decode()))
# abcdef
print(re.sub('\x00', '', struct.pack(">Q", 0x616263).decode()))
# abc
Large numbers
With gsub
If your number doesn't fit in 8 bytes, you could use a modified version of your code. This is shorter and outputs the string correctly if the first byte is smaller than 10 (e.g. for "\t") :
def decode(int)
if int < 2**64
[int].pack('Q>').sub(/\x00+/, '')
else
nhex = int.to_s(16)
nhex = '0' + nhex if nhex.size.odd?
nhex.gsub(/../) { |hh| hh.to_i(16).chr }
end
end
puts decode(0x616263) == 'abc'
# true
puts decode(0x616263646566) == 'abcdef'
# true
puts decode(0x0961) == "\ta"
# true
puts decode(0x546869732073656e74656e63652069732077617920746f6f206c6f6e6720666f7220616e20496e743634)
# This sentence is way too long for an Int64
By the way, here's the reverse method :
def encode(str)
str.reverse.each_byte.with_index.map { |b, i| b * 256**i }.inject(:+)
end
You should still check if your RSA code really outputs arbitrary large numbers or just an array of integers.
With shifts
Here's another way to get the result. It's similar to #Nathan's answer, but it works for any integer size :
def decode(int)
a = []
while int>0
a << (int & 0xFF)
int >>= 8
end
a.reverse.pack('C*')
end
According to fruity, it's twice as fast as the gsub solution.
I'm currently rolling with this:
n = 0x616263
nhex = n.to_s(16)
nhexarr = nhex.scan(/.{1,2}/)
nhexarr = nhexarr.map {|e| e.to_i(16)}
out = nhexarr.pack("C*")
But was hoping for a concise/built-in way to do this, so I'll leave this answer unaccepted for now.

Converting a hexadecimal number to binary in ruby

I am trying to convert a hex value to a binary value (each bit in the hex string should have an equivalent four bit binary value). I was advised to use this:
num = "0ff" # (say for eg.)
bin = "%0#{num.size*4}b" % num.hex.to_i
This gives me the correct output 000011111111. I am confused with how this works, especially %0#{num.size*4}b. Could someone help me with this?
You can also do:
num = "0ff"
num.hex.to_s(2).rjust(num.size*4, '0')
You may have already figured out, but, num.size*4 is the number of digits that you want to pad the output up to with 0 because one hexadecimal digit is represented by four (log_2 16 = 4) binary digits.
You'll find the answer in the documentation of Kernel#sprintf (as pointed out by the docs for String#%):
http://www.ruby-doc.org/core/classes/Kernel.html#M001433
This is the most straightforward solution I found to convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1
This doesn't answer your original question, but I would assume that a lot of people coming here are, instead of looking to turn hexadecimal to actual "0s and 1s" binary output, to decode hexadecimal to a byte string representation (in the spirit of such utilities as hex2bin). As such, here is a good method for doing exactly that:
def hex_to_bin(hex)
# Prepend a '0' for padding if you don't have an even number of chars
hex = '0' << hex unless (hex.length % 2) == 0
hex.scan(/[A-Fa-f0-9]{2}/).inject('') { |encoded, byte| encoded << [byte].pack('H2') }
end
Getting back to hex again is much easier:
def bin_to_hex(bin)
bin.unpack('H*').first
end
Converting the string of hex digits back to binary is just as easy. Take the hex digits two at a time (since each byte can range from 00 to FF), convert the digits to a character, and join them back together.
def hex_to_bin(s) s.scan(/../).map { |x| x.hex.chr }.join end

How can I output leading zeros in Ruby?

I'm outputting a set of numbered files from a Ruby script. The numbers come from incrementing a counter, but to make them sort nicely in the directory, I'd like to use leading zeros in the filenames. In other words
file_001...
instead of
file_1
Is there a simple way to add leading zeros when converting a number to a string? (I know I can do "if less than 10.... if less than 100").
Use the % operator with a string:
irb(main):001:0> "%03d" % 5
=> "005"
The left-hand-side is a printf format string, and the right-hand side can be a list of values, so you could do something like:
irb(main):002:0> filename = "%s/%s.%04d.txt" % ["dirname", "filename", 23]
=> "dirname/filename.0023.txt"
Here's a printf format cheat sheet you might find useful in forming your format string. The printf format is originally from the C function printf, but similar formating functions are available in perl, ruby, python, java, php, etc.
If the maximum number of digits in the counter is known (e.g., n = 3 for counters 1..876), you can do
str = "file_" + i.to_s.rjust(n, "0")
Can't you just use string formatting of the value before you concat the filename?
"%03d" % number
Use String#next as the counter.
>> n = "000"
>> 3.times { puts "file_#{n.next!}" }
file_001
file_002
file_003
next is relatively 'clever', meaning you can even go for
>> n = "file_000"
>> 3.times { puts n.next! }
file_001
file_002
file_003
As stated by the other answers, "%03d" % number works pretty well, but it goes against the rubocop ruby style guide:
Favor the use of sprintf and its alias format over the fairly
cryptic String#% method
We can obtain the same result in a more readable way using the following:
format('%03d', number)
filenames = '000'.upto('100').map { |index| "file_#{index}" }
Outputs
[file_000, file_001, file_002, file_003, ..., file_098, file_099, file_100]

Resources