Ruby: What does unpack("C") actually do? - ruby

From the docs, unpack does:
Decodes str (which may contain binary data) according to the format
string, returning an array of each value extracted.
And the "C" format means 8-bit unsigned (unsigned char).
But what does this actually end up doing to the string I input? What does the result mean, and if I had to do it by hand, how would I go about doing that?

It converts each subsequent char to it’s integer ordinal as String#ord does. That said,
string.unpack 'C*'
is an exact equivalent of
string.each_char.map(&:ord)

But what does this actually end up doing to the string I input
It doesn't do anything to the input. And the input is not really a string here. It's typed as a string, but it is really a buffer of binary data, such as you might receive by networking, and your goal is to extract that data into an array of integers. Example:
s = "\01\00\02\03"
arr = s.unpack("C*")
p(arr) # [1,0,2,3]
That "string" would be meaningless as a string of text, but it is quite viable as a data buffer. Unpacking it allows you examine the data.

Related

string size limit input cin.get() and getline()

In this project the user can type in a text(maximum 140 characters).
so for this limitation I once used getline():
string text;
getline(cin, text);
text = text.substr(1, 140);
but in this case the result of cout << text << endl; is an empty string.
so I used cin.get() like:
cin.get(text, 140);
this time I get this error: no matching function for call to ‘std::basic_istream::get(std::__cxx11::string&, int)’
note that I have included <iostream>
so the question is how can I fix this why is this happening?
Your first approach is sound with one correction - you need to use
text = text.substr(0, 140);
instead of text = text.substr(1, 140);. Containers (which includes a string) in C/C++ start with index 0 and you are requesting the string to be trimmed from position 1. This is perfectly fine, but if the string happens to be only one character long, calling text.substr(1, 140); will not necessarily cause the program to crash, but will not end up in the desired output either.
According to this source, substr will throw an out of range exception if called with starting position larger than string length. In case of a one character string, position 1 would be equal to string length, but the return value is not meaningful (in fact, it may even be an undefined behavior but I cannot find a confirmation of this statement - in yours and my case, calling it returns an empty string). I recommend you test it yourself in the interactive coding section following the link above.
Your second approach tried to pass a string to a function that expected C-style character arrays. Again, more can be found here. Like the error said, the compiler couldn't find a matching function because the argument was a string and not the char array. Some functions will perform a conversion of string to char, but this is not the case here. You could convert the string to char array yourself, as for instance described in this post, but the first approach is much more in line with C++ practices.
Last note - currently you're only reading a single line of input, I assume you will want to change that.

The code always outputs "not"

The following code always outputs "not":
print "input a number please. "
TestNumber = gets
if TestNumber % 2 == 0
print "The number is even"
else
print "The number is not even"
end
What is going wrong with my code?
The gets() method returns an object of type String.
When you call %() on a String object, the return value is a new String object (usually it changes the text. You can read more about string formatting here).
Since there are no String objects that == 0, the if/else will always take the same path.
If you want to use the return value of gets() like a number, you will need to transform it into one first. The simplest approach is probably to use the to_i() method on String objects, which returns a new 'Integer' object. If you're doing something where the user input will not always be an integer (e.g. 3.14 or 1.5), you might need to use a different approach.
One last thing: in your example the result of gets() is saved into a constant called TestNumber. Constants are different to normal variables, and they will probably cause problems if you're not using them intentionally. Normal variables don't start with capital letters. (You can read more about ruby variables here). In ruby you need to write you variable names like this: test_number.
I suspect your Testnumber variable might be interpreted as a string during the operation. make sure the testnum is converted to an integer first even if you put in say 100 it could be its being interpreted as the stirng "100" and not the integer 100.
A similar issue can be found here: Ruby Modulo Division
You have to convert TestNumber from string to integer, as your input has linefeed and/or other unwanted characters that do not match an integer.
Use TestNumber = gets.to_i to convert to integer before testing.

Visual Works smalltalk, how to convert Ascii values to characters

using visualworks, in small talk, I'm receiving a string like '31323334' from a network connection.
I need a string that reads '1234' so I need a way of extracting two characters at a time, converting them to what they represent in ascii, and then building a string of them...
Is there a way to do so?
EDIT(7/24): for some reason many of you are assuming I will only be working with numbers and could just truncate 3s or read every other char. This is not the case, examples of strings read could include any keys on the US standard keyboard (a-z, A-Z,0-9,punctuation/annotation such as {}*&^%$...)
Following along the lines of what Max started to suggest:
x := '31323334'.
in := ReadStream on: x.
out := WriteStream on: String new.
[ in atEnd ] whileFalse: [ out nextPut: (in next digitValue * 16 + (in next digitValue)) asCharacter ].
newX := out contents.
newX will have the result '1234'. Or, if you start with:
x := '454647'
You will get a result of 'EFG'.
Note that digitValue might only recognize upper case hex digits, so an asUppercase may be needed on the string before processing.
There is usually a #fold: or #reduce: method that will let you do that. In Pharo there's also a message #allPairsDo: and #groupsOf:atATimeCollect:. Using one of these methods you could do:
| collectionOfBytes |
collectionOfBytes := '9798'
groupsOf: 2
atATimeCollect: [ :group |
(group first digitValue * 10) + (group second digitValue) ].
collectionOfBytes asByteArray asString "--> 'ab'"
The #digitValue message in Pharo simply returns the value of the digit for numerical characters.
If you're receiving the data on a stream you could replace #groupsOf:atATime: with a loop (result may be any collection that you then convert to a string like above):
...
[ stream atEnd ] whileFalse: [
result add: (stream next digitValue * 10) + (stream next digitValue) ]
...
in Smalltalk/X, there is a method called "fromHexBytes:" which the ByteArray class understands. I am not sure, but think that something similar exists in other ST dialects.
If present, you can solve this with:
(ByteArray fromHexString:'68656C6C6F31323334') asString
and the reverse would be:
'hello1234' asByteArray hexPrintString
Another possible solution is to read the string as a hex number,
fetch the digitBytes (which should give you a byte array) and then convert that to a string.
I.e.
(Integer readFrom:'68656C6C6F31323334' radix:16)
digitBytes asString
One problem with that is that I am not sure about which byte-order you will get the digitBytes (LSB or MSB), and if that is defined to be the same across architectures or converted at image loading time to use the native order. So it may be required to reverse the string at the end (to be portable, it may even be required to reverse it conditionally, depending on the endianess of the system.
I cannot test this on VisualWorks, but I assume it should work fine there, too.

How do I convert hex to binary (and vice versa) in Ruby, WHILE maintaining leading zeroes?

I have a data structure that I'd like to convert back and forth from hex to binary in Ruby. The simplest approach for a binary to hex is '0010'.to_i(2).to_s(16) - unfortunately this does not preserve leading zeroes (due to the to_i call), as one may need with data structures like cryptographic keys (which also vary with the number of leading zeroes).
Is there an easy built in way to do this?
I think you should have a firm idea of how many bits are in your cryptographic key. That should be stored in some constant or variable in your program, not inside individual strings representing the key:
KEY_BITS = 16
The most natural way to represent a key is as an integer, so if you receive a key in a hex format you can convert it like this (leading zeros in the string do not matter):
key = 'a0a0'.to_i(16)
If you receive a key in a (ASCII) binary format, you can convert it like this (leading zeros in the string do not matter):
key = '101011'.to_i(2)
If you need to output a key in hex with the right number of leading zeros:
key.to_s(16).rjust((KEY_BITS+3)/4, '0')
If you need to output a key in binary with the right number of leading zeros:
key.to_s(2).rjust(KEY_BITS, '0')
If you really do want to figure out how many bits might be in a key based on a (ASCII) binary or hex string, you can do:
key_bits = binary_str.length
key_bits = hex_str.length * 4
The truth is, leading zeros are not part of the integer value. I mean, it's a little detail related to representation of this value, not the value itself. So if you want to preserve properties of representation, it may be best not to get to underlying values at all.
Luckily, hex<->binary conversion has one neat property: each hexadecimal digit exactly corresponds to 4 binary digits. So assuming you only get binary numbers that have number of digits divisible by 4 you can just construct two dictionaries for constructing back and forth:
# Hexadecimal part is easy
hex = [*'0'..'9', *'A'..'F']
# Binary... not much longer, but a bit trickier
bin = (0..15).map { |i| '%04b' % i }
Note the use of String#% operator, that formats the given value interpreting the string as printf-style format string.
Okay, so these are lists of "digits", 16 each. Now for the dictionaries:
hex2bin = hex.zip(bin).to_h
bin2hex = bin.zip(hex).to_h
Converting hex to bin with these is straightforward:
"DEADBEEF".each_char.map { |d| hex2bin[d] }.join
Converting back is not that trivial. I assume we have a "good number" that can be split into groups of 4 binary digits each. I haven't found a cleaner way than using String#scan with a "match every 4 characters" regex:
"10111110".scan(/.{4}/).map { |d| bin2hex[d] }.join
The procedure is mostly similar.
Bonus task: implement the same conversion disregarding my assumption of having only "good binary numbers", i. e. "110101".
"I-should-have-read-the-docs" remark: there is Hash#invert that returns a hash with all key-value pairs inverted.
This is the most straightforward solution I found that preserves leading zeros. To convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1

Converting a hexadecimal number to binary in ruby

I am trying to convert a hex value to a binary value (each bit in the hex string should have an equivalent four bit binary value). I was advised to use this:
num = "0ff" # (say for eg.)
bin = "%0#{num.size*4}b" % num.hex.to_i
This gives me the correct output 000011111111. I am confused with how this works, especially %0#{num.size*4}b. Could someone help me with this?
You can also do:
num = "0ff"
num.hex.to_s(2).rjust(num.size*4, '0')
You may have already figured out, but, num.size*4 is the number of digits that you want to pad the output up to with 0 because one hexadecimal digit is represented by four (log_2 16 = 4) binary digits.
You'll find the answer in the documentation of Kernel#sprintf (as pointed out by the docs for String#%):
http://www.ruby-doc.org/core/classes/Kernel.html#M001433
This is the most straightforward solution I found to convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1
This doesn't answer your original question, but I would assume that a lot of people coming here are, instead of looking to turn hexadecimal to actual "0s and 1s" binary output, to decode hexadecimal to a byte string representation (in the spirit of such utilities as hex2bin). As such, here is a good method for doing exactly that:
def hex_to_bin(hex)
# Prepend a '0' for padding if you don't have an even number of chars
hex = '0' << hex unless (hex.length % 2) == 0
hex.scan(/[A-Fa-f0-9]{2}/).inject('') { |encoded, byte| encoded << [byte].pack('H2') }
end
Getting back to hex again is much easier:
def bin_to_hex(bin)
bin.unpack('H*').first
end
Converting the string of hex digits back to binary is just as easy. Take the hex digits two at a time (since each byte can range from 00 to FF), convert the digits to a character, and join them back together.
def hex_to_bin(s) s.scan(/../).map { |x| x.hex.chr }.join end

Resources