Why the result of 41 == 0041 is giving False in python? - python-2.x

I am trying to retrieve Images in a folder based on indices retrieved from a list. The images format is frame%04d whereas list consists of positive integers. when I passed the following statement 41 == 0041, the result is false. What is the correct procedure?

In Python3, it is not even possible to have integer with leading zeros:
>>> foo = 0041
SyntaxError: invalid token
In Python2, as DeepSpace pointed out, having leading zeros refers to octal literals:
>>> 0041
33
>>> 0081
SyntaxError: invalid token
Therefore, if you are comparing strings, you can easily do that with
>>> "41" == "0041".lstrip("0")
True
which results to True

Related

String contains NUL bytes

I'm trying to decode this file that is in IBM437 into readable UTF I'm at the point where I think I've almost got it but I'm getting an ArgumentError where the string contains nul bytes, I'm aware of how to gsub out nul bytes using:
.gsub("\u0000", '') however I can't figure out where to gsub the bytes out.
Here's the source:
def gather_info
file = './lib/SETI_message.txt'
File.read(file).each_line do |gather|
packed = [gather].pack('b*')
ec = Encoding::Converter.new(packed, 'utf-8')
encoding_forced = packed.encode(ec)
File.open('packed.txt', 'a+'){ |s| s.puts(encoding_forced.gsub("\u0000", '')) }
end
end
gather_info
And here's the file
Can anyone tell me what I'm doing wrong here?
The following works for me :
file = File.read('SETI.txt')
packed = file.scan(/......../).map{|s| s.to_i(2)}.pack('U*')
File.write('packed.txt', packed)
Let's break file.scan(/......../).map{|s| s.to_i(2)}.pack('U*') down :
file.scan(/......../)
Here we break the huge string of 0s and 1s (the file) into an array of strings containing 8 characters each. It looks like that : ['00001111', '11110000', ...].
arr.map{|s| s.to_i(2)}
From step 1 we got an array of strings representing the different characters in binary notation. We can convert one of those strings (called s) by applying s.to_i(2) because the parameter '2' says to the method to_i to use base 2. So '00000011'.to_i(2) returns 3.
We apply this to all the characters by using map.
So we now have an array that looks like [98, 82, 49, 39, ...].
arr.pack('U*')
From step 2 we have an array of integers representing each a character. We can now use the pack method to transform our array of integers into a string. The parameter we use for pack is U to tell him that the integers are in fact UTF-8 characters.

How to convert a number in a string to an integer

I have an output like "35%" from one command, and I stripped "%". Still, it's stored as a string. Is there a function to convert the string to integer?
You can simply do "35%".to_i which produces 35
For your exact problem:
puts 'true' if 35 == "35".to_i
output is:
true
Let's say your string is "35%". Start reading your string character by character. First your pointer is at '3'. Subtract '0'(ASCII 0) from this and multiply the result by 10. Go to the next character, '5' in this case and again subtract '0' but multiply the result by 1. Now add the 2 results and what you get is integer type 35. So what you are basically doing is subtracting '0' from each character and multiplying it by 10^(its position), until you hit your terminator(% here).

Why is "string".to_i = 0 but "9".to_i =9

Hello I was wondering why this was the case and how to_i is defined.
simple question why does
"string".to_i
=> 0?
"9".to_i
=> 9
According to the documentation for to_i, "if there is not a valid a number at the start of str, 0 is returned".
Invoking .to_i on a string will return a number (in base 10) by interpreting valid numbers at the beginning of the string.
"string".to_i returns 0 because .to_i couldn't interpret a valid number from the start of the string. "9".to_i returns 9 because the leading (or in this case, the only) character is "9" and it could be interpreted as a valid number.
This doesn't mean that invoking .to_i on a string that starts with a letter will always return 0 though. For example, "b6".to_i(16) returns 182 because this means you want to interpret "b6" (in base 16, aka hexadecimal) as base 10.
See the documentation here: http://www.ruby-doc.org/core-2.1.0/String.html#method-i-to_i

Ruby unfamiliar string usage with Integer.chr and "\001"

Recently I stumbled over this code snippet in Ruby:
#data = 3.chr * 5
which results in "\003\003\003\003\003"
later in the code for example
flag = #data[2] & 2
is used,
I know that it has something todo with bitwise-flags. It seems the values 1,2 and 3 are used as state flags, but because ruby 1.9, which is the version I am familar with, changed the Integer.chr method the code does no longer work and I would really like to know whats going on.
Furthermore, what is the purpose of the "\00x" escaped-thing?
Thanks for your answers
To make the code work in Ruby 1.9, try changing that line to:
flag = #data[2].ord & 2
Prior to Ruby 1.9, str[n] would return an integer between 0 and 255, but in Ruby 1.9 with its new unicode support, str[n] returns a character (string of length 1). To get the integer instead of character, you can call .ord on the character.
The & operator is just the standard bitwise AND operator common to C, Ruby, and many other languages.
Byte number three (0x03) is not a printable ASCII character, so when you have that byte in a string and call inspect ruby denotes that byte as \003. Just make sure you understand that "\003" is a single-byte string while '\003' is a four-byte string.
In Ruby, strings are really sequences of bytes. In Ruby 1.9, there is also encoding information, but they are still really just a sequence of bytes.
The "\00X" thing is an octal representation of the value.
So if we do:
irb(main):001:0> 15.chr
=> "\017"
irb(main):002:0> 16.chr
=> "\020"
Notice how we went from 17 right to 20? Octal.
"\003\003\003\003\003" is 5 bytes of the value 3 and you can then bitwise and them with other bytes, such as 2 or \002.
So 3 or 0011 in binary anded with 2 (0010) is 2 (0010)
The 1.9 issue occurs on account of 1.9 not using ascii like 1.8 does. David Grayson hits that point well.
Note that ruby 1.9 will inspect unprintable characters in the hexadecimal representation:
3.chr # => "\x03"
Even more confusing is that sometimes the strings will appear in unicode (UTF-8):
"\003" # => "\u0003" (utf-8)
3.chr.encoding # => #<Encoding:US-ASCII>
"\003".encoding # => #<Encoding:UTF-8>
"\003" == 3.chr # => true (this is strange because the encoding is different)
If you're trying to understand how these octal and hex strings relate to decimal numbers, you can convert them to binary:
"\003".unpack('B*') # same as "\003".ord.to_s(2)
# => ["00000011"] # the 2 least significant bits are set
2.to_s(2) # convert to base 2
#=> "10"
The expression 3 & 2 is a bitwise-and of binary numbers 11b and 10b, which will yield 10b (because 1 & 1 is 1 for the most significant bit; 1 & 0 is 0 for least significant).
Other conversions:
'%x' % 97 # => '61' hex
0x61 # => 97 decimal from raw hex input
'%o' % 97 # => '141' octal
0141 # => 97 decimal from raw octal input
This is sort of a crash course but you should probably google for more in-depth info.

Converting a hexadecimal number to binary in ruby

I am trying to convert a hex value to a binary value (each bit in the hex string should have an equivalent four bit binary value). I was advised to use this:
num = "0ff" # (say for eg.)
bin = "%0#{num.size*4}b" % num.hex.to_i
This gives me the correct output 000011111111. I am confused with how this works, especially %0#{num.size*4}b. Could someone help me with this?
You can also do:
num = "0ff"
num.hex.to_s(2).rjust(num.size*4, '0')
You may have already figured out, but, num.size*4 is the number of digits that you want to pad the output up to with 0 because one hexadecimal digit is represented by four (log_2 16 = 4) binary digits.
You'll find the answer in the documentation of Kernel#sprintf (as pointed out by the docs for String#%):
http://www.ruby-doc.org/core/classes/Kernel.html#M001433
This is the most straightforward solution I found to convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1
This doesn't answer your original question, but I would assume that a lot of people coming here are, instead of looking to turn hexadecimal to actual "0s and 1s" binary output, to decode hexadecimal to a byte string representation (in the spirit of such utilities as hex2bin). As such, here is a good method for doing exactly that:
def hex_to_bin(hex)
# Prepend a '0' for padding if you don't have an even number of chars
hex = '0' << hex unless (hex.length % 2) == 0
hex.scan(/[A-Fa-f0-9]{2}/).inject('') { |encoded, byte| encoded << [byte].pack('H2') }
end
Getting back to hex again is much easier:
def bin_to_hex(bin)
bin.unpack('H*').first
end
Converting the string of hex digits back to binary is just as easy. Take the hex digits two at a time (since each byte can range from 00 to FF), convert the digits to a character, and join them back together.
def hex_to_bin(s) s.scan(/../).map { |x| x.hex.chr }.join end

Resources