This question already has answers here:
Why is 032 different than 32 in Ruby? [duplicate]
(4 answers)
Closed 8 years ago.
So I wanted to get my fundamental ruby skills up (coming from a python background) because I wanted to get a good handle on rails.
I was doing a bunch of exercises I picked for myself, and this particular error came up (it's not quite an error, but I'm raising an eyebrow) - for what it's worth, I'm using ruby 2.0.0.
Class A
def B(binaryNum)
puts binaryNum
binarray = binaryNum.to_s.chars.to_a
indice = binarray.length
puts "\n#{indice}"
end
end
conv = A.new()
puts "#{conv.B(1111)}" # outputs 1111 as usual, with a length of 4
puts "#{conv.B(01111)}" # outputs 585, with a length of 3
It seems putting a zero in front of the integer representation of binary is causing all sorts of ruckus to occur. I initially thought it might be a silly error regarding maximum ints, but I reproduced the issue with much smaller numbers.
Ruby's numeric syntax is similar to C's, and the leading zero tells it to interpret the number as octal (base 8).
1111 base 8 = 585 base 10.
Numeric literals with leading zeros in Ruby are treated as octal numbers.
According to the ruby-doc.org documentation for numeric literals:
You can use a special prefix to write numbers in decimal, hexadecimal,
octal or binary formats. For decimal numbers use a prefix of 0d, for
hexadecimal numbers use a prefix of 0x, for octal numbers use a prefix
of 0 or 0o, for binary numbers use a prefix of 0b. The alphabetic
component of the number is not case-sensitive.
Examples:
0d170
0D170
0xaa
0xAa
0xAA
0Xaa
0XAa
0XaA
0252
0o252
0O252
0b10101010
0B10101010
So in your case, since 11118 = 58510, 01111.to_s will return "585".
Note that Fixnum#to_s takes an argument which lets you specifiy the base of the number system you are using. So for your program, you could do it like this:
class A
def B(binaryNum)
puts binaryNum
binarray = binaryNum.to_s(2).chars.to_a
indice = binarray.length
puts "\n#{indice}"
end
end
conv = A.new
puts "#{conv.B(0b1111)}" # Outputs 15, with a length of 4
puts "#{conv.B(01111)}" # Outputs 585, with a length of 10
puts "#{conv.B(1111)}" # Outputs 1111, with a length of 11
Even better, in Ruby 2.1+ Fixnum has an instance method called bit_length which seems to already do what you want:
0b1.bit_length
#=> 1
0b11.bit_length
#=> 2
0b111.bit_length
#=> 3
0x1FF.bit_length
#=> 9
Related
I have a member number like "123456" that I want to encode into the shortest string I can for use in a url shortener (without a database).
The standard characters A-Z, a-z and 0-9 give me 62 characters to work with, easily being 64 charaters if I add two special characters like _ and ! for example.
How can I convert the any number up to say 64 to be a single character.
So something like.
encode(1) # -> a
encode(10) # -> j
encode(26) # -> z
encode(27) # -> A
encode(52) # -> Z
encode(123456) # -> eJA
So I could have any number .. and return a shorter encoded string.
Attempt with things like the built in Base64 are returning a string that's as long as the input.
Base64.encode64("10") # -> "MTA=\n" ... I want the output to be 1 character not 6!!
How can I encode integers to be a shorter base 64 string?
Edit:
Oh and it was implied but I totally forgot to say, how do I also then de-encode back to the original input?
decode('a') # -> 1
decode('j') # -> 10
decode('z') # -> 26
decode('A') # -> 27
decode('Z') # -> 52
decode('eJA') # -> 123456
First of all, there's base-64 (a numeral system) and Base64 (an encoding for binary data). Ruby's built-in Base64 module converts data (strings) to and from Base64 encoding.
I assume that you on the other hand want to convert a number from base-10 to base-64 and then use a custom alphabet (A-Z, a-z, 0-9, _, !) to represent each digit.
Your input number 123456 is in base-10. You can convert it to base-64 via digits – which returns an array of digits:
number = 123456
digits = number.digits(64).reverse
#=> [30, 9, 0]
And then map each digit to its corresponding character:
chars = [*'A'..'Z', *'a'..'z', *'0'..'9', '_', '!']
digits.map { |i| chars[i] }.join
#=> "eJA"
It doesn't work with "base-64" for character conversion, but I've used base 36 a number of times in the past, using the built-in ruby #to_s and #to_i routines:
2.7.2 :007 > 340.to_s(36)
=> "9g"
2.7.2 :008 > "9g".to_i(36)
=> 340
I'm not 100% sure which characters you would use for your base character set. "Base 36" is all alphas (26) and all numerics (10).
As others here have said ruby's Base64 encoding is not the same as converting an integer to a string using a base of 64. Ruby provides an elegant converter for this but the maximum base is base-36. (See #jad's answer).
Below brings together everything into two methods for encoding/decoding as base-64.
def encode(int)
chars = [*'A'..'Z', *'a'..'z', *'0'..'9', '_', '!']
digits = int.digits(64).reverse
digits.map { |i| chars[i] }.join
end
And to decode
def decode(str)
chars = [*'A'..'Z', *'a'..'z', *'0'..'9', '_', '!']
digits = str.chars.map { |char| value = chars.index(char) }.reverse
output = digits.each_with_index.map do |value, index|
value * (64 ** index)
end
output.sum
end
Give them a try:
puts output = encode(123456) #=> "eJA"
puts decode(output) #=> 123456
The compression is pretty good, an integer around 99 Million (99,999,999) encodes down to 5 characters ("1pOkA").
To gain the extra compression of including upper and lower case characters using base-64 is inherantly case-sensetive. If you are wanting to make this case-insensetive, using the built in base-36 method per Jad's answer is the way to go.
Credit to #stefan for help with this.
My ruby command is,
"980,323,344.00".to_i
Why does it return 980 instead of 980323344?
You can achieve it by doing this :
"980,323,344.00".delete(',').to_i
The reason your method call to to_i does not return as expected is explained here, and to quote, the method :
Returns the result of interpreting leading characters in str as an integer base base (between 2 and 36). Extraneous characters past the end of a valid number are ignored.
Extraneous characters in your case would be the comma character that ends at 980, the reason why you see 980 being returned
In ruby calling to_i on a string will truncate from the beginning of a string where possible.
number_string = '980,323,344.00'
number_string.delete(',').to_i
#=> 980323344
"123abc".to_i
#=> 123
If you want to add underscores to make longer number more readable, those can be used where the conventional commas would be in written numbers.
"980_323_344.00".to_i
#=> 980323344
The documentation for to_i might be a bit misleading:
Returns the result of interpreting leading characters in str as an integer base base (between 2 and 36)
"interpreting" doesn't mean that it tries to parse various number formats (like Date.parse does for date formats). It means that it looks for what's a valid integer literal in Ruby (in the given base). For example:
1234. #=> 1234
'1234'.to_i #=> 1234
1_234. #=> 1234
'1_234'.to_i. #=> 1234
0d1234 #=> 1234
'0d1234'.to_i #=> 1234
0x04D2 #=> 1234
'0x04D2'.to_i(16) #=> 1234
Your input as a whole however is not a valid integer literal: (Ruby doesn't like the ,)
980,323,344.00
# SyntaxError (syntax error, unexpected ',', expecting end-of-input)
# 980,323,344.00
# ^
But it starts with a valid integer literal. And that's where the the seconds sentence comes into play:
Extraneous characters past the end of a valid number are ignored.
So the result is 980 – the leading characters which form a valid integer converted to an integer.
If your strings always have that format, you can just delete the offending commas and run the result through to_i which will ignore the trailing .00:
'980,323,344.00'.delete(',') #=> "980323344.00"
'980,323,344.00'.delete(',').to_i #=> 980323344
Otherwise you could use a regular expression to check its format before converting it:
input = '980,323,344.00'
number = case input
when /\A\d{1,3}(,\d{3})*\.00\z/
input.delete(',').to_i
when /other format/
# other conversion
end
And if you are dealing with monetary values, you should consider using the money gem and its monetize addition for parsing formatted values:
amount = Monetize.parse('980,323,344.00')
#=> #<Money fractional:98032334400 currency:USD>
amount.format
#=> "$980.323.344,00"
Note that format requires i18n so the above example might require some setup.
Ruby integers are written as using an optional leading sign, an optional base indicator (0 for octal, 0x for hex, or 0b for binary), followed by a string of digits in the appropriate base. Underscore characters are ignored in the digit string. The letters mentioned in the above description may be either upper or lower case and the underscore characters can only occur strictly within the digit string.
I need to create regular expression to check for Ruby integers in java string with the specification mentioned above.
I assume the substrings that may represent integers are separated by spaces or begin or end the string. If so, I suggest you split the string on whitespace and then the use the method Kernel#Integer to determine if each element of the resulting array represents an integer.
def str_to_int(str)
str.split.each_with_object([]) do |s,a|
val = Integer(s) rescue nil
a << [s, val] unless val.nil?
end
end
str_to_int "22 -22 077 0xAB 0xA_B 0b101 -0b101 cat _3 4_"
#=> [["22", 22], ["-22", -22], ["077", 63], ["0xAB", 171],
# ["0xA_B", 171], ["0b101", 5], ["-0b101", -5]]
Integer raises a TypeError exception is the number cannot be converted to an integer. I've dealt with that with an in-line rescue that returns nil, but you may wish to write it so that only that exception is rescued. It may be prudent remove punctuation from the string before executing the above method.
This regex captures positive or negative numbers in denary, binary, octal and hexidecimal form including any underscores:
# hexidecimal binary octal denary
-?0x[0-9a-fA-F][0-9a-fA-F_]*[0-9a-fA-F]|-?0x[0-9a-fA-F]|-?0b[01][01_]*[01]|-?0b[01]|-?0[0-7][0-7_]?[0-7]?|-?0[0-7]|-?[1-9][0-9_]*[0-9]|-?[0-9]
You should test the regex thoroughly to make sure it works as required but it does seem to work on a few relevant samples I tried (see this on Rubular where I've used () captures so you can see the matches more easily but it is essentially the same regex).
Here is an example of the regex in action using String#scan:
str = "-0x88339_43 wor0ds 8_8_ 0b1001 01words0x334 _9 0b1 0x4 0_ 0x_ 0b_1 0b00_1"
reg = /-?0x[0-9a-fA-F][0-9a-fA-F_]*[0-9a-fA-F]|-?0x[0-9a-fA-F]|-?0b[01][01_]*[01]|-?0b[01]|-?0[0-7][0-7_]?[0-7]?|-?0[0-7]|-?[1-9][0-9_]*[0-9]|-?[0-9]/
#regex matches
str.scan reg
#=>["-0x88339_43", "0", "8_8", "0b1001", "01", "0x334", "9", "0b1", "0x4", "0", "0", "0", "1", "0b00_1"]
Like #CarySwoveland, I'm assuming your string has spaces. Without spaces you will still get a result but it may not be what you desire, but at least it's a start.
I am trying to convert a variable, which will always be a number, into binary, octal, and hex with Ruby.
The code I have at this point is:
def convert(number)
puts "#{number} in decimal is"
puts "#{number.to_s(2)} in binary"
puts "#{number.to_s(8)} in octal"
puts "#{number.to_s(16)} in hexadecimal"
end
and so far the output is:
2 in decimal is
10 in binary
2 in octal
2 in hexadecimal
The first two lines run fine, but after that it is ignoring the conversion command and just putting the variable in. Does anyone have any idea what it is I am missing?
You are missing the fact that 2 is... 2 in base 8, 16, or any base greater than 2. Try convert(42) for fun.
I want to understand a piece of code I found in Google:
i.to_s
In the above code i is an integer. As per my understanding i is being converted into a string. Is that true?
Better to say that this is an expression returning the string representation of the integer i. The integer itself doesn't change. #pedantic.
In irb
>> 54.to_s
=> "54"
>> 4598734598734597345937423647234.to_s
=> "4598734598734597345937423647234"
>> i = 7
=> 7
>> i.to_s
=> "7"
>> i
=> 7
As noted in the other answers, calling .to_s on an integer will return the string representation of that integer.
9.class #=> Fixnum
9.to_s #=> "9"
9.to_s.class #=> String
But you can also pass an argument to .to_s to change it from the default Base = 10 to anything from Base 2 to Base 36. Here is the documentation: Fixnum to_s. So, for example, if you wanted to convert the number 1024 to it's equivalent in binary (aka Base 2, which uses only "1" and "0" to represent any number), you could do:
1024.to_s(2) #=> "10000000000"
Converting to Base 36 can be useful when you want to generate random combinations of letters and numbers, since it counts using every number from 0 to 9 and then every letter from a to z. Base 36 explanation on Wikipedia. For example, the following code will give you a random string of letters and numbers of length 1 to 3 characters long (change the 3 to whatever maximum string length you want, which increases the possible combinations):
rand(36**3).to_s(36)
To better understand how the numbers are written in the different base systems, put this code into irb, changing out the 36 in the parenthesis for the base system you want to learn about. The resulting printout will count from 0 to 35 in which ever base system you chose
36.times {|i| puts i.to_s(36)}
That is correct. to_s converts any object to a string, in this case (probably) an integer, since the variable is called i.