Converting a string a number (exactly as it is represented in it)? - ruby

I have the following:
{:department=>{"Pet Supplies"=>{"Birds"=>"16,414", "Cats"=>"243,384",
"Dogs"=>"512,186", "Fish & Aquatic Pets"=>"47,018",
"Horses"=>"14,749", "Insects"=>"359", "Reptiles &
Amphibians"=>"5,794", "Small Animals"=>"19,797"}}}
Now if I use to_i I get say 16. If I do to_f I get something like 16.0 (and as you can see Ruby is considering the , as a . for some reason).
I want the number to be exactly as in the string but as a number instead: "Birds"=>16,414
How to accomplish that?

Just a notice:
If I do to_f I get something like 16.0 (and as you can see Ruby is considering the , as a . for some reason)
Ruby is not treating the , as a . at all. If it would the resulting float would be 16.414 and not 16.0. Ruby is just noticing an extraneous character and decides to ignore ,414.
How to accomplish that?
Well if you want 16,414 to be transformed to 16414 there's nothing as easy as just removing the character:
str = '16,414'
str.delete(',').to_i
# => 16414
In some cultures the , is considered a floating point. In that case, if you want to return 16.414 you can just transform the , into . and convert to Float:
str = '16,414'
str.gsub(/,/, '.').to_f
# => 16.414

Try something like below:
"16,414".gsub(",","_").to_i
# => 16414
or(as #Chris Heald suggested)
"19,797".delete(",").to_i
# => 19797

as you can see Ruby is considering the , as a . for some reason
Yes, it's all quite confusing:
class String
to_i(base=10) → integer
Returns the result of interpreting leading characters in str as an
integer base base (between 2 and 36). Extraneous characters past the
end of a valid number are ignored.
to_f → float
Returns the result of interpreting leading characters in str as a
floating point number. Extraneous characters past the end of a valid
number are ignored.
The ruby docs are public. They are not secret. In fact, you probably have the docs on your computer. Try this:
$ ri String#to_i

Related

Regex for series of four digits each up to 100

I'm trying to write a regex to validate a string and accepts only a series of four comma-separated digits, each up to 100. Something like this would be valid:
20,30,40,50
and these invalid:
120,0,20,0
20,30,40,ss
invalid_string
Any thoughts?
They're used for CMYK colours. We just need to store them here, not use them.
Number Range and Subroutine
In Ruby 2+, for a compact regex, use this:
^([0-9]|[1-9][0-9]|100)(?:,\g<1>){3}$
Explanation
The ^ anchor asserts that we are at the beginning of the string
The parentheses around ([0-9]|[1-9][0-9]|100) match a number from 0 to 100 and define subroutine #1
(?:,\g<1>) matches one comma and the expression defined by subroutine # 1
The {3} quantifier repeats that three times
The $ anchor asserts that we are at the end of the string
I'd save myself the headache of using regex for a number related problem. Also the validation message will look akward so it's better to make your own:
validate :that_string_has_only_4_numbers_upto_100
def that_string_has_only_4_numbers_upto_100
errors.add(:str, 'is not valid.') unless str.split(/,/).all? { |n| 1..100 === n.to_i }
end
Unless you a re regex jedi guru like #zx81 :p.
^(?:\d{1,2},){3}\d{1,2}$
Try this

Deleting all special characters from a string - ruby

I was doing the challenges from pythonchallenge writing code in ruby, specifically this one. It contains a really long string in page source with special characters. I was trying to find a way to delete them/check for the alphabetical chars.
I tried using scan method, but I think I might not use it properly. I also tried delete! like that:
a = "PAGE SOURCE CODE PASTED HERE"
a.delete! "!", "#" #and so on with special chars, does not work(?)
a
How can I do that?
Thanks
You can do this
a.gsub!(/[^0-9A-Za-z]/, '')
try with gsub
a.gsub!(/[!#%&"]/,'')
try the regexp on rubular.com
if you want something more general you can have a string with valid chars and remove what's not in there:
a.gsub!(/[^abcdefghijklmnopqrstuvwxyz ]/,'')
When you give multiple arguments to string#delete, it's the intersection of those arguments that is deleted. a.delete! "!", "#" deletes the intersections of the sets ! and # which means that nothing will be deleted and the method returns nil.
What you wanted to do is a.delete! "!#" with the characters to delete passed as a single string.
Since the challenge is asking to clean up the mess and find a message in it, I would go with a whitelist instead of deleting special characters. The delete method accepts ranges with - and negations with ^ (similar to a regex) so you can do something like this: a.delete! "^A-Za-z ".
You could also use regular expressions as shown by #arieljuod.
gsub is one of the most used Ruby methods in the wild.
specialname="Hello!#$#"
cleanedname = specialname.gsub(/[^a-zA-Z0-9\-]/,"")
I think a.gsub(/[^A-Za-z0-9 ]/, '') works better in this case. Otherwise, if you have a sentence, which typically should start with a capital letter, you will lose your capital letter. You would also lose any 1337 speak, or other possible crypts within the text.
Case in point:
phrase = "Joe can't tell between 'large' and large."
=> "Joe can't tell between 'large' and large."
phrase.gsub(/[^a-z ]/, '')
=> "oe cant tell between large and large"
phrase.gsub(/[^A-Za-z0-9 ]/, '')
=> "Joe cant tell between large and large"
phrase2 = "W3 a11 f10a7 d0wn h3r3!"
phrase2.gsub(/[^a-z ]/, '')
=> " a fa dwn hr"
phrase2.gsub(/[^A-Za-z0-9 ]/, '')
=> "W3 a11 f10a7 d0wn h3r3"
If you don't want to change the original string - i.e. to solve the challenge.
str.each_char do |letter|
if letter =~ /[a-z]/
p letter
end
end
You will have to write down your own string sanitize function, could easily use regex and the gsub method.
Atomic sample:
your_text.gsub!(/[!#\[;\]^%*\(\);\-_\/&\\|$\{#\}<>:`~"]/,'')
API sample:
Route: post 'api/sanitize_text', to: 'api#sanitize_text'
Controller:
def sanitize_text
return render_bad_request unless params[:text].present? && params[:text].present?
sanitized_text = params[:text].gsub!(/[!#\[;\]^%*\(\);\-_\/&\\|$\{#\}<>:`~"]/,'')
render_response( {safe_text: sanitized_text})
end
Then you call it
POST /api/sanitize_text?text=abcdefghijklmnopqrstuvwxyz123456<>$!#%23^%26*[]:;{}()`,.~'"\|/

UTF-8 Encoding in Ruby using a variable

I am using Ruby 1.8.7 (and upgrading isn't an option). I would like to create a string of all UTF-8 code points from 0 to 127, written as "\uXXXX".
My problem is that this is being interpreted as (for example): 'u0008'. If I try to use '\u0008', the string becomes "\u0008" which IS NOT what I want.
I have tried many different ways, but it seems impossible to create a string that is exactly just "\uXXXX" ie. "\u000B". it always is either "\u000B" or "u000B"
Escaping the '\' isn't an option. I need to send a string to a server, such that the server will receive '\u000B' for example. It is so that other server can test its parsing of the \uXXXX syntax. This seems impossible to do in Ruby however.
Happy if someone can prove me wrong :)
Use Integer #chr to get the character. Here's a clean version:
(1..127).each do |i|
value << "U+#{i} = #{i.chr}, hex = \\x#{"%02x" % i}; "
end
The "%02x" % i is the equal to sprintf("%02x", i). It returns the integer as a 2-digit hexadecimal number.
Escaped output (see comments):
(1..127).each do |i|
value << "U+#{i} = \\u#{"%04x" % i}, hex = \\x#{"%02x" % i}; "
end

Converting a hexadecimal number to binary in ruby

I am trying to convert a hex value to a binary value (each bit in the hex string should have an equivalent four bit binary value). I was advised to use this:
num = "0ff" # (say for eg.)
bin = "%0#{num.size*4}b" % num.hex.to_i
This gives me the correct output 000011111111. I am confused with how this works, especially %0#{num.size*4}b. Could someone help me with this?
You can also do:
num = "0ff"
num.hex.to_s(2).rjust(num.size*4, '0')
You may have already figured out, but, num.size*4 is the number of digits that you want to pad the output up to with 0 because one hexadecimal digit is represented by four (log_2 16 = 4) binary digits.
You'll find the answer in the documentation of Kernel#sprintf (as pointed out by the docs for String#%):
http://www.ruby-doc.org/core/classes/Kernel.html#M001433
This is the most straightforward solution I found to convert from hexadecimal to binary:
['DEADBEEF'].pack('H*').unpack('B*').first # => "11011110101011011011111011101111"
And from binary to hexadecimal:
['11011110101011011011111011101111'].pack('B*').unpack1('H*') # => "deadbeef"
Here you can find more information:
Array#pack: https://ruby-doc.org/core-2.7.1/Array.html#method-i-pack
String#unpack1 (similar to unpack): https://ruby-doc.org/core-2.7.1/String.html#method-i-unpack1
This doesn't answer your original question, but I would assume that a lot of people coming here are, instead of looking to turn hexadecimal to actual "0s and 1s" binary output, to decode hexadecimal to a byte string representation (in the spirit of such utilities as hex2bin). As such, here is a good method for doing exactly that:
def hex_to_bin(hex)
# Prepend a '0' for padding if you don't have an even number of chars
hex = '0' << hex unless (hex.length % 2) == 0
hex.scan(/[A-Fa-f0-9]{2}/).inject('') { |encoded, byte| encoded << [byte].pack('H2') }
end
Getting back to hex again is much easier:
def bin_to_hex(bin)
bin.unpack('H*').first
end
Converting the string of hex digits back to binary is just as easy. Take the hex digits two at a time (since each byte can range from 00 to FF), convert the digits to a character, and join them back together.
def hex_to_bin(s) s.scan(/../).map { |x| x.hex.chr }.join end

Ruby: Fuzzing through all unicode characters ‎(UTF8/Encoding/String Manipulation)

I can't iterate over the entire range of unicode characters.
I searched everywhere...
I am building a fuzzer and want to embed into a url, all unicode characters (one at a time).
For example:
http://www.example.com?a=\uff1c
I know that there are some built tools but I need more flexibility.
If i could do someting like the following: "\u" + "ff1c" it would be great.
This is the closest I got:
char = "\u0000"
...
#within iteration
char.succ!
...
but after the character "\u0039", which is the number 9, I will get "10" instead of ":"
You could use pack to convert numbers to UTF8 characters but I'm not sure if this solves your problem.
You can either create an array with numeric values of all the characters and use pack to get an UTF8 string or you can just loop from 0 to whatever you need and use pack within the loop.
I've written a small example to explain myself. The code below prints out the hex value of each character followed by the character itself.
0.upto(100) do |i|
puts "%04x" % i + ": " + [i].pack("U*")
end
Here's some simpler code, albeit slightly obfuscated, that takes advantage of the fact that Ruby will convert an integer on the right hand side of the << operator to a codepoint. This only works with Ruby 1.8 up for integer values <= 255. It will work for values greater than 255 in 1.9.
0.upto(100) do |i|
puts "" << i
end

Resources