Ruby test for "\0" null? - ruby

I have some odd characters showing up in strings that are breaking a script. From what I can tell by put badstring to console they are "\0\0\0\0".
I'd like to test for this so I can ignore them...but how?
thought that's what blank? and empty? were for?!? :
> badstring = "\0"
=> "\u0000"
> badstring.blank?
NoMethodError: undefined method `blank?' for "\u0000":String
from (irb):97
from /Users/meltemi/.rvm/rubies/ruby-2.0.0-p195/bin/irb:16:in `<main>'
> badstring.empty?
=> false
> badstring.nil?
=> false
Edit: Trying to recreate this in irb but having trouble:
> test1 = "\0\0\0\0"
=> "\u0000\u0000\u0000\u0000"
> test2 = '\0\0\0\0'
=> "\\0\\0\\0\\0"
what I want is a "\0\0\0\0" string so I can find a way to test if mystring == "\0\0\0\0" or something of the sort.

First of all blank? is a Rails helper. Try this instead:
badstring =~ /\x00/
if this returns an integer then the given string includes "\0", if this returns nil then the given string does not include "\0".

You could just remove "\0" chars with
badstring.delete!("\0")
Full example
badstring = "\0"
badstring.delete!("\0")
badstring.empty?
#=> true
Use delete instead of delete! if you want to keep the original string around.

Seems like we need to verify the encoding and characters here. You can check the string's encoding type with "string".encoding. Then you can see which character codes are actually being used here with badstring.chars.map(&:ord). Then you can replace / remove the characters using character_code.chr(encoding).

Related

Use ARGV[] argument vector to pass a regular expression in Ruby

I am trying to use gsub or sub on a regex passed through terminal to ARGV[].
Query in terminal: $ruby script.rb input.json "\[\{\"src\"\:\"
Input file first 2 lines:
[{
"src":"http://something.com",
"label":"FOO.jpg","name":"FOO",
"srcName":"FOO.jpg"
}]
[{
"src":"http://something123.com",
"label":"FOO123.jpg",
"name":"FOO123",
"srcName":"FOO123.jpg"
}]
script.rb:
dir = File.dirname(ARGV[0])
output = File.new(dir + "/output_" + Time.now.strftime("%H_%M_%S") + ".json", "w")
open(ARGV[0]).each do |x|
x = x.sub(ARGV[1]),'')
output.puts(x) if !x.nil?
end
output.close
This is very basic stuff really, but I am not quite sure on how to do this. I tried:
Regexp.escape with this pattern: [{"src":".
Escaping the characters and not escaping.
Wrapping the pattern between quotes and not wrapping.
Meditate on this:
I wrote a little script containing:
puts ARGV[0].class
puts ARGV[1].class
and saved it to disk, then ran it using:
ruby ~/Desktop/tests/test.rb foo /abc/
which returned:
String
String
The documentation says:
The pattern is typically a Regexp; if given as a String, any regular expression metacharacters it contains will be interpreted literally, e.g. '\d' will match a backlash followed by ‘d’, instead of a digit.
That means that the regular expression, though it appears to be a regex, it isn't, it's a string because ARGV only can return strings because the command-line can only contain strings.
When we pass a string into sub, Ruby recognizes it's not a regular expression, so it treats it as a literal string. Here's the difference in action:
'foo'.sub('/o/', '') # => "foo"
'foo'.sub(/o/, '') # => "fo"
The first can't find "/o/" in "foo" so nothing changes. It can find /o/ though and returns the result after replacing the two "o".
Another way of looking at it is:
'foo'.match('/o/') # => nil
'foo'.match(/o/) # => #<MatchData "o">
where match finds nothing for the string but can find a hit for /o/.
And all that leads to what's happening in your code. Because sub is being passed a string, it's trying to do a literal match for the regex, and won't be able to find it. You need to change the code to:
sub(Regexp.new(ARGV[1]), '')
but that's not all that has to change. Regexp.new(...) will convert what's passed in into a regular expression, but if you're passing in '/o/' the resulting regular expression will be:
Regexp.new('/o/') # => /\/o\//
which is probably not what you want:
'foo'.match(/\/o\//) # => nil
Instead you want:
Regexp.new('o') # => /o/
'foo'.match(/o/) # => #<MatchData "o">
So, besides changing your code, you'll need to make sure that what you pass in is a valid expression, minus any leading and trailing /.
Based on this answer in the thread Convert a string to regular expression ruby, you should use
x = x.sub(/#{ARGV[1]}/,'')
I tested it with this file (test.rb):
puts "You should not see any number [0123456789].".gsub(/#{ARGV[0]}/,'')
I called the file like so:
ruby test.rb "\d+"
# => You should not see any number [].

Get last character in string

I want to get the last character in a string MY WAY - 1) Get last index 2) Get character at last index, as a STRING. After that I will compare the string with another, but I won't include that part of code here. I tried the code below and I get a strange number instead. I am using ruby 1.8.7.
Why is this happening and how do I do it ?
line = "abc;"
last_index = line.length-1
puts "last index = #{last_index}"
last_char = line[last_index]
puts last_char
Output-
last index = 3
59
Ruby docs told me that array slicing works this way -
a = "hello there"
a[1] #=> "e"
But, in my code it does not.
UPDATE:
I keep getting constant up votes on this, hence the edit. Using [-1, 1] is correct, however a better looking solution would be using just [-1]. Check Oleg Pischicov's answer.
line[-1]
# => "c"
Original Answer
In ruby you can use [-1, 1] to get last char of a string. Here:
line = "abc;"
# => "abc;"
line[-1, 1]
# => ";"
teststr = "some text"
# => "some text"
teststr[-1, 1]
# => "t"
Explanation:
Strings can take a negative index, which count backwards from the end
of the String, and an length of how many characters you want (one in
this example).
Using String#slice as in OP's example: (will work only on ruby 1.9 onwards as explained in Yu Hau's answer)
line.slice(line.length - 1)
# => ";"
teststr.slice(teststr.length - 1)
# => "t"
Let's go nuts!!!
teststr.split('').last
# => "t"
teststr.split(//)[-1]
# => "t"
teststr.chars.last
# => "t"
teststr.scan(/.$/)[0]
# => "t"
teststr[/.$/]
# => "t"
teststr[teststr.length-1]
# => "t"
Just use "-1" index:
a = "hello there"
a[-1] #=> "e"
It's the simplest solution.
If you are using Rails, then apply the method #last to your string, like this:
"abc".last
# => c
You can use a[-1, 1] to get the last character.
You get unexpected result because the return value of String#[] changed. You are using Ruby 1.8.7 while referring the the document of Ruby 2.0
Prior to Ruby 1.9, it returns an integer character code. Since Ruby 1.9, it returns the character itself.
String#[] in Ruby 1.8.7:
str[fixnum] => fixnum or nil
String#[] in Ruby 2.0:
str[index] → new_str or nil
In ruby you can use something like this:
ending = str[-n..-1] || str
this return last n characters
Using Rails library, I would call the method #last as the string is an array. Mostly because it's more verbose..
To get the last character.
"hello there".last() #=> "e"
To get the last 3 characters you can pass a number to #last.
"hello there".last(3) #=> "ere"
Slice() method will do for you.
For Ex
"hello".slice(-1)
# => "o"
Thanks
Your code kinda works, the 'strange number' you are seeing is ; ASCII code. Every characters has a corresponding ascii code ( https://www.asciitable.com/). You can use for conversationputs last_char.chr, it should output ;.

How to validate that a string is a proper hexadecimal value in Ruby?

I am writing a 6502 assembler in Ruby. I am looking for a way to validate hexadecimal operands in string form. I understand that the String object provides a "hex" method to return a number, but here's a problem I run into:
"0A".hex #=> 10 - a valid hexadecimal value
"0Z".hex #=> 0 - invalid, produces a zero
"asfd".hex #=> 10 - Why 10? I guess it reads 'a' first and stops at 's'?
You will get some odd results by typing in a bunch of gibberish. What I need is a way to first verify that the value is a legit hex string.
I was playing around with regular expressions, and realized I can do this:
true if "0A" =~ /[A-Fa-f0-9]/
#=> true
true if "0Z" =~ /[A-Fa-f0-9]/
#=> true <-- PROBLEM
I'm not sure how to address this issue. I need to be able to verify that letters are only A-F and that if it is just numbers that is ok too.
I'm hoping to avoid spaghetti code, riddled with "if" statements. I am hoping that someone could provide a "one-liner" or some form of elegent code.
Thanks!
!str[/\H/] will look for invalid hex values.
String#hex does not interpret the whole string as hex, it extracts from the beginning of the string up to as far as it can be interpreted as hex. With "0Z", the "0" is valid hex, so it interpreted that part. With "asfd", the "a" is valid hex, so it interpreted that part.
One method:
str.to_i(16).to_s(16) == str.downcase
Another:
str =~ /\A[a-f0-9]+\Z/i # or simply /\A\h+\Z/ (see hirolau's answer)
About your regex, you have to use anchors (\A for begin of string and \Z for end of string) to say that you want the full string to match. Also, the + repeats the match for one or more characters.
Note that you could use ^ (begin of line) and $ (end of line), but this would allow strings like "something\n0A" to pass.
This is an old question, but I just had the issue myself. I opted for this in my code:
str =~ /^\h+$/
It has the added benefit of returning nil if str is nil.
Since Ruby has literal hex built-in, you can eval the string and rescue the SyntaxError
eval "0xA" => 10
eval "0xZ" => SyntaxError
You can use this on a method like
def is_hex?(str)
begin
eval("0x#{str}")
true
rescue SyntaxError
false
end
end
is_hex?('0A') => true
is_hex?('0Z') => false
Of course since you are using eval, make sure you are sending only safe values to the methods

Ruby: How to get the first character of a string

How can I get the first character in a string using Ruby?
Ultimately what I'm doing is taking someone's last name and just creating an initial out of it.
So if the string was "Smith" I just want "S".
You can use Ruby's open classes to make your code much more readable. For instance, this:
class String
def initial
self[0,1]
end
end
will allow you to use the initial method on any string. So if you have the following variables:
last_name = "Smith"
first_name = "John"
Then you can get the initials very cleanly and readably:
puts first_name.initial # prints J
puts last_name.initial # prints S
The other method mentioned here doesn't work on Ruby 1.8 (not that you should be using 1.8 anymore anyway!--but when this answer was posted it was still quite common):
puts 'Smith'[0] # prints 83
Of course, if you're not doing it on a regular basis, then defining the method might be overkill, and you could just do it directly:
puts last_name[0,1]
If you use a recent version of Ruby (1.9.0 or later), the following should work:
'Smith'[0] # => 'S'
If you use either 1.9.0+ or 1.8.7, the following should work:
'Smith'.chars.first # => 'S'
If you use a version older than 1.8.7, this should work:
'Smith'.split(//).first # => 'S'
Note that 'Smith'[0,1] does not work on 1.8, it will not give you the first character, it will only give you the first byte.
"Smith"[0..0]
works in both ruby 1.8 and ruby 1.9.
For completeness sake, since Ruby 1.9 String#chr returns the first character of a string. Its still available in 2.0 and 2.1.
"Smith".chr #=> "S"
http://ruby-doc.org/core-1.9.3/String.html#method-i-chr
In MRI 1.8.7 or greater:
'foobarbaz'.each_char.first
Try this:
>> a = "Smith"
>> a[0]
=> "S"
OR
>> "Smith".chr
#=> "S"
In Rails
name = 'Smith'
name.first
>> s = 'Smith'
=> "Smith"
>> s[0]
=> "S"
Another option that hasn't been mentioned yet:
> "Smith".slice(0)
#=> "S"
Because of an annoying design choice in Ruby before 1.9 — some_string[0] returns the character code of the first character — the most portable way to write this is some_string[0,1], which tells it to get a substring at index 0 that's 1 character long.
Try this:
def word(string, num)
string = 'Smith'
string[0..(num-1)]
end
If you're using Rails You can also use truncate
> 'Smith'.truncate(1, omission: '')
#=> "S"
or for additional formatting:
> 'Smith'.truncate(4)
#=> "S..."
> 'Smith'.truncate(2, omission: '.')
#=> "S."
While this is definitely overkill for the original question, for a pure ruby solution, here is how truncate is implemented in rails
# File activesupport/lib/active_support/core_ext/string/filters.rb, line 66
def truncate(truncate_at, options = {})
return dup unless length > truncate_at
omission = options[:omission] || "..."
length_with_room_for_omission = truncate_at - omission.length
stop = if options[:separator]
rindex(options[:separator], length_with_room_for_omission) || length_with_room_for_omission
else
length_with_room_for_omission
end
"#{self[0, stop]}#{omission}"
end
Other way around would be using the chars for a string:
def abbrev_name
first_name.chars.first.capitalize + '.' + ' ' + last_name
end
Any of these methods will work:
name = 'Smith'
puts name.[0..0] # => S
puts name.[0] # => S
puts name.[0,1] # => S
puts name.[0].chr # => S

Remove a character at an index position in Ruby

Basically what the question says. How can I delete a character at a given index position in a string? The String class doesn't seem to have any methods to do this.
If I have a string "HELLO" I want the output to be this
["ELLO", "HLLO", "HELO", "HELO", "HELL"]
I do that using
d = Array.new(c.length){|i| c.slice(0, i)+c.slice(i+1, c.length)}
I dont know if using slice! will work here, because it will modify the original string, right?
Won't Str.slice! do it? From ruby-doc.org:
str.slice!(fixnum) => fixnum or nil [...]
Deletes the specified portion from str, and returns the portion deleted.
If you're using Ruby 1.8, you can use delete_at (mixed in from Enumerable), otherwise in 1.9 you can use slice!.
Example:
mystring = "hello"
mystring.slice!(1) # mystring is now "hllo"
# now do something with mystring
$ cat m.rb
class String
def maulin! n
slice! n
self
end
def maulin n
dup.maulin! n
end
end
$ irb
>> require 'm'
=> true
>> s = 'hello'
=> "hello"
>> s.maulin(2)
=> "helo"
>> s
=> "hello"
>> s.maulin!(1)
=> "hllo"
>> s
=> "hllo"
To avoid needing to monkey patch String you can make use of tap:
"abc".tap {|s| s.slice!(2) }
=> "ab"
If you need to leave your original string unaltered, make use of dup, eg. abc.dup.tap.
I did something like this
c.slice(0, i)+c.slice(i+1, c.length)
Where c is the string and i is the index position I want to delete. Is there a better way?

Resources