How do I use a ruby regex to get a substring? - ruby

I want to get the numbers out of strings such as:
person_3
person_34
person_356
city_4
city_15
etc...
It seems to me that the following should work:
string[/[0-9]*/]
but this always spits out an empty string.

[0-9]* successfully matches "0 or more" digits at the beginning of the string, so it returns "". [0-9]+ will match "1 or more" digits, and works as you expect:
irb(main):001:0> x = "test 92"
=> "test 92"
irb(main):003:0> x[/\d*/]
=> ""
irb(main):005:0> x.index(/\d*/)
=> 0
irb(main):004:0> x[/\d+/]
=> "92"

Related

Ruby - regex that matches string to pattern and detects unwanted occurrences [duplicate]

How do I a string against a regex such that it will return true if the whole string matches (not a substring)?
eg:
test( \ee\ , "street" ) #=> returns false
test( \ee\ , "ee" ) #=> returns true!
Thank you.
You can match the beginning of the string with \A and the end with \Z. In ruby ^ and $ match also the beginning and end of the line, respectively:
>> "a\na" =~ /^a$/
=> 0
>> "a\na" =~ /\Aa\Z/
=> nil
>> "a\na" =~ /\Aa\na\Z/
=> 0
This seems to work for me, although it does look ugly (probably a more attractive way it can be done):
!(string =~ /^ee$/).nil?
Of course everything inside // above can be any regex you want.
Example:
>> string = "street"
=> "street"
>> !(string =~ /^ee$/).nil?
=> false
>> string = "ee"
=> "ee"
>> !(string =~ /^ee$/).nil?
=> true
Note: Tested in Rails console with ruby (1.8.7) and rails (3.1.1)
So, what you are asking is how to test whether the two strings are equal, right? Just use string equality! This passes every single one of the examples that both you and Tomas cited:
'ee' == 'street' # => false
'ee' == 'ee' # => true
"a\na" == 'a' # => false
"a\na" == "a\na" # => true

Basic Matching in Ruby

I am working through a book and it gives this example
x = "This is a test".match(/(\w+) (\w+)/)
We are looking at the parentheses and being able to access what is passed separately.
When I put the expression above into my IRB I get:
MatchData "This is" 1:"This" 2:"is">
Why doesn't this also include a and Test?
Would I have to include .match(/(\w+) (\w+) (\w+) (\w+)/) ?
The 'match' method is not matching the regex globally. It is only returning the first match. You can use the 'scan' method rather than 'match' and it should return an array of all matches of the regex.
[~]$ irb
1.8.7-p371 :001 > x = "This is a test".match(/(\w+) (\w+)/)
=> #<MatchData "This is" 1:"This" 2:"is">
1.8.7-p371 :002 > x = "This is a test".scan(/(\w+) (\w+)/)
=> [["This", "is"], ["a", "test"]]

Extract the last word in sentence/string?

I have an array of strings, of different lengths and contents.
Now i'm looking for an easy way to extract the last word from each string, without knowing how long that word is or how long the string is.
something like;
array.each{|string| puts string.fetch(" ", last)
This should work just fine
"my random sentence".split.last # => "sentence"
to exclude punctuation, delete it
"my rando­m sente­nce..,.!?".­split.last­.delete('.­!?,') #=> "sentence"
To get the "last words" as an array from an array you collect
["random sentence...",­ "lorem ipsum!!!"­].collect { |s| s.spl­it.last.delete('.­!?,') } # => ["sentence", "ipsum"]
array_of_strings = ["test 1", "test 2", "test 3"]
array_of_strings.map{|str| str.split.last} #=> ["1","2","3"]
["one two",­ "thre­e four five"­].collect { |s| s.spl­it.last }
=> ["two", "five"]
"a string of words!".match(/(.*\s)*(.+)\Z/)[2] #=> 'words!' catches from the last whitespace on. That would include the punctuation.
To extract that from an array of strings, use it with collect:
["a string of words", "Something to say?", "Try me!"].collect {|s| s.match(/(.*\s)*(.+)\Z/)[2] } #=> ["words", "say?", "me!"]
The problem with all of these solutions is that you only considering spaces for word separation. Using regex you can capture any non-word character as a word separator. Here is what I use:
str = 'Non-space characters, like foo=bar.'
str.split(/\W/).last
# "bar"
This is the simplest way I can think of.
hostname> irb
irb(main):001:0> str = 'This is a string.'
=> "This is a string."
irb(main):002:0> words = str.split(/\s+/).last
=> "string."
irb(main):003:0>

How do I match a 10 character sub-string starting with 'H' in a longer string with Ruby?

I have the following string:
/Users/patelc75/Documents/code/haloror/dialup/H200000787_1313406125/H200000787_1313389058_1.xml
In Ruby, how do I extract the first 10 character substring that starts with the letter H and contains 9 digits (digits only) after the H. In this above example, the substring would be H200000787
String#[] method is what you need:
str = '/Users/patelc75/Documents/code/haloror/dialup/H200000787_1313406125/H200000787_1313389058_1.xml'
puts str[/H\d{9}/] #=> H200000787
irb(main):001:0> s = "/Users/patelc75/Documents/code/haloror/dialup/H200000787_1313406125/H200000787_1313389058_1.xml"
=> "/Users/patelc75/Documents/code/haloror/dialup/H200000787_1313406125/H200000787_1313389058_1.xml"
irb(main):002:0> s =~ /H\d{9}/
=> 46
irb(main):003:0> $&
=> "H200000787"

Ruby regular expressions

I understand how to check for a pattern in string with regexp in ruby. What I am confused about is how to save the pattern found in string as a separate string.
I thought I could say something like:
if string =~ /regexp/
pattern = string.grep(/regexp/)
and then I could be on with my life. However, this isn't working as expected and is returning the entire original string. Any advice?
You're looking for string.match() in ruby.
irb(main):003:0> a
=> "hi"
irb(main):004:0> a=~/(hi)/
=> 0
irb(main):005:0> a.match(/hi/)
=> #<MatchData:0x5b6e8>
irb(main):006:0> a.match(/hi/)[0]
=> "hi"
irb(main):007:0> a.match(/h(i)/)[1]
=> "i"
irb(main):008:0>
But also for working with what you just matched in the if condition you can use $& $1..$9 and $~ as such:
irb(main):009:0> if a =~ /h(i)/
irb(main):010:1> puts("%s %s %s %s"%[$&,$1,$~[0],$~[1]])
irb(main):011:1> end
hi i hi i
=> nil
irb(main):012:0>
You can also use the special variables $& and $1-$n, like so:
if "regex" =~ /reg(ex)/
puts $&
puts $1
end
Outputs:
regex
ex
$~ also contains the MatchData object. See also: http://www.regular-expressions.info/ruby.html.
I prefer some shortcuts like:
email = "Khaled Al Habache <khellls#gmail.com>"
email[/<(.*?)>/, 1] # => "khellls#gmail.com"

Resources