ruby regex: how to get group value - ruby

Here is my regex:
s = /(?<head>http|https):\/\/(?<host>[^\/]+)/.match("http://www.myhost.com")
How do I get the head and host groups?

s['head'] => "http"
s['host'] => "www.myhost.com"
You could also use URI...
1.9.3p327 > require 'uri'
=> true
1.9.3p327 > u = URI.parse("http://www.myhost.com")
=> #<URI::HTTP:0x007f8bca2239b0 URL:http://www.myhost.com>
1.9.3p327 > u.scheme
=> "http"
1.9.3p327 > u.host
=> "www.myhost.com"

Use captures >>
string = ...
one, two, three = string.match(/pattern/).captures

You should probably use the uri library for this purpose as suggested above, but whenever you match a string to a regex, you can grab captured values using the special variable:
"foo bar baz" =~ /(bar)\s(baz)/
$1
=> 'bar'
$2
=> 'baz'
and so on...

Related

Ruby - regex that matches string to pattern and detects unwanted occurrences [duplicate]

How do I a string against a regex such that it will return true if the whole string matches (not a substring)?
eg:
test( \ee\ , "street" ) #=> returns false
test( \ee\ , "ee" ) #=> returns true!
Thank you.
You can match the beginning of the string with \A and the end with \Z. In ruby ^ and $ match also the beginning and end of the line, respectively:
>> "a\na" =~ /^a$/
=> 0
>> "a\na" =~ /\Aa\Z/
=> nil
>> "a\na" =~ /\Aa\na\Z/
=> 0
This seems to work for me, although it does look ugly (probably a more attractive way it can be done):
!(string =~ /^ee$/).nil?
Of course everything inside // above can be any regex you want.
Example:
>> string = "street"
=> "street"
>> !(string =~ /^ee$/).nil?
=> false
>> string = "ee"
=> "ee"
>> !(string =~ /^ee$/).nil?
=> true
Note: Tested in Rails console with ruby (1.8.7) and rails (3.1.1)
So, what you are asking is how to test whether the two strings are equal, right? Just use string equality! This passes every single one of the examples that both you and Tomas cited:
'ee' == 'street' # => false
'ee' == 'ee' # => true
"a\na" == 'a' # => false
"a\na" == "a\na" # => true

Why a dangerous method doesn't work with a character element of String in Ruby?

When I apply the upcase! method I get:
a="hello"
a.upcase!
a # Shows "HELLO"
But in this other case:
b="hello"
b[0].upcase!
b[0] # Shows h
b # Shows hello
I don't understand why the upcase! applied to b[0] doesn't have any efect.
b[0] returns a new String every time. Check out the object id:
b = 'hello'
# => "hello"
b[0].object_id
# => 1640520
b[0].object_id
# => 25290780
b[0].object_id
# => 24940620
When you are selecting an individual character in a string, you're not referencing the specific character, you're calling a accessor/mutator function which performs the evaluation:
2.0.0-p643 :001 > hello = "ruby"
=> "ruby"
2.0.0-p643 :002 > hello[0] = "R"
=> "R"
2.0.0-p643 :003 > hello
=> "Ruby"
In the case when you run a dangerous method, the value is requested by the accessor, then it's manipulated and the new variable is updated, but because there is no longer a connection between the character and the string, it will not update the reference.
2.0.0-p643 :004 > hello = "ruby"
=> "ruby"
2.0.0-p643 :005 > hello[0].upcase!
=> "R"
2.0.0-p643 :006 > hello
=> "ruby"

Regular expression for only 2 letters

I need to create regular expression for 2 and only 2 letters. I understood it has to be the following /[a-z]{2}/i, but it matches any string with 2 or more letters. Here is what I get:
my_reg_exp = /[a-z]{2}/i
my_reg_exp.match('aa') # => #<MatchData "aa">
my_reg_exp.match('AA') # => #<MatchData "AA">
my_reg_exp.match('a') # => nil
my_reg_exp.match('aaa') # => #<MatchData "aa">
Any suggestion?
You can add the anchors like this:
my_reg_exp = /^[a-z]{2}$/i
Test:
my_reg_exp.match('aaa')
#=> nil
my_reg_exp.match('aa')
#=> #<MatchData "aa">
Hao's solution matches isn't locale sensitive. If this is important for your use case:
/\a[[:alpha:]]{2}\z/
2.0.0-p451 :005 > 'aba' =~ /\A[[:alpha:]]{2}\Z/
=> nil
2.0.0-p451 :006 > 'ab' =~ /\A[[:alpha:]]{2}\Z/
=> 0
2.0.0-p451 :007 > 'xy' =~ /\A[[:alpha:]]{2}\Z/
=> 0
2.0.0-p451 :008 > 'zxy' =~ /\A[[:alpha:]]{2}\Z/
=> nil
Per usual, if you need further assistance, leave a comment.
You can use /\b[a-z]{2}\b/i to match a two-letter string. /b Matches a word-break.
This means you can scan a string to find all occurrences:
'Foo is a bar'.scan(/\b[a-z]{2}\b/i) #=> ["is"]
Or find the first match in a string using:
'a bc def'[/\b[a-z]{2}\b/i] # => "bc"

how to remove backslash from a string containing an array in ruby

I have a string like this
a="[\"6000208900\",\"600020890225\",\"600900231930\"]"
#expected result [6000208900,600020890225,600900231930]
I am trying to remove the backslash from the string.
a.gsub!(/^\"|\"?$/, '')
Inside the double quoted string(""), another double quotes must be escaped by \. You can't remove it.
Use puts, you can see it is not there.
a = "[\"6000208902912790\"]"
puts a # => ["6000208902912790"]
Or use JSON
irb(main):001:0> require 'json'
=> true
irb(main):002:0> a = "[\"6000208902912790\"]"
=> "[\"6000208902912790\"]"
irb(main):003:0> b = JSON.parse a
=> ["6000208902912790"]
irb(main):004:0> b
=> ["6000208902912790"]
irb(main):005:0> b.to_s
=> "[\"6000208902912790\"]"
update (as per the last edit of OP)
irb(main):002:0> a = "[\"6000208900\",\"600020890225\",\"600900231930\"]"
=> "[\"6000208900\",\"600020890225\",\"600900231930\"]"
irb(main):006:0> a.scan(/\d+/).map(&:to_i)
=> [6000208900, 600020890225, 600900231930]
irb(main):007:0>
The code a.gsub!(/^\"|\"?$/, '') can't remove the double quote characters because they are not at the beginning and the end of the string. To get what you want try this:
a.gsub(/((?<=^\[)")|("(?=\]$))/, '')
try this:
=> a = "[\"6000208902912790\"]"
=> a.chars.select{ |x| x =~ %r|\d| }.join
=> "6000208902912790"
=> [a.chars.select { |x| x =~ %r|\d| }.join]
=> ["6000208902912790"] # <= array with string
=> [a.chars.select { |x| x =~ %r|\d| }.join].to_s
=> "[\"6000208902912790\"]" # <= come back :)
a="["6000208902912790"]" will return `unexpected tINTEGER`error;
so a="[\"6000208902912790\"]"is used with \ character for double quotes.
As a solution you should try to remove double quotes that will solve the problem.
Do this
a.gsub!(/"/, '')

Ruby regular expressions

I understand how to check for a pattern in string with regexp in ruby. What I am confused about is how to save the pattern found in string as a separate string.
I thought I could say something like:
if string =~ /regexp/
pattern = string.grep(/regexp/)
and then I could be on with my life. However, this isn't working as expected and is returning the entire original string. Any advice?
You're looking for string.match() in ruby.
irb(main):003:0> a
=> "hi"
irb(main):004:0> a=~/(hi)/
=> 0
irb(main):005:0> a.match(/hi/)
=> #<MatchData:0x5b6e8>
irb(main):006:0> a.match(/hi/)[0]
=> "hi"
irb(main):007:0> a.match(/h(i)/)[1]
=> "i"
irb(main):008:0>
But also for working with what you just matched in the if condition you can use $& $1..$9 and $~ as such:
irb(main):009:0> if a =~ /h(i)/
irb(main):010:1> puts("%s %s %s %s"%[$&,$1,$~[0],$~[1]])
irb(main):011:1> end
hi i hi i
=> nil
irb(main):012:0>
You can also use the special variables $& and $1-$n, like so:
if "regex" =~ /reg(ex)/
puts $&
puts $1
end
Outputs:
regex
ex
$~ also contains the MatchData object. See also: http://www.regular-expressions.info/ruby.html.
I prefer some shortcuts like:
email = "Khaled Al Habache <khellls#gmail.com>"
email[/<(.*?)>/, 1] # => "khellls#gmail.com"

Resources