Ruby : get value between parenthesis (without those parenthesis) - ruby

I have strings like "untitled", "untitled(1)" or "untitled(2)".
I want to get the last integer value between parenthesis when if there is. So far I tried a lot of regex, the ones making sense to me (I am new to regex) look like this:
number= string[/\([0-9]+\)/]
number= string[/\(([0-9]+)\)/]
but it still returns me the value with the parenthesis.
If there are no left-then-right parenthesis, getting an empty string (or nil) would be nice. Case such as "untitled($)", getting the '$' char or nil or an empty string would do the trick. For "untitled(3) animal(4)", I want to get 4.
I have been looking a lot of topics about how to do that and but it never seems to work ... what am I missing here ?

/(?<=\()\w+(?=\)$)/ matches one or more word characters (letter, number, underscore) within parenthesis, right before the end of line:
words = %w[
untitled
untitled(1)
untitled(2)
untitled(foo)
unti(tle)d
]
p words.map { |word| word[/(?<=\()\w+(?=\)$)/] }
# => [nil, "1", "2", "foo", nil]

When you use Regex as the parameter to String#[], you can optionally pass in the index of captured group to extract that group.
If a Regexp is supplied, the matching portion of the string is returned. If a capture follows the regular expression, which may be a capture group index or name, follows the regular expression that component of the MatchData is returned instead.
string = "untitled(1)"
number = string[/\(([0-9]+)\)/, 1]
puts number
#=> 1

Related

Use regular expression to fetch 3 groups from string

This is my expected result.
Input a string and get three returned string.
I have no idea how to finish it with Regex in Ruby.
this is my roughly idea.
match(/(.*?)(_)(.*?)(\d+)/)
Input and expected output
# "R224_OO2003" => R224, OO, 2003
# "R2241_OOP2003" => R2244, OOP, 2003
If the example description I gave in my comment on the question is correct, you need a very straightforward regex:
r = /(.+)_(.+)(\d{4})/
Then:
"R224_OO2003".scan(r).flatten #=> ["R224", "OO", "2003"]
"R2241_OOP2003".scan(r).flatten #=> ["R2241", "OOP", "2003"]
Assuming that your three parts consist of (R and one or more digits), then an underbar, then (one or more non-whitespace characters), before finally (a 4-digit numeric date), then your regex could be something like this:
^(R\d+)_(\S+)(\d{4})$
The ^ indicates start of string, and the $ indicates end of string. \d+ indicates one or more digits, while \S+ says one or more non-whitespace characters. The \d{4} says exactly four digits.
To recover data from the matches, you could either use the pre-defined globals that line up with your groups, or you could could use named captures.
To use the match globals just use $1, $2, and $3. In general, you can figure out the number to use by counting the left parentheses of the specific group.
To use the named captures, include ? right after the left paren of a particular group. For example:
x = "R2241_OOP2003"
match_data = /^(?<first>R\d+)_(?<second>\S+)(?<third>\d{4})$/.match(x)
puts match_data['first'], match_data['second'], match_data['third']
yields
R2241
OOP
2003
as expected.
As long as your pattern covers all possibilities, then you just need to use the match object to return the 3 strings:
my_match = "R224_OO2003".match(/(.*?)(_)(.*?)(\d+)/)
#=> #<MatchData "R224_OO2003" 1:"R224" 2:"_" 3:"OO" 4:"2003">
puts my_match[0] #=> "R224_OO2003"
puts my_match[1] #=> "R224"
puts my_match[2] #=> "_"
puts my_match[3] #=> "00"
puts my_match[4] #=> "2003"
A MatchData object contains an array of each match group starting at index [1]. As you can see, index [0] returns the entire string. If you don't want the capture the "_" you can leave it's parentheses out.
Also, I'm not sure you are getting what you want with the part:
(.*?)
this basically says one or more of any single character followed by zero or one of any single character.

Take an array and a letter as arguments and return a new array with words that contain that letter

I can run a search and find the element I want and can return those words with that letter. But when I start to put arguments in, it doesn't work. I tried select with include? and it throws an error saying, private method. This is my code, which returns what I am expecting:
my_array = ["wants", "need", 3, "the", "wait", "only", "share", 2]
def finding_method(source)
words_found = source.grep(/t/) #I just pick random letter
print words_found
end
puts finding_method(my_array)
# => ["wants", "the", "wait"]
I need to add the second argument, but it breaks:
def finding_method(source, x)
words_found = source.grep(/x/)
print words_found
end
puts finding_method(my_array, "t")
This doesn't work, (it returns an empty array because there isn't an 'x' in the array) so I don't know how to pass an argument. Maybe I'm using the wrong method to do what I'm after. I have to define 'x', but I'm not sure how to do that. Any help would be great.
Regular expressions support string interpolation just like strings.
/x/
looks for the character x.
/#{x}/
will first interpolate the value of the variable and produce /t/, which does what you want. Mostly.
Note that if you are trying to search for any text that might have any meaning in regular expression syntax (like . or *), you should escape it:
/#{Regexp.quote(x)}/
That's the correct answer for any situation where you are including literal strings in regular expression that you haven't built yourself specifically for the purpose of being a regular expression, i.e. 99% of cases where you're interpolating variables into regexps.

String comparison by splitting in jruby

I am trying to get the numeric value out of a string which looks like "ABCDEFG [1]". Here I am trying to get 1. I tried to use split method on the following string but dint get success. Any solution for this is appreciated. Thanks!
Here you go:
def get_numeric_value(str)
# Match a number surrounded by square brackets
matchdata = str.match(/\[\d+\]/)
if matchdata
# Get the first matched substring, and remove the first and last characters
# (which will be the square brackets)
number = matchdata[0][1..-2]
number.to_i
else
nil
end
end
get_numeric_value "ABCDEFG [1]" #=> 1

ruby include, dynamic letters and symbols

Say I build an array like this:
:001 > holder = "This.File.Q99P84.Is.Awesome"
=> "This.File.Q99P84.Is.Awesome"
:002 > name = holder.split(".")
=> ["This", "File", "Q99P84", "Is", "Awesome"]
Now, I can do:
name[2].include? 'Q99P84'
Instead of putting in 'Q99P84' I want to put in something like 'symbol for Q followed by
symbol for number, symbol for number, symbol for P, symbol for number, symbol for number
so the .include? function will be dynamic. So any file name I load that has Q##P## will return true.
I'm pretty sure this is possibly I just don't know exactly what to search. If you know the answer can you link me to the documentation.
What you're looking for is regular expression matching. The Ruby regexp object will help you. What you want is something like
/Q[\d+]P[\d+]/.match(name[2])
...which will return a truthy value if name[2] has a string which matches a character Q, one or more digits (0-9), a character P, then one or more digits. This is probably too flexible a match if the pattern you want has exactly two digits in those number spaces; for that you might try a more specific pattern:
/Q\d\dP\d\d/.match(name[2])
One way to do this is through Regular Expressions ('regex' for short). Two good sources of information is Regular Expression.info and Rubular which is more Ruby centric.
One way to use regex on a string is the String#match method:
names[2].match(/Q\d\dP\d\d/) # /d stands for digit
This will return the string if there is a match and it will return nil if there isn't.
"Q42P67".match(/Q\d\dP\d\d/) #=> "Q42P67"
"H33P55".match(/Q\d\dP\d\d/) #=> nil
This is helpful in an if condition since a returned string is 'truthy' and nil is 'falsely'.
names[2] = "Q42P67"
if names[2].match(/Q\d\dP\d\d/)
# Will execute code here
end
names[2] = "H33P55"
if names[2].match(/Q\d\dP\d\d/)
# Will not execute code here
end
I hope that helps until you dig further into your study of Regular Expressions.
EDIT:
Note that the Q and P in /Q\d\dP\d\d/ are capital letters. If case is not important, you can append an 'i' for case-insensitivity. /Q\d\dP\d\d/i will capture "Q42P67" and "q42P67"

understanding regexp with array notation

I've met this code snippet:
erb = "#coding:UTF-8 _erbout = ''; _erbout.concat ..." # string is cut
erb[/\A(#coding[:=].*\r?\n)/, 1]
I know how regular expression works, but I am confused with the array notation. What does it mean to place a regexp in [], what does the second argument 1 mean?
str[regexp] is actually a method of class String, you can find it here http://www.ruby-doc.org/core/classes/String.html#M001128
The second argument 1 will return text matching the first subpattern #coding[:=].*\r?\n, another example for your better understanding:
"ab123baab"[/(\d+)(ba+).*/, 0] # returns "123baab", since it is the complete matched text, ,0 can be omitted also
"ab123baab"[/(\d+)(ba+).*/, 1] # returns "123", since the first subpattern is (\d+)
"ab123baab"[/(\d+)(ba+).*/, 2] # returns "baa", since the second subpattern is (ba+)
The brackets are a method of String. See http://www.ruby-doc.org/core/classes/String.html:
If a Regexp is supplied, the matching
portion of str is returned. If a
numeric or name parameter follows the
regular expression, that component of
the MatchData is returned instead. If
a String is given, that string is
returned if it occurs in str. In both
cases, nil is returned if there is no
match.
The 1 means to return what's matched by the pattern inside the parenthesis.

Resources