Replace s string in Ruby by gsub - ruby

I have a string
path = "MT_Store_0 /47/47/47/opt/47/47/47/data/47/47/47/FCS/47/47/47/oOvt4wCtSuODh8r9RuQT3w"
I want to remove the part of string from first /47 using gsub.
path.gsub! '/47/', '/'
Expected output:
"MT_Store_0 "
Actual output:
"MT_Store_0 /47/opt/47/data/47/FCS/47/oOvt4wCtSuODh8r9RuQT3w"

path.gsub! /\/47.*/, ''
In the regex, \/47.* matches /47 and any characters following it.
Or, you can write the regex using %r to avoid escaping the forward slashes:
path.gsub! %r{/47.*}, ''

If the output have to be MT_Store_0
then gsub( /\/47.*/ ,'' ).strip is what you want

Here are two solutions that employ neither Hash#gsub nor Hash#gsub!.
Use String#index
def extract(str)
ndx = str.index /\/47/
ndx ? str[0, ndx] : str
end
str = "MT_Store_0 /47/47/oOv"
str = extract str
#=> "MT_Store_0 "
extract "MT_Store_0 cat"
#=> "MT_Store_0 cat"
Use a capture group
R = /
(.+?) # match one or more of any character, lazily, in capture group 1
(?: # start a non-capture group
\/47 # match characters
| # or
\z # match end of string
) # end non-capture group
/x # extended mode for regex definition
def extract(str)
str[R, 1]
end
str = "MT_Store_0 /47/47/oOv"
str = extract str
#=> "MT_Store_0 "
extract "MT_Store_0 cat"
#=> "MT_Store_0 cat"

Related

How do I extract the part of a string whose individual words begin with letters?

I'm using Ruby 2.4. Let's say I have a string that has a number of spaces in it
str = "abc def 123ffg"
How do I capture all the consecutive words at the beginning of the string that begin with a letter? So for example, in the above, I would want to capture
"abc def"
And if I had a string like
"aa22 b cc 33d ff"
I would want to capture
"aa22 b cc"
but if my string were
"66dd eee ff"
I would want to return nothing because the first word of that string does not begin with a letter.
If you can spare the extra spaces between words, you could then split the string and iterate the resulting array with take_while, using a regex to get the desired output; something like this:
str = "abc def 123ffg"
str.split.take_while { |word| word[0] =~ /[[:alpha:]]/ }
#=> ["abc", "def"]
The output is an array, but if a string is needed, you could use join at the end:
str.split.take_while { |word| word[0] =~ /[[:alpha:]]/ }.join(" ")
#=> "abc def"
More examples:
"aa22 b cc 33d ff".split.take_while { |word| word[0] =~ /[[:alpha:]]/ }
#=> ["aa22", "b", "cc"]
"66dd eee ff".split.take_while { |word| word[0] =~ /[[:alpha:]]/ }
#=> []
The Regular Expression
There's usually more than one way to match a pattern, although some are simpler than others. A relatively simple regular express that works with your inputs and expected outputs is as follows:
/(?:(?:\A|\s*)\p{L}\S*)+/
This matches one or more strings when all of the following conditions are met:
start-of-string, or zero or more whitespace characters
followed by a Unicode category of "letter"
followed by zero or more non-whitespace characters
The first item in the list, which is the second non-capturing group, is what allows the match to be repeated until a word starts with a non-letter.
The Proofs
regex = /(?:(?:\A|\s*)\p{L}\S*)+/
regex.match 'aa22 b cc 33d ff' #=> #<MatchData "aa22 b cc">
regex.match 'abc def 123ffg' #=> #<MatchData "abc def">
regex.match '66dd eee ff' #=> #<MatchData "">
The sub method can be used to replace with an empty string '' everything that needs to be removed from the expression.
In this case, a first sub method is needed to remove the whole text if it starts with a digit. Then another sub will remove everything starting from any word that starts with a digit.
Answer:
str.sub(/^\d+.*/, '').sub(/\s+\d+.*/, '')
Outputs:
str = "abc def 123ffg"
# => "abc def"
str = "aa22 b cc 33d ff"
# => "aa22 b cc"
str = "66dd eee ff"
# => ""

Regex for name in Ruby

I know this question has been asked a lot but I need a RegEx for a name validator.
The only requirements are letters are okay, No numbers, and no special characters other than 2 and the spaces cannot be at the beginning or end, the "-" and "`" are allowed also. Everything else would be invalid.
All the other answers seem to ask for a lot more and seem to get too complicated.
Currently I am using
/^([^\d\W]|[-])*$/
But this fails with the space
Sample data:
Pass:
Susan Johnson,
Stephanie Le'Sean,
John Pierre'-Frank
Fail:
Ricky2Good,
Jean,stewie,
Mike#dude,
Jim. McNeil
I've assumed that for a string to be valid, it may contain only uppercase and lowercase letters, apostrophes, dashes and at most two spaces, provided the spaces are not at the beginning or end of the string.
STR= "-a-z'"
r = /
\A # match beginning of string
(?: # begin non-capture group
[#{STR}]+ # match 1+ letters, "-" or "'"
| # or
[#{STR}]+\s[#{STR}]*\s?[#{STR}]+
# match 1+ letters, "-" or "'", space, 0+ letters, "-" or "'",
# optional space, 1+ letters, "-" or "'"
) # end non-capture group
\z # match end of string
/ix # case-indifferent and free-spacing regex definition modes
#=> /
\A # match beginning of string
(?: # begin non-capture group
[-a-z']+ # match 1+ letters, "-" or "'"
| # or
[-a-z']+\s[-a-z']*\s?[-a-z']+
# match 1+ letters, "-" or "'", space, 0+ letters, "-" or "'",
# optional space, 1+ letters, "-" or "'"
) # end non-capture group
\z # match end of string
/ix
If I did not use free-spacing mode to define the regex it would look like this:
r = /\A(?:[-a-z']+|[-a-z']+\s[-a-z']*\s?[-a-z']+)\z/i
"a B-' v" =~ r #=> 0
"aB-'v" =~ r #=> 0
"aB-'1v" =~ r #=> nil
"a B-'1 v" =~ r #=> nil
" a B-1v" =~ r #=> nil
If you wish to return true or false, rather than a truthy value 0 or a falsy value nil, you could write, for example:
("a B-' v" =~ r) ? true : false #=> true
or (the "trick")
!!("a B-' v" =~ r) #=> true
The latter works because it is the same as:
!(!("a B-' v" =~ r))
#=> !(!(0)) => !(false) => true
The question asks for a regex to validate names. Using a regex may be the best, but it's not the only way. If the question is really how to validate names--using a regex or otherwise--it should be stated in a way that doesn't stipulate a particular approach. Here's one way to validate without using a regex.
GOOD_CHARS = ('a'..'z').to_a.join << "'-"
#=> "abcdefghijklmnopqrstuvwxyz'-"
def validate(str)
return false if str.empty? || (str[0]==' ' || str[-1]==' ')
nbr_spaces = str.count(' ')
return false if nbr_spaces > 2
str.downcase.count(GOOD_CHARS) + nbr_spaces == str.size
end
validate "a B-' v" #=> true
validate "aB-'v" #=> true
validate "aB-`1v" #=> false
validate "a B-'1 v" #=> false
validate " a B-'1v" #=> false
The following regex should filter for letters, no special characters (other than one space, dashes, and backticks), and no numbers:
/^[a-zA-Z\-\`]++(?: [a-zA-Z\-\`]++)?$/
Hope it helps!

i have a regular expression that i need to figure out

"peter,nick,jake,jack"
i need to have something like this.
i cannot have any whitespace after the word for example,
"peter,," "peter," "peter,,nick " will all be incorrect.
it has to be just a word such as "peter" or a word follow by a comma then word ("peter,nick")
First confirm that the string has the required structure.
r = /
\A # match the beginning of the string
[[:alpha:]]+ # match > 0 letters
(?:,[[:alpha:]]+) # match a comma then > 0 letters in a non-capture group
* # match the preceding non-capture group >= 0 times
\z # match end of the string
/x # free-spacing regex definition mode
str = "peter,nick,jake,jack"
str =~ r #=> 0
Since it matches the regex, simply split on commas to return an array of the words.
str.split(',') #=> ["peter", "nick", "jake", "jack"]
By contrast:
"peter,nick,,jake,jack" =~ r #=> nil
"peter,nick,jake, jack" =~ r #=> nil
"peter,nick,jake,jack " =~ r #=> nil
"peter ispeter,nick" =~ r #=> nil
I assume the string must contain at least one letter.

How to write a regex in a single line

I have this code:
str = 'printf("My name is %s and age is %0.2d", name, age);'
SPECIFIERS = 'diuXxofeEgsc'
format_specifiers = /((?:%(?:\*?([-+]?\d*\.?\d+)*(?:[#{SPECIFIERS}]))))/i
variables = /([.[^"]]*)\);$/
format = str.scan(format_specifiers)
var = str.scan(variables).first.first.split(/,/)
Is there any way a single regex can do that in a couple of lines?
My desired output is:
%s, name
%0.2d, age
I'm a big believer in keeping regular expressions as simple as possible; They can too quickly mushroom into unwieldy/unmaintainable messes. I'd start with something like this, then tweak as necessary:
str = 'printf("My name is %s and age is %0.2d", name, age);'
formats = str.scan(/%[a-z0-9.]+/) # => ["%s", "%0.2d"]
str[/,(.+)\);$/] # => ", name, age);"
vars = str[/,(.+)\);$/].scan(/[a-z]+/) # => ["name", "age"]
puts formats.zip(vars).map{ |a| a.join(', ')}
# >> %s, name
# >> %0.2d, age
Your question has two parts:
Q1: Is it possible to do this with a single regex?
Q2: Can this be done in one or two lines of code?
The answer to both questions is "yes".
format_specifiers = /
%[^\s\"\z]+ # match % followed by > 0 characters other than a
# whitespace, a double-quote or the end of the string
/x # free-spacing regex definition mode
variables = /
,\s* # match comma followed by >= 0 whitespaces
\K # forget matches so far
[a-z] # match a lowercase letter
\w* # match >= 0 word characters
/x
You can decide, after testing, if these two regexes do their jobs adequately. For testing, refer to Kernel#sprintf.
r = /
(?:#{format_specifiers}) # match format_specifiers in a non-capture group
| # or
(?:#{variables}) # match variables in a non-capture group
/x
#=> /
(?:(?x-mi:
%[^\s\"\z]+ # match % followed by > 0 characters other than a
# whitespace, a double-quote or the end of the string
)) # match format_specifiers in a non-capture group
| # or
(?:(?x-mi:
,\s* # match comma followed by >= 0 whitespaces
\K # forget matches so far
[a-zA-Z] # match a letter
\w* # match >= 0 word characters
)) # match variables in a non-capture group
/x
r can of course also be written:
/(?:(?x-mi:%[^\s\"\z]+))|(?:(?x-mi:,\s*\K[a-zA-Z]\w*))/
One advantage of constructing r from two regexes is that each of the latter can be tested separately.
str = 'printf("My name is %s and age is %0.2d", name, age);'
arr = str.scan(r)
#=> ["%s", "%0.2d", "name", "age"]
arr.each_slice(arr.size/2).to_a.transpose.map { |s| s.join(', ') }
#=> ["%s, name", "%0.2d, age"]
I have five lines of code. We could reduce this to two by simply substituting out r in str.scan(r). We could make it a single line by writing:
str.scan(r).tap { |a|
a.replace(a.each_slice(a.size/2).to_a.transpose.map { |s| s.join(', ') }) }
#=> ["%s, name", "%0.2d, age"]
with r substituted out.
The steps here are as follows:
a = str.scan(r)
#=> ["%s", "%0.2d", "name", "age"]
b = a.each_slice(a.size/2)
#=> a.each_slice(2)
#=> #<Enumerator: ["%s", "%0.2d", "name", "age"]:each_slice(2)>
c = b.to_a
#=> [["%s", "%0.2d"], ["name", "age"]]
d = c.transpose
#=> [["%s", "name"], ["%0.2d", "age"]]
e = d.map { |s| s.join(', ') }
#=> ["%s, name", "%0.2d, age"]
a.replace(e)
#=> ["%s, name", "%0.2d, age"]
The methods used (aside from Array#size) are String#scan, Enumerable#each_slice, Enumerable#to_a, Enumerable#map, Array#transpose and Array#replace.

Delete the last instance of a letter in string

How do I delete only the last "l" from a string and not the others?
string = "Hello"
desired outcome:
string # => "Helo"
I did:
string.delete!("l")
string #= > "Heo"
string[string.rindex('l')] = ''
You can use sub to replace a single occurrence and tweak the regexp to replace the last match.
string = "Homemade"
string.sub(/(.*)m/, '\1')
# => "Homeade"
In your case the regexp will be
string.sub(/(.*)l/, '\1')
str = 'hello'
r = /
.* # match any number of any character
\K # discard everything matched so far
l # match last 'l'
/x # extended mode
str.gsub(r,'')
#=> "helo"

Resources