Replacement with regular expression and capture

Replacement with regular expression and capture - ruby

The method below is supposed to transform "snake_case" to "CamelCase".
def zebulansNightmare(string)
string.gsub(/_(.)/){$1.upcase}
end
With string "camel_case", I expect gsub(/_(.)/) to match c after the _. I understood that $1 is the first matched letter: the capital letter. But it works like it's substituting _ with the capital letter. Why has the _ disappeared?

You are right that $1 is the captured value, however, the gsub matches the letter with _ before it, and the whole match gets replaced. You need to reinsert _ to the result:
"camel_case".gsub(/_(.)/){"_#{$1.upcase}"}
See the IDEONE demo
BTW, if you only plan to match _ followed with a letter (so as not to waste time and resources on trying to turn non-letters to upper case), you can use the following regex:
/_(\p{Ll})/
Where \p{Ll} is any lowercase Unicode letter.

def zebulans_nightmare(string)
string.gsub(/\B_[a-z0-9]/) { |s| s[1].upcase }
end
zebulans_nightmare("case_of_snakes")
#=> "caseOfSnakes"
zebulans_nightmare("case_of_3_snakes")
#=> "caseOf3Snakes"
zebulans_nightmare("_case_of_3_snakes")
#=> "_caseOf3Snakes"
\B matches non-word boundaries.

Related

splitting a string misses the word which is used to split it

I have a string
a="Tamilnadu is far away from Kashmir"
If I split this string using "Tamilnadu", then I don't find Tamilnadu as a part of the array, I find empty string there, If I split the string "away" then away is not present in the result array, it's having empty string in the place of away. What should I do include it instead of having empty string.
Example
a="Tamilnadu is far away from Kashmir"
p a.split("Tamilnadu")
then Output is
["", " is far away from Kashmir"]
But I want
["Tamilnadu", " is far away from Kashmir"]

From docs:
If pattern is a Regexp, str is divided where the pattern matches. Whenever the pattern matches a zero-length string, str is split into individual characters. If pattern contains groups, the respective matches will be returned in the array as well.
So... to split by "Tamilnadu" and keep it in the list, make it a capture group:
"Tamilnadu is far away from Kashmir".split(/(Tamilnadu)/)
# => ["", "Tamilnadu", " is far away from Kashmir"]
or, if you want to split after "Tamilnadu", make a zero-width match after it using lookbehind:
"Tamilnadu is far away from Kashmir".split(/(?<=Tamilnadu)/)
# => ["Tamilnadu", " is far away from Kashmir"]

If you don't know where "Tamilnadu" is in the string but you want to split the string before and after it, and not have any empty strings in the resulting array, you can use String#scan:
def split_it(str, substring)
str.scan(/\A.+(?= #{substring}\b)|\b#{substring}\b|(?<=\b#{substring} ).+/)
end
substring = "Tamilnadu"
split_it("Tamilnadu is far away from Kashmir", substring)
#=> ["Tamilnadu", "is far away from Kashmir"]
split_it("Far away is Tamilnadu from Kashmir", substring)
#=> ["Far away is", "Tamilnadu", "from Kashmir"]
split_it("Far away from Kashmir is Tamilnadu", substring)
#=> ["Far away from Kashmir is", "Tamilnadu"]
split_it("Far away is Daluth from Kashmir", substring)
#=> []
split_it("Far away is Tamilnaduland from Kashmir", substring)
#=> []
I've assumed that substring appears at most once in the string.
The regular expression can be written in free-spacing mode to make it self-documenting:
substring = "Tamilnadu"
/
\A.+ # match the beginning of the string followed by > 0 characters
(?=\ #{substring}\b) # match the value of substring preceded by a space and
# followed by a word break, in a positive lookahead
| # or
\b#{substring}\b # match the value of substring with a word break before and after
| # or
(?<=\b#{substring}\ ) # match the value of substring preceded by a word break
# and followed by a space, in a positive lookbehind
.+ # match > 0 characters
/x # free-spacing regex definition mode
#=>
/
\A.+ # ...
(?=\ Tamilnadu\b) # ...
| # ...
\bTamilnadu\b # ...
| # ...
(?<=\bTamilnadu\ ) # ...
.+ # ...
/x
Free-spacing mode removes all spaces before the regex is parsed, including spaces that may be intended to be part of the expression. It was for that reason that I escaped the two spaces. I could alternatively put each in a character class ([ ]) or use \s, [[:space:]] or \p{Space}, though they match whitespace, which is not quite the same.

How to remove strings that end with a particular character in Ruby

Based on "How to Delete Strings that Start with Certain Characters in Ruby", I know that the way to remove a string that starts with the character "#" is:
email = email.gsub( /(?:\s|^)#.*/ , "") #removes strings that start with "#"
I want to also remove strings that end in ".". Inspired by "Difference between \A \z and ^ $ in Ruby regular expressions" I came up with:
email = email.gsub( /(?:\s|$).*\./ , "")
Basically I used gsub to remove the dollar sign for the carrot and reversed the order of the part after the closing parentheses (making sure to escape the period). However, it is not doing the trick.
An example I'd like to match and remove is:
"a8&23q2aas."

You were so close.
email = email.gsub( /.*\.\s*$/ , "")
The difference lies in the fact that you didn't consider the relationship between string of reference and the regex tokens that describe the condition you wish to trigger. Here, you are trying to find a period (\.) which is followed only by whitespace (\s) or the end of the line ($). I would read the regex above as "Any characters of any length followed by a period, followed by any amount of whitespace, followed by the end of the line."
As commenters pointed out, though, there's a simpler way: String#end_with?.

I'd use:
words = %w[#a day in the life.]
# => ["#a", "day", "in", "the", "life."]
words.reject { |w| w.start_with?('#') || w.end_with?('.') }
# => ["day", "in", "the"]
Using a regex is overkill for this if you're only concerned with the starting or ending character, and, in fact, regular expressions will slow your code in comparison with using the built-in methods.
I would really like to stick to using gsub....
gsub is the wrong way to remove an element from an array. It could be used to turn the string into an empty string, but that won't remove that element from the array.

def replace_suffix(str,suffix)
str.end_with?(suffix)? str[0, str.length - suffix.length] : str
end

How to remove a certain character after substring in Ruby

I have a string with exclamation marks. I want to remove the exclamation marks at the end of the word, not the ones before a word. Assume there is no exclamation mark by itself/ not accompanied by a word. By word I mean [a..z], can be uppercased.
For example:
exclamation("Hello world!!!")
#=> ("Hello world")
exclamation("!!Hello !world!")
#=> ("!!Hello !world")
I have read How do I remove substring after a certain character in a string using Ruby? ; these two are close, but different.
def exclamation(s)
s.slice(0..(s.index(/\w/)))
end
# exclamation("Hola!") returns "Hol"
I have also tried s.gsub(/\w!+/, ''). Although it retains the '!' before word, it removes both the last letter and exclamation mark. exclamation("!Hola!!") #=> "!Hol".
How can I remove only the exclamation marks at the end?

If you don't want to use regex that sometimes difficult to understand use this:
def exclamation(sentence)
words = sentence.split
words_wo_exclams = words.map do |word|
word.split('').reverse.drop_while { |c| c == '!' }.reverse.join
end
words_wo_exclams.join(' ')
end

Although you haven't given a lot of test data, here's an example of something that might work:
def exclamation(string)
string.gsub(/(\w+)\!(?=\s|\z)/, '\1')
end
The \s|\z part means either a space or the end of the string, and (?=...) means to just peek ahead in the string but not actually match against it.
Note that this won't work in the case of things like "I'm mad!" where the exclamation mark is not adjacent to a space, but you could always add that as another potential end-of-word match.

"!!Hello !world!, world!! I say".gsub(r, '')
#=> "!!Hello !world, world! I say"
where
r = /
(?<=[[:alpha:]]) # match an uppercase or lowercase letter in a positive lookbehind
! # match an exclamation mark
/x # free-spacing regex definition mode
or
r = /
[[:alpha:]] # match an uppercase or lowercase letter
\K # discard match so far
! # match an exclamation mark
/x # free-spacing regex definition mode
If the above example should return "!!Hello !world, world I say", change ! to !+ in the regexes.

Regex to grab full firstname and first letter of last name

I have a list of users grabbed by the Etc Ruby library:
Thomas_J_Perkins
Jennifer_Scanner
Amanda_K_Loso
Aaron_Cole
Mark_L_Lamb
What I need to do is grab the full first name, skip the middle name (if given), and grab the first character of the last name. The output should look like this:
Thomas P
Jennifer S
Amanda L
Aaron C
Mark L
I'm not sure how to do this, I've tried grabbing all of the characters: /\w+/ but that will grab everything.

You don't always need regular expressions.
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. Jamie Zawinski
You can do it with some simple Ruby code
string = "Mark_L_Lamb"
string.split('_').first + ' ' + string.split('_').last[0]
=> "Mark L"

I think its simpler without regex:
array = "Thomas_J_Perkins".split("_") # split at _
array.first + " " + array.last[0] # .first prints first name .last[0] prints first char of last name
#=> "Thomas P"

You can use
^([^\W_]+)(?:_[^\W_]+)*_([^\W_])[^\W_]*$
And replace with \1_\2. See the regex demo
The [^\W_] matches a letter or a digit. If you want to only match letters, replace [^\W_] with \p{L}.
^(\p{L}+)(?:_\p{L}+)*_(\p{L})\p{L}*$
See updated demo
The point is to match and capture the first chunk of letters up to the first _ (with (\p{L}+)), then match 0+ sequences of _ + letters inside (with (?:_\p{L}+)*_) and then match and capture the last word first letter (with (\p{L})) and then match the rest of the string (with \p{L}*).
NOTE: replace ^ with \A and $ with \z if you have independent strings (as in Ruby ^ matches the start of a line and $ matches the end of the line).
Ruby code:
s.sub(/^(\p{L}+)(?:_\p{L}+)*_(\p{L})\p{L}*$/, "\\1_\\2")

I'm in the don't-use-a-regex-for-this camp.
str1 = "Alexander_Graham_Bell"
str2 = "Sylvester_Grisby"
"#{str1[0...str1.index('_')]} #{str1[str1.rindex('_')+1]}"
#=> "Alexander B"
"#{str2[0...str2.index('_')]} #{str2[str2.rindex('_')+1]}"
#=> "Sylvester G"
or
first, last = str1.split(/_.+_|_/)
#=> ["Alexander", "Bell"]
first+' '+last[0]
#=> "Alexander B"
first, last = str2.split(/_.+_|_/)
#=> ["Sylvester", "Grisby"]
first+' '+last[0]
#=> "Sylvester G"
but if you insist...
r = /
(.+?) # match any characters non-greedily in capture group 1
(?=_) # match an underscore in a positive lookahead
(?:.*) # match any characters greedily in a non-capture group
(?:_) # match an underscore in a non-capture group
(.) # match any character in capture group 2
/x # free-spacing regex definition mode
str1 =~ r
$1+' '+$2
#=> "Alexander B"
str2 =~ r
$1+' '+$2
#=> "Sylvester G"
You can of course write
r = /(.+?)(?=_)(?:.*)(?:_)(.)/

This is my attempt:
/([a-zA-Z]+)_([a-zA-Z]+_)?([a-zA-Z])/
See demo

Let's see if this works:
/^([^_]+)(?:_\w)?_(\w)/
And then you'll have to combine the first and second matches into the format you want. I don't know Ruby, so I can't help you there.

And another attempt using a replacement method:
result = subject.gsub(/^([^_]+)(?:_[^_])?_([^_])[^_]+$/, '\1 \2')
We capture the entire string, with the relevant parts in capturing groups. Then just return the two captured groups

using the split method is much better
full_names.map do |full_name|
parts = full_name.split('_').values_at(0,-1)
parts.last.slice!(1..-1)
parts.join(' ')
end

/^[A-Za-z]{5,15}\s[A-Za-z]{1}]$/i
This will have the following criteria:
5-15 characters for first name then a whitespace and finally a single character for last name.

regex scan only returning first value

I have two strings that should both return matches according to the regex, but only str1 returns the expected match. str1 is an exact match for the regex (created by Avinash Raj) below. str2 contains str1 and more data. I expected str2 to return str1 and more values that matched, but it returns nothing Can someone explain why?
str1="3,15,14,31,40,5,5,4,5,3,4,4,5,2,2,2,1,2,1,1,3,3,3,2,4,3,false,false,false,false,false,true,false,true,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,3,3,3,2,3"
str2="3,15,14,31,40,5,5,4,5,3,4,4,5,2,2,2,1,2,1,1,3,3,3,2,4,3,false,false,false,false,false,true,false,true,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,3,3,3,2,3,3,15,14,35,27,4,5,3,5,3,2,4,4,2,1,1,2,2,2,1,3,3,3,2,5,9,true,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,true,true,false,false,false,false,2,2,3,2,3,3,15,16,34,53,4,4,4,3,1,3,4,3,1,1,1,1,1,1,1,2,3,2,3,5,1,true,false,false,false,false,false,true,false,false,false,false,false,false,false,true,true,false,false,false,false,false,false,false,false,false,false,false,3,2,3,2,3,3,15,18,37,29,4,4,4,3,2,3,3,4,1,1,1,1,1,1,1,1,3,1,2,4,1,true,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,3,2,3,2,3,3,15,20,34,37,4,4,4,3,1,3,3,4,1,1,1,1,1,1,1,1,1,1,2,4,1,false,false,false,true,false,false,false,false,false,false,false,false,false,true,false,true,false,false,false,false,false,false,false,false,true,false,false,3,1,3,1,3,3,16,10,18,30,4,3,3,3,1,3,3,3,1,1,1,1,1,1,1,1,2,1,4,4,3,false,false,false,false,false,true,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,3,2,3,2,3,3,16,12,39,5,5,5,4,5,3,5,5,5,1,1,1,1,1,1,1,2,1,1,1,5,10,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,3,2,3,2,3,3,16,14,18,27,4,4,4,4,2,3,3,4,1,1,1,1,1,1,1,1,1,1,2,5,1,true,false,false,false,false,false,false,false,false,false,false,false,false,true,false,true,false,false,false,false,true,false,false,false,false,false,false,3,2,3,2,3,3,16,16,18,32,5,5,5,5,4,5,5,5,1,1,1,1,1,1,1,2,1,1,1,5,3,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,true,false,3,2,3,2,3,3,16,18,20,7,5,5,5,5,3,3,3,4,1,1,1,1,1,1,1,1,1,1,2,5,1,false,false,false,true,false,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,true,false,false,false,false,false,false,3,2,3,2,3,3,16,20,18,59,4,4,4,3,1,1,1,2,1,1,1,1,1,1,1,1,2,2,4,5,9,false,false,false,true,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,3,2,3,2,3,3,17,10,16,9,3,3,3,3,1,2,3,3,1,1,1,1,1,1,1,1,2,1,3,5,1,true,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,3,2,3,2,3,3,17,12,16,17,4,3,4,2,1,4,3,2,1,1,1,1,1,1,1,1,1,1,4,5,3,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,3,2,3,2,3,3,17,14,16,21,4,4,4,4,1,3,4,4,1,1,1,1,1,1,1,1,1,1,2,5,1,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,true,false,false,3,2,3,2,3,3,17,16,16,20,5,5,4,5,3,4,4,5,1,1,1,1,1,1,1,1,1,1,1,5,8,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,true,false,false,false,3,2,3,2,3,3,17,18,16,31,4,4,4,4,1,4,3,3,1,1,1,1,1,1,1,1,1,1,3,5,1,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,true,false,true,false,false,false,false,3,2,3,2,3,3,17,20,18,8,5,5,4,5,4,4,4,5,1,1,1,1,1,1,1,2,1,1,1,5,1,false,false,false,true,false,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,true,false,false,false,3,2,3,2,3,3,18,10,31,33,3,2,3,2,2,2,2,3,1,1,1,1,1,1,1,1,1,1,1,5,7,true,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,3,2,3,2,3,3,18,12,36,11,4,4,4,5,3,4,3,3,1,1,2,1,2,1,2,2,1,1,1,5,1,false,false,false,true,false,false,true,false,false,false,false,false,false,true,false,true,false,false,false,false,false,false,false,false,true,false,false,3,2,3,2,3,3,18,14,49,6,3,3,2,2,1,2,2,2,2,1,1,1,2,1,2,3,3,4,4,5,9,true,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,3,2,3,2,3,3,18,16,32,53,3,4,4,3,3,3,3,3,1,1,1,1,1,1,2,2,1,1,3,5,7,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,true,false,false,false,false,3,2,3,2,3,3,18,18,37,59,5,4,4,4,4,4,4,4,1,1,1,1,1,1,1,2,1,1,2,5,7,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,true,false,false,false,false,3,2,3,2,3,3,19,10,5,25,4,4,4,2,2,4,3,3,1,1,1,1,1,1,1,1,2,2,2,5,1,true,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,2,1,3,2,3,3,19,13,0,5,5,5,4,5,3,3,5,5,1,1,1,1,1,1,1,1,1,1,3,5,7,false,false,true,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,false,false,true,false,false,false,false,false,false,3,2,3,2,3,3,19,14,5,23,4,4,4,4,3,4,3,3,1,1,1,1,1,1,1,1,1,2,2,5,9,false,false,true,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,false,3,2,3,2,3,3,19,16,7,19,5,4,4,4,3,4,3,3,1,1,1,1,1,1,1,2,2,2,3,5,9,false,false,true,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,3,2,3,2,3,3,19,18,6,30,4,4,4,4,3,4,4,4,1,1,1,1,1,1,1,1,1,1,1,5,8,false,false,true,true,false,false,false,false,false,false,false,false,false,false,false,false,false,false,true,false,true,false,false,false,true,false,false,3,2,3,2,3,3,19,20,8,25,4,4,5,4,3,4,3,4,1,1,1,1,1,1,1,1,1,1,3,5,1,false,false,true,false,false,false,true,false,false,false,false,false,false,true,false,false,false,false,false,false,false,false,false,false,true,false,false,3,2,3,2,3,3,19,21,18,2,4,4,4,3,3,4,3,4,1,1,1,1,1,1,1,1,1,1,1,5,1,false,false,true,false,false,false,false,false,false,false,false,false,false,true,false,false,false,false,true,false,true,false,false,false,true,false,false,3,2,3,2,3,"
str1.scan(/^,?(?:[1-5]\d|[1-9])(?:,(?:[1-5]\d|[1-9])){4}(?:,[1-5]){21}(?:,(?:true|false)){27}(?:,[1-5]){5}$/).each{|x|
puts x
puts "---1---"
}
str2.scan(/^,?(?:[1-5]\d|[1-9])(?:,(?:[1-5]\d|[1-9])){4}(?:,[1-5]){21}(?:,(?:true|false)){27}(?:,[1-5]){5}$/).each{|x|
puts x
puts "---2---"
}

Kind of by definition, you can't have more than one pattern match in a string when your pattern specifically says "start of string, then [stuff], then end of string". Look at regexp anchors ^ and $.
A simpler example might make it clearer: ^a$ "start of string, then letter a, then end of string" will match in "a" once, but will match in "aaa" zero times, even though there are three letters a.

$ assert position at end of a line
Now you are not matching upto the end of line.
^,?(?:[1-5]\d|[1-9])(?:,(?:[1-5]\d|[1-9])){4}(?:,[1-5]){21}(?:,(?:true|false)){27}(?:,[1-5]){5}
Just remove the $ from the end.See demo.
https://regex101.com/r/sJ9gM7/22

Because you're regular starts with the ^ metacharacter and ends with the $ metacharacter, it expects the full string to match.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Replacement with regular expression and capture - ruby

def zebulans_nightmare(string) string.gsub(/\B_[a-z0-9]/) { |s| s[1].upcase } end zebulans_nightmare("case_of_snakes") #=> "caseOfSnakes" zebulans_nightmare("case_of_3_snakes") #=> "caseOf3Snakes" zebulans_nightmare("_case_of_3_snakes") #=> "_caseOf3Snakes" \B matches non-word boundaries.

Related

splitting a string misses the word which is used to split it

How to remove strings that end with a particular character in Ruby

How to remove a certain character after substring in Ruby

Regex to grab full firstname and first letter of last name

regex scan only returning first value

Categories

Resources