Ruby Regular Expression Excluding - ruby

#message_to = 'bob#google.com'
#cleaned = #message_to.match(/^(.*)+#/)
#cleaned is returning bob#, where I want it to return just bob. Am I doing the regex right with ruby?
Thanks

No need much regular expression
>> #message_to = "bob#google.com"
=> "bob#google.com"
>> #message_to.split("#",2)
=> ["bob", "google.com"]
>> #message_to.split("#",2)[0] if #message_to["#"]
=> "bob"
>>

You want this:
#cleaned = #message_to.match(/^(.*)+#/)[1]
match returns a MatchData object and the string version of that is the entire match, the captured groups are available starting at index 1 when you treat the MatchData as an array.
I'd probably go with something more like this though:
#cleaned = #message_to.match(/^([^#]+)#/)[1]

There is a shorter solution:
#cleaned = #message_to[/[^#]+/]

An even shorter code than mu_is_too_short would be:
#cleaned = #message_to[/^([^#]+)#/, 1]
The String#[] method can take a regular expression.

The simplest RegEx I got to work in the IRB console is:
#message_to = 'bob#google.com'
#cleaned = #message_to.match(/(.+)#/)[1]
Also from this link you could try:
#cleaned = #message_to.match(/^(?<local_part>[\w\W]*?)#/)[:local_part]

The most obvious way to adjust your code is by using a forward positive assertion. Instead of saying "match bob#" you're now saying "match bob, when followed by a #"
#message_to = 'bob#google.com'
#cleaned = #message_to.match(/^(.*)+(?=#)/)
A further point about when to use and not to use regexes: yes, using a regex is a bit pointless in this case. But when you do use a regex, it's easier to add validation as well:
#cleaned = #message_to.match(/^(([-a-zA-Z0-9!#$%&'*+\/=?^_`{|}~]+.)*[-a-zA-Z0-9!#$%&'*+\/=?^_`{|}~]+(?=#)/)
(and yes, all those are valid in email-adresses)

Related

How to split a string which contains multiple forward slashes

I have a string as given below,
./component/unit
and need to split to get result as component/unit which I will use this as key for inserting hash.
I tried with .split(/.\//).last but its giving result as unit only not getting component/unit.
I think, this should help you:
string = './component/unit'
string.split('./')
#=> ["", "component/unit"]
string.split('./').last
#=> "component/unit"
Your regex was almost fine :
split(/\.\//)
You need to escape both . (any character) and / (regex delimiter).
As an alternative, you could just remove the first './' substring :
'./component/unit'.sub('./','')
#=> "component/unit"
All the other answers are fine, but I think you are not really dealing with a String here but with a URI or Pathname, so I would advise you to use these classes if you can. If so, please adjust the title, as it is not about do-it-yourself-regexes, but about proper use of the available libraries.
Link to the ruby doc:
https://docs.ruby-lang.org/en/2.1.0/URI.html
and
https://ruby-doc.org/stdlib-2.1.0/libdoc/pathname/rdoc/Pathname.html
An example with Pathname is:
require 'pathname'
pathname = Pathname.new('./component/unit')
puts pathname.cleanpath # => "component/unit"
# pathname.to_s # => "component/unit"
Whether this is a good idea (and/or using URI would be cool too) also depends on what your real problem is, i.e. what you want to do with the extracted String. As stated, I doubt a bit that you are really intested in Strings.
Using a positive lookbehind, you could do use regex:
reg = /(?<=\.\/)[\w+\/]+\w+\z/
Demo
str = './component'
str2 = './component/unit'
str3 = './component/unit/ruby'
str4 = './component/unit/ruby/regex'
[str, str2, str3, str4].each { |s| puts s[reg] }
#component
#component/unit
#component/unit/ruby
#component/unit/ruby/regex

Regex to match a specific parenthesis among multiple

Take the String:
"The only true (wisdom) is in knowing you know (nothing)"
I want to extract nothing.
What I know about it:
It will always be inside a parenthesis
The parenthesis will always be the last element before the line-end: $
I first attempted to match it with
/\(.*\)$/, but that obviously returned
(wisdom) is in knowing you know (nothing).
You want to use negative character group matching, like [^...]:
s = 'The only true (wisdom) is in knowing you know (nothing)'
s.match(/\(([^)]+)\)$/).captures
Debuggex Demo
In this case, nothing is in the first sub-group match, but the entire regex technically matches (nothing). To match exactly nothing as the entire match, use:
s = 'The only true (wisdom) is in knowing you know (nothing)'
s.match(/(?<=\()([^)]+)(?=\)$)/).captures
Debuggex Demo
I would do
s = 'The only true (wisdom) is in knowing you know (nothing)'
s.match(/\(([^)]+)\)$/).captures # => ["nothing"]
You could use scan to find all matches and then take the last one:
str = "The only true (wisdom) is in knowing you know (nothing)"
str.scan(/\((.+?)\)/).last
#=> "nothing"
You can use the \z which matches end of string. try
\([a-z]+\)\z
Way simpler and will ignore everything else but what you need.
Test it here:
http://rubular.com/
It's even trickier if there's any chance of nesting. In that case you need some recursion:
"...knowing you know ((almost) nothing)"[/\(((?:[^()]*|\(\g<1>\))*)\)$/, 1]
#=> "(almost) nothing"
Look ma, no regex!
s = 'The only true (wisdom) is in knowing you know (nothing)'
r = s.reverse
r[(r.index(')') + 1)...(r.index('('))].reverse
#=> "nothing"

extract with regex, one liner in ruby

I would like to extract the word after "=". For "GENEINFO=AGRN:" in a document, I can use the regex /GENEINFO=(.*?):/ to extract the required. However the value I wanted to returned is just "AGRN". Is there a one-liner that I can use for this task?
Try using a lookbehind and a lookahead:
/(?<=GENEINFO=).*?(?=:)/
You could also use match:
'GENEINFO=AGRN:'.match(/GENEINFO=(.*?):/)[1]
#=> "AGRN"
Which could also be written using the String#[] method:
'GENEINFO=AGRN:'[/GENEINFO=(.*?):/, 1]
#=> "AGRN"
"GENEINFO=AGRN:"[/(?<==).*(?=:)/]
# => "AGRN"
You want a lookbehind and lookahead.
pattern = /(?<=GENEINFO=)(.*?)(?=:)/
value = "GENEINFO=AGRN:".scan(pattern)
// [["AGRN"]]

Using regexes in Ruby

I have a regex, that I'm trying to use in Ruby. Here is my Regex, and it works in Java when I add the double escape keys
\(\*(.*?)\*\)
I know this is a simple question, but how would I write this as a ruby expression and set it equal to a variable? I appreciate any help.
try this:
myregex = /\(\*(.*?)\*\)/
To be clear, this is just to save the regex to a variable. To use it:
"(**)" =~ myregex
Regular expressions are a native type in Ruby (the actual class is "Pattern"). You can just write:
mypat = /\(\*(.*?)\*\)/
[Looks like anything between '(' / ')' pairs, yes?]
You can then do
m = mypat.match(str)
comment = m[1]
...or, more compactly
comment = mypat.match(str)[1]
try this:
if /\(\*(.*?)\*\)/ === "(*hello*)"
content = $1 # => "hello"
end
http://rubular.com/r/7eCuPX3ri0

How to replace the last occurrence of a substring in ruby?

I want to replace the last occurrence of a substring in Ruby. What's the easiest way?
For example, in abc123abc123, I want to replace the last abc to ABC. How do I do that?
How about
new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse
For instance:
irb(main):001:0> old_str = "abc123abc123"
=> "abc123abc123"
irb(main):002:0> pattern="abc"
=> "abc"
irb(main):003:0> replacement="ABC"
=> "ABC"
irb(main):004:0> new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse
=> "abc123ABC123"
"abc123abc123".gsub(/(.*(abc.*)*)(abc)(.*)/, '\1ABC\4')
#=> "abc123ABC123"
But probably there is a better way...
Edit:
...which Chris kindly provided in the comment below.
So, as * is a greedy operator, the following is enough:
"abc123abc123".gsub(/(.*)(abc)(.*)/, '\1ABC\3')
#=> "abc123ABC123"
Edit2:
There is also a solution which neatly illustrates parallel array assignment in Ruby:
*a, b = "abc123abc123".split('abc', -1)
a.join('abc')+'ABC'+b
#=> "abc123ABC123"
Since Ruby 2.0 we can use \K which removes any text matched before it from the returned match. Combine with a greedy operator and you get this:
'abc123abc123'.sub(/.*\Kabc/, 'ABC')
#=> "abc123ABC123"
This is about 1.4 times faster than using capturing groups as Hirurg103 suggested, but that speed comes at the cost of lowering readability by using a lesser-known pattern.
more info on \K: https://www.regular-expressions.info/keep.html
Here's another possible solution:
>> s = "abc123abc123"
=> "abc123abc123"
>> s[s.rindex('abc')...(s.rindex('abc') + 'abc'.length)] = "ABC"
=> "ABC"
>> s
=> "abc123ABC123"
When searching in huge streams of data, using reverse will definitively* lead to performance issues. I use string.rpartition*:
sub_or_pattern = "!"
replacement = "?"
string = "hello!hello!hello"
array_of_pieces = string.rpartition sub_or_pattern
( array_of_pieces[(array_of_pieces.find_index sub_or_pattern)] = replacement ) rescue nil
p array_of_pieces.join
# "hello!hello?hello"
The same code must work with a string with no occurrences of sub_or_pattern:
string = "hello_hello_hello"
# ...
# "hello_hello_hello"
*rpartition uses rb_str_subseq() internally. I didn't check if that function returns a copy of the string, but I think it preserves the chunk of memory used by that part of the string. reverse uses rb_enc_cr_str_copy_for_substr(), which suggests that copies are done all the time -- although maybe in the future a smarter String class may be implemented (having a flag reversed set to true, and having all of its functions operating backwards when that is set), as of now, it is inefficient.
Moreover, Regex patterns can't be simply reversed. The question only asks for replacing the last occurrence of a sub-string, so, that's OK, but readers in the need of something more robust won't benefit from the most voted answer (as of this writing)
You can achieve this with String#sub and greedy regexp .* like this:
'abc123abc123'.sub(/(.*)abc/, '\1ABC')
simple and efficient:
s = "abc123abc123abc"
p = "123"
s.slice!(s.rindex(p), p.size)
s == "abc123abcabc"
string = "abc123abc123"
pattern = /abc/
replacement = "ABC"
matches = string.scan(pattern).length
index = 0
string.gsub(pattern) do |match|
index += 1
index == matches ? replacement : match
end
#=> abc123ABC123
I've used this handy helper method quite a bit:
def gsub_last(str, source, target)
return str unless str.include?(source)
top, middle, bottom = str.rpartition(source)
"#{top}#{target}#{bottom}"
end
If you want to make it more Rails-y, extend it on the String class itself:
class String
def gsub_last(source, target)
return self unless self.include?(source)
top, middle, bottom = self.rpartition(source)
"#{top}#{target}#{bottom}"
end
end
Then you can just call it directly on any String instance, eg "fooBAR123BAR".gsub_last("BAR", "FOO") == "fooBAR123FOO"
.gsub /abc(?=[^abc]*$)/, 'ABC'
Matches a "abc" and then asserts ((?=) is positive lookahead) that no other characters up to the end of the string are "abc".

Resources