Can we use the relational operator in gsub? - ruby

I need to replace the . character with . \n in the following string format. But, the constraint is, don't replace the . character with .\n in following pattern string only.
"test was done and was negative. Urine dipstick: ph = 6\\n \\342\\200\\242 spec. Grav. = 1.015"
I need the following output, like
"test was done and was negative. \n Urine dipstick: ph = 6\\n \\342\\200\\242 spec. Grav. = 1.015"
The constraint is => "spec. Grav. = 1.015".

str = "test was done and was negative. Urine dipstick: ph = 6\\n \\342\\200\\242 spec. Grav. = 1.015"
puts str.sub('. ', ".\n")
#=> test was done and was negative.
#=> Urine dipstick: ph = 6\n \342\200\242 spec. Grav. = 1.015
String.sub only substitutes the first match.

str.gsub(/\.(?! (Grav| =))/, ".\n")
should do the job.
Brief explanation
\. matches any .
(?!) denotes a negative look-ahead. That means that it won't match anything found in these brackets.
(Grav| =) hence a dot followed by either Grav or = won't be matched.

You want this?
str.gsub(/\.(?!\n)/, "\.\n")

Related

Need better regex solution in Ruby

I have following code:
date_time = Time.now.strftime('%Y%m%d%H%M%S')
name = "builder-#{date_time}" # builder-20150923125450
if some_condition
name.sub!("#{date_time}", "one-#{date_time}") # builder-one-20150923125450
end
Above code is working fine.
But I think it could be better as I feel like I am repeating #{date_time} twice here.
I have heard of regex capture and replace. Can we use it here? If yes, how?
To utilize capturing mechanism, you need to use round brackets round a subpattern that you would like to refer to using a back-reference in the replacement string.
Here is an example:
date_time = Time.now.strftime('%Y%m%d%H%M%S')
name = "builder-#{date_time}"
puts name.sub(/^([^-]*-)/, "\\1one-")
See IDEONE demo
The ^([^-]*-) matches and captures all characters other than - from the beginning of the string (^) and a hyphen, and then we refer to the text with \\1 in the replacement string.
Refer to Use Parentheses for Grouping and Capturing at Regular-Expressions.info for more details.
A more optimal way is using a ternary operator when initializing name variable:
a = 1
date_time = Time.now.strftime('%Y%m%d%H%M%S')
name = "builder-" + (some_condition ? "one-" : "") + "#{date_time}"
IDEONE demo
Strategy one - precalculate the prefix:
date_time = Time.now.strftime('%Y%m%d%H%M%S')
prefix = some_condition ? 'builder-one-' : 'builder-'
name = "#{prefix}#{date_time}"
The string 'builder-' is repeated twice here. Obviously, you can DRY it even more, but it's an overkill IMHO.
Strategy two - use a lookahead:
date_time = Time.now.strftime('%Y%m%d%H%M%S')
name = "builder-#{date_time}"
name.sub!(/(?=#{date_time})/, "one-") if some_condition
Now date_time appears only twice. I wouldn't say it's a great improvement. I wouldn't say there is much of a problem to begin with.
"builder-" + ("one-" if some_condition).to_s + date_time
date_time = "right now"
some_condition = true
"builder-" + ("one-" if some_condition).to_s + date_time
#=> "builder-one-right now"
some_condition = false
"builder-" + ("one-" if some_condition).to_s + date_time
#=> "builder-right now"
Note that:
("one-" if false).to_s #=> nil.to_s => ""

regex replace [ with \[

I want to write a regex in Ruby that will add a backslash prior to any open square brackets.
str = "my.name[0].hello.line[2]"
out = str.gsub(/\[/,"\\[")
# desired out = "my.name\[0].hello.line\[2]"
I've tried multiple combinations of backslashes in the substitution string and can't get it to leave a single backslash.
You don't need a regular expression here.
str = "my.name[0].hello.line[2]"
puts str.gsub('[', '\[')
# my.name\[0].hello.line\[2]
I tried your code and it worked correct:
str = "my.name[0].hello.line[2]"
out = str.gsub(/\[/,"\\[")
puts out #my.name\[0].hello.line\[2]
If you replace putswith p you get the inspect-version of the string:
p out #"my.name\\[0].hello.line\\[2]"
Please see the " and the masked \. Maybe you saw this result.
As Daniel already answered: You can also define the string with ' and don't need to mask the values.

What is the opposite of Regexp.escape?

What is the opposite of Regexp.escape ?
> Regexp.escape('A & B')
=> "A\\ &\\ B"
> # do something, to get the next result: (something like Regexp.unescape(A\\ &\\ B))
=> "A & B"
How can I get the original value?
replaces = Hash.new { |hash,key| key } # simple trick to return key if there is no value in hash
replaces['t'] = "\t"
replaces['n'] = "\n"
replaces['r'] = "\r"
replaces['f'] = "\f"
replaces['v'] = "\v"
rx = Regexp.escape('A & B')
str = rx.gsub(/\\(.)/){ replaces[$1] }
Also make sure to #puts output in irb, because #inspect escapes characters by default.
Basically escaping/quoting looks for meta-characters, and prepends \ character (which has to be escaped for string interpretation in source code). But if we find any control character from list: \t, \n, \r, \f, \v, then quoting outputs \ character followed by this special character translated to ascii.
UPDATE:
My solution had problems with special characters (\n, \t ans so on), I updated it after investigating source code for rb_reg_quote method.
UPDATE 2:
replaces is hash, which converts escaped characters (thats why it is used in block attached to gsub) to unescaped ones. It is indexed by character without escape character (second character in sequence) and searches for unescaped value. The only defined values are control-characters, but there is also default_proc attached (block attached to Hash.new), which returns key if there is no value found in hash. So it works like this:
for "n" it returns "\n", the same for all other escaped control characters, because it is value associated with key
for "(" it returns "(", because there is no value associated with "(" key, hash calls #default_proc, which returns key itself
The only characters escaped by Regexp.escape are meta characters and control characters, so we don't have to worry about alphanumerics.
Take a look at http://ruby-doc.org/core-2.0.0/Hash.html#method-i-default_proc for documentation on #defoult_proc
You can perhaps use something like this?
def unescape(s)
eval %Q{"#{s}"}
end
puts unescape('A\\ &\\ B')
Credits to this question.
codepad demo
If you are okay with a regex solution, you can use this:
res = s.gsub(/\\(?!\\)|(\\)\\/, "\\1")
codepad demo
Try this
>> r = Regexp.escape("A & B (and * c [ e] + )")
# => "A\\ &\\ B\\ \\(and\\ \\*\\ c\\ \\[\\ e\\]\\ \\+\\ \\)"
>> r.gsub("\\(","(").gsub("\\)",")").gsub("\\[","[").gsub("\\]","]").gsub("\\{","{").gsub("\\}","}").gsub("\\.",".").gsub("\\?","?").gsub("\\+","+").gsub("\\*","*").gsub("\\ "," ")
# => "A & B (and * c [ e] + )"
Basically, these (, ), [, ], {, }, ., ?, +, * are the meta characters in regex. And also \ which is used as an escape character.
The chain of gsub() calls replace the escaped patterns with corresponding actual value.
I am sure there is a way to DRY this up.
Update: DRY version as suggested by user2503775
>> r.gsub("\\","")
Update:
following are the special characters in regex
[,],{,},(,),|,-,*,.,\\,?,+,^,$,<space>,#,\t,\f,\v,\n,\r
using a regex replace using \\(?=([\\\*\+\?\|\{\[\(\)\^\$\.\#\ ]))\
should give you the string unescaped, you would only have to replace \r\n sequences with there CrLf counterparts.
"There\ is\ a\ \?\ after\ the\ \(white\)\ car\.\ \r\n\ it\ should\ be\ http://car\.com\?\r\n"
is unescaped to :
"There is a ? after the (white) car. \r\n it should be http://car.com?\r\n"
and removing the \r\n gives you :
There is a ? after the (white) car.
it should be http://car.com?

What is the Ruby equivalent of preg_quote()?

In PHP you need to use preg_quote() to escape all the characters in a string that have a particular meaning in a regular expression, to allow (for example) preg_match() to search for those special characters.
What is the equivalent in Ruby of the following code?
// The content of this variable is obtained from user input, in example.
$search = "$var = 100";
if (preg_match('/' . preg_quote($search, '/') . ";/i")) {
// …
}
You want Regexp.escape.
str = "[...]"
re = /#{Regexp.escape(str)}/
"la[...]la[...]la".gsub(re,"") #=> "lalala"

Way to partially match a Ruby string using Regexp

I'm working on 2 cases:
assume I have those var:
a = "hello"
b = "hello-SP"
c = "not_hello"
Any partial matches
I want to accept any string that has the variable a inside, so b and c would match.
Patterned match
I want to match a string that has a inside, followed by '-', so b would match, c does not.
I am having problem, because I always used the syntax /expression/ to define Regexp, so how dynamically define an RegExp on Ruby?
You can use the same syntax to use variables in a regex, so:
reg1 = /#{a}/
would match on anything that contains the value of the a variable (at the time the expression is created!) and
reg2 = /#{a}-/
would do the same, plus a hyphen, so hello- in your example.
Edit: As Wayne Conrad points out, if a contains "any characters that would have special meaning in a regular expression," you need to escape them. Example:
a = ".com"
b = Regexp.new(Regexp.escape(a))
"blah.com" =~ b
Late to comment but I wasn't able to find what I was looking for.The above mentioned answers didn't help me.Hope it help someone new to ruby who just wants a quick fix.
Ruby Code:
st = "BJ's Restaurant & Brewery"
#take the string you want to match into a variable
m = (/BJ\'s/i).match(string) #(/"your regular expression"/.match(string))
# m has the match #<MatchData "BJ's">
m.to_s
# this will display the match
=> "BJ's"

Resources