using gsub in ruby strings correctly - ruby

I have this expression:
channelName = rhash["Channel"].gsub("'", " ")
it works fine. However, I can only substitute 1 character with it. I want to add a few more characters to substitue. So I tried the following:
channelName = rhash["Channel"].gsub(/[':;] /, " ")
This did not work, that is there was no substitution done on strings and no error message. I also tried this:
channelName = rhash["Channel"].gsub!("'", " ")
This lead to a string that was blank. So absolutely not what I desired.
I would like to have a gsub method to substitute the following characters with a space in my string:
' ; :
My questions:
How can I structure my gsub method so that all instances of the above characters are replaced with a space?
What is happening with gsub! above as its returning a blank.

Your second attempt was very close. The problem is that you left a space after the closing bracket, meaning it was only looking for one of those symbols followed by a space.
Try this:
channelName = rhash["Channel"].gsub(/[':;]/, " ")

This does not answer your question, but is a better way to do it.
channelName = rhash["Channel"].tr("':;", " ")

Related

Regexp.escape adds weird escapes to a plain space

I stumbled over this problem using the following simplified example:
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty. Indeed, this is the case for many strings, but not for this case:
searchstring = "D "
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
p line
It turns out, that line is printed as "D " afterwards, i.e. no replacement had been performed.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string, in the following way:
REPLACEMENTS.each do
|from, to|
line.chomp!
line.gsub!(Regexp.escape(from)) { to }
end
. I'm using Regexp.escape just as a safety measure in the case that the string being replaced contains some regex metacharacter.
I'm using the Cygwin port of MRI Ruby 2.6.4.
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty.
Your understanding is incorrect. The guarantee in the docs is
For any string, Regexp.new(Regexp.escape(str))=~str will be true.
This does hold for your example
Regexp.new(Regexp.escape("D "))=~"D " # => 0
therefore this is what your code should look like
line.gsub!(Regexp.new(Regexp.escape(searchstring))) { '' }
As for why this is the case, there used to be a bug where Regex.escape would incorrectly handle space characters:
# in Ruby 1.8.4
Regex.escape("D ") # => "D\\s"
My guess is they tried to keep the fix as simple as possible by replacing 's' with ' '. Technically this does add an unnecessary escape character but, again, that does not break the intended use of the method.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
This looks to be a bug. In my opinion, whitespace is not a Regexp meta character, there is no need to escape it.
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string […]
If you want to do literal string replacement, then don't use a Regexp. Just use a literal string:
line.gsub!(from, to)

Extract values after pattern in Ruby string

I have a string like this:
"<root><some ProdCode=\"40\" ProducerName=\"demo1\" ProdCode=\"40\" Need_Confirmation=\"1\"/><some ProdCode=\"40\" ProducerName=\"demo1\" ProdCode=\"40\" Need_Confirmation=\"1\"/></root>"
I'm trying to pull the content from this string which is between =\"content\" and put it in an array, like ["40","demo1","40","1",40......]
You should use :scan to select elements by regexp pattern. Then remove escape characters.
string.scan(/"[^"]+"/).map { |element| element.delete('\\"') }
Explanation of pattern:
/ – regexp starts
" – first char should be "
[^"]+ – next should be any char except ". + sign says that number of such chars should be at least 1.
" – next should be again "
/ – regexp ends
So string.scan(/"[^"]+"/) would return:
["\"40\"", "\"demo1\"", "\"40\"", "\"1\"", "\"40\"", "\"demo1\"", "\"40\"", "\"1\""]
Then we can just delete \" using :delete method.
Convenient tool to build regexps is http://rubular.com/
When your string is this simple you can use scan + regular expression like this:
result = html.scan(/ProdCode="\d+?"/)
If it is more complex you can use a html parser like nokogiri or oga.

start_with not working for backslash in ruby

I have the following string -
abcdefgh;
lmnopqrst;
On doing a string = string.split(";"), I get -
["abcdefgh", "\nlmnopqrst"]
Now when I do -
string[1].start_with?("\\")
The function returns false. Whereas if I do
string[0].start_with?("a")
The function return true.
I am new to ruby and just can't understand this behavior. Can anyone tell me what am I doing wrong.
I dont know, butString[1][0] (first character from string) returns "\n" so maybe use this
string[1].start_with?("\n")
This is because "\n" actually does not start with a backslash . It is the line feed character and is considered to be a single character and for that reason it is only presented having the escape character \ in front of it.
So:
string[1].start_with?("\n")
Will return true.
You already tried to search with string[1].start_with?("\\") so you seem to realize you need to escape the backslash character by using \\.
If your input string would look like this:
\abcdefgh;
lmnopqrst;
Then after .split(';') your resulting array would look like this:
["\\abcdefgh;", "\nlmnopqrst"]
Now string[0].start_with?("\\") would return true because the first string actually starts with a single backslash, which was presented with the escape character in the console.
you can try
'\nhello world'.start_with?("\\") # return true
"\nhello world".start_with?("\\") # return false
because '\n' is two chars( \ and n), but "\n" is one char(new line char).
The first character there is not "\" - it's "\n" in the first example, and "\\" in the second. "\n" and "\\" are effectively single characters in this context, even though they look like two characters.
"\n" != "\\", and so start_with? responds false.

Regular expression to match special characters within double quotes

My input string is :
"& is here "& is here also, & has again occured""
Using gsub method in Ruby language, is there a way to substitute character '&' which is occuring within double quotes with character '$', if gsub method doesnt solve this problem, is there any other approach which can be used to address this problem.
Since first arguement in gsub method can be a regex, so matched regex will be substituted by the second arguement, getting a right regex for identifying might also solve this problem since it can be substituted in the gsub method for replacing '&' with '$'.
Expected output is as shown :
& is here "$ is here also , $ has again occured"
str = %q{& is here "& is here also , & has again occured"}
str.gsub!(/".*?"/) do |substr|
substr.gsub(/&/, '$')
end
puts str
# => & is here "$ is here also , $ has again occured"
EDIT: Just noticed that stribizhev proposed this way before I wrote it.

regex replace [ with \[

I want to write a regex in Ruby that will add a backslash prior to any open square brackets.
str = "my.name[0].hello.line[2]"
out = str.gsub(/\[/,"\\[")
# desired out = "my.name\[0].hello.line\[2]"
I've tried multiple combinations of backslashes in the substitution string and can't get it to leave a single backslash.
You don't need a regular expression here.
str = "my.name[0].hello.line[2]"
puts str.gsub('[', '\[')
# my.name\[0].hello.line\[2]
I tried your code and it worked correct:
str = "my.name[0].hello.line[2]"
out = str.gsub(/\[/,"\\[")
puts out #my.name\[0].hello.line\[2]
If you replace putswith p you get the inspect-version of the string:
p out #"my.name\\[0].hello.line\\[2]"
Please see the " and the masked \. Maybe you saw this result.
As Daniel already answered: You can also define the string with ' and don't need to mask the values.

Resources