start_with not working for backslash in ruby - ruby

I have the following string -
abcdefgh;
lmnopqrst;
On doing a string = string.split(";"), I get -
["abcdefgh", "\nlmnopqrst"]
Now when I do -
string[1].start_with?("\\")
The function returns false. Whereas if I do
string[0].start_with?("a")
The function return true.
I am new to ruby and just can't understand this behavior. Can anyone tell me what am I doing wrong.

I dont know, butString[1][0] (first character from string) returns "\n" so maybe use this
string[1].start_with?("\n")

This is because "\n" actually does not start with a backslash . It is the line feed character and is considered to be a single character and for that reason it is only presented having the escape character \ in front of it.
So:
string[1].start_with?("\n")
Will return true.
You already tried to search with string[1].start_with?("\\") so you seem to realize you need to escape the backslash character by using \\.
If your input string would look like this:
\abcdefgh;
lmnopqrst;
Then after .split(';') your resulting array would look like this:
["\\abcdefgh;", "\nlmnopqrst"]
Now string[0].start_with?("\\") would return true because the first string actually starts with a single backslash, which was presented with the escape character in the console.

you can try
'\nhello world'.start_with?("\\") # return true
"\nhello world".start_with?("\\") # return false
because '\n' is two chars( \ and n), but "\n" is one char(new line char).

The first character there is not "\" - it's "\n" in the first example, and "\\" in the second. "\n" and "\\" are effectively single characters in this context, even though they look like two characters.
"\n" != "\\", and so start_with? responds false.

Related

Regexp.escape adds weird escapes to a plain space

I stumbled over this problem using the following simplified example:
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty. Indeed, this is the case for many strings, but not for this case:
searchstring = "D "
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
p line
It turns out, that line is printed as "D " afterwards, i.e. no replacement had been performed.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string, in the following way:
REPLACEMENTS.each do
|from, to|
line.chomp!
line.gsub!(Regexp.escape(from)) { to }
end
. I'm using Regexp.escape just as a safety measure in the case that the string being replaced contains some regex metacharacter.
I'm using the Cygwin port of MRI Ruby 2.6.4.
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty.
Your understanding is incorrect. The guarantee in the docs is
For any string, Regexp.new(Regexp.escape(str))=~str will be true.
This does hold for your example
Regexp.new(Regexp.escape("D "))=~"D " # => 0
therefore this is what your code should look like
line.gsub!(Regexp.new(Regexp.escape(searchstring))) { '' }
As for why this is the case, there used to be a bug where Regex.escape would incorrectly handle space characters:
# in Ruby 1.8.4
Regex.escape("D ") # => "D\\s"
My guess is they tried to keep the fix as simple as possible by replacing 's' with ' '. Technically this does add an unnecessary escape character but, again, that does not break the intended use of the method.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
This looks to be a bug. In my opinion, whitespace is not a Regexp meta character, there is no need to escape it.
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string […]
If you want to do literal string replacement, then don't use a Regexp. Just use a literal string:
line.gsub!(from, to)

Why ruby controller would escape the parameters itself?

I am writing Ruby application for the back end service. There is a controller which would accept request from front-end.
Here is the case, there is a GET request with a parameter containing character "\n".
def register
begin
request = {
id: params[:key]
}
.........
end
end
The "key" parameter is passing from AngularJs as "----BEGIN----- \n abcd \n ----END---- \n", but in the Ruby controller the parameter became "----BEGIN----- \\n abcd \\n ----END---- \\n" actually.
Anyone has a good solution for this?
Yes, this is because of the ruby way to read the escape character. You can read the explanation right here: Escaping characters in Ruby
I got this issue once, and I just use gsub! to change the \\n to \n. What you should do is:
def register
begin
request = {
id: params[:key].gsub!("\\n", "\n")
}
.........
end
end
Remember, you have to use double quotation " instead of single quotation '. From the link I gave:
The difference between single and double quoted strings in Ruby is the way the string definitions represent escape sequences.
In double quoted strings, you can write escape sequences and Ruby will output their translated meaning. A \n becomes a newline.
In single quoted strings however, escape sequences are escaped and return their literal definition. A \n remains a \n.

What is the opposite of Regexp.escape?

What is the opposite of Regexp.escape ?
> Regexp.escape('A & B')
=> "A\\ &\\ B"
> # do something, to get the next result: (something like Regexp.unescape(A\\ &\\ B))
=> "A & B"
How can I get the original value?
replaces = Hash.new { |hash,key| key } # simple trick to return key if there is no value in hash
replaces['t'] = "\t"
replaces['n'] = "\n"
replaces['r'] = "\r"
replaces['f'] = "\f"
replaces['v'] = "\v"
rx = Regexp.escape('A & B')
str = rx.gsub(/\\(.)/){ replaces[$1] }
Also make sure to #puts output in irb, because #inspect escapes characters by default.
Basically escaping/quoting looks for meta-characters, and prepends \ character (which has to be escaped for string interpretation in source code). But if we find any control character from list: \t, \n, \r, \f, \v, then quoting outputs \ character followed by this special character translated to ascii.
UPDATE:
My solution had problems with special characters (\n, \t ans so on), I updated it after investigating source code for rb_reg_quote method.
UPDATE 2:
replaces is hash, which converts escaped characters (thats why it is used in block attached to gsub) to unescaped ones. It is indexed by character without escape character (second character in sequence) and searches for unescaped value. The only defined values are control-characters, but there is also default_proc attached (block attached to Hash.new), which returns key if there is no value found in hash. So it works like this:
for "n" it returns "\n", the same for all other escaped control characters, because it is value associated with key
for "(" it returns "(", because there is no value associated with "(" key, hash calls #default_proc, which returns key itself
The only characters escaped by Regexp.escape are meta characters and control characters, so we don't have to worry about alphanumerics.
Take a look at http://ruby-doc.org/core-2.0.0/Hash.html#method-i-default_proc for documentation on #defoult_proc
You can perhaps use something like this?
def unescape(s)
eval %Q{"#{s}"}
end
puts unescape('A\\ &\\ B')
Credits to this question.
codepad demo
If you are okay with a regex solution, you can use this:
res = s.gsub(/\\(?!\\)|(\\)\\/, "\\1")
codepad demo
Try this
>> r = Regexp.escape("A & B (and * c [ e] + )")
# => "A\\ &\\ B\\ \\(and\\ \\*\\ c\\ \\[\\ e\\]\\ \\+\\ \\)"
>> r.gsub("\\(","(").gsub("\\)",")").gsub("\\[","[").gsub("\\]","]").gsub("\\{","{").gsub("\\}","}").gsub("\\.",".").gsub("\\?","?").gsub("\\+","+").gsub("\\*","*").gsub("\\ "," ")
# => "A & B (and * c [ e] + )"
Basically, these (, ), [, ], {, }, ., ?, +, * are the meta characters in regex. And also \ which is used as an escape character.
The chain of gsub() calls replace the escaped patterns with corresponding actual value.
I am sure there is a way to DRY this up.
Update: DRY version as suggested by user2503775
>> r.gsub("\\","")
Update:
following are the special characters in regex
[,],{,},(,),|,-,*,.,\\,?,+,^,$,<space>,#,\t,\f,\v,\n,\r
using a regex replace using \\(?=([\\\*\+\?\|\{\[\(\)\^\$\.\#\ ]))\
should give you the string unescaped, you would only have to replace \r\n sequences with there CrLf counterparts.
"There\ is\ a\ \?\ after\ the\ \(white\)\ car\.\ \r\n\ it\ should\ be\ http://car\.com\?\r\n"
is unescaped to :
"There is a ? after the (white) car. \r\n it should be http://car.com?\r\n"
and removing the \r\n gives you :
There is a ? after the (white) car.
it should be http://car.com?

using gsub in ruby strings correctly

I have this expression:
channelName = rhash["Channel"].gsub("'", " ")
it works fine. However, I can only substitute 1 character with it. I want to add a few more characters to substitue. So I tried the following:
channelName = rhash["Channel"].gsub(/[':;] /, " ")
This did not work, that is there was no substitution done on strings and no error message. I also tried this:
channelName = rhash["Channel"].gsub!("'", " ")
This lead to a string that was blank. So absolutely not what I desired.
I would like to have a gsub method to substitute the following characters with a space in my string:
' ; :
My questions:
How can I structure my gsub method so that all instances of the above characters are replaced with a space?
What is happening with gsub! above as its returning a blank.
Your second attempt was very close. The problem is that you left a space after the closing bracket, meaning it was only looking for one of those symbols followed by a space.
Try this:
channelName = rhash["Channel"].gsub(/[':;]/, " ")
This does not answer your question, but is a better way to do it.
channelName = rhash["Channel"].tr("':;", " ")

gsub! On an argument doesn't work

I am making a function that turns the first argument into a PHP var (useless, I know), and set it equal to the second argument. I'm trying to gsub! it to get rid of all the characters that can't be used in a PHP var. Here is what I have:
dvar = "$" + name.gsub!(/.?\/!#\#{}$%^&*()`~/, "") { |match| puts match }
I have the puts match there to make sure some of the characters were removed. name is a variable passed into a method in which this is its purpose. I am getting this error:
TypeError: can't convert nil into String
cVar at ./Web.rb:31
(root) at C:\Users\Andrew\Documents\NetBeansProjects\Web\lib\main.rb:13
Web.rb is the file this line is in, and main.rb is the file calling this method. How can I fix this?
EDIT: If I remove the ! in gsub!, it goes through, but the characters aren't removed.
Short answer
Use dvar = "$" + name.tr(".?\/!#\#{}$%^&*()``~", '')
Long answer
The problem you are facing is that the gsub! call is returning nil. You can't concatenate (+) a String with a nil.
That's happening because you have a malformed Regexp. You aren't escaping the special regex symbols, like $, * and ., just for a start. Also, the way it is now, gsub will only match if your string contains all that symbols in sequence. You should use the pipe (|) operator to make an OR like operation.
gsub! will also return nil if no substitutions happened.
See the documentation for gsub and gsub! here: http://ruby-doc.org/core/classes/String.html#M001186
I think you should replace gsub! with gsub. Do you really need name to change?
Example:
name = "m$var.name$$"
dvar = "$" + name.gsub!(/\$|\.|\*/, "") # $ or . or *
# dvar now contains $mvarname and name is mvarname
Your line, corrected:
dvar = "$" + name.gsub(/\.|\?|\/|\!|\#|\\|\#|\{|\}|\$|\%|\^|\&|\*|\(|\)|\`|\~/, "")
# some things shouldn't (or aren't needed to) be escaped, I don't remember them all right now
As J-_-L appointed, you could also use a character class ([]), that makes it a little clearer, I guess. Well, it's hard to mentally parse anyway.
dvar = "$" + name.gsub(/[\.\?\/\!\#\\\#\{\}\$\%\^\&\*\(\)\`\~]/, "")
But because what you are doing is simple character replacement, the best method is tr (again reminded by J-_-L!):
dvar = "$" + name.tr(".?\/!#\#{}$%^&*()`~", '')
Way easier to read and make modifications.
You cannot apply a second parameter
and a block to gsub (the block is ignored)
The regex is wrong, you forgot the
square brackets:
/[.?\/!#\#{}$%^&*()~]/`
Because your regex is wrong, it
didn't match anything and because
gsub! returns nil if nothing was
replaced, you get this strange nil no
method error
btw: you should use gsub not gsub! in
this case, because you are using the
return value (and not name itself) --
and the error would not have happened
i dont see what the block is for
just do
name = 'hello.?\/!##$%^&*()`~hello'
dvar = "$" + name.gsub(/\.|\?|\\|\/|\!|\#|\#|\{|\}|\$|\%|\^|\&|\*|\(|\)|\`|\~/, "")
puts dvar # => "$hellohello"
or use [] to denote OR
dvar = "$" + name.gsub(/[\.\?\\\/\!\#\\\#\{\}\$\%\^\&\*\(\)\`\~]/, "")
you have to escape the special characters and then OR them so it will remove them individually not just if they are all found together
also there is really no need to use gsub! to modify the string in place use the non mutator gsub() since you assign it to a new variable,
gsub! returns nil for which the operator + is not defined for stings, which gives you the no method error mentioned
It seems as the 'name' object is nil, you may be calling gsub! on nil which usually complains with a NoMethodError: private method gusb! called for nilNilClass, since I don't know the version of ruby you are using I am not sure if the error would be the same, but it's a good place to start looking at.

Resources