I have a long String and want to delete the part of the String that comes after a word and I'm looking for the gsub! command that does that. I would appreciate it if you could provide it.
For reference:
I know that the command to delete the part of the String (the String is called contents) that comes before the word "body" is:
contents.gsub!(/.*?(?=body)/im, "")
Thanks.
This code:
"this has a word in it".gsub! /(word).*/, $1
Will change the string to "this has a word"
The "word" in brackets is the first argument returned by the regex, and $1 returns that argument.
See the Ruby docs for gsub
Going by your regex, that requires the / in body to be escaped, I'm assuming you mean every after
contents = "Stuff before </body> stuff after"
contents.gsub(/(?<=\/body>).+/, "")
=> "Stuff before </body>"
Related
I stumbled over this problem using the following simplified example:
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty. Indeed, this is the case for many strings, but not for this case:
searchstring = "D "
line = searchstring.dup
line.gsub!(Regexp.escape(searchstring)) { '' }
p line
It turns out, that line is printed as "D " afterwards, i.e. no replacement had been performed.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string, in the following way:
REPLACEMENTS.each do
|from, to|
line.chomp!
line.gsub!(Regexp.escape(from)) { to }
end
. I'm using Regexp.escape just as a safety measure in the case that the string being replaced contains some regex metacharacter.
I'm using the Cygwin port of MRI Ruby 2.6.4.
line.gsub!(Regexp.escape(searchstring)) { '' }
My understanding was, that for every String stored in searchstring, the gsub! would cause that line is afterwards empty.
Your understanding is incorrect. The guarantee in the docs is
For any string, Regexp.new(Regexp.escape(str))=~str will be true.
This does hold for your example
Regexp.new(Regexp.escape("D "))=~"D " # => 0
therefore this is what your code should look like
line.gsub!(Regexp.new(Regexp.escape(searchstring))) { '' }
As for why this is the case, there used to be a bug where Regex.escape would incorrectly handle space characters:
# in Ruby 1.8.4
Regex.escape("D ") # => "D\\s"
My guess is they tried to keep the fix as simple as possible by replacing 's' with ' '. Technically this does add an unnecessary escape character but, again, that does not break the intended use of the method.
This happens to any searchstring containing a space. Indeed, if I do a
p(Regexp.escape(searchstring))
for my example, I see "D\\ " being printed, while I would expect to get "D " instead. Is this a bug in the Ruby core library, or did I misuse the escape function?
This looks to be a bug. In my opinion, whitespace is not a Regexp meta character, there is no need to escape it.
Some background: In my concrete application, where this simplified example is derived from, I just want to do a literal string replacement inside a long string […]
If you want to do literal string replacement, then don't use a Regexp. Just use a literal string:
line.gsub!(from, to)
I have a string like below and am trying to remove the last character from that string. can someone please help on this
what if I have a lengthy string and I want to only remove the last character of my string.
Example: "city": "Winston Salem","state": "NC","zip": "27127","country": " "}}
and I want to only remove the last '}'.
Use a String method like replace:
String newString = oldString.replace("}}", "}")
if that´s case or another one; you can use anywhere of methods of String API only if casts to String
I'm trying to export email from Outlook(2010) into a CSV file, but there are emails with a comma in the subject line.
Is there a way to deal with this? I can't find an option to change the delimiter to something else.
Thanks
You can use the VBA Replace function.
Eg, this takes myString and replaces the comma with a dash.
Dim myString As String
myString = "Test subject, with comma"
myString = Replace(myString, ",", "-")
myString becomes "Test subject- with comma"
If the value contains a comma, enclose the value in quotes. If you have a quote, replace it with a double quote. E.g a value like
Weird, encoded "subject"
becomes
"Weird, encoded ""subject"""
I need to parse a string in ruby which contain vars of ids and names like this {2,Shahar}.
The string is like this:
text = "Hello {1,Micheal}, my name is {2,Shahar}, nice to meet you!"
when I am trying to parse it, the regexp skips the first } and I get something like this:
text.gsub(/\{(.*),(.*)\}/, "\\2(\\1)")
=> "Hello Shahar(1,Micheal}, my name is {2), nice to meet you!"
while the required resault should be:
=> "Hello Michael(1), my name is Shahar(2), nice to meet you!"
I would be thankful to anyone who can help.
Thanks
Shahar
The greedy .* matches too much. It means "any string, maximum possible length". So the first (.*) matches 1,Micheal}, my name is {2, then the comma matches the comma, and the second (.*) matches Shahar (and the final \} matches the closing braces.
Better be more specific. For example, you could restrict the match to allow only characters except braces to ensure that a match will never extend beyond the scope of a {...} section:
text.gsub(/\{([^{}]*),([^{}]*)\}/, "\\2(\\1)")
Or you could do this:
text.gsub(/\{([^,]*),([^}]*)\}/, "\\2(\\1)")
where the first part may be any string that doesn't contain a comma, the second part may be any string that doesn't contain a }.
I'm trying to do a simple string sub in Ruby.
The second argument to sub() is a long piece of minified JavaScript which has regular expressions contained in it. Back references in the regex in this string seem to be effecting the result of sub, because the replaced string (i.e., the first argument) is appearing in the output string.
Example:
input = "string <!--tooreplace--> is here"
output = input.sub("<!--tooreplace-->", "\&")
I want the output to be:
"string \& is here"
Not:
"string & is here"
or if escaping the regex
"string <!--tooreplace--> is here"
Basically, I want some way of doing a string sub that has no regex consequences at all - just a simple string replace.
To avoid having to figure out how to escape the replacement string, use Regex.escape. It's handy when replacements are complicated, or dealing with it is an unnecessary pain. A little helper on String is nice too.
input.sub("<!--toreplace-->", Regexp.escape('\&'))
You can also use block notation to make it simpler (as opposed to Regexp.escape):
=> puts input.sub("<!--tooreplace-->") {'\&'}
string \& is here
Use single quotes and escape the backslash:
output = input.sub("<!--tooreplace-->", '\\\&') #=> "string \\& is here"
Well, since '\\&' (that is, \ followed by &) is being interpreted as a special regex statement, it stands to reason that you need to escape the backslash. In fact, this works:
>> puts 'abc'.sub 'b', '\\\\&'
a\&c
Note that \\\\& represents the literal string \\&.