How can I remove the string "\n" from within a Ruby string? - ruby

I have this string:
"some text\nandsomemore"
I need to remove the "\n" from it. I've tried
"some text\nandsomemore".gsub('\n','')
but it doesn't work. How do I do it? Thanks for reading.

You need to use "\n" not '\n' in your gsub. The different quote marks behave differently.
Double quotes " allow character expansion and expression interpolation ie. they let you use escaped control chars like \n to represent their true value, in this case, newline, and allow the use of #{expression} so you can weave variables and, well, pretty much any ruby expression you like into the text.
While on the other hand, single quotes ' treat the string literally, so there's no expansion, replacement, interpolation or what have you.
In this particular case, it's better to use either the .delete or .tr String method to delete the newlines.
See here for more info

If you want or don't mind having all the leading and trailing whitespace from your string removed you can use the strip method.
" hello ".strip #=> "hello"
"\tgoodbye\r\n".strip #=> "goodbye"
as mentioned here.
edit The original title for this question was different. My answer is for the original question.

When you want to remove a string, rather than replace it you can use String#delete (or its mutator equivalent String#delete!), e.g.:
x = "foo\nfoo"
x.delete!("\n")
x now equals "foofoo"
In this specific case String#delete is more readable than gsub since you are not actually replacing the string with anything.

You don't need a regex for this. Use tr:
"some text\nandsomemore".tr("\n","")

use chomp or strip functions from Ruby:
"abcd\n".chomp => "abcd"
"abcd\n".strip => "abcd"

Related

delete matched characters using regex in ruby

I need to write a regex for the following text:
"How can you restate your point (something like: \"<font>First</font>\") as a clear topic?"
that keeps whatever is between the
\" \"
characters (in this case <font>First</font>
I came up with this:
/"How can you restate your point \(something like: |\) as a clear topic\?"/
but how do I get ruby to remove the unwanted surrounding text and only return <font>First</font>?
lookbehind, lookahead and making what is greedy, lazy.
str[/(?<=\").+?(?=\")/] #=> "<font>First</font>"
If you have strings just like that, you can .split and get the first:
> str.split(/"/)[1]
=> "<font>First</font>"
You certainly can use a regular expression, but you don't need to:
str = "How can you restate (like: \"<font>First</font>\") as a clear topic?"
str[str.index('"')+1...str.rindex('"')]
#=> "<font>First</font>"
or, for those like me who never use three dots:
str[str.index('"')+1..str.rindex('"')-1]

Replace symbols in a string to tabs

How do I parse a string and change all the "a" letters to a tab symbol?
Is there a way to use a gsub for that?
something like
'blah'.gsub('a', '\t')
Use double quoted (") strings at least of the replacement that holds the tab:
'blah'.gsub('a', "\t")
#=> "bl\th"
Have a look at Ruby Programming/Strings for a very concise yet comprehensive overview of the differences between single and double quoted strings.
You can also use String#tr:
'matador'.tr('a', "\t")
#=> "m\tt\tdor"
You could write ?\t in place of "\t".

How to replace single quotes with escaped single quotes in ruby

I'm trying to replace single quotes (') with escaped single quotes (\') in a string in ruby 1.9.3 and 1.8.7.
The exact problem string is "Are you sure you want to delete '%#'". This string should become "Are you sure you want to delete \'%#\'"
Using .gsub!(/\'/,"\'") leads to the following string "Are you sure you want to %#'%#".
Any ideas on what's going on?
String#gsub in the form gsub(exp,replacement) has odd quirks affecting the replacement string which sometimes require lots of escaping slashes. Ruby users are frequently directed to use the block form instead:
str.gsub(/'/){ "\\'" }
If you want to do away with escaping altogether, consider using an alternate string literal form:
str.gsub(/'/){ %q(\') }
Once you get used to seeing these types of literals, using them to avoid escape sequences can make your code much more readable.
\' in a substitution replacement string means "The portion of the original string after the match". So str.gsub!(/\'/,"\\'") replaces the ' character with everything after it - which is what you've noticed.
You need to further escape the backslash in the replacement. .gsub(/'/,"\\\\'") works in my irb console:
irb(main):059:0> puts a.gsub(/'/,"\\\\'")
Are you sure you want to delete \'%#\'
You need to escape the backslash. What about this?
"Are you sure you want to delete '%#'".gsub(/(?=')/, "\\")
# => "Are you sure you want to delete \\'%#\\'"
The above should be what you want. Your expected result is wrong. There is no way to literally see a single backslash when it means literally a backslash.

In Ruby, what's the easiest way to "chomp" at the start of a string instead of the end?

In Ruby, sometimes I need to remove the new line character at the beginning of a string. Currently what I did is like the following. I want to know the best way to do this. Thanks.
s = "\naaaa\nbbbb"
s.sub!(/^\n?/, "")
lstrip seems to be what you want (assuming trailing white space should be kept):
>> s = "\naaaa\nbbbb" #=> "\naaaa\nbbbb"
>> s.lstrip #=> "aaaa\nbbbb"
From the docs:
Returns a copy of str with leading whitespace removed. See also
String#rstrip and String#strip.
http://ruby-doc.org/core-1.9.3/String.html#method-i-lstrip
strip will remove all trailing whitespace
s = "\naaaa\nbbbb"
s.strip!
Little hack to chomp leading whitespace:
str = "\nmy string"
chomped_str = str.reverse.chomp.reverse
To be perfectly accurate chomp not only can delete whitespace, from the end of a string, but can also delete arbitrary characters.
If the latter functionality is sought, one can use:
'\naaaa\nbbbb'.delete_prefix( "\n" )
As opposed to strip this works for arbitrary characters exactly like chomp.
So, just for a bit of clarification, there are three ways that you can go about this: sub, reverse.chomp.reverse and lstrip.
I'd recommend against sub because it's a bit less readable, but also because of how it works: by creating a new string that inherits from your old string. Plus you need a regular expression for something that's fairly simple.
So then you're down to reverse.chomp.reverse and lstrip. Most likely, you want lstrip because it's a bit faster, but keep in mind that the strip operations are not the same as the chomp operations. strip will remove all leading newlines and whitespace:
"\n aaa\nbbb".reverse.chomp.reverse # => " aaa\nbbb"
"\n aaa\nbbb".lstrip # => "aaa\nbbb"
If you want to make sure you only remove one character and that it's definitely a newline, use the reverse.chomp.reverse solution. If you consider all leading newlines and whitespace garbage, go with lstrip.
The one case I can think of for using regular expressions would be if you have an unknown number of \rs and \ns at the beginning and want to trim them all but avoid touching any whitespace. You could use a loop and the more String methods for trimming but it would just be uglier. The performance implications don't really matter that much.
s.sub(/^[\n\r]*/, '')
This removes leading newlines (carriage returns and line feeds, as in chomp), not any whitespace.
Not sure if it's the best way but you could try:
s.reverse.chomp.reverse
if you want to leave the trailing newline (if it exists).
This should work for you: s.strip.
A way to do this for whitespace or non-whitespace characters is like this:
s = "\naaaa\nbbbb"
s.slice!("\n") # returns "\n" but s also has the first newline removed.
puts s # shows s has the first newline removed

Backslash + captured group within Ruby regular expression

How do I excape a backslash before a captured group?
Example:
"foo+bar".gsub(/(\+)/, '\\\1')
What I expect (and want):
foo\+bar
what I unfortunately get:
foo\\1bar
How do I escape here correctly?
As others have said, you need to escape everything in that string twice. So in your case the solution is to use '\\\\\1' or '\\\\\\1'. But since you asked why, I'll try to explain that part.
The reason is that replacement sequence is being parsed twice--once by Ruby and once by the underlying regular expression engine, for whom \1 is its own escape sequence. (It's probably easier to understand with double-quoted strings, since single quotes introduce an ambiguity where '\\1' and '\1' are equivalent but '\' and '\\' are not.)
So for example, a simple replacement here with a captured group and a double quoted string would be:
"foo+bar".gsub(/(\+)/, "\\1") #=> "foo+bar"
This passes the string \1 to the regexp engine, which it understands as a reference to a capture group. In Ruby string literals, "\1" means something else entirely (ASCII character 1).
What we actually want in this case is for the regexp engine to receive \\\1. It also understands \ as an escape character, so \\1 is not sufficient and will simply evaluate to the literal output \1. So, we need \\\1 in the regexp engine, but to get to that point we need to also make it past Ruby's string literal parser.
To do that, we take our desired regexp input and double every backslash again to get through Ruby's string literal parser. \\\1 therefore requires "\\\\\\1". In the case of single quotes one slash can be omitted as \1 is not a valid escape sequence in single quotes and is treated literally.
Addendum
One of the reasons this problem is usually hidden is thanks to the use of /.+/ style regexp quotes, which Ruby treats in a special way to avoid the need to double escape everything. (Of course, this doesn't apply to gsub replacement strings.) But you can still see it in action if you use a string literal instead of a regexp literal in Regexp.new:
Regexp.new("\.").match("a") #=> #<MatchData "a">
Regexp.new("\\.").match("a") #=> nil
As you can see, we had to double-escape the . for it to be understood as a literal . by the regexp engine, since "." and "\." both evaluate to . in double-quoted strings, but we need the engine itself to receive \..
This happens due to a double string escaping. You should use 5 slashes in this case.
"foo+bar".gsub(/([+])/, '\\\\\1')
Adding \ two more times escapes this properly.
irb(main):011:0> puts "foo+bar".gsub(/(\+)/, '\\\\\1')
foo\+bar
=> nil

Resources