How to strip leading and trailing quote from string, in Ruby - ruby

I want to strip leading and trailing quotes, in Ruby, from a string. The quote character will occur 0 or 1 time. For example, all of the following should be converted to foo,bar:
"foo,bar"
"foo,bar
foo,bar"
foo,bar

You could also use the chomp function, but it unfortunately only works in the end of the string, assuming there was a reverse chomp, you could:
'"foo,bar"'.rchomp('"').chomp('"')
Implementing rchomp is straightforward:
class String
def rchomp(sep = $/)
self.start_with?(sep) ? self[sep.size..-1] : self
end
end
Note that you could also do it inline, with the slightly less efficient version:
'"foo,bar"'.chomp('"').reverse.chomp('"').reverse
EDIT: Since Ruby 2.5, rchomp(x) is available under the name delete_prefix, and chomp(x) is available as delete_suffix, meaning that you can use
'"foo,bar"'.delete_prefix('"').delete_suffix('"')

I can use gsub to search for the leading or trailing quote and replace it with an empty string:
s = "\"foo,bar\""
s.gsub!(/^\"|\"?$/, '')
As suggested by comments below, a better solution is:
s.gsub!(/\A"|"\Z/, '')

As usual everyone grabs regex from the toolbox first. :-)
As an alternate I'll recommend looking into .tr('"', '') (AKA "translate") which, in this use, is really stripping the quotes.

Another approach would be
remove_quotations('"foo,bar"')
def remove_quotations(str)
if str.start_with?('"')
str = str.slice(1..-1)
end
if str.end_with?('"')
str = str.slice(0..-2)
end
end
It is without RegExps and start_with?/end_with? are nicely readable.

It frustrates me that strip only works on whitespace. I need to strip all kinds of characters! Here's a String extension that will fix that:
class String
def trim sep=/\s/
sep_source = sep.is_a?(Regexp) ? sep.source : Regexp.escape(sep)
pattern = Regexp.new("\\A(#{sep_source})*(.*?)(#{sep_source})*\\z")
self[pattern, 2]
end
end
Output
'"foo,bar"'.trim '"' # => "foo,bar"
'"foo,bar'.trim '"' # => "foo,bar"
'foo,bar"'.trim '"' # => "foo,bar"
'foo,bar'.trim '"' # => "foo,bar"
' foo,bar'.trim # => "foo,bar"
'afoo,bare'.trim /[aeiou]/ # => "foo,bar"

Assuming that quotes can only appear at the beginning or end, you could just remove all quotes, without any custom method:
'"foo,bar"'.delete('"')

I wanted the same but for slashes in url path, which can be /test/test/test/ (so that it has the stripping characters in the middle) and eventually came up with something like this to avoid regexps:
'/test/test/test/'.split('/').reject(|i| i.empty?).join('/')
Which in this case translates obviously to:
'"foo,bar"'.split('"').select{|i| i != ""}.join('"')
or
'"foo,bar"'.split('"').reject{|i| i.empty?}.join('"')

Regexs can be pretty heavy and lead to some funky errors. If you are not dealing with massive strings and the data is pretty uniform you can use a simpler approach.
If you know the strings have starting and leading quotes you can splice the entire string:
string = "'This has quotes!'"
trimmed = string[1..-2]
puts trimmed # "This has quotes!"
This can also be turned into a simple function:
# In this case, 34 is \" and 39 is ', you can add other codes etc.
def trim_chars(string, char_codes=[34, 39])
if char_codes.include?(string[0]) && char_codes.include?(string[-1])
string[1..-2]
else
string
end
end

You can strip non-optional quotes with scan:
'"foo"bar"'.scan(/"(.*)"/)[0][0]
# => "foo\"bar"

Related

How to remove strings that end with a particular character in Ruby

Based on "How to Delete Strings that Start with Certain Characters in Ruby", I know that the way to remove a string that starts with the character "#" is:
email = email.gsub( /(?:\s|^)#.*/ , "") #removes strings that start with "#"
I want to also remove strings that end in ".". Inspired by "Difference between \A \z and ^ $ in Ruby regular expressions" I came up with:
email = email.gsub( /(?:\s|$).*\./ , "")
Basically I used gsub to remove the dollar sign for the carrot and reversed the order of the part after the closing parentheses (making sure to escape the period). However, it is not doing the trick.
An example I'd like to match and remove is:
"a8&23q2aas."
You were so close.
email = email.gsub( /.*\.\s*$/ , "")
The difference lies in the fact that you didn't consider the relationship between string of reference and the regex tokens that describe the condition you wish to trigger. Here, you are trying to find a period (\.) which is followed only by whitespace (\s) or the end of the line ($). I would read the regex above as "Any characters of any length followed by a period, followed by any amount of whitespace, followed by the end of the line."
As commenters pointed out, though, there's a simpler way: String#end_with?.
I'd use:
words = %w[#a day in the life.]
# => ["#a", "day", "in", "the", "life."]
words.reject { |w| w.start_with?('#') || w.end_with?('.') }
# => ["day", "in", "the"]
Using a regex is overkill for this if you're only concerned with the starting or ending character, and, in fact, regular expressions will slow your code in comparison with using the built-in methods.
I would really like to stick to using gsub....
gsub is the wrong way to remove an element from an array. It could be used to turn the string into an empty string, but that won't remove that element from the array.
def replace_suffix(str,suffix)
str.end_with?(suffix)? str[0, str.length - suffix.length] : str
end

How do I remove backslashes from a JSON string?

I have a JSON string that looks as below
'{\"test\":{\"test1\":{\"test1\":[{\"test2\":\"1\",\"test3\": \"foo\",\"test4\":\"bar\",\"test5\":\"test7\"}]}}}'
I need to change it to the one below using Ruby or Rails:
'{"test":{"test1":{"test1":[{"test2":"1","test3": "foo","test4":"bar","test5":"bar2"}]}}}'
I need to know how to remove those slashes.
To avoid generating JSON with backslashes in console, use puts:
> hash = {test: 'test'}
=> {:test=>"test"}
> hash.to_json
=> "{\"test\":\"test\"}"
> puts hash.to_json
{"test":"test"}
You could also use JSON.pretty_generate, with puts of course.
> puts JSON.pretty_generate(hash)
{
"test": "test"
}
Use Ruby's String#delete! method. For example:
str = '{\"test\":{\"test1\":{\"test1\":[{\"test2\":\"1\",\"test3\": \"foo\",\"test4\":\"bar\",\"test5\":\"test7\"}]}}}'
str.delete! '\\'
puts str
#=> {"test":{"test1":{"test1":[{"test2":"1","test3": "foo","test4":"bar","test5":"test7"}]}}}
Replace all backslashes with empty string using gsub:
json.gsub('\\', '')
Note that the default output in a REPL uses inspect, which will double-quote the string and still include backslashes to escape the double-quotes. Use puts to see the string’s exact contents:
{"test":{"test1":{"test1":[{"test2":"1","test3": "foo","test4":"bar","test5":"test7"}]}}}
Further, note that this will remove all backslashes, and makes no regard for their context. You may wish to only remove backslashes preceding a double-quote, in which case you can do:
json.gsub('\"', '')
I needed to print a pretty JSON array to a file. I created an array of JSON objects then needed to print to a file for the DBA to manipulate.
This is how i did it.
puts(((dirtyData.gsub(/\\"/, "\"")).gsub(/\["/, "\[")).gsub(/"\]/, "\]"))
It's a triple nested gsub that removes the \" first then removes the [" lastly removes the "]
I hope this helps
I tried the earlier approaches and they did not seem to fix my issue. My json still had backslashes. However, I found a fix to solve this issue.
myuglycode.gsub!(/\"/, '\'')
Use JSON.parse()
JSON.parse("{\"test\":{\"test1\":{\"test1\":[{\"test2\":\"1\",\"test3\": \"foo\",\"test4\":\"bar\",\"test5\":\"test7\"}]}}}")
> {"test"=>{"test1"=>{"test1"=>[{"test2"=>"1", "test3"=>"foo", "test4"=>"bar", "test5"=>"test7"}]}}}
Note that the "json-string" must me double-quoted for this to work, otherwise \" would be read as is instead of being "translated" to " which return a JSON::ParserError Exception.

String#delete ignore special characters

String#delete interprets a-z as character range. However, I would like it to delete fa-zo.
"fojwfa-zowj".delete("fa-zo") #=> "-"
Desired result:
"fojwwj"
You could also use this little trick:
string = "fojwfa-zowj"
string[/fa-zo/] = ''
string
# => "fojwwj"
Notice however, that this modifies the string in place like #gsub!, which should be faster and should use less memory, but which could introduce side-effects if not considered well.
"fojwfa-zowj".gsub("fa-zo","") # => "fojwwj"
"fojwfa-zowj".tap{ |s| s.slice! "fa-zo" } # just for the Heaven of it

Replace single quote with backslash single quote

I have a very large string that needs to escape all the single quotes in it, so I can feed it to JavaScript without upsetting it.
I have no control over the external string, so I can't change the source data.
Example:
Cote d'Ivoir -> Cote d\'Ivoir
(the actual string is very long and contains many single quotes)
I'm trying to this by using gsub on the string, but can't get this to work:
a = "Cote d'Ivoir"
a.gsub("'", "\\\'")
but this gives me:
=> "Cote dIvoirIvoir"
I also tried:
a.gsub("'", 92.chr + 39.chr)
but got the same result; I know it's something to do with regular expressions, but I never get those.
The %q delimiters come in handy here:
# %q(a string) is equivalent to a single-quoted string
puts "Cote d'Ivoir".gsub("'", %q(\\\')) #=> Cote d\'Ivoir
The problem is that \' in a gsub replacement means "part of the string after the match".
You're probably best to use either the block syntax:
a = "Cote d'Ivoir"
a.gsub(/'/) {|s| "\\'"}
# => "Cote d\\'Ivoir"
or the Hash syntax:
a.gsub(/'/, {"'" => "\\'"})
There's also the hacky workaround:
a.gsub(/'/, '\#').gsub(/#/, "'")
# prepare a text file containing [ abcd\'efg ]
require "pathname"
backslashed_text = Pathname("/path/to/the/text/file.txt").readlines.first.strip
# puts backslashed_text => abcd\'efg
unslashed_text = "abcd'efg"
unslashed_text.gsub("'", Regexp.escape(%q|\'|)) == backslashed_text # true
# puts unslashed_text.gsub("'", Regexp.escape(%q|\'|)) => abcd\'efg

Ruby remove everything except some characters?

How can I remove from a string all characters except white spaces, numbers, and some others?
Something like this:
oneLine.gsub(/[^ULDR0-9\<\>\s]/i,'')
I need only: 0-9 l d u r < > <space>
Also, is there a good document about the use of regex in Ruby, like a list of special characters with examples?
The regex you have is already working correctly. However, you do need to assign the result back to the string you're operating on. Otherwise, you're not changing the string (.gsub() does not modify the string in-place).
You can improve the regex a bit by adding a '+' quantifier (so consecutive characters can be replaced in one go). Also, you don't need to escape angle brackets:
oneLine = oneLine.gsub(/[^ULDR0-9<>\s]+/i, '')
A good resource with special consideration of Ruby regexes is the Regular Expressions Cookbook by Jan Goyvaerts and Steven Levithan. A good online tutorial by the same author is here.
Good old String#delete does this without a regular expression. The ^ means 'NOT'.
str = "12eldabc8urp pp"
p str.delete('^0-9ldur<> ') #=> "12ld8ur "
Just for completeness: you don't need a regular expression for this particular task, this can be done using simple string manipulation:
irb(main):005:0> "asdasd123".tr('^ULDRuldr0-9<>\t\r\n ', '')
=> "dd123"
There's also the tr! method if you want to replace the old value:
irb(main):009:0> oneLine = 'UasdL asd 123'
irb(main):010:0> oneLine.tr!('^ULDRuldr0-9<>\t\r\n ', '')
irb(main):011:0> oneLine
=> "UdL d 123"
This should be a bit faster as well (but performance shouldn't be a big concern in Ruby :)

Resources