How to print an escape character in Ruby? - ruby

I have a string containing an escape character:
word = "x\nz"
and I would like to print it as x\nz.
However, puts word gives me:
x
z
How do I get puts word to output x\nz instead of creating a new line?

Use String#inspect
puts word.inspect #=> "x\nz"
Or just p
p word #=> "x\nz"

I have a string containing an escape character:
No, you don't. You have a string containing a newline.
How do I get puts word to output x\nz instead of creating a new line?
The easiest way would be to just create the string in the format you want in the first place:
word = 'x\nz'
# or
word = "x\\nz"
If that isn't possible, you can translate the string the way you want:
word = word.gsub("\n", '\n')
# or
word.gsub!("\n", '\n')
You may be tempted to do something like
puts word.inspect
# or
p word
Don't do that! #inspect is not guaranteed to have any particular format. The only requirement it has, is that it should return a human-readable string representation that is suitable for debugging. You should never rely on the content of #inspect, the only thing you should rely on, is that it is human readable.

Related

Cannot convert ISO8859-1 to cyrillic in ruby

I have text "ÐоÑÑинаÑ", and I want to convert it to cyrillic. 2cyr.com says that this is ISO8859-1 format. I tried
"ÐоÑÑинаÑ".force_encoding("ISO8859-1").encode("UTF-8")
But it returned =>
"Ã\u0090Â\u0093Ã\u0090¾Ã\u0091Â\u0081Ã\u0091Â\u0082Ã\u0090¸Ã\u0090½Ã\u0090°Ã\u0091Â\u008F"
What should I do to make the final word be "Гостиная"
It's the other way round. Your string is the result of:
str = "Гостиная".force_encoding('ISO8859-1').encode('UTF-8')
#=> "Ð\u0093оÑ\u0081Ñ\u0082инаÑ\u008F"
puts str
#=> ÐоÑÑинаÑ
To revert it, use:
str.encode('ISO8859-1').force_encoding('UTF-8')
#=> "Гостиная"
Of course, this only works if the malformed string is left intact (it contains several invisible / unprintable characters).
Best you can do is switch the order of methods:
puts "ÐоÑÑинаÑ".encode("CP1252")
#=> �о��ина�
Your string still contains broken chars, but that is likely to be inherent to your original string. Online tools like this one give the same result.

How to insert a newline character to an array of characters

I want to insert a newline character into an array of characters which initially is a string. Let's say I have a variable myvar = "Blizzard". A string is formed from an array of characters. How can I insert a newline character inside it? In hope of making an output like this:
"B
lizzard"
I tried this:
myvar[1] = "\n"
but it's not working, and the output is like this:
"B\nlizzard"
My goal is to make the output like this:
B
l
i
z
z
a
r
d
without using puts. I have to do it by inserting newline characters into the array. Can someone point out where my mistake is, and if possible help me with this?
To add \n you can use this:
myvar = "Blizzard"
myvar.chars.map { |c| c + "\n" }.join.strip
Or better #Uri solution:
myvar.chars.join "\n"
But you can puts letters one on the line with next code:
myvar.chars.each { |c| puts c }
or:
myvar.each_char { |c| puts c } # for ruby >= 2.0
by Darek Nędza
'Blizzard'.chars.join("\n")
# => "B\nl\ni\nz\nz\na\nr\nd"
If all you want is to print the characters each in a new row you can do the following:
puts 'Blizzard'.chars
Output:
B
l
i
z
z
a
r
d
You have done myvar[1] = "\n" correctly. Your problem is not how you did it, but what you are expecting.
You seem to be confusing the inspection of a string and the puts output of the string. Inspection is what is displayed as the return value as in irb, and it is a meta-representation of what you have. And as long as it is a string, it will be delimited by double quotes, and all the special characters will be escaped with a backslash \. If you have a new line character, that would be represented as "\n". On the other hand, when you pass the string to puts, you will get the output according to what the special characters represent.
What you displayed as what you want (the one in multiple lines) should be the result of puts. You will never get such thing as inspection of the string.

Regex string with grouping?

I see in the documentation I'm able to do:
/\$(?<dollars>\d+)\.(?<cents>\d+)/ =~ "$3.67" #=> 0
puts dollars #=> prints 3
I was wondering if this would be possible:
string = "\$(\?<dlr>\d+)\.(\?<cts>\d+)"
/#{Regexp.escape(string)}/ =~ "$3.67"
I get:
`<main>': undefined local variable or method `dlr' for main:Object (NameError)
There are a few mistakes in your approach. First of all, let's look at your string:
string = "\$(\?<dlr>\d+)\.(\?<cts>\d+)"
You escape the dollar sign with "\$", but that is the same as just writing "$", consider:
"\$" == "$"
#=> true
To actually end up with the string "backslash followed by dollar" you would need to write "\\$". The same thing applies to the decimal character classes, you would have to write "\\d" to end up with the correct string.
The question marks on the other hand are actually part of the regex syntax, so you do not want to escape these at all. I recommend using single quotes for your original string, because that makes the input much easier:
string = '\$(?<dlr>\d+)\.(?<cts>\d+)'
#=> "\\$(?<dlr>\\d+)\\.(?<cts>\\d+)"
The next issue is with Regexp.escape. Take a look at what regular expression it produces with the above string:
string = '\$(?<dlr>\d+)\.(?<cts>\d+)'
Regexp.escape(string)
#=> "\\\\\\$\\(\\?<dlr>\\\\d\\+\\)\\\\\\.\\(\\?<cts>\\\\d\\+\\)"
That's one level too much escaping. Regexp.escape can be used when you want to match the literal characters that are contained in the string. For example, the escaped regex above will match the source string itself:
/#{Regexp.escape(string)}/ =~ string
#=> 0 # matches at offset 0
Instead, you can use Regexp.new to treat the source as an actual regular expression.
The last issue is then how you access the match result. Obviously, you are getting a NoMethodError. You might think that the match result is stored in local variables called dlr and cts, but that is not the case. You have two options to access the match data:
Use Regexp.match, it will return a MatchData object as result
Use regexp =~ string and then access the last match data with the global variable $~
I prefer the former, because it is easier to read. The full code would then look like this:
string = '\$(?<dlr>\d+)\.(?<cts>\d+)'
regexp = Regexp.new(string)
result = regexp.match("$3.67")
#=> #<MatchData "$3.67" dlr:"3" cts:"67">
result[:dlr]
#=> "3"
result[:cts]
#=> "67"

Ruby: How to append to each line of a string based on a given regex?

I want to append </tag> to each line where it's missing:
text = '<tag>line 1</tag>
<tag>line2 # no closing tag, append
<tag>line3 # no closing tag, append
line4</tag> # no opening tag, but has a closing tag, so ignore
<tag>line5</tag>'
I tried to create a regular expression to match this but I know its wrong:
text.gsub! /.*?(<\/tag>)Z/, '</tag>'
How can I create a regular expression to conditionally append each line?
Here you go:
text.gsub!(%r{(?<!</tag>)$}, "</tag>")
Explanation:
$ means end of line and \z means end of string. \Z means something similar, with complications.
(?<!) work together to create a negative lookbehind.
Given the example provided, I'd just do something like this:
text.split(/<\/?tag>/).
reject {|t| t.strip.length == 0 }.
map {|t| "<tag>%s</tag>" % t.strip }.
join("\n")
You're basically treating either and as record delimiters, so you can just split on them, reject any blank records, then construct a new combined string from the extracted values. This works nicely when you can't count on newlines being record delimiters and will generally be tolerant of missing tags.
If you're insistent on a pure regex solution, though, and your data format will always match the given format (one record per line), you can use a negative lookbehind:
text.strip.gsub(/(?<!<\/tag>)(\n|$)/, "</tag>\\1")
One that could work is:
/<tag>[^\n ]+[^>][\s]*(\n)/
This is will return all the newline chars without a ">" before them.
Replace it with "\n", i.e.
text.gsub!( /<tag>[^\n ]+[^>][\s]*(\n)/ , "</tag>\n")
For more polishing, try http://rubular.com/
text = '<tag>line 1</tag>
<tag>line2
<tag>line3
line4</tag>
<tag>line5</tag>'
result = ""
text.each_line do |line|
line.rstrip!
line << "</tag>" if not line.end_with?("</tag>")
result << line << "\n"
end
puts result
--output:--
<tag>line 1</tag>
<tag>line2</tag>
<tag>line3</tag>
line4</tag>
<tag>line5</tag>

Ruby: String no longer mixes in Enumerable in 1.9

So how can I still be able to write beautiful code such as:
'im a string meing!'.pop
Note: str.chop isn't sufficient answer
It is not what an enumerable string atually enumerates. Is a string a sequence of ...
lines,
characters,
codepoints or
bytes?
The answer is: all of those, any of those, either of those or neither of those, depending on the context. Therefore, you have to tell Ruby which of those you actually want.
There are several methods in the String class which return enumerators for any of the above. If you want the pre-1.9 behavior, your code sample would be
'im a string meing!'.bytes.to_a.pop
This looks kind of ugly, but there is a reason for it: a string is a sequence. You are treating it as a stack. A stack is not a sequence, in fact it pretty much is the opposite of a sequence.
That's not beautiful :)
Also #pop is not part of Enumerable, it's part of Array.
The reason why String is not enumerable is because there are no 'natural' units to enumerate, should it be on a character basis or a line basis? Because of this String does not have an #each
String instead provides the #each_char and #each_byte and #each_line methods for iteration in the way that you choose.
Since you don't like str[str.length], how about
'im a string meing!'[-1] # returns last character as a character value
or
'im a string meing!'[-1,1] # returns last character as a string
or, if you need it modified in place as well, while keeping it an easy one-liner:
class String
def pop
last = self[-1,1]
self.chop!
last
end
end
#!/usr/bin/ruby1.8
s = "I'm a string meing!"
s, last_char = s.rpartition(/./)
p [s, last_char] # => ["I'm a string meing", "!"]
String.rpartition is new for 1.9 but it's been back-ported to 1.8.7. It searches a string for a regular expression, starting at the end and working backwards. It returns the part of the string before the match, the match, and the part of the string after the match (which we discard here).
String#slice! and String#insert is going to get you much closer to what you want without converting your strings to arrays.
For example, to simulate Array#pop you can do:
text = '¡Exclamation!'
mark = text.slice! -1
mark == '!' #=> true
text #=> "¡Exclamation"
Likewise, for Array#shift:
text = "¡Exclamation!"
inverted_mark = text.slice! 0
inverted_mark == '¡' #=> true
text #=> "Exclamation!"
Naturally, to do an Array#push you just use one of the concatenation methods:
text = 'Hello'
text << '!' #=> "Hello!"
text.concat '!' #=> "Hello!!"
To simulate Array#unshift you use String#insert instead, it's a lot like the inverse of slice really:
text = 'World!'
text.insert 0, 'Hello, ' #=> "Hello, World!"
You can also grab chunks from the middle of a string in multiple ways with slice.
First you can pass a start position and length:
text = 'Something!'
thing = text.slice 4, 5
And you can also pass a Range object to grab absolute positions:
text = 'This is only a test.'
only = text.slice (8..11)
In Ruby 1.9 using String#slice like this is identical to String#[], but if you use the bang method String#slice! it will actually remove the substring you specify.
text = 'This is only a test.'
only = text.slice! (8..12)
text == 'This is a test.' #=> true
Here's a slightly more complex example where we reimplement a simple version of String#gsub! to do a search and replace:
text = 'This is only a test.'
search = 'only'
replace = 'not'
index = text =~ /#{search}/
text.slice! index, search.length
text.insert index, replace
text == 'This is not a test.' #=> true
Of course 99.999% of the time, you're going to want to use the aforementioned String.gsub! which will do the exact same thing:
text = 'This is only a test.'
text.gsub! 'only', 'not'
text == 'This is not a test.' #=> true
references:
Ruby String Documentation

Resources