How to properly join LF character in an array? - ruby

Basic question.
Instead of adding '\n' between the elements:
>> puts "#{[1, 2, 3].join('\n')}"
1\n2\n3
I need to actually add the line feed character, so the output I expect when printing it would be:
1
2
3
What's the best way to do that in Ruby?

You need to use double quotes.
puts "#{[1, 2, 3].join("\n")}"
Note that you don't have to escape the double quotes because they're within the {} of a substitution, and thus will not be treated as delimiters for the outer string.
However, you also don't even need the #{} wrapper if that's all your doing - the following will work fine:
puts [1,2,3].join("\n")

Escaped characters can only be used in double quoted strings:
puts "#{[1, 2, 3].join("\n")}"
But since all you output is this one statement, I wouldn't quote the join statement:
puts [1, 2, 3].join("\n")

Note that join only adds line feeds between the lines. It will not add a line-feed to the end of the last line. If you need every line to end with a line-feed, then:
#!/usr/bin/ruby1.8
lines = %w(one two three)
s = lines.collect do |line|
line + "\n"
end.join
p s # => "one\ntwo\nthree\n"
print s # => one
# => two
# => three

Ruby doesn't interpret escape sequences in single-quoted strings.
You want to use double-quotes:
puts "#{[1, 2, 3].join(\"\n\")}"
NB: My syntax might be bad - I'm not a Ruby programmer.

Related

What's different about this ruby regex?

I was trying to substitute either a comma or a percent sign, and it continually failed, so I opened up IRB and tried some things out. Can anyone explain to me why the first regex (IRB line 13) doesn't work but the flipped version does (IRB line 15)? I've looked it up and down and I don't see any typos, so it must be something to do with the rule but I can't see what.
b.gsub(/[%]*|[,]*/,"")
# => "245,324"
b.gsub(/[,]*/,"")
# => "245324"
b.gsub(/[,]*|[%]*/,"")
# => "245324"
b
# => "245,324"
Because ruby happily finds [%]* zero times throughout your string and does the substitution. Check out this result:
b = '232,000'
puts b.gsub(/[%]*/,"-")
--output:--
-2-3-2-,-0-0-0-
If you put all the characters that you want to erase into the same character class, then you will get the result you want:
b = "%245,324,000%"
puts b.gsub(/[%,]*/, '')
--output:--
245324000
Even then, there are a lot of needless substitutions going on:
b = "%245,324,000%"
puts b.gsub(/[%,]*/, '-')
--output:--
--2-4-5--3-2-4--0-0-0--
It's the zero or more that gets you into trouble because ruby can find lots of places where there are 0 percent signs or 0 commas. You actually don't want to do substitutions where ruby finds zero of your characters, instead you want to do substitutions where at least one of your characters occurs:
b = '%232,000,000%'
puts b.gsub(/%+|,+/,"")
--output:--
232000000
Or, equivalently:
puts b.gsub(/[%,]+/, '')
Also, note that regexes are like double quoted strings, so you can interpolate into them--it's as if the delimiters // are double quotes:
one_or_more_percents = '%+'
one_or_more_commas = ',+'
b = '%232,000,000%'
puts b.gsub(/#{one_or_more_percents}|#{one_or_more_commas}/,"")
--output:--
232000000
But when your regexes consist of single characters, just use a character class: [%,]+

Ruby backslash to continue string on a new line?

I'm reviewing a line of Ruby code in a pull request. I'm not sure if this is a bug or a feature that I haven't seen before:
puts "A string of Ruby that"\
"continues on the next line"
Is the backslash a valid character to concatenate these strings? Or is this a bug?
That is valid code.
The backslash is a line continuation. Your code has two quoted runs of text; the runs appear like two strings, but are really just one string because Ruby concatenates whitespace-separated runs.
Example of three quoted runs of text that are really just one string:
"a" "b" "c"
=> "abc"
Example of three quoted runs of text that are really just one string, using \ line continuations:
"a" \
"b" \
"c"
=> "abc"
Example of three strings, using + line continuations and also concatenations:
"a" +
"b" +
"c"
=> "abc"
Other line continuation details: "Ruby interprets semicolons and newline characters as the ending of a statement. However, if Ruby encounters operators, such as +, -, or backslash at the end of a line, they indicate the continuation of a statement." - Ruby Quick Guide
The backslash character does not concatenate any strings. It prevents the line-break from meaning that those two lines are different statements. Think of the backslash as the opposite of the semicolon. The semicolon lets two statements occupy one line; the backslash lets one statement occupy two lines.
What you are not realizing is that a string literal can be written as multiple successive literals. This is legal Ruby:
s = "A string of Ruby that" "continues on the same line"
puts s
Since that is legal, it is legal to put a line break between the two string literals - but then you need the backslash, the line-continuation character, to tell Ruby that these are in fact the same statement, spread over two lines.
s = "A string of Ruby that" \
"continues on the same line"
puts s
If you omit the backslash, it is still legal, but doesn't give the result you might be hoping for; the string literal on the second line is simply thrown away.
This is not a case of concatenated strings. It is one single string. "foo" "bar" is a syntactic construct that allows you to break up a string in your code, but it is identical to "foobar". In contrast, "foo" + "bar" is the true concatenation, invoking the concatenation method + on object "foo".
You can verify this by dumping the YARV instructions. Compare:
RubyVM::InstructionSequence.compile('"foo" + "bar"').to_a
// .... [:putstring, "foo"], [:putstring, "bar"] ....
RubyVM::InstructionSequence.compile('"foo" "bar"').to_a
// .... [:putstring, "foobar"] ....
The backslash in front of the newline will cancel the newline, so it does not terminate the statement; without it, it would not be one string, but two strings in separate lines.

How to insert a newline character to an array of characters

I want to insert a newline character into an array of characters which initially is a string. Let's say I have a variable myvar = "Blizzard". A string is formed from an array of characters. How can I insert a newline character inside it? In hope of making an output like this:
"B
lizzard"
I tried this:
myvar[1] = "\n"
but it's not working, and the output is like this:
"B\nlizzard"
My goal is to make the output like this:
B
l
i
z
z
a
r
d
without using puts. I have to do it by inserting newline characters into the array. Can someone point out where my mistake is, and if possible help me with this?
To add \n you can use this:
myvar = "Blizzard"
myvar.chars.map { |c| c + "\n" }.join.strip
Or better #Uri solution:
myvar.chars.join "\n"
But you can puts letters one on the line with next code:
myvar.chars.each { |c| puts c }
or:
myvar.each_char { |c| puts c } # for ruby >= 2.0
by Darek Nędza
'Blizzard'.chars.join("\n")
# => "B\nl\ni\nz\nz\na\nr\nd"
If all you want is to print the characters each in a new row you can do the following:
puts 'Blizzard'.chars
Output:
B
l
i
z
z
a
r
d
You have done myvar[1] = "\n" correctly. Your problem is not how you did it, but what you are expecting.
You seem to be confusing the inspection of a string and the puts output of the string. Inspection is what is displayed as the return value as in irb, and it is a meta-representation of what you have. And as long as it is a string, it will be delimited by double quotes, and all the special characters will be escaped with a backslash \. If you have a new line character, that would be represented as "\n". On the other hand, when you pass the string to puts, you will get the output according to what the special characters represent.
What you displayed as what you want (the one in multiple lines) should be the result of puts. You will never get such thing as inspection of the string.

When to use %w?

The following two statements will generate the same result:
arr = %w(abc def ghi jkl)
and
arr = ["abc", "def", "ghi", "jkl"]
In which cases should %w be used?
In the case above, I want an array ["abc", "def", "ghi", "jkl"]. Which is the ideal way: the former (with %w) or the later?
When to use %w[...] vs. a regular array? I'm sure you can think up reasons simply by looking at the two, and then typing them in, and thinking about what you just did.
Use %w[...] when you have a list of single words you want to turn into an array. I use it when I have parameters I want to loop over, or commands I know I'll want to add to in the future, because %w[...] makes it easy to add new elements to the array. There's less visual noise in the definition of the array.
Use a regular array of strings when you have elements that have embedded white-space that would trick %w. Use it for arrays that have to contain elements that are not strings. Enclosing the elements inside " and ' with intervening commas causes visual-noise, but it also makes it possible to create arrays with any object type.
So, you pick when to use one or the other when it makes the most sense to you. It's called "programmer's choice".
As you correctly noted, they generate the same result. So, when deciding, choose one that produces simpler code. In this case, it's the %w operator. In the case of your previous question, it's the array literal.
Using %w allows you to avoid using quotes around strings.
Moreover, there are more shortcuts like these:
%W - double quotes
%r - regular expression
%q - single-quoted string
%Q - double-quoted string
%x - shell command
More information is available in "What does %w(array) mean?"
This is the way I remember it:
%Q/%q is for strings
%Q is for double-quoted strings (useful for when you have multiple quote characters in a string).
Instead of doing this:
“I said \“Hello World\””
You can do:
%Q{I said “Hello World”}
%q is for single-quoted strings (remember single quoted strings do not support string interpolation or escape sequences e.g. \n. And when I say does not "support", I mean that single quoted strings will need process the escape sequence as a special character, in other words, the escape sequence will just be part of the string literal)
Instead of doing this:
‘I said \’Hello World\’’
You can do:
%q{I said 'Hello World'}
But note that if you have an escape sequence in string, that will not be processed and instead treated as a literal backslash and n character:
result = %q{I said Hello World\n}
=> "I said Hello World\\n"
puts result
I said Hello World\n
Notice the literal \n was not treated as a line break, but it is with %Q:
result = %Q{I said Hello World\n}
=> "I said Hello World\n"
puts result
I said Hello World
%W/%w is for array elements
%W is used for double-quoted array elements. This means that it will support string interpolation and escape sequences:
Instead of doing this:
orange = "orange"
result = ["apple", "#{orange}", "grapes"]
=> ["apple", "orange", "grapes”]
you can do this:
result = %W(apple #{orange} grapes\n)
=> ["apple", "orange", "grapes\n"]
puts result
apple
orange
grapes
Notice the escape sequence \n caused a newline break after grapes. That would not happen with %w. %w is used for single-quoted array elements. And of course single quoted strings do not support interpolation and escape sequences.
Instead of doing this:
result = [‘a’, ‘b’, ‘c’]
you can do:
result = %w{a b c}
But look what happens when we try this:
result = %w{a b c\n}
=> ["a", "b", "c\\n"]
puts result
a
b
c\n
Remember do not confuse these constructs with %x (alternative for ` backtick which is used to run unix commands), %r (alternative for // regular expression syntax useful when you have a lot of / characters in your regular expressions and do not want to escape them) and finally %s (which is sued for symbols).

How to split a string by colons NOT in quotes

I have a CSV-file delimited by colons, but it contains text-fields wrapped in quotes, which themselves contain several colons.
I would like a simple solution for getting the data fields, but eg. in ruby the split method splits on every colon.
Is there a regex which matches all colons, except those wrapped in quotes?
Given:
str = 'foo:bar:"jim:jam":jar'
You can do this:
a = str.scan( /([^":]+)|"([^"]+)"/ ).flatten.compact
p a
#=> ["foo", "bar", "jim:jam", "jar"]
Or you can do this:
a = []
str.scan( /([^":]+)|"([^"]+)"/ ){ a << ($1 || $2) }
p a
#=> ["foo", "bar", "jim:jam", "jar"]
Those regex say to find either
One or more characters that are not a-quote-or-a-colon, or
A quote, followed by one or more characters that are not a quote, followed by a quote.
Just use http://ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
you can split on double quotes instead of colons
>> str = 'foo:bar:"jim:jam":jar'
=> "foo:bar:\"jim:jam\":jar"
>> str.split("\"").each_with_index do |x,y|
?> puts y%2==0 ? x.split(":") : x
>> end
foo
bar
jim:jam
jar
First attempt was so bad, revised the entire thing. This is my regex solution:
GETS LAST delimeter field ':' = :last
Trims: /(?:^\s*:|:|^)\s*(".*?"|.*?)(?=\s*(?:\:|$))/
No-trim: /(?:(?<!^):|^)(\s*".*?"\s*|.*?)(?=\:|$)/
------------------
GETS FIRST AND LAST delimeter fields ':' = first:last
Trims: /(?:^|:)\s*(".*?"|(?<!^).*?|)(?=\s*(?:\:|$))/
No trim: /(?:^|:)(\s*".*?"\s*|\s*(?<!^).*?|)(?=\:|$)/
And yes, its not as easy as one thinks ..

Resources