String concatenation vs. interpolation in Ruby - ruby

I am just starting to learn Ruby (first time programming), and have a basic syntactical question with regards to variables, and various ways of writing code.
Chris Pine's "Learn to Program" taught me to write a basic program like this...
num_cars_again= 2
puts 'I own ' + num_cars_again.to_s + ' cars.'
This is fine, but then I stumbled across the tutorial on ruby.learncodethehardway.com, and was taught to write the same exact program like this...
num_cars= 2
puts "I own #{num_cars} cars."
They both output the same thing, but obviously option 2 is a much shorter way to do it.
Is there any particular reason why I should use one format over the other?

Whenever TIMTOWTDI (there is more than one way to do it), you should look for the pros and cons. Using "string interpolation" (the second) instead of "string concatenation" (the first):
Pros:
Is less typing
Automatically calls to_s for you
More idiomatic within the Ruby community
Faster to accomplish during runtime
Cons:
Automatically calls to_s for you (maybe you thought you had a string, and the to_s representation is not what you wanted, and hides the fact that it wasn't a string)
Requires you to use " to delimit your string instead of ' (perhaps you have a habit of using ', or you previously typed a string using that and only later needed to use string interpolation)

Both interpolation and concatination has its own strength and weakness. Below I gave a benchmark which clearly demonstrates where to use concatination and where to use interpolation.
require 'benchmark'
iterations = 1_00_000
firstname = 'soundarapandian'
middlename = 'rathinasamy'
lastname = 'arumugam'
puts 'With dynamic new strings'
puts '===================================================='
5.times do
Benchmark.bm(10) do |benchmark|
benchmark.report('concatination') do
iterations.times do
'Mr. ' + firstname + middlename + lastname + ' aka soundar'
end
end
benchmark.report('interpolaton') do
iterations.times do
"Mr. #{firstname} #{middlename} #{lastname} aka soundar"
end
end
end
puts '--------------------------------------------------'
end
puts 'With predefined strings'
puts '===================================================='
5.times do
Benchmark.bm(10) do |benchmark|
benchmark.report('concatination') do
iterations.times do
firstname + middlename + lastname
end
end
benchmark.report('interpolaton') do
iterations.times do
"#{firstname} #{middlename} #{lastname}"
end
end
end
puts '--------------------------------------------------'
end
And below is the Benchmark result
Without predefined strings
====================================================
user system total real
concatination 0.170000 0.000000 0.170000 ( 0.165821)
interpolaton 0.130000 0.010000 0.140000 ( 0.133665)
--------------------------------------------------
user system total real
concatination 0.180000 0.000000 0.180000 ( 0.180410)
interpolaton 0.120000 0.000000 0.120000 ( 0.125051)
--------------------------------------------------
user system total real
concatination 0.140000 0.000000 0.140000 ( 0.134256)
interpolaton 0.110000 0.000000 0.110000 ( 0.111427)
--------------------------------------------------
user system total real
concatination 0.130000 0.000000 0.130000 ( 0.132047)
interpolaton 0.120000 0.000000 0.120000 ( 0.120443)
--------------------------------------------------
user system total real
concatination 0.170000 0.000000 0.170000 ( 0.170394)
interpolaton 0.150000 0.000000 0.150000 ( 0.149601)
--------------------------------------------------
With predefined strings
====================================================
user system total real
concatination 0.070000 0.000000 0.070000 ( 0.067735)
interpolaton 0.100000 0.000000 0.100000 ( 0.099335)
--------------------------------------------------
user system total real
concatination 0.060000 0.000000 0.060000 ( 0.061955)
interpolaton 0.130000 0.000000 0.130000 ( 0.127011)
--------------------------------------------------
user system total real
concatination 0.090000 0.000000 0.090000 ( 0.092136)
interpolaton 0.110000 0.000000 0.110000 ( 0.110224)
--------------------------------------------------
user system total real
concatination 0.080000 0.000000 0.080000 ( 0.077587)
interpolaton 0.110000 0.000000 0.110000 ( 0.112975)
--------------------------------------------------
user system total real
concatination 0.090000 0.000000 0.090000 ( 0.088154)
interpolaton 0.140000 0.000000 0.140000 ( 0.135349)
--------------------------------------------------
Conclusion
If strings already defined and sure they will never be nil use concatination else use interpolation.Use appropriate one which will result in better performance than one which is easy to indent.

#user1181898 - IMHO, it's because it's easier to see what's happening. To #Phrogz's point, string interpolation automatically calls the to_s for you. As a beginner, you need to see what's happening "under the hood" so that you learn the concept as opposed to just learning by rote.
Think of it like learning mathematics. You learn the "long" way in order to understand the concepts so that you can take shortcuts once you actually know what you are doing. I speak from experience b/c I'm not that advanced in Ruby yet, but I've made enough mistakes to advise people on what not to do. Hope this helps.

If you are using a string as a buffer, I found that using concatenation (String#concat) to be faster.
require 'benchmark/ips'
puts "Ruby #{RUBY_VERSION} at #{Time.now}"
puts
firstname = 'soundarapandian'
middlename = 'rathinasamy'
lastname = 'arumugam'
Benchmark.ips do |x|
x.report("String\#<<") do |i|
buffer = String.new
while (i -= 1) > 0
buffer << 'Mr. ' << firstname << middlename << lastname << ' aka soundar'
end
end
x.report("String interpolate") do |i|
buffer = String.new
while (i -= 1) > 0
buffer << "Mr. #{firstname} #{middlename} #{lastname} aka soundar"
end
end
x.compare!
end
Results:
Ruby 2.3.1 at 2016-11-15 15:03:57 +1300
Warming up --------------------------------------
String#<< 230.615k i/100ms
String interpolate 234.274k i/100ms
Calculating -------------------------------------
String#<< 2.345M (± 7.2%) i/s - 11.761M in 5.041164s
String interpolate 1.242M (± 5.4%) i/s - 6.325M in 5.108324s
Comparison:
String#<<: 2344530.4 i/s
String interpolate: 1241784.9 i/s - 1.89x slower
At a guess, I'd say that interpolation generates a temporary string which is why it's slower.

Here is a full benchmark which also compares Kernel#format and String#+ as it's all methods for construction dynamic string in ruby that I know 🤔
require 'benchmark/ips'
firstname = 'soundarapandian'
middlename = 'rathinasamy'
lastname = 'arumugam'
FORMAT_STR = 'Mr. %<firstname>s %<middlename>s %<lastname>s aka soundar'
Benchmark.ips do |x|
x.report("String\#<<") do |i|
str = String.new
str << 'Mr. ' << firstname << ' ' << middlename << ' ' << lastname << ' aka soundar'
end
x.report "String\#+" do
'Mr. ' + firstname + ' ' + middlename + ' ' + lastname + ' aka soundar'
end
x.report "format" do
format(FORMAT_STR, firstname: firstname, middlename: middlename, lastname: lastname)
end
x.report("String interpolate") do |i|
"Mr. #{firstname} #{middlename} #{lastname} aka soundar"
end
x.compare!
end
And results for ruby 2.6.5
Warming up --------------------------------------
String#<<
94.597k i/100ms
String#+ 75.512k i/100ms
format 73.269k i/100ms
String interpolate 164.005k i/100ms
Calculating -------------------------------------
String#<< 91.385B (±16.9%) i/s - 315.981B
String#+ 905.389k (± 4.2%) i/s - 4.531M in 5.013725s
format 865.746k (± 4.5%) i/s - 4.323M in 5.004103s
String interpolate 161.694B (±11.3%) i/s - 503.542B
Comparison:
String interpolate: 161693621120.0 i/s
String#<<: 91385051886.2 i/s - 1.77x slower
String#+: 905388.7 i/s - 178590.27x slower
format: 865745.8 i/s - 186768.00x slower

Related

Why map {}.compact is faster than each_with_object([])?

I did some benchmarks:
require 'benchmark'
words = File.open('/usr/share/dict/words', 'r') do |file|
file.each_line.take(1_000_000).map(&:chomp)
end
Benchmark.bmbm(20) do |x|
GC.start
x.report(:map) do
words.map do |word|
word.size if word.size > 5
end.compact
end
GC.start
x.report(:each_with_object) do
words.each_with_object([]) do |word, long_sizes|
long_sizes << word.size if word.size > 5
end
end
end
Output (ruby 2.3.0):
Rehearsal --------------------------------------------------------
map 0.020000 0.000000 0.020000 ( 0.016906)
each_with_object 0.020000 0.000000 0.020000 ( 0.024695)
----------------------------------------------- total: 0.040000sec
user system total real
map 0.010000 0.000000 0.010000 ( 0.015004)
each_with_object 0.020000 0.000000 0.020000 ( 0.024183)
I cannot understand it because I thought that each_with_object should be faster: it needs only 1 loop and 1 new object to create a new array instead of 2 loops and 2 new objects in case when we combine map and compact.
Any ideas?
Array#<< needs to reallocate memory if the original memory space doesn't have enough room to hold the new item. See the implementation, especially this line
VALUE target_ary = ary_ensure_room_for_push(ary, 1);
While Array#map doesn't have to reallocate memory from time to time because it already knows the size of the result array. See the implementation, especially
collect = rb_ary_new2(RARRAY_LEN(ary));
which allocates the same size of memory as the original array.

Random string generation with same pattern

I want to generate a random string with the pattern:
number-number-letter-SPACE-letter-number-number
for example "81b t15", "12a x13". How can I generate something like this? I tried generating each char and joining them into one string, but it does not look efficient.
Nums = (0..9).to_a
Ltrs = ("A".."Z").to_a + ("a".."z").to_a
def rand_num; Nums.sample end
def rand_ltr; Ltrs.sample end
"#{rand_num}#{rand_num}#{rand_ltr} #{rand_ltr}#{rand_num}#{rand_num}"
# => "71P v33"
Have you looked at randexp gem
It works like this:
> /\d\d\w \w\d\d/.gen
=> "64M c82"
Ok here's another entry for the competition :D
module RandomString
LETTERS = (("A".."Z").to_a + ("a".."z").to_a)
LETTERS_SIZE = LETTERS.size
SPACE = " "
FORMAT = [:number, :letter, :number, :space, :letter, :number, :number]
class << self
def generate
chars.join
end
def generate2
"#{number}#{letter}#{number} #{letter}#{number}#{number}"
end
private
def chars
FORMAT.collect{|char_class| send char_class}
end
def letter
LETTERS[rand(LETTERS_SIZE)]
end
def number
rand 10
end
def space
SPACE
end
end
end
And you use it like:
50.times { puts RandomString.generate }
Out of curiosity, I made a benchmark of all the solutions presented here. Here are the results:
JRuby:
user system total real
kimmmo 1.490000 0.000000 1.490000 ( 0.990000)
kimmmo2 0.600000 0.010000 0.610000 ( 0.479000)
sawa 0.960000 0.040000 1.000000 ( 0.533000)
hp4k 2.050000 0.230000 2.280000 ( 1.234000)
brian 17.700000 0.170000 17.870000 ( 14.867000)
MRI 2.0
user system total real
kimmmo 0.900000 0.000000 0.900000 ( 0.908601)
kimmmo2 0.410000 0.000000 0.410000 ( 0.406443)
sawa 0.570000 0.000000 0.570000 ( 0.568935)
hp4k 4.940000 0.000000 4.940000 ( 4.945404)
brian 25.860000 0.010000 25.870000 ( 25.870011)
You can do it this way
(0..9).to_a.sample(2).join + ('a'..'z').to_a.sample + " " + ('a'..'z').to_a.sample + (0..9).to_a.sample(2).join

Sorting an array of strings in Ruby

I have learned two array sorting methods in Ruby:
array = ["one", "two", "three"]
array.sort.reverse!
or:
array = ["one", "two", "three"]
array.sort { |x,y| y<=>x }
And I am not able to differentiate between the two. Which method is better and how exactly are they different in execution?
Both lines do the same (create a new array, which is reverse sorted). The main argument is about readability and performance. array.sort.reverse! is more readable than array.sort{|x,y| y<=>x} - I think we can agree here.
For the performance part, I created a quick benchmark script, which gives the following on my system (ruby 1.9.3p392 [x86_64-linux]):
user system total real
array.sort.reverse 1.330000 0.000000 1.330000 ( 1.334667)
array.sort.reverse! 1.200000 0.000000 1.200000 ( 1.198232)
array.sort!.reverse! 1.200000 0.000000 1.200000 ( 1.199296)
array.sort{|x,y| y<=>x} 5.220000 0.000000 5.220000 ( 5.239487)
Run times are pretty constant for multiple executions of the benchmark script.
array.sort.reverse (with or without !) is way faster than array.sort{|x,y| y<=>x}. Thus, I recommend that.
Here is the script as a Reference:
#!/usr/bin/env ruby
require 'benchmark'
Benchmark.bm do|b|
master = (1..1_000_000).map(&:to_s).shuffle
a = master.dup
b.report("array.sort.reverse ") do
a.sort.reverse
end
a = master.dup
b.report("array.sort.reverse! ") do
a.sort.reverse!
end
a = master.dup
b.report("array.sort!.reverse! ") do
a.sort!.reverse!
end
a = master.dup
b.report("array.sort{|x,y| y<=>x} ") do
a.sort{|x,y| y<=>x}
end
end
There really is no difference here. Both methods return a new array.
For the purposes of this example, simpler is better. I would recommend array.sort.reverse because it is much more readable than the alternative. Passing blocks to methods like sort should be saved for arrays of more complex data structures and user-defined classes.
Edit: While destructive methods (anything ending in a !) are good for performance games, it was pointed out that they aren't required to return an updated array, or anything at all for that matter. It is important to keep this in mind because array.sort.reverse! could very likely return nil. If you wish to use a destructive method on a newly generated array, you should prefer calling .reverse! on a separate line instead of having a one-liner.
Example:
array = array.sort
array.reverse!
should be preferred to
array = array.sort.reverse!
Reverse! is Faster
There's often no substitute for benchmarking. While it probably makes no difference in shorter scripts, the #reverse! method is significantly faster than sorting using the "spaceship" operator. For example, on MRI Ruby 2.0, and given the following benchmark code:
require 'benchmark'
array = ["one", "two", "three"]
loops = 1_000_000
Benchmark.bmbm do |bm|
bm.report('reverse!') { loops.times {array.sort.reverse!} }
bm.report('spaceship') { loops.times {array.sort {|x,y| y<=>x} }}
end
the system reports that #reverse! is almost twice as fast as using the combined comparison operator.
user system total real
reverse! 0.340000 0.000000 0.340000 ( 0.344198)
spaceship 0.590000 0.010000 0.600000 ( 0.595747)
My advice: use whichever is more semantically meaningful in a given context, unless you're running in a tight loop.
With comparison as simple as your example, there is not much difference, but as the formula for comparison gets complicated, it is better to avoid using <=> with a block because the block you pass will be evaluated for each element of the array, causing redundancy. Consider this:
array.sort{|x, y| some_expensive_method(x) <=> some_expensive_method(y)}
In this case, some_expensive_method will be evaluated for each possible pair of element of array.
In your particular case, use of a block with <=> can be avoided with reverse.
array.sort_by{|x| some_expensive_method(x)}.reverse
This is called Schwartzian transform.
In playing with tessi's benchmarks on my machine, I've gotten some interesting results. I'm running ruby 2.0.0p195 [x86_64-darwin12.3.0], i.e., latest release of Ruby 2 on an OS X system. I used bmbm rather than bm from the Benchmark module. My timings are:
Rehearsal -------------------------------------------------------------
array.sort.reverse: 1.010000 0.000000 1.010000 ( 1.020397)
array.sort.reverse!: 0.810000 0.000000 0.810000 ( 0.808368)
array.sort!.reverse!: 0.800000 0.010000 0.810000 ( 0.809666)
array.sort{|x,y| y<=>x}: 0.300000 0.000000 0.300000 ( 0.291002)
array.sort!{|x,y| y<=>x}: 0.100000 0.000000 0.100000 ( 0.105345)
---------------------------------------------------- total: 3.030000sec
user system total real
array.sort.reverse: 0.210000 0.000000 0.210000 ( 0.208378)
array.sort.reverse!: 0.030000 0.000000 0.030000 ( 0.027746)
array.sort!.reverse!: 0.020000 0.000000 0.020000 ( 0.020082)
array.sort{|x,y| y<=>x}: 0.110000 0.000000 0.110000 ( 0.107065)
array.sort!{|x,y| y<=>x}: 0.110000 0.000000 0.110000 ( 0.105359)
First, note that in the Rehearsal phase that sort! using a comparison block comes in as the clear winner. Matz must have tuned the heck out of it in Ruby 2!
The other thing that I found exceedingly weird was how much improvement array.sort.reverse! and array.sort!.reverse! exhibited in the production pass. It was so extreme it made me wonder whether I had somehow screwed up and passed these already sorted data, so I added explicit checks for sorted or reverse-sorted data prior to performing each benchmark.
My variant of tessi's script follows:
#!/usr/bin/env ruby
require 'benchmark'
class Array
def sorted?
(1...length).each {|i| return false if self[i] < self[i-1] }
true
end
def reversed?
(1...length).each {|i| return false if self[i] > self[i-1] }
true
end
end
master = (1..1_000_000).map(&:to_s).shuffle
Benchmark.bmbm(25) do|b|
a = master.dup
puts "uh-oh!" if a.sorted?
puts "oh-uh!" if a.reversed?
b.report("array.sort.reverse:") { a.sort.reverse }
a = master.dup
puts "uh-oh!" if a.sorted?
puts "oh-uh!" if a.reversed?
b.report("array.sort.reverse!:") { a.sort.reverse! }
a = master.dup
puts "uh-oh!" if a.sorted?
puts "oh-uh!" if a.reversed?
b.report("array.sort!.reverse!:") { a.sort!.reverse! }
a = master.dup
puts "uh-oh!" if a.sorted?
puts "oh-uh!" if a.reversed?
b.report("array.sort{|x,y| y<=>x}:") { a.sort{|x,y| y<=>x} }
a = master.dup
puts "uh-oh!" if a.sorted?
puts "oh-uh!" if a.reversed?
b.report("array.sort!{|x,y| y<=>x}:") { a.sort!{|x,y| y<=>x} }
end

Count total characters in an Array of Strings in Ruby?

How would I count the total number of characters in an array of strings in Ruby? Assume I have the following:
array = ['peter' , 'romeo' , 'bananas', 'pijamas']
I'm trying:
array.each do |counting|
puts counting.count "array[]"
end
but, I'm not getting the desired result. It appears I am counting something other than the characters.
I searched for the count property but I haven't had any luck or found a good source of info. Basically, I'd like to get an output of the total of characters inside the array.,
Wing's Answer will work, but just for fun here are a few alternatives
['peter' , 'romeo' , 'bananas', 'pijamas'].inject(0) {|c, w| c += w.length }
or
['peter' , 'romeo' , 'bananas', 'pijamas'].join.length
The real issue is that string.count is not the method you're looking for. (Docs)
Or...
a.map(&:size).reduce(:+) # from Andrew: reduce(0, :+)
Another alternative:
['peter' , 'romeo' , 'bananas', 'pijamas'].join('').size
An interesting result :)
>> array = []
>> 1_000_000.times { array << 'foo' }
>> Benchmark.bmbm do |x|
>> x.report('mapreduce') { array.map(&:size).reduce(:+) }
>> x.report('mapsum') { array.map(&:size).sum }
>> x.report('inject') { array.inject(0) { |c, w| c += w.length } }
>> x.report('joinsize') { array.join('').size }
>> x.report('joinsize2') { array.join.size }
>> end
Rehearsal ---------------------------------------------
mapreduce 0.220000 0.000000 0.220000 ( 0.222946)
mapsum 0.210000 0.000000 0.210000 ( 0.210070)
inject 0.150000 0.000000 0.150000 ( 0.158709)
joinsize 0.120000 0.000000 0.120000 ( 0.116889)
joinsize2 0.070000 0.000000 0.070000 ( 0.071718)
------------------------------------ total: 0.770000sec
user system total real
mapreduce 0.220000 0.000000 0.220000 ( 0.228385)
mapsum 0.210000 0.000000 0.210000 ( 0.207359)
inject 0.160000 0.000000 0.160000 ( 0.156711)
joinsize 0.120000 0.000000 0.120000 ( 0.116652)
joinsize2 0.080000 0.000000 0.080000 ( 0.069612)
so it looks like array.join.size has the lowest runtime
a = ['peter' , 'romeo' , 'bananas', 'pijamas']
count = 0
a.each {|s| count += s.length}
puts count

#inject and slowness

I've often heard Ruby's inject method criticized as being "slow." As I rather like the function, and see equivalents in other languages, I'm curious if it's merely Ruby's implementation of the method that's slow, or if it is inherently a slow way to do things (e.g. should be avoided for non-small collections)?
inject is like fold, and can be very efficient in other languages, fold_left specifically, since it's tail-recursive.
It's mostly an implementation issue, but this gives you a good idea of the comparison:
$ ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
$ ruby exp/each_v_inject.rb
Rehearsal -----------------------------------------------------
loop 0.000000 0.000000 0.000000 ( 0.000178)
fixnums each 0.790000 0.280000 1.070000 ( 1.078589)
fixnums each add 1.010000 0.290000 1.300000 ( 1.297733)
Enumerable#inject 1.900000 0.430000 2.330000 ( 2.330083)
-------------------------------------------- total: 4.700000sec
user system total real
loop 0.000000 0.000000 0.000000 ( 0.000178)
fixnums each 0.760000 0.300000 1.060000 ( 1.079252)
fixnums each add 1.030000 0.280000 1.310000 ( 1.305888)
Enumerable#inject 1.850000 0.490000 2.340000 ( 2.340341)
exp/each_v_inject.rb
require 'benchmark'
total = (ENV['TOTAL'] || 1_000).to_i
fixnums = Array.new(total) {|x| x}
Benchmark.bmbm do |x|
x.report("loop") do
total.times { }
end
x.report("fixnums each") do
total.times do |i|
fixnums.each {|x| x}
end
end
x.report("fixnums each add") do
total.times do |i|
v = 0
fixnums.each {|x| v += x}
end
end
x.report("Enumerable#inject") do
total.times do |i|
fixnums.inject(0) {|a,x| a + x }
end
end
end
So yes it is slow, but as improvements occur in the implementation it should become a non-issue. There is nothing inherent about WHAT it is doing that requires it to be slower.
each_with_object may be faster than inject, if you're mutating an existing object rather than creating a new object in each block.

Resources