Should I memoize Hash#dig method? - ruby

My hypothesis that Hash#dig memoization should be useless memory usage, since Hash#dig should return value in constant time. (Hash#dig is a recursive C function)
So i made this little experiment with benchmark (edited mistake):
# frozen_string_literal: true
require 'benchmark/ips'
class DigMemoTest
def initialize
#payload = {
record: {
id: '1',
status: 'pending'
},
event: {
at: Time.now
}
}
end
def run
Benchmark.ips do |x|
x.report('Hash#dig_with_memo') { dig_with_memoize }
x.report('Hash#dig_no_memo') { dig_without_memoize }
x.compare!
end
end
def dig_without_memoize
#payload.dig(:record, :status)
end
def dig_with_memoize
return #status if defined?(#status)
#status = #payload.dig(:record, :status)
end
end
And i get result:
irb(main):001:0> dig_test = DigMemoTest.new
irb(main):002:0> dig_test.run
Warming up --------------------------------------
Hash#dig_with_memo 323.224k i/100ms
Hash#dig_no_memo 303.926k i/100ms
Calculating -------------------------------------
Hash#dig_with_memo 12.993M (± 4.4%) i/s - 64.968M in 5.010944s
Hash#dig_no_memo 9.630M (± 3.3%) i/s - 48.324M in 5.024331s
Comparison:
Hash#dig_with_memo: 12992823.6 i/s
Hash#dig_no_memo: 9630387.4 i/s - 1.35x slower
Why in this case memoization improves performance, could we say that we should memoize Hash#dig?

Related

Implementing &:method with custom methods

I've already read this question but I'm having trouble implementing the concepts there.
I'm doing an exercise from exercism.io that has provided tests. The aim of the exercise is to implement an accumulate method that returns the squares of the numbers passed to it. We need to do this without using map/inject.
That was no problem but one of the tests is as follows:
def test_accumulate_upcases
result = %w(hello world).accumulate(&:upcase)
assert_equal %w(HELLO WORLD), result
end
I have the following class
class Array
def accumulate
squares = []
self.each { |x| squares << x**2 unless x.is_a? String }
squares
end
def upcase
upcase = []
self.each { |word| word.upcase }
upcase
end
end
But I don't fully understand the concept being tested. How to I get accumulate to call methods that are passed to it as arguments?
It seems you are expected to extend Array class with new method accumulate which will accumulate the results of invoking a given proc on each element of the array.
One implementation can be like this:
class Array
def accumulate(&block)
self.collect { |i| block.yield(i) }
end
end
p result = %w(hello world).accumulate(&:upcase) # Prints ["HELLO", "WORLD"]
p result = %w(hello world).accumulate { |b| b.upcase } # Prints ["HELLO", "WORLD"]
Please note that %w(HELLO WORLD) is same as array ["HELLO", "WORLD"]
There is a very good explanation of what is the use of & operator in this article - please read the section on The Unary &
accumulate method is expecting to receive a name of the method as Symbol.
def accumulate meth = nil
result = []
case
when meth.nil?
self.each { |e| result << e.public_send meth }
when block_given?
self.each { |e| result << yield e }
else
fail 'This method expects either block or method name'
end
end

When to use method { } vs method(&block)?

When we want to pass blocks to a method, when do we want to do:
block = Proc.new { puts 'test blocks & procs' }
def method(&block)
yield
end
VS
def method
yield
end
method { puts 'test blocks & procs' }
Is there any particular circumstance we would want to use one or the other?
Using Procs lets you not repeat yourself. Compare this:
arr1 = [1,2,3]
arr2 = [4,5,6]
Using blocks, you repeat the block twice:
arr1.map { |n| n * 2 }
arr2.map { |n| n * 2 }
When using Procs, you can reuse the object:
multiply_2 = Proc.new do |n|
n * 2
end
arr1.map(&multiply_2)
arr2.map(&multiply_2)
1) A block is not an object, and therefore a block cannot be captured in a variable, and a block cannot be passed explicitly to a method. However, you can convert a block to a Proc instance using the & operator, and a Proc instance is an object which can be assigned to a variable and passed to a method.
2) Proc.new() doesn't return a block--it returns a Proc instance. So naming your variable block is misleading.
3) yield only calls a block, which is the thing specified after a method call:
do_stuff(10) {puts 'hello'} #<-- block
do_stuf(10) do |x| #<--'do' marks the start of a block
puts x + 2
end #<--end of block
block = Proc.new {puts 'hello'}
^
|
+--- #not a block
yield does not call a Proc instance that happens to be passed as an argument to a method:
def do_stuff(arg)
yield
end
p = Proc.new { puts 'test blocks & procs' }
do_stuff(p)
--output:--
1.rb:2:in `do_stuff': no block given (yield) (LocalJumpError)
from 1.rb:6:in `<main>'
Compare to:
def do_stuff(arg)
puts arg
yield
end
do_stuff(10) {puts "I'm a block"}
--output:--
10
I'm a block
4) You can convert a block to a Proc instance and capture it in a variable using &, like this:
def do_stuff(arg, &p)
puts arg
p.call
end
do_stuff(10) {puts "I'm a block"}
--output:--
10
I'm a block
Liar! You are not a block anymore! Typically, the variable name used to capture a block is written as &block:
def do_stuff(arg, &block)
puts arg
block.call
end
But that's technically incorrect; the block variable will contain a Proc instance, as you can see here:
def do_stuff(arg, &block)
puts arg
puts block.class
end
do_stuff(10) {puts "I'm a block"}
--output:--
10
Proc
5) You can also use the & operator to convert a Proc instance to a block, as Nobita's answer demonstrated:
def do_stuff(x, y)
yield(x, y)
end
p = Proc.new {|x, y| puts x+y}
do_stuff(10, 20, &p)
--output:--
30
In that example, the method call do_stuff(10, 20, &p) is equivalent to writing:
do_stuff(10, 20) {|x, y| puts x+y}
6) When do you want to use a block and yield v. &block and call?
One use case for capturing a block in a variable is so that you can pass it to another method:
def do_stuff(arg, &a_proc)
result = arg * 2
do_other_stuff(result, a_proc)
end
def do_other_stuff(x, p)
1.upto(x) do |i|
p[i] #Proc#[] is a synonym for Proc#call
end
end
do_stuff(2) {|x| puts x}
--output:--
1
2
3
4
I suggest you operate by these two rules:
When you write a method that expects a block, ALWAY execute the block with yield.
If #1 doesn't work for you, consider capturing the block and using call (or [])
Block seems to be a bit faster based on the following benchmark:
require 'benchmark/ips'
prc = Proc.new { '' }
def _proc(&block)
yield
end
def _block
yield
end
Benchmark.ips do |x|
x.report('Block') { _block { '' } }
x.report('Proc') { _proc(&prc) }
end
Benchmark results on an i7-4510U CPU # 2.00GHz
Block 149.700k i/100ms
Proc 144.151k i/100ms
Block 4.786M (± 1.6%) i/s - 23.952M
Proc 4.269M (± 2.3%) i/s - 21.334M

Ruby attr_accessor vs. getter/setter benchmark: why is accessor faster?

I just tested attr_accessor against equivalent getter/setter-methods:
class A
# we define two R/W attributes with accessors
attr_accessor :acc, :bcc
# we define two attributes with getter/setter-functions
def dirA=(d); #dirA=d; end
def dirA; #dirA; end
def dirB=(d); #dirB=d; end
def dirB; #dirB; end
end
varA = A.new
startT = 0
dirT = 0
accT = 0
# now we do 100 times the same benchmarking
# where we do the same assignment operation
# 50000 times
100.times do
startT = Time.now.to_f
50000.times do |i|
varA.dirA = i
varA.dirB = varA.dirA
end
dirT += (Time.now.to_f - startT)
startT = Time.now.to_f
50000.times do |i|
varA.acc = i
varA.bcc = varA.acc
end
accT += (Time.now.to_f - startT)
end
puts "direct: %10.4fs" % (dirT/100)
puts "accessor: %10.4fs" % (accT/100)
Program output is:
direct: 0.2640s
accessor: 0.1927s
So the attr_accessor is significantly faster. could someone please share some wisdom, why this is so?
Without deeply understanding the differences, I can at least say that attr_accessor (and attr_reader and attr_writer) are implemented in C, as you can see by toggling the source on that page. Your methods will be implemented in Ruby, and Ruby methods have more call overhead than native C functions.
Here's an article explaining why Ruby method dispatch tends to be slow.

How to create simple timer method in Ruby?

How to create a simple timer method in Ruby that takes an arbitrary function call?
For example:
time rspec or time get_primes(54784637)
It should still return the result of the function that is passed in (get_primes).
The following doesn't quite work:
def time block
t = Time.now
result = eval(block)
puts "\nCompleted in #{(Time.now - t).format} seconds"
result
end
Ruby 1.9.3
It's more Rubyesque to use a block:
def time
t = Time.now
result = yield
puts "\nCompleted in #{Time.now - t} seconds"
result
end
time { rspec }
This functionality is already available in Ruby's standard library, in the Benchmark module:
require 'benchmark'
time = Benchmark.realtime { 10_000.times { print "a" } }
Try it like this:
def time(&block)
t = Time.now
result = block.call
puts "\nCompleted in #{(Time.now - t)} seconds"
result
end

Ruby: Proc#call vs yield

What are the behavioural differences between the following two implementations in Ruby of the thrice method?
module WithYield
def self.thrice
3.times { yield } # yield to the implicit block argument
end
end
module WithProcCall
def self.thrice(&block) # & converts implicit block to an explicit, named Proc
3.times { block.call } # invoke Proc#call
end
end
WithYield::thrice { puts "Hello world" }
WithProcCall::thrice { puts "Hello world" }
By "behavioural differences" I include error handling, performance, tool support, etc.
I think the first one is actually a syntactic sugar of the other. In other words there is no behavioural difference.
What the second form allows though is to "save" the block in a variable. Then the block can be called at some other point in time - callback.
Ok. This time I went and did a quick benchmark:
require 'benchmark'
class A
def test
10.times do
yield
end
end
end
class B
def test(&block)
10.times do
block.call
end
end
end
Benchmark.bm do |b|
b.report do
a = A.new
10000.times do
a.test{ 1 + 1 }
end
end
b.report do
a = B.new
10000.times do
a.test{ 1 + 1 }
end
end
b.report do
a = A.new
100000.times do
a.test{ 1 + 1 }
end
end
b.report do
a = B.new
100000.times do
a.test{ 1 + 1 }
end
end
end
The results are interesting:
user system total real
0.090000 0.040000 0.130000 ( 0.141529)
0.180000 0.060000 0.240000 ( 0.234289)
0.950000 0.370000 1.320000 ( 1.359902)
1.810000 0.570000 2.380000 ( 2.430991)
This shows that using block.call is almost 2x slower than using yield.
The other answers are pretty thorough and Closures in Ruby extensively covers the functional differences. I was curious about which method would perform best for methods that optionally accept a block, so I wrote some benchmarks (going off this Paul Mucur post). I compared three methods:
&block in method signature
Using &Proc.new
Wrapping yield in another block
Here is the code:
require "benchmark"
def always_yield
yield
end
def sometimes_block(flag, &block)
if flag && block
always_yield &block
end
end
def sometimes_proc_new(flag)
if flag && block_given?
always_yield &Proc.new
end
end
def sometimes_yield(flag)
if flag && block_given?
always_yield { yield }
end
end
a = b = c = 0
n = 1_000_000
Benchmark.bmbm do |x|
x.report("no &block") do
n.times do
sometimes_block(false) { "won't get used" }
end
end
x.report("no Proc.new") do
n.times do
sometimes_proc_new(false) { "won't get used" }
end
end
x.report("no yield") do
n.times do
sometimes_yield(false) { "won't get used" }
end
end
x.report("&block") do
n.times do
sometimes_block(true) { a += 1 }
end
end
x.report("Proc.new") do
n.times do
sometimes_proc_new(true) { b += 1 }
end
end
x.report("yield") do
n.times do
sometimes_yield(true) { c += 1 }
end
end
end
Performance was similar between Ruby 2.0.0p247 and 1.9.3p392. Here are the results for 1.9.3:
user system total real
no &block 0.580000 0.030000 0.610000 ( 0.609523)
no Proc.new 0.080000 0.000000 0.080000 ( 0.076817)
no yield 0.070000 0.000000 0.070000 ( 0.077191)
&block 0.660000 0.030000 0.690000 ( 0.689446)
Proc.new 0.820000 0.030000 0.850000 ( 0.849887)
yield 0.250000 0.000000 0.250000 ( 0.249116)
Adding an explicit &block param when it's not always used really does slow down the method. If the block is optional, do not add it to the method signature. And, for passing blocks around, wrapping yield in another block is fastest.
That said, these are the results for a million iterations, so don't worry about it too much. If one method makes your code clearer at the expense of a millionth of a second, use it anyway.
They give different error messages if you forget to pass a block:
> WithYield::thrice
LocalJumpError: no block given
from (irb):3:in `thrice'
from (irb):3:in `times'
from (irb):3:in `thrice'
> WithProcCall::thrice
NoMethodError: undefined method `call' for nil:NilClass
from (irb):9:in `thrice'
from (irb):9:in `times'
from (irb):9:in `thrice'
But they behave the same if you try to pass a "normal" (non-block) argument:
> WithYield::thrice(42)
ArgumentError: wrong number of arguments (1 for 0)
from (irb):19:in `thrice'
> WithProcCall::thrice(42)
ArgumentError: wrong number of arguments (1 for 0)
from (irb):20:in `thrice'
I found that the results are different depending on whether you force Ruby to construct the block or not (e.g. a pre-existing proc).
require 'benchmark/ips'
puts "Ruby #{RUBY_VERSION} at #{Time.now}"
puts
firstname = 'soundarapandian'
middlename = 'rathinasamy'
lastname = 'arumugam'
def do_call(&block)
block.call
end
def do_yield(&block)
yield
end
def do_yield_without_block
yield
end
existing_block = proc{}
Benchmark.ips do |x|
x.report("block.call") do |i|
buffer = String.new
while (i -= 1) > 0
do_call(&existing_block)
end
end
x.report("yield with block") do |i|
buffer = String.new
while (i -= 1) > 0
do_yield(&existing_block)
end
end
x.report("yield") do |i|
buffer = String.new
while (i -= 1) > 0
do_yield_without_block(&existing_block)
end
end
x.compare!
end
Gives the results:
Ruby 2.3.1 at 2016-11-15 23:55:38 +1300
Warming up --------------------------------------
block.call 266.502k i/100ms
yield with block 269.487k i/100ms
yield 262.597k i/100ms
Calculating -------------------------------------
block.call 8.271M (± 5.4%) i/s - 41.308M in 5.009898s
yield with block 11.754M (± 4.8%) i/s - 58.748M in 5.011017s
yield 16.206M (± 5.6%) i/s - 80.880M in 5.008679s
Comparison:
yield: 16206091.2 i/s
yield with block: 11753521.0 i/s - 1.38x slower
block.call: 8271283.9 i/s - 1.96x slower
If you change do_call(&existing_block) to do_call{} you'll find it's about 5x slower in both cases. I think the reason for this should be obvious (because Ruby is forced to construct a Proc for each invocation).
BTW, just to update this to current day using:
ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux]
On Intel i7 (1.5 years oldish).
user system total real
0.010000 0.000000 0.010000 ( 0.015555)
0.030000 0.000000 0.030000 ( 0.024416)
0.120000 0.000000 0.120000 ( 0.121450)
0.240000 0.000000 0.240000 ( 0.239760)
Still 2x slower. Interesting.

Resources