Is there any better way to slice a word in ruby?
My solution:
str="TAMILNADU"
pairs=[]
numPairs= str.length-1
i=0
while i<numPairs do
pairs[i] = str.slice(i,2)
i+=1
end
Result
["TA", "AM", "MI", "IL", "LN", "NA", "AD", "DU"]
Yes there is:
irb(main):001:0> "TAMILNADU".chars.each_cons(2).map(&:join)
=> ["TA", "AM", "MI", "IL", "LN", "NA", "AD", "DU"]
See the documentation of Enumerable#each_cons.
And another one using a positive lookahead:
str = 'TAMILNADU'
str.scan(/(.)(?=(.))/).map(&:join)
#=> ["TA", "AM", "MI", "IL", "LN", "NA", "AD", "DU"]
Another version:
str = "TAMILNADU"
(0...str.size-1).map{|x| str[x,2]}
# => ["TA", "AM", "MI", "IL", "LN", "NA", "AD", "DU"]
Thanks(#Doorknob,#hirolau,#Stefan,#Cary Swoveland) for all of your responses find the benchmark report below.
Updated (Tue - Mar- 2015)
require 'benchmark/ips'
Benchmark.ips do |x|
str = "TAMILNADU"
x.report("1") {
pairs=[]
numPairs= str.length()-1
i=0
while i<numPairs do
pairs[i] = str.slice(i,2)
i+=1
end
}
x.report("2") {
str.chars.each_cons(2).map(&:join)
}
x.report("3") {
(0...str.size-1).map{|x| str[x,2]}
}
x.report("4"){
str.scan(/(.)(?=(.))/).map(&:join)
}
x.report("5"){
str.gsub(/(.)/,'\1\1')[1...-1].scan(/../)
}
x.report("6"){
str.gsub(/./) { |c| c*2 }[1...-1].scan(/../)
}
x.compare!
end
Calculating -------------------------------------
1 37.355k i/100ms
2 9.703k i/100ms
3 26.961k i/100ms
4 7.950k i/100ms
5 6.302k i/100ms
6 7.804k i/100ms
-------------------------------------------------
1 508.411k (±11.1%) i/s - 2.503M
2 107.568k (± 5.1%) i/s - 543.368k
3 332.923k (±10.2%) i/s - 1.672M
4 88.410k (± 4.5%) i/s - 445.200k
5 67.694k (± 4.8%) i/s - 340.308k
6 85.000k (± 5.8%) i/s - 429.220k
Comparison:
1: 508410.6 i/s
3: 332923.4 i/s - 1.53x slower
2: 107567.7 i/s - 4.73x slower
4: 88409.9 i/s - 5.75x slower
6: 85000.2 i/s - 5.98x slower
5: 67694.2 i/s - 7.51x slower
Another way using a regex:
str.gsub(/(.)/,'\1\1')[1...-1].scan(/../)
#=> ["TA", "AM", "MI", "IL", "LN", "NA", "AD", "DU"]
or a variant:
str.gsub(/./) { |c| c*2 }[1...-1].scan(/../)
Any another:
a = []
e = str.each_char
loop do
a << (e.next << e.peek)
end
a
Edit: I added the last method after the benchmarking was done, so I ran the benchmark myself. It was dead last. (I like it anyway.)
Related
Suppose I have some code like:
def set_reminder(cond_one: false, cond_two: false)
if cond_two
if cond_one
outcome_a
else
outcome_b
end
else
if cond_one
outcome_c
else
outcome_d
end
end
end
How can I more elegantly write a function like this, which has 4 potential results (one for each combination of possible cond_one and cond_two values)?
I'm not satisfied with this version, using an if/else statement with another if/else in both branches. In the actual code, the outcomes are already complex expressions, so writing something like return outcome_a if cond_one && cond_two (for all 4 outcomes) would be unwieldy.
Ruby has a very powerful case expression that can be used for this sort of thing. Consider
def set_reminder(cond_one: false, cond_two: false)
case [cond_one, cond_two]
when [true, true] then outcome_a
when [true, false] then outcome_b
when [false, true] then outcome_c
when [false, false] then outcome_d
end
end
As pointed out in the comments, though, consider having your arguments convey more than just "pair of Booleans". See Boolean blindness for a good discussion on this.
You could go with something like below, flattening your nested ifs into a series of guard statements.
def set_reminder(cond_one: false, cond_two: false)
return outcome_a if cond_two && cond_one
return outcome_b if cond_two
return outcome_c if cond_one
outcome_d
end
This is neater and allows for further refactoring.
I can't recommend enough Sandi Metz et al's book on refactoring, 99 Bottles of OOP. There's a Ruby version. The entire book walks you through this kind of refactoring. From multiple ifs to extracted classes.
You could set it up in your constructor as a hash:
#choice = {[true, true] => :a, [true, false] => :c,
[false, true] => :b, [false, false] => :d}
and then your method would be:
def set_reminder(cond_one: false, cond_two: false)
#choice[[cond_one, cond_two]]
end
An alternative which avoids indexing by compound objects is to create a hash of hashes:
#h_of_h = {true => {true => :a, false => :c},
false => {true => :b, false => :d}}
def set_reminder_h2(cond_one: false, cond_two: false)
#h_of_h[cond_one][cond_two]
end
I did the following benchmark to compare the various proposed approaches:
require 'benchmark/ips'
combos = [[true, true], [true, false], [false, true], [false, false]]
def set_reminder_case(cond_one: false, cond_two: false)
case [cond_one, cond_two]
when [true, true] then :a
when [true, false] then :b
when [false, true] then :c
when [false, false] then :d
end
end
def set_reminder_guard(cond_one: false, cond_two: false)
return :a if cond_two && cond_one
return :b if cond_two
return :c if cond_one
:d
end
#choice = {[true, true] => :a, [true, false] => :c,
[false, true] => :b, [false, false] => :d}
def set_reminder_hash(cond_one: false, cond_two: false)
#choice[[cond_one, cond_two]]
end
#h_of_h = {true => {true => :a, false => :c},
false => {true => :b, false => :d}}
def set_reminder_h2(cond_one: false, cond_two: false)
#h_of_h[cond_one][cond_two]
end
N = 1_000
SEED_VALUE = 123_456_987
# The choicess of true/false combos are being randomized, but since the
# seed is reset they are identical for the two functions being tested.
Benchmark.ips do |b|
srand(SEED_VALUE)
b.report('case') do
N.times do
v1, v2 = combos[rand(4)]
set_reminder_case(cond_one: v1, cond_two: v2)
end
end
srand(SEED_VALUE)
b.report('hash') do
N.times do
v1, v2 = combos[rand(4)]
set_reminder_hash(cond_one: v1, cond_two: v2)
end
end
srand(SEED_VALUE)
b.report('guard') do
N.times do
v1, v2 = combos[rand(4)]
set_reminder_guard(cond_one: v1, cond_two: v2)
end
end
srand(SEED_VALUE)
b.report('hsh_of_hsh') do
N.times do
v1, v2 = combos[rand(4)]
set_reminder_h2(cond_one: v1, cond_two: v2)
end
end
b.compare!
end
The results generated using Ruby 3.2.0 on an M1 MacBook Pro are:
% ruby case_v_hash.rb
Warming up --------------------------------------
case 197.000 i/100ms
hash 239.000 i/100ms
guard 596.000 i/100ms
hsh_of_hsh 562.000 i/100ms
Calculating -------------------------------------
case 1.977k (± 1.2%) i/s - 10.047k in 5.083713s
hash 2.408k (± 0.5%) i/s - 12.189k in 5.062504s
guard 5.952k (± 0.6%) i/s - 29.800k in 5.006765s
hsh_of_hsh 5.637k (± 1.3%) i/s - 28.662k in 5.085419s
Comparison:
guard: 5952.2 i/s
hsh_of_hsh: 5637.2 i/s - 1.06x (± 0.00) slower
hash: 2407.8 i/s - 2.47x (± 0.00) slower
case: 1976.6 i/s - 3.01x (± 0.00) slower
With --yjit:
% ruby --yjit case_v_hash.rb
Warming up --------------------------------------
case 243.000 i/100ms
hash 290.000 i/100ms
guard 1.075k i/100ms
hsh_of_hsh 952.000 i/100ms
Calculating -------------------------------------
case 2.419k (± 0.7%) i/s - 12.150k in 5.022058s
hash 2.921k (± 0.8%) i/s - 14.790k in 5.062949s
guard 10.715k (± 1.4%) i/s - 53.750k in 5.017634s
hsh_of_hsh 9.430k (± 0.7%) i/s - 47.600k in 5.048054s
Comparison:
guard: 10714.6 i/s
hsh_of_hsh: 9429.9 i/s - 1.14x (± 0.00) slower
hash: 2921.4 i/s - 3.67x (± 0.00) slower
case: 2419.4 i/s - 4.43x (± 0.00) slower
It sure looks to me like user3574603's guard statement approach is the winner, with the hash of hashes a reasonably close second. Both dominate the hash lookup and case statement approaches.
I would say that you cannot improve on what you have now. Your method has the following characteristics:
it is easy to comprehend by anyone reading your code (including yourself when you return to it in future);
testing is straightforward;
the variables cond_two and cond_one are each examined just once; and
it easily accommodates future changes in logic.
The number of lines of code in your method could be reduced but I advise against doing that if it would adversely affect readability or ease of testing.
The following are more or less equivalent ways of writing your method. I would say that choosing among these is purely a stylistic choice.
Use ternaries for the inner conditionals
def set_reminder(cond_one: false, cond_two: false)
if cond_two
cond_one ? outcome_a : outcome_b
else
cond_one ? outcome_c : outcome_d
end
end
Use case statements
def set_reminder(cond_one: false, cond_two: false)
case cond_two
case cond_one
when true then outcome_a
else outcome_b
end
else
case cond_one
when true then outcome_c
else outcome_d
end
end
end
Replace inner conditionals with method calls
def set_reminder(cond_one: false, cond_two: false)
if cond_two
set_reminder_cond_two_true(cond_one)
else
set_reminder_cond_two_false(cond_one)
end
end
def set_reminder_cond_two_true(cond_one)
cond_one ? outcome_a : outcome_b
end
def set_reminder_cond_two_false(cond_one)
cond_one ? outcome_c : outcome_d
end
I have a dict similar to:
{"services"=>["foo", "bar", "test"], "foo_limit_max"=>2, "test_limit_min"=>6}
I want to extract from the list of services all of those services that have a key following servicename_limit_*
I tried the following dict['services'].select { |x| dict.key?(/#{x}_limit_*/)} but it's not returning anything.
Expected output would be ['foo', 'test']
I would go about it like this
dict['services'] & dict.keys.flat_map {|k| k.scan(/\w+(?=_limit_)/)}
Here we use Array#& (intersection) between the dict['services'] and the keys that match the given pattern.
dict.keys.flat_map {|k| k.scan(/\w+(?=_limit_)/)}
#=> ["foo", "test"]
Another as suggested by #SergioTulentsev (which is far more performant) is
dict['services'].select {|s| dict.any? {|k,_| k.start_with?(s)}}
Benchmarks Provided By #SergioTulentsev
# >> Warming up --------------------------------------
# >> sergio 54.608k i/100ms
# >> aleksei 6.257k i/100ms
# >> cary 40.951k i/100ms
# >> engineersmnky 16.093k i/100ms
# >> Calculating -------------------------------------
# >> sergio 663.692k (± 3.5%) i/s - 3.331M in 5.025472s
# >> aleksei 64.006k (± 2.9%) i/s - 325.364k in 5.087717s
# >> cary 472.838k (± 3.1%) i/s - 2.375M in 5.028133s
# >> engineersmnky 167.907k (± 2.0%) i/s - 852.929k in 5.081762s
# >>
# >> Comparison:
# >> sergio: 663691.8 i/s
# >> cary: 472838.2 i/s - 1.40x slower
# >> engineersmnky: 167906.6 i/s - 3.95x slower
# >> aleksei: 64005.5 i/s - 10.37x slower
# >>
input['services'].reject do |x|
input.keys.grep(/\A#{x}_limit_/).empty?
end
#⇒ ["foo", "test"]
h = {"services"=>["foo", "bar", "test"], "foo_limit_max"=>2, "test_limit_min"=>6}
h["services"] & h.keys.map { |k| k[/.+?(?=_limit_)/] }
#=> ["foo", "test"]
Note:
h.keys.map { |k| k[/.+?(?=_limit_)/] }
#=> [nil, "foo", "test"]
See Array#&, which includes the sentence, "The order is preserved from the original array.".
Out of following 2 options which is faster?
something {|i| i.foo }
something(&:foo)
I read somewhere that something(&:foo) is transformed into something {|i| i.send(:foo) }. Is it true?
If that is true, the question becomes - which is faster? i.foo or i.send(:foo)?
Any any case, which one should be used for better performance?
In ruby it is quite easy to write such a (micro) benchmark:
require 'benchmark/ips'
VALUES = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Benchmark.ips do |x|
x.compare!
x.report('symbol') do
VALUES.each(&:odd?)
end
x.report('method') do
VALUES.each do |value|
value.odd?
end
end
x.report('send') do
VALUES.each do |value|
value.send(:odd?)
end
end
end
(latest version is https://github.com/pascalbetz/benchmarks/blob/master/symbol_vs_proc.rb)
Ruby 1.9.3
symbol: 1349424.7 i/s
method: 1049894.7 i/s - 1.29x slower
send: 929240.4 i/s - 1.45x slower
Ruby 2.2.2
symbol: 1358391.5 i/s
method: 1146686.4 i/s - 1.18x slower
send: 820392.7 i/s - 1.66x slower
JRuby 9.0.5.0
symbol: 2959040.8 i/s
method: 2016911.1 i/s - 1.47x slower
send: 1305816.9 i/s - 2.27x slower
This is a different result as what Sawa explained in his answer.
Suppose I have an array of hashes
a = [
{'id'=>'1','imageUrl'=>'abc','name'=>'x','age'=>'20'},
{'id'=>'2','imageUrl'=>'efg','name'=>'y','age'=>'30'},
{'id'=>'3','imageUrl'=>'hij','name'=>'z','age'=>'40'}
]
What can be the fastest way to remove the key 'name' and 'age' and their corresponding value from all hashes in the array?
Basically how can I remove multiple key/value pairs?
Try the following code:
a = [
{'id'=>'1','imageUrl'=>'abc','name'=>'x'},
{'id'=>'2','imageUrl'=>'efg','name'=>'y'},
{'id'=>'3','imageUrl'=>'hij','name'=>'z'}
]
a.each { |h| h.delete("name") }
p a # => [{"id"=>"1", "imageUrl"=>"abc"}, {"id"=>"2", "imageUrl"=>"efg"}, {"id"=>"3", "imageUrl"=>"hij"}]
Nothing like benchmarking:
Collected from the above answers: and using benchmark-ips
require 'benchmark/ips'
def a
[
{'id'=>'1','imageUrl'=>'abc','name'=>'x'},
{'id'=>'2','imageUrl'=>'efg','name'=>'y'},
{'id'=>'3','imageUrl'=>'hij','name'=>'z'}
]
end
Benchmark.ips do |x|
x.report("map w/ except!") do |times|
a.map {|o| o.except!('name') }
end
x.report("each w/ except!") do |times|
a.each {|o| o.except!('name') }
end
x.report("map w/ except") do |times|
a.map {|o| o.except('name') }
end
x.report("each w/ except") do |times|
a.each {|o| o.except('name') }
end
x.report("map w/ delete") do |times|
a.map { |h| h.delete("name") }
end
x.report("each w/ delete") do |times|
a.each { |h| h.delete("name") }
end
x.compare!
end
I got the following:
Calculating -------------------------------------
map w/ except! 8.438k i/100ms
each w/ except! 8.439k i/100ms
map w/ except 5.242k i/100ms
each w/ except 5.469k i/100ms
map w/ delete 9.840k i/100ms
each w/ delete 9.810k i/100ms
-------------------------------------------------
map w/ except! 1.311B (±25.3%) i/s - 2.994B
each w/ except! 1.360B (±25.2%) i/s - 3.048B
map w/ except 423.818M (±25.8%) i/s - 1.238B
each w/ except 458.859M (±25.7%) i/s - 1.315B
map w/ delete 1.955B (±24.0%) i/s - 3.982B
each w/ delete 2.025B (±23.5%) i/s - 4.033B
Comparison:
each w/ delete: 2024710811.4 i/s
map w/ delete: 1955349074.3 i/s - 1.04x slower
each w/ except!: 1360241861.3 i/s - 1.49x slower
map w/ except!: 1311373772.5 i/s - 1.54x slower
each w/ except: 458859254.7 i/s - 4.41x slower
map w/ except: 423818242.2 i/s - 4.78x slower
Using a.each { |h| h.delete("name") } is the fastest (as pointed in the comment).
For example use except! (or except) method:
a.map {|o| o.except!('name') }
Iterate the Array and delete it from each hash:
a = [
{'id'=>'1','imageUrl'=>'abc','name'=>'x'},
{'id'=>'2','imageUrl'=>'efg','name'=>'y'},
{'id'=>'3','imageUrl'=>'hij','name'=>'z'}
]
=> [{"id"=>"1", "imageUrl"=>"abc", "name"=>"x"}, {"id"=>"2", "imageUrl"=>"efg", "name"=>"y"}, {"id"=>"3", "imageUrl"=>"hij", "name"=>"z"}]
a.each do |h|
h.delete("name")
end
=> [{"id"=>"1", "imageUrl"=>"abc"}, {"id"=>"2", "imageUrl"=>"efg"}, {"id"=>"3", "imageUrl"=>"hij"}]
I want to change every value in a hash so as to add '%' before and after the value so
{ :a=>'a' , :b=>'b' }
must be changed to
{ :a=>'%a%' , :b=>'%b%' }
What's the best way to do this?
In Ruby 2.1 and higher you can do
{ a: 'a', b: 'b' }.map { |k, str| [k, "%#{str}%"] }.to_h
If you want the actual strings themselves to mutate in place (possibly and desirably affecting other references to the same string objects):
# Two ways to achieve the same result (any Ruby version)
my_hash.each{ |_,str| str.gsub! /^|$/, '%' }
my_hash.each{ |_,str| str.replace "%#{str}%" }
If you want the hash to change in place, but you don't want to affect the strings (you want it to get new strings):
# Two ways to achieve the same result (any Ruby version)
my_hash.each{ |key,str| my_hash[key] = "%#{str}%" }
my_hash.inject(my_hash){ |h,(k,str)| h[k]="%#{str}%"; h }
If you want a new hash:
# Ruby 1.8.6+
new_hash = Hash[*my_hash.map{|k,str| [k,"%#{str}%"] }.flatten]
# Ruby 1.8.7+
new_hash = Hash[my_hash.map{|k,str| [k,"%#{str}%"] } ]
Ruby 2.4 introduced the method Hash#transform_values!, which you could use.
{ :a=>'a' , :b=>'b' }.transform_values! { |v| "%#{v}%" }
# => {:a=>"%a%", :b=>"%b%"}
The best way to modify a Hash's values in place is
hash.update(hash){ |_,v| "%#{v}%" }
Less code and clear intent. Also faster because no new objects are allocated beyond the values that must be changed.
A bit more readable one, map it to an array of single-element hashes and reduce that with merge
the_hash.map{ |key,value| {key => "%#{value}%"} }.reduce(:merge)
There is a new 'Rails way' method for this task :)
http://api.rubyonrails.org/classes/Hash.html#method-i-transform_values
One method that doesn't introduce side-effects to the original:
h = {:a => 'a', :b => 'b'}
h2 = Hash[h.map {|k,v| [k, '%' + v + '%']}]
Hash#map may also be an interesting read as it explains why the Hash.map doesn't return a Hash (which is why the resultant Array of [key,value] pairs is converted into a new Hash) and provides alternative approaches to the same general pattern.
Happy coding.
[Disclaimer: I am not sure if Hash.map semantics change in Ruby 2.x]
my_hash.each do |key, value|
my_hash[key] = "%#{value}%"
end
Hash.merge! is the cleanest solution
o = { a: 'a', b: 'b' }
o.merge!(o) { |key, value| "%#{ value }%" }
puts o.inspect
> { :a => "%a%", :b => "%b%" }
After testing it with RSpec like this:
describe Hash do
describe :map_values do
it 'should map the values' do
expect({:a => 2, :b => 3}.map_values { |x| x ** 2 }).to eq({:a => 4, :b => 9})
end
end
end
You could implement Hash#map_values as follows:
class Hash
def map_values
Hash[map { |k, v| [k, yield(v)] }]
end
end
The function then can be used like this:
{:a=>'a' , :b=>'b'}.map_values { |v| "%#{v}%" }
# {:a=>"%a%", :b=>"%b%"}
If you are curious which inplace variant is the fastest here it is:
Calculating -------------------------------------
inplace transform_values! 1.265k (± 0.7%) i/s - 6.426k in 5.080305s
inplace update 1.300k (± 2.7%) i/s - 6.579k in 5.065925s
inplace map reduce 281.367 (± 1.1%) i/s - 1.431k in 5.086477s
inplace merge! 1.305k (± 0.4%) i/s - 6.630k in 5.080751s
inplace each 1.073k (± 0.7%) i/s - 5.457k in 5.084044s
inplace inject 697.178 (± 0.9%) i/s - 3.519k in 5.047857s