Which Ruby statement is more efficient? - ruby

I have a hash table:
hash = Hash.new(0)
hash[:key] = hash[:key] + 1 # Line 1
hash[:key] += 1 # Line 2
Line 1 and Line 2 do the same thing. Looks like line 1 needs to query hash by key two times while line 2 only once. Is that true? Or they are actually same?

I created a ruby script to benchmark it
require 'benchmark'
def my_case1()
#hash[:key] = #hash[:key] + 1
end
def my_case2()
#hash[:key] += 1
end
n = 10000000
Benchmark.bm do |test|
test.report("case 1") {
#hash = Hash.new(1)
#hash[:key] = 0
n.times do; my_case1(); end
}
test.report("case 2") {
#hash = Hash.new(1)
#hash[:key] = 0
n.times do; my_case2(); end
}
end
Here is the result
user system total real
case 1 3.620000 0.080000 3.700000 ( 4.253319)
case 2 3.560000 0.080000 3.640000 ( 4.178699)
It looks hash[:key] += 1 is slightly better.

#sza beat me to it :)
Here is my example irb session:
> require 'benchmark'
=> true
> n = 10000000
=> 10000000
> Benchmark.bm do |x|
> hash = Hash.new(0)
> x.report("Case 1:") { n.times do; hash[:key] = hash[:key] + 1; end }
> hash = Hash.new(0)
> x.report("Case 2:") { n.times do; hash[:key] += 1; end }
> end
user system total real
Case 1: 1.070000 0.000000 1.070000 ( 1.071366)
Case 2: 1.040000 0.000000 1.040000 ( 1.043644)

The Ruby Language Specification spells out the algorithm for evaluating abbreviated indexing assignment expressions quite clearly. It is something like this:
primary_expression[indexing_argument_list] ω= expression
# ω can be any operator, in this example, it is +
is (roughly) evaluated like
o = primary_expression
*l = indexing_argument_list
v = o.[](*l)
w = expression
l << (v ω w)
o.[]=(*l)
In particular, you can see that both the getter and the setter are called exactly once.
You can also see that by looking at the informal desugaring:
hash[:key] += 1
# is syntactic sugar for
hash[:key] = hash[:key] + 1
# which is syntactic sugar for
hash.[]=(:key, hash.[](:key).+(1))
Again, you see that both the setter and the getter are called exactly once.

second one is the customary way of doing it. It is more efficient.

Related

How can I improve the performance of this small Ruby function?

I am currently doing a Ruby challenge and get the error Terminated due to timeout
for some testcases where the string input is very long (10.000+ characters).
How can I improve my code?
Ruby challenge description
You are given a string containing characters A and B only. Your task is to change it into a string such that there are no matching adjacent characters. To do this, you are allowed to delete zero or more characters in the string.
Your task is to find the minimum number of required deletions.
For example, given the string s = AABAAB, remove A an at positions 0 and 3 to make s = ABAB in 2 deletions.
My function
def alternatingCharacters(s)
counter = 0
s.chars.each_with_index { |char, idx| counter += 1 if s.chars[idx + 1] == char }
return counter
end
Thank you!
This could be faster returning the count:
str.size - str.chars.chunk_while{ |a, b| a == b }.to_a.size
The second part uses String#chars method in conjunction with Enumerable#chunk_while.
This way the second part groups in subarrays:
'aababbabbaab'.chars.chunk_while{ |a, b| a == b}.to_a
#=> [["a", "a"], ["b"], ["a"], ["b", "b"], ["a"], ["b", "b"], ["a", "a"], ["b"]]
Trivial if you can use squeeze:
str.length - str.squeeze.length
Otherwise, you could try a regular expression that matches those A (or B) that are preceded by another A (or B):
str.enum_for(:scan, /(?<=A)A|(?<=B)B/).count
Using enum_for avoids the creation of the intermediate array.
The main issue with:
s.chars.each_with_index { |char, idx| counter += 1 if s.chars[idx + 1] == char }
Is the fact that you don't save chars into a variable. s.chars will rip apart the string into an array of characters. The first s.chars call outside the loop is fine. However there is no reason to do this for each character in s. This means if you have a string of 10.000 characters, you'll instantiate 10.001 arrays of size 10.000.
Re-using the characters array will give you a huge performance boost:
require 'benchmark'
s = ''
options = %w[A B]
10_000.times { s << options.sample }
Benchmark.bm do |x|
x.report do
counter = 0
s.chars.each_with_index { |char, idx| counter += 1 if s.chars[idx + 1] == char }
# create a character array for each iteration ^
end
x.report do
counter = 0
chars = s.chars # <- only create a character array once
chars.each_with_index { |char, idx| counter += 1 if chars[idx + 1] == char }
end
end
user system total real
8.279767 0.000001 8.279768 ( 8.279655)
0.002188 0.000003 0.002191 ( 0.002191)
You could also make use of enumerator methods like each_cons and count to simplify the code, this doesn't increase performance cost a lot, but makes the code a lot more readable.
Benchmark.bm do |x|
x.report do
counter = 0
chars = s.chars
chars.each_with_index { |char, idx| counter += 1 if chars[idx + 1] == char }
end
x.report do
s.each_char.each_cons(2).count { |a, b| a == b }
# ^ using each_char instead of chars to avoid
# instantiating a character array
end
end
user system total real
0.002923 0.000000 0.002923 ( 0.002920)
0.003995 0.000000 0.003995 ( 0.003994)

Optimising code for matching two strings modulo scrambling

I am trying to write a function scramble(str1, str2) that returns true if a portion of str1 characters can be rearranged to match str2, otherwise returns false. Only lower case letters (a-z) will be used. No punctuation or digits will be included. For example:
str1 = 'rkqodlw'; str2 = 'world' should return true.
str1 = 'cedewaraaossoqqyt'; str2 = 'codewars' should return true.
str1 = 'katas'; str2 = 'steak' should return false.
This is my code:
def scramble(s1, s2)
#sorts strings into arrays
first = s1.split("").sort
second = s2.split("").sort
correctLetters = 0
for i in 0...first.length
#check for occurrences of first letter
occurrencesFirst = first.count(s1[i])
for j in 0...second.length
#scan through second string
occurrencesSecond = second.count(s2[j])
#if letter to be tested is correct and occurrences of first less than occurrences of second
#meaning word cannot be formed
if (s2[j] == s1[i]) && occurrencesFirst < occurrencesSecond
return false
elsif s2[j] == s1[i]
correctLetters += 1
elsif first.count(s1[s2[j]]) == 0
return false
end
end
end
if correctLetters == 0
return false
end
return true
end
I need help optimising this code. Please give me suggestions.
Here is one efficient and Ruby-like way of doing that.
Code
def scramble(str1, str2)
h1 = char_counts(str1)
h2 = char_counts(str2)
h2.all? { |ch, nbr| nbr <= h1[ch] }
end
def char_counts(str)
str.each_char.with_object(Hash.new(0)) { |ch, h| h[ch] += 1 }
end
Examples
scramble('abecacdeba', 'abceae')
#=> true
scramble('abecacdeba', 'abweae')
#=> false
Explanation
The three steps are as follows.
str1 = 'abecacdeba'
str2 = 'abceae'
h1 = char_counts(str1)
#=> {"a"=>3, "b"=>2, "e"=>2, "c"=>2, "d"=>1}
h2 = char_counts(str2)
#=> {"a"=>2, "b"=>1, "c"=>1, "e"=>2}
h2.all? { |ch, nbr| nbr <= h1[ch] }
#=> true
The last statement is equivalent to
2 <= 3 && 1 <= 2 && 1 <= 2 && 2 <=2
The method char_counts constructs what is sometimes called a "counting hash". To understand how char_counts works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(0), then if h does not have a key k, h[k] returns the default value, here 0 (and the hash is not changed).
Suppose, for different data,
h1 = { "a"=>2 }
h2 = { "a"=>1, "b"=>2 }
Then we would find that 1 <= 2 #=> true but 2 <= 0 #=> false, so the method would return false. The second comparison is 2 <= h1["b"]. As h1 does not have a key "b", h1["b"] returns the default value, 0.
The method char_counts is effectively a short way of writing the method expressed as follows.
def char_counts(str)
h = {}
str.each_char do |ch|
h[ch] = 0 unless h.key?(ch) # instead of Hash.new(0)
h[ch] = h[c] + 1 # instead of h[c][ += 1
end
h # no need for this if use `each_with_object`
end
See Enumerable#each_with_object, String#each_char (preferable to String.chars, as the latter produces an unneeded temporary array whereas the former returns an enumerator) and Hash#key? (or Hash#has_key?, Hash#include? or Hash#member?).
An Alternative
def scramble(str1, str2)
str2.chars.difference(str1.chars).empty?
end
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
I have found the method Array#difference to be so useful I proposed it be added to the Ruby Core (here). The response has been, er, underwhelming.
One way:
def scramble(s1,s2)
s2.chars.uniq.all? { |c| s1.count(c) >= s2.count(c) }
end
Another way:
def scramble(s1,s2)
pool = s1.chars.group_by(&:itself)
s2.chars.all? { |c| pool[c]&.pop }
end
Yet another:
def scramble(s1,s2)
('a'..'z').all? { |c| s1.count(c) >= s2.count(c) }
end
Since this appears to be from codewars, I submitted my first two there. Both got accepted and the first one was a bit faster. Then I was shown solutions of others and saw someone using ('a'..'z') and it's fast, so I include that here.
The codewars "performance tests" aren't shown explicitly but they're all up to about 45000 letters long. So I benchmarked these solutions as well as Cary's (yours was too slow to be included) on shuffles of the alphabet repeated to be about that long (and doing it 100 times):
user system total real
Stefan 1 0.812000 0.000000 0.812000 ( 0.811765)
Stefan 2 2.141000 0.000000 2.141000 ( 2.127585)
Other 0.125000 0.000000 0.125000 ( 0.122248)
Cary 1 2.562000 0.000000 2.562000 ( 2.575366)
Cary 2 3.094000 0.000000 3.094000 ( 3.106834)
Moral of the story? String#count is fast here. Like, ridiculously fast. Almost unbelievably fast (I actually had to run extra tests to believe it). It counts through about 1.9 billion letters per second (100 times 26 letters times 2 strings of ~45000 letters, all in 0.12 seconds). Note that the difference to my own first solution is just that I do s2.chars.uniq, and that increases the time from 0.12 seconds to 0.81 seconds. Meaning this double pass through one string takes about six times as long as the 52 passes for counting. The counting is about 150 times faster. I did expect it to be very fast, because it presumably just searches a byte in an array of bytes using C code (edit: looks like it does), but this speed still surprised me.
Code:
require 'benchmark'
def scramble_stefan1(s1,s2)
s2.chars.uniq.all? { |c| s1.count(c) >= s2.count(c) }
end
def scramble_stefan2(s1,s2)
pool = s1.chars.group_by(&:itself)
s2.chars.all? { |c| pool[c]&.pop }
end
def scramble_other(s1,s2)
('a'..'z').all? { |c| s1.count(c) >= s2.count(c) }
end
def scramble_cary1(str1, str2)
h1 = char_counts(str1)
h2 = char_counts(str2)
h2.all? { |ch, nbr| nbr <= h1[ch] }
end
def char_counts(str)
str.each_char.with_object(Hash.new(0)) { |ch, h| h[ch] += 1 }
end
def scramble_cary2(str1, str2)
str2.chars.difference(str1.chars).empty?
end
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
Benchmark.bmbm do |x|
n = 100
s1 = (('a'..'z').to_a * (45000 / 26)).shuffle.join
s2 = s1.chars.shuffle.join
x.report('Stefan 1') { n.times { scramble_stefan1(s1, s2) } }
x.report('Stefan 2') { n.times { scramble_stefan2(s1, s2) } }
x.report('Other') { n.times { scramble_other(s1, s2) } }
x.report('Cary 1') { n.times { scramble_cary1(s1, s2) } }
x.report('Cary 2') { n.times { scramble_cary2(s1, s2) } }
end

Ruby - get bit range from variable

I have a variable and want to take a range of bits from that variable. I want the CLEANEST way to do this.
If x = 19767 and I want bit3 - bit8 (starting from the right):
100110100110111 is 19767 in binary.
I want the part in parenthesis 100110(100110)111 so the answer is 38.
What is the simplest/cleanest/most-elegant way to implement the following function with Ruby?
bit_range(orig_num, first_bit, last_bit)
PS. Bonus points for answers that are computationally less intensive.
19767.to_s(2)[-9..-4].to_i(2)
or
19767 >> 3 & 0x3f
Update:
Soup-to-nuts (why do people say that, anyway?) ...
class Fixnum
def bit_range low, high
len = high - low + 1
self >> low & ~(-1 >> len << len)
end
end
p 19767.bit_range(3, 8)
orig_num.to_s(2)[(-last_bit-1)..(-first_bit-1)].to_i(2)
Here is how this could be done using pure number operations:
class Fixnum
def slice(range_or_start, length = nil)
if length
start = range_or_start
else
range = range_or_start
start = range.begin
length = range.count
end
mask = 2 ** length - 1
self >> start & mask
end
end
def p n
puts "0b#{n.to_s(2)}"; n
end
p 0b100110100110111.slice(3..8) # 0b100110
p 0b100110100110111.slice(3, 6) # 0b100110
Just to show the speeds of the suggested answers:
require 'benchmark'
ORIG_NUMBER = 19767
def f(x,i,j)
b = x.to_s(2)
n = b.size
b[(n-j-1)...(n-i)].to_i(2)
end
class Fixnum
def bit_range low, high
len = high - low + 1
self >> low & ~(-1 >> len << len)
end
def slice(range_or_start, length = nil)
if length
start = range_or_start
else
range = range_or_start
start = range.begin
length = range.count
end
mask = 2 ** length - 1
self >> start & mask
end
end
def p n
puts "0b#{n.to_s(2)}"; n
end
n = 1_000_000
puts "Using #{ n } loops in Ruby #{ RUBY_VERSION }."
Benchmark.bm(21) do |b|
b.report('texasbruce') { n.times { ORIG_NUMBER.to_s(2)[(-8 - 1)..(-3 - 1)].to_i(2) } }
b.report('DigitalRoss string') { n.times { ORIG_NUMBER.to_s(2)[-9..-4].to_i(2) } }
b.report('DigitalRoss binary') { n.times { ORIG_NUMBER >> 3 & 0x3f } }
b.report('DigitalRoss bit_range') { n.times { 19767.bit_range(3, 8) } }
b.report('Philip') { n.times { f(ORIG_NUMBER, 3, 8) } }
b.report('Semyon Perepelitsa') { n.times { ORIG_NUMBER.slice(3..8) } }
end
And the output:
Using 1000000 loops in Ruby 1.9.3.
user system total real
texasbruce 1.240000 0.010000 1.250000 ( 1.243709)
DigitalRoss string 1.000000 0.000000 1.000000 ( 1.006843)
DigitalRoss binary 0.260000 0.000000 0.260000 ( 0.262319)
DigitalRoss bit_range 0.840000 0.000000 0.840000 ( 0.858603)
Philip 1.520000 0.000000 1.520000 ( 1.543751)
Semyon Perepelitsa 1.150000 0.010000 1.160000 ( 1.155422)
That's on my old MacBook Pro. Your mileage might vary.
Makes sense to define a function for that:
def f(x,i,j)
b = x.to_s(2)
n = b.size
b[(n-j-1)...(n-i)].to_i(2)
end
puts f(19767, 3, 8) # => 38
Expanding on the idea from DigitalRoss - instead of taking two arguments, you can pass a range:
class Fixnum
def bit_range range
len = range.last - range.first + 1
self >> range.first & ~(-1 >> len << len)
end
end
19767.bit_range 3..8

more ruby way of doing project euler #2

I'm trying to learn Ruby, and am going through some of the Project Euler problems. I solved problem number two as such:
def fib(n)
return n if n < 2
vals = [0, 1]
n.times do
vals.push(vals[-1]+vals[-2])
end
return vals.last
end
i = 1
s = 0
while((v = fib(i)) < 4_000_000)
s+=v if v%2==0
i+=1
end
puts s
While that works, it seems not very ruby-ish—I couldn't come up with any good purely Ruby answer like I could with the first one ( puts (0..999).inject{ |sum, n| n%3==0||n%5==0 ? sum : sum+n }).
For a nice solution, why don't you create a Fibonacci number generator, like Prime or the Triangular example I gave here.
From this, you can use the nice Enumerable methods to handle the problem. You might want to wonder if there is any pattern to the even Fibonacci numbers too.
Edit your question to post your solution...
Note: there are more efficient ways than enumerating them, but they require more math, won't be as clear as this and would only shine if the 4 million was much higher.
As demas' has posted a solution, here's a cleaned up version:
class Fibo
class << self
include Enumerable
def each
return to_enum unless block_given?
a = 0; b = 1
loop do
a, b = b, a + b
yield a
end
end
end
end
puts Fibo.take_while { |i| i < 4000000 }.
select(&:even?).
inject(:+)
My version based on Marc-André Lafortune's answer:
class Some
#a = 1
#b = 2
class << self
include Enumerable
def each
1.upto(Float::INFINITY) do |i|
#a, #b = #b, #a + #b
yield #b
end
end
end
end
puts Some.take_while { |i| i < 4000000 }.select { |n| n%2 ==0 }
.inject(0) { |sum, item| sum + item } + 2
def fib
first, second, sum = 1,2,0
while second < 4000000
sum += second if second.even?
first, second = second, first + second
end
puts sum
end
You don't need return vals.last. You can just do vals.last, because Ruby will return the last expression (I think that's the correct term) by default.
fibs = [0,1]
begin
fibs.push(fibs[-1]+fibs[-2])
end while not fibs[-1]+fibs[-2]>4000000
puts fibs.inject{ |sum, n| n%2==0 ? sum+n : sum }
Here's what I got. I really don't see a need to wrap this in a class. You could in a larger program surely, but in a single small script I find that to just create additional instructions for the interpreter. You could select even, instead of rejecting odd but its pretty much the same thing.
fib = Enumerator.new do |y|
a = b = 1
loop do
y << a
a, b = b, a + b
end
end
puts fib.take_while{|i| i < 4000000}
.reject{|x| x.odd?}
.inject(:+)
That's my approach. I know it can be less lines of code, but maybe you can take something from it.
class Fib
def first
#p0 = 0
#p1 = 1
1
end
def next
r =
if #p1 == 1
2
else
#p0 + #p1
end
#p0 = #p1
#p1 = r
r
end
end
c = Fib.new
f = c.first
r = 0
while (f=c.next) < 4_000_000
r += f if f%2==0
end
puts r
I am new to Ruby, but here is the answer I came up with.
x=1
y=2
array = [1,2]
dar = []
begin
z = x + y
if z % 2 == 0
a = z
dar << a
end
x = y
y = z
array << z
end while z < 4000000
dar.inject {:+}
puts "#{dar.sum}"
def fib_nums(num)
array = [1, 2]
sum = 0
until array[-2] > num
array.push(array[-1] + array[-2])
end
array.each{|x| sum += x if x.even?}
sum
end

Find most common string in an array

I have this array, for example (the size is variable):
x = ["1.111", "1.122", "1.250", "1.111"]
and I need to find the most commom value ("1.111" in this case).
Is there an easy way to do that?
Tks in advance!
EDIT #1: Thank you all for the answers!
EDIT #2: I've changed my accepted answer based on Z.E.D.'s information. Thank you all again!
Ruby < 2.2
#!/usr/bin/ruby1.8
def most_common_value(a)
a.group_by do |e|
e
end.values.max_by(&:size).first
end
x = ["1.111", "1.122", "1.250", "1.111"]
p most_common_value(x) # => "1.111"
Note: Enumberable.max_by is new with Ruby 1.9, but it has been backported to 1.8.7
Ruby >= 2.2
Ruby 2.2 introduces the Object#itself method, with which we can make the code more concise:
def most_common_value(a)
a.group_by(&:itself).values.max_by(&:size).first
end
As a monkey patch
Or as Enumerable#mode:
Enumerable.class_eval do
def mode
group_by do |e|
e
end.values.max_by(&:size).first
end
end
["1.111", "1.122", "1.250", "1.111"].mode
# => "1.111"
One pass through the hash to accumulate the counts. Use .max() to find the hash entry with the largest value.
#!/usr/bin/ruby
a = Hash.new(0)
["1.111", "1.122", "1.250", "1.111"].each { |num|
a[num] += 1
}
a.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]
or, roll it all into one line:
ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] } # => ["1.111", 2]
If you only want the item back add .first():
ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first # => "1.111"
The first sample I used is how it would be done in Perl usually. The second is more Ruby-ish. Both work with older versions of Ruby. I wanted to compare them, plus see how Wayne's solution would speed things up so I tested with benchmark:
#!/usr/bin/env ruby
require 'benchmark'
ary = ["1.111", "1.122", "1.250", "1.111"] * 1000
def most_common_value(a)
a.group_by { |e| e }.values.max_by { |values| values.size }.first
end
n = 1000
Benchmark.bm(20) do |x|
x.report("Hash.new(0)") do
n.times do
a = Hash.new(0)
ary.each { |num| a[num] += 1 }
a.max{ |a,b| a[1] <=> b[1] }.first
end
end
x.report("inject:") do
n.times do
ary.inject(Hash.new(0)){ |h,i| h[i] += 1; h }.max{ |a,b| a[1] <=> b[1] }.first
end
end
x.report("most_common_value():") do
n.times do
most_common_value(ary)
end
end
end
Here's the results:
user system total real
Hash.new(0) 2.150000 0.000000 2.150000 ( 2.164180)
inject: 2.440000 0.010000 2.450000 ( 2.451466)
most_common_value(): 1.080000 0.000000 1.080000 ( 1.089784)
You could sort the array and then loop over it once. In the loop just keep track of the current item and the number of times it is seen. Once the list ends or the item changes, set max_count == count if count > max_count. And of course keep track of which item has the max_count.
You could create a hashmap that stores the array items as keys with their values being the number of times that element appears in the array.
Pseudo Code:
["1.111", "1.122", "1.250", "1.111"].each { |num|
count=your_hash_map.get(num)
if(item==nil)
hashmap.put(num,1)
else
hashmap.put(num,count+1)
}
As already mentioned, sorting might be faster.
Using the default value feature of hashes:
>> x = ["1.111", "1.122", "1.250", "1.111"]
>> h = Hash.new(0)
>> x.each{|i| h[i] += 1 }
>> h.max{|a,b| a[1] <=> b[1] }
["1.111", 2]
It will return most popular value in array
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
IE:
x = ["1.111", "1.122", "1.250", "1.111"]
# Most popular
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[0]
#=> "1.111
# How many times
x.group_by{|a| a }.sort_by{|a,b| b.size<=>a.size}.first[1].size
#=> 2

Resources