Related
I am trying to write a function scramble(str1, str2) that returns true if a portion of str1 characters can be rearranged to match str2, otherwise returns false. Only lower case letters (a-z) will be used. No punctuation or digits will be included. For example:
str1 = 'rkqodlw'; str2 = 'world' should return true.
str1 = 'cedewaraaossoqqyt'; str2 = 'codewars' should return true.
str1 = 'katas'; str2 = 'steak' should return false.
This is my code:
def scramble(s1, s2)
#sorts strings into arrays
first = s1.split("").sort
second = s2.split("").sort
correctLetters = 0
for i in 0...first.length
#check for occurrences of first letter
occurrencesFirst = first.count(s1[i])
for j in 0...second.length
#scan through second string
occurrencesSecond = second.count(s2[j])
#if letter to be tested is correct and occurrences of first less than occurrences of second
#meaning word cannot be formed
if (s2[j] == s1[i]) && occurrencesFirst < occurrencesSecond
return false
elsif s2[j] == s1[i]
correctLetters += 1
elsif first.count(s1[s2[j]]) == 0
return false
end
end
end
if correctLetters == 0
return false
end
return true
end
I need help optimising this code. Please give me suggestions.
Here is one efficient and Ruby-like way of doing that.
Code
def scramble(str1, str2)
h1 = char_counts(str1)
h2 = char_counts(str2)
h2.all? { |ch, nbr| nbr <= h1[ch] }
end
def char_counts(str)
str.each_char.with_object(Hash.new(0)) { |ch, h| h[ch] += 1 }
end
Examples
scramble('abecacdeba', 'abceae')
#=> true
scramble('abecacdeba', 'abweae')
#=> false
Explanation
The three steps are as follows.
str1 = 'abecacdeba'
str2 = 'abceae'
h1 = char_counts(str1)
#=> {"a"=>3, "b"=>2, "e"=>2, "c"=>2, "d"=>1}
h2 = char_counts(str2)
#=> {"a"=>2, "b"=>1, "c"=>1, "e"=>2}
h2.all? { |ch, nbr| nbr <= h1[ch] }
#=> true
The last statement is equivalent to
2 <= 3 && 1 <= 2 && 1 <= 2 && 2 <=2
The method char_counts constructs what is sometimes called a "counting hash". To understand how char_counts works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(0), then if h does not have a key k, h[k] returns the default value, here 0 (and the hash is not changed).
Suppose, for different data,
h1 = { "a"=>2 }
h2 = { "a"=>1, "b"=>2 }
Then we would find that 1 <= 2 #=> true but 2 <= 0 #=> false, so the method would return false. The second comparison is 2 <= h1["b"]. As h1 does not have a key "b", h1["b"] returns the default value, 0.
The method char_counts is effectively a short way of writing the method expressed as follows.
def char_counts(str)
h = {}
str.each_char do |ch|
h[ch] = 0 unless h.key?(ch) # instead of Hash.new(0)
h[ch] = h[c] + 1 # instead of h[c][ += 1
end
h # no need for this if use `each_with_object`
end
See Enumerable#each_with_object, String#each_char (preferable to String.chars, as the latter produces an unneeded temporary array whereas the former returns an enumerator) and Hash#key? (or Hash#has_key?, Hash#include? or Hash#member?).
An Alternative
def scramble(str1, str2)
str2.chars.difference(str1.chars).empty?
end
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
I have found the method Array#difference to be so useful I proposed it be added to the Ruby Core (here). The response has been, er, underwhelming.
One way:
def scramble(s1,s2)
s2.chars.uniq.all? { |c| s1.count(c) >= s2.count(c) }
end
Another way:
def scramble(s1,s2)
pool = s1.chars.group_by(&:itself)
s2.chars.all? { |c| pool[c]&.pop }
end
Yet another:
def scramble(s1,s2)
('a'..'z').all? { |c| s1.count(c) >= s2.count(c) }
end
Since this appears to be from codewars, I submitted my first two there. Both got accepted and the first one was a bit faster. Then I was shown solutions of others and saw someone using ('a'..'z') and it's fast, so I include that here.
The codewars "performance tests" aren't shown explicitly but they're all up to about 45000 letters long. So I benchmarked these solutions as well as Cary's (yours was too slow to be included) on shuffles of the alphabet repeated to be about that long (and doing it 100 times):
user system total real
Stefan 1 0.812000 0.000000 0.812000 ( 0.811765)
Stefan 2 2.141000 0.000000 2.141000 ( 2.127585)
Other 0.125000 0.000000 0.125000 ( 0.122248)
Cary 1 2.562000 0.000000 2.562000 ( 2.575366)
Cary 2 3.094000 0.000000 3.094000 ( 3.106834)
Moral of the story? String#count is fast here. Like, ridiculously fast. Almost unbelievably fast (I actually had to run extra tests to believe it). It counts through about 1.9 billion letters per second (100 times 26 letters times 2 strings of ~45000 letters, all in 0.12 seconds). Note that the difference to my own first solution is just that I do s2.chars.uniq, and that increases the time from 0.12 seconds to 0.81 seconds. Meaning this double pass through one string takes about six times as long as the 52 passes for counting. The counting is about 150 times faster. I did expect it to be very fast, because it presumably just searches a byte in an array of bytes using C code (edit: looks like it does), but this speed still surprised me.
Code:
require 'benchmark'
def scramble_stefan1(s1,s2)
s2.chars.uniq.all? { |c| s1.count(c) >= s2.count(c) }
end
def scramble_stefan2(s1,s2)
pool = s1.chars.group_by(&:itself)
s2.chars.all? { |c| pool[c]&.pop }
end
def scramble_other(s1,s2)
('a'..'z').all? { |c| s1.count(c) >= s2.count(c) }
end
def scramble_cary1(str1, str2)
h1 = char_counts(str1)
h2 = char_counts(str2)
h2.all? { |ch, nbr| nbr <= h1[ch] }
end
def char_counts(str)
str.each_char.with_object(Hash.new(0)) { |ch, h| h[ch] += 1 }
end
def scramble_cary2(str1, str2)
str2.chars.difference(str1.chars).empty?
end
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
Benchmark.bmbm do |x|
n = 100
s1 = (('a'..'z').to_a * (45000 / 26)).shuffle.join
s2 = s1.chars.shuffle.join
x.report('Stefan 1') { n.times { scramble_stefan1(s1, s2) } }
x.report('Stefan 2') { n.times { scramble_stefan2(s1, s2) } }
x.report('Other') { n.times { scramble_other(s1, s2) } }
x.report('Cary 1') { n.times { scramble_cary1(s1, s2) } }
x.report('Cary 2') { n.times { scramble_cary2(s1, s2) } }
end
Here is the logic:
y = 'var to check for'
some_var = some_loop.each do |x|
x if x == y
break if x
end
Is there a better way to write this?
Something like
x && break if x == y
Thank you in advance!
The correct answer is to use include?. eg:
found = (array_expression).include? {|x| x == search_value}
It's possible to also use each and break out on the first matched value, but the C implementation of include? is faster than a ruby script with each.
Here is a test program, comparing the performance of invoking include? on a very large array vs. invoking each on the same array with the same argument.
#!/usr/bin/env ruby
#
require 'benchmark'
def f_include a, b
b if a.include?(b)
end
def f_each_break a, b
a.each {|x| return b if x == b }
nil
end
# gen large array of random numbers
a = (1..100000).map{|x| rand 1000000}
# now select 1000 random numbers in the set
nums = (1..1000).map{|x| a[rand a.size]}
# now, check the time for f1 vs. f2
r1 = r2 = nil
Benchmark.bm do |bm|
bm.report('incl') { r1 = nums.map {|n| f_include a,n} }
bm.report('each') { r2 = nums.map {|n| f_each_break a,n} }
end
if r1.size != r2.size || r1 != r2
puts "results differ"
puts "r1.size = #{r1.size}"
puts "r2.size = #{r2.size}"
exceptions = (0..r1.size).select {|x| x if r1[x] != r2[x]}.compact
puts "There were #{exceptions.size} exceptions"
else
puts "results ok"
end
exit
Here is the output from the test:
$ ./test-find.rb
user system total real
incl 5.150000 0.090000 5.240000 ( 7.410580)
each 7.400000 0.140000 7.540000 ( 9.269962)
results ok
Why not:
some_var = (some_loop.include? y ? y : nil)
Given two numbers, say (14, 18), the problem is to find the sum of all the numbers in this range, 14, 15, 16, 17, 18 recursively. Now, I have done this using loops but I have trouble doing this recursively.
Here is my recursive solution:
def sum_cumulative_recursive(a,b)
total = 0
#base case is a == b, the stopping condition
if a - b == 0
puts "sum is: "
return total + a
end
if b - a == 0
puts "sum is: "
return total + b
end
#case 1: a > b, start from b, and increment recursively
if a > b
until b > a
puts "case 1"
total = b + sum_cumulative_recursive(a, b+1)
return total
end
end
#case 2: a < b, start from a, and increment recursively
if a < b
until a > b
puts "case 2"
total = a + sum_cumulative_recursive(a+1, b)
return total
end
end
end
Here are some sample test cases:
puts first.sum_cumulative_recursive(4, 2)
puts first.sum_cumulative_recursive(14, 18)
puts first.sum_cumulative_recursive(-2,-2)
My solution works for cases where a > b, and a < b, but it doesn't work for a == b.
How can I fix this code so that it works?
Thank you for your time.
def sum_cumulative_recursive(a,b)
return a if a == b
a, b = [a,b].sort
a + sum_cumulative_recursive(a + 1, b)
end
EDIT
Here is the most efficient solution I could see from some informal benchmarks:
def sum_cumulative_recursive(a,b)
return a if a == b
a, b = b, a if a > b
a + sum_cumulative_recursive(a + 1, b)
end
Using:
Benchmark.measure { sum_cumulative_recursive(14,139) }
Benchmark for my initial response: 0.005733
Benchmark for #Ajedi32's response: 0.000371
Benchmark for my new response: 0.000115
I was also surprised to see that in some cases, the recursive solution approaches or exceeds the efficiency of the more natural inject solution:
Benchmark.measure { 10.times { (1000..5000).inject(:+) } }
# => 0.010000 0.000000 0.010000 ( 0.027827)
Benchmark.measure { 10.times { sum_cumulative_recursive(1000,5000) } }
# => 0.010000 0.010000 0.020000 ( 0.019441)
Though you run into stack level too deep errors if you take it too far...
I'd do it like this:
def sum_cumulative_recursive(a, b)
a, b = a.to_i, b.to_i # Only works with ints
return sum_cumulative_recursive(b, a) if a > b
return a if a == b
return a + sum_cumulative_recursive(a+1, b)
end
Here's one way of doing it. I assume this is just an exercise, as the sum of the elements of a range r is of course just (r.first+r.last)*(f.last-r.first+1)/2.
def sum_range(range)
return nil if range.last < range.first
case range.size
when 1 then range.first
when 2 then range.first + range.last
else
range.first + range.last + sum_range(range.first+1..range.last-1)
end
end
sum_range(14..18) #=> 80
sum_range(14..14) #=> 14
sum_range(14..140) #=> 9779
sum_range(14..139) #=> 9639
Another solution would be to have a front-end invocation that fixes out-of-order arguments, then a private recursive back-end which does the actual work. I find this is useful to avoid repeated checks of arguments once you've established they're clean.
def sum_cumulative_recursive(a, b)
a, b = b, a if b < a
_worker_bee_(a, b)
end
private
def _worker_bee_(a, b)
a < b ? (a + _worker_bee_(a+1,b-1) + b) : a == b ? a : 0
end
This variant would cut the stack requirement in half by summing from both ends.
If you don't like that approach and/or you really want to trim the stack size:
def sum_cumulative_recursive(a, b)
if a < b
mid = (a + b) / 2
sum_cumulative_recursive(a, mid) + sum_cumulative_recursive(mid+1, b)
elsif a == b
a
else
sum_cumulative_recursive(b, a)
end
end
This should keep the stack size to O(log |b-a|).
I am a beginner in Ruby. Can anyone help me to write code for this, please?
Given an Array, return the elements that are present exactly once in the array.
For example, it should pass the following test cases:
returns [1,4,5], given [1,2,2,3,3,4,5]
returns [1,3], given [1,2,2,3,4,4]
Put the items in an array. a = [1,2,2,3,4,4] Then run a few filters to get the items you want.
a.group_by { |x| x }.reject { |k,v| v.count > 1 }.keys
#=> [1,3]
Updated With Stefan's keys suggestion.
a = [1,2,2,3,3,4,5]
p a.select{|i| a.count(i) == 1}
# >> [1, 4, 5]
a = [1,2,2,3,4,4]
p a.select{|i| a.count(i) == 1}
# >> [1, 3]
Benchmarks
require 'benchmark'
a = [1,2,2,3,3,4,5]
n = 1000000
Benchmark.bm(15) do |x|
x.report('priti') { n.times { a.select{|i| a.count(i) == 1} } }
x.report('Jason') { n.times { a.group_by { |x| x }.reject { |k,v| v.count > 1 }.keys } }
x.report('rogerdpack2') { n.times {
bad = {}
good = {}
a.each{|v|
if bad.key? v
# do nothing
else
if good.key? v
bad[v] = true
good.delete(v)
else
good[v] = true;
end
end
}
good.keys
}
}
end
with this result
priti 3.152000 0.000000 3.152000 ( 3.247000)
Jason 4.633000 0.000000 4.633000 ( 4.845000)
rogerdpack2 3.853000 0.000000 3.853000 ( 3.886000)
and with a larger array:
require 'benchmark'
a = [1,2,2,3,3,4,5]*5 + [33,34]
n = 1000000
Benchmark.bm(15) do |x|
x.report('priti') { n.times { a.select{|i| a.count(i) == 1} } }
x.report('Jason') { n.times { a.group_by { |x| x }.reject { |k,v| v.count > 1 }.keys } }
x.report('rogerdpack2') { n.times {
bad = {}
good = {}
a.each{|v|
if bad.key? v
# do nothing
else
if good.key? v
bad[v] = true
good.delete(v)
else
good[v] = true;
end
end
}
good.keys
}
}
x.report('priti2') { n.times { a.uniq.select{|i| a.count(i) == 1} }}
end
you get result:
user system total real
priti 60.435000 0.000000 60.435000 ( 60.769151)
Jason 10.827000 0.016000 10.843000 ( 10.978195)
rogerdpack2 9.141000 0.000000 9.141000 ( 9.213843)
priti2 15.897000 0.000000 15.897000 ( 16.007201)
Here's another option:
a = [1,2,2,3,3,4,5]
b = {}
a.each{|v|
b[v] ||= 0
b[v] += 1
}
b.select{|k, v| v == 1}.keys
and here's a potentially faster one (though more complex) that is hard coded to look for items "just listed once":
a = [1,2,2,3,3,4,5]
bad = {}
good = {}
a.each{|v|
if bad.key? v
# do nothing
else
if good.key? v
bad[v] = true
good.delete(v)
else
good[v] = true;
end
end
}
good.keys
Given an array of the array [X,Y]:
a=[[1,2],[2,2],[3,2],[4,2],[5,2],[6,2]]
What is the most efficient way to sum all the Y digits for 2<=X<4?
I'd work with this:
a.select{ |x,y| (2...4) === x }.inject(0){ |m, (x,y)| m + y }
=> 4
I don't really like using ... though, because it confuses people by how it works. Here are some equivalent ways of testing:
a.select{ |x,y| (2..3) === x }.inject(0){ |m, (x,y)| m + y }
ary.select{ |x,y| (2 <= x) && (x < 4) }.inject(0){ |m, (x,y)| m + y } } }
Here's some benchmark code:
require 'benchmark'
a = [ [1,2], [2,2], [3,2], [4,2], [5,2], [6,2] ]
n = 1_000_000
Benchmark.bm(12) do |b|
b.report('The Tin Man') { n.times { a.select{ |x,y| (2...4) === x }.inject(0){ |m, (x,y)| m + y } } }
b.report('The Tin Man2') { n.times { a.select{ |x,y| (2 <= x) && (x < 4) }.inject(0){ |m, (x,y)| m + y } } }
b.report('Mik_Die') { n.times { a.select{ |i| (2...4).include? i[0] }.map(&:last).reduce(:+) } }
b.report('Justin Ko') { n.times { a.inject(0){ |sum, coord| (coord[0] >= 2 and coord[0] < 4) ? sum + coord[1] : sum } } }
b.report('Justin Ko2') { n.times { a.inject(0){ |sum, (x,y)| (x >= 2 and x < 4) ? sum + y : sum } } }
b.report('Leo Correa') { n.times { sum = 0; a.each { |x, y| sum += y if x >= 2 and x < 4 } } }
b.report('tokland') { n.times { a.map { |x, y| y if x >= 2 && x < 4 }.compact.inject(0, :+) } }
end
And its output:
user system total real
The Tin Man 4.020000 0.000000 4.020000 ( 4.020154)
The Tin Man2 2.420000 0.000000 2.420000 ( 2.424424)
Mik_Die 3.830000 0.000000 3.830000 ( 3.836531)
Justin Ko 2.070000 0.000000 2.070000 ( 2.072446)
Justin Ko2 2.000000 0.000000 2.000000 ( 2.035079)
Leo Correa 1.260000 0.000000 1.260000 ( 1.259672)
tokland 2.650000 0.010000 2.660000 ( 2.645466)
The lesson learned here is inject is costly.
I would use inject:
a = [[1,2],[2,2],[3,2],[4,2],[5,2],[6,2]]
sum = a.inject(0){ |sum, (x,y)| (x >= 2 and x < 4) ? sum + y : sum }
puts sum
#=> 4
The rdoc describes the inject method well:
inject(initial) {| memo, obj | block } → obj
Combines all elements of enum by applying a binary operation,
specified by a block or a symbol that names a method or operator.
If you specify a block, then for each element in enum the block is
passed an accumulator value (memo) and the element. If you specify a
symbol instead, then each element in the collection will be passed to
the named method of memo. In either case, the result becomes the new
value for memo. At the end of the iteration, the final value of memo
is the return value for the method.
If you do not explicitly specify an initial value for memo, then uses
the first element of collection is used as the initial value of memo.
Update - Benchmark Array vs Unpacking:
#tokland had suggested unpacking the pairs, which definitely improves readability. The following benchmark was run to see if it was faster than using the array (ie my original solution).
require 'benchmark'
a = [ [1,2], [2,2], [3,2], [4,2], [5,2], [6,2] ]
n = 2_000_000
Benchmark.bm(12) do |b|
b.report('array'){n.times{a.inject(0){ |sum, coord| (coord[0] >= 2 and coord[0] < 4) ? sum + coord[1] : sum }}}
b.report('unpacked'){n.times{a.inject(0){ |sum, (x,y)| (x >= 2 and x < 4) ? sum + y : sum }}}
end
Which gave the results
user system total real
array 3.916000 0.000000 3.916000 ( 3.925393)
unpacked 3.619000 0.000000 3.619000 ( 3.616361)
So, in at least this case, unpacking the pairs is better.
I like the inject answer that #JustinKo gave but here's another solution that might be easier to understand if you are new to Ruby.
a=[[1,2],[2,2],[3,2],[4,2],[5,2],[6,2]]
sum = 0
a.each { |x, y| sum += y if x >= 2 and x < 4 }
puts sum
#=> 4
It's more clearly in ruby to use chains of more simple methods. So:
a=[[1,2],[2,2],[3,2],[4,2],[5,2],[6,2]]
a.select{ |i| (2...4).include? i[0] }.map(&:last).reduce(:+)
# => 4
Conceptually what you'd like to use is a list-comphrehension. Alas, Ruby has no built-in syntax for LCs, but a compact+map does the job just fine:
a.map { |x, y| y if x >= 2 && x < 4 }.compact.inject(0, :+)
#=> 4
If you are writing a medium/large script you'll probably have (and should have) an extensions
module. Add the required methods so you can write declarative and concise code:
a.map_select { |x, y| y if x >= 2 && x < 4 }.sum
Or even:
a.sum { |x, y| y if x >= 2 && x < 4 }