How can I improve the performance of this small Ruby function? - ruby

I am currently doing a Ruby challenge and get the error Terminated due to timeout
for some testcases where the string input is very long (10.000+ characters).
How can I improve my code?
Ruby challenge description
You are given a string containing characters A and B only. Your task is to change it into a string such that there are no matching adjacent characters. To do this, you are allowed to delete zero or more characters in the string.
Your task is to find the minimum number of required deletions.
For example, given the string s = AABAAB, remove A an at positions 0 and 3 to make s = ABAB in 2 deletions.
My function
def alternatingCharacters(s)
counter = 0
s.chars.each_with_index { |char, idx| counter += 1 if s.chars[idx + 1] == char }
return counter
end
Thank you!

This could be faster returning the count:
str.size - str.chars.chunk_while{ |a, b| a == b }.to_a.size
The second part uses String#chars method in conjunction with Enumerable#chunk_while.
This way the second part groups in subarrays:
'aababbabbaab'.chars.chunk_while{ |a, b| a == b}.to_a
#=> [["a", "a"], ["b"], ["a"], ["b", "b"], ["a"], ["b", "b"], ["a", "a"], ["b"]]

Trivial if you can use squeeze:
str.length - str.squeeze.length
Otherwise, you could try a regular expression that matches those A (or B) that are preceded by another A (or B):
str.enum_for(:scan, /(?<=A)A|(?<=B)B/).count
Using enum_for avoids the creation of the intermediate array.

The main issue with:
s.chars.each_with_index { |char, idx| counter += 1 if s.chars[idx + 1] == char }
Is the fact that you don't save chars into a variable. s.chars will rip apart the string into an array of characters. The first s.chars call outside the loop is fine. However there is no reason to do this for each character in s. This means if you have a string of 10.000 characters, you'll instantiate 10.001 arrays of size 10.000.
Re-using the characters array will give you a huge performance boost:
require 'benchmark'
s = ''
options = %w[A B]
10_000.times { s << options.sample }
Benchmark.bm do |x|
x.report do
counter = 0
s.chars.each_with_index { |char, idx| counter += 1 if s.chars[idx + 1] == char }
# create a character array for each iteration ^
end
x.report do
counter = 0
chars = s.chars # <- only create a character array once
chars.each_with_index { |char, idx| counter += 1 if chars[idx + 1] == char }
end
end
user system total real
8.279767 0.000001 8.279768 ( 8.279655)
0.002188 0.000003 0.002191 ( 0.002191)
You could also make use of enumerator methods like each_cons and count to simplify the code, this doesn't increase performance cost a lot, but makes the code a lot more readable.
Benchmark.bm do |x|
x.report do
counter = 0
chars = s.chars
chars.each_with_index { |char, idx| counter += 1 if chars[idx + 1] == char }
end
x.report do
s.each_char.each_cons(2).count { |a, b| a == b }
# ^ using each_char instead of chars to avoid
# instantiating a character array
end
end
user system total real
0.002923 0.000000 0.002923 ( 0.002920)
0.003995 0.000000 0.003995 ( 0.003994)

Related

Codewars: "Return or rotate": why isn't my attempted solution working?

These were the instructions given on Codewars (https://www.codewars.com/kata/56b5afb4ed1f6d5fb0000991/train/ruby):
The input is a string str of digits. Cut the string into chunks (a chunk here is a substring of the initial string) of size sz (ignore the last chunk if its size is less than sz).
If a chunk represents an integer such as the sum of the cubes of its digits is divisible by 2, reverse that chunk; otherwise rotate it to the left by one position. Put together these modified chunks and return the result as a string.
If
sz is <= 0 or if str is empty return ""
sz is greater (>) than the length of str it is impossible to take a chunk of size sz hence return "".
Examples:
revrot("123456987654", 6) --> "234561876549"
revrot("123456987653", 6) --> "234561356789"
revrot("66443875", 4) --> "44668753"
revrot("66443875", 8) --> "64438756"
revrot("664438769", 8) --> "67834466"
revrot("123456779", 8) --> "23456771"
revrot("", 8) --> ""
revrot("123456779", 0) --> ""
revrot("563000655734469485", 4) --> "0365065073456944"
This was my code (in Ruby):
def revrot(str, sz)
# your code
if sz > str.length || str.empty? || sz <= 0
""
else
arr = []
while str.length >= sz
arr << str.slice!(0,sz)
end
arr.map! do |chunk|
if chunk.to_i.digits.reduce(0) {|s, n| s + n**3} % 2 == 0
chunk.reverse
else
chunk.chars.rotate.join
end
end
arr.join
end
end
It passed 13/14 test and the error I got back was as follows:
STDERR/runner/frameworks/ruby/cw-2.rb:38:in `expect': Expected: "", instead got: "095131824330999134303813797692546166281332005837243199648332767146500044" (Test::Error)
from /runner/frameworks/ruby/cw-2.rb:115:in `assert_equals'
from main.rb:26:in `testing'
from main.rb:84:in `random_tests'
from main.rb:89:in `<main>'
I'm not sure what I did wrong, I have been trying to find what it could be for over an hour. Could you help me?
I will let someone else identify the problem with you code. I merely wish to show how a solution can be speeded up. (I will not include code to deal with edge cases, such as the string being empty.)
You can make use of two observations:
the cube of an integer is odd if and only if the integer is odd; and
the sum of collection of integers is odd if and only if the number of odd integers is odd.
We therefore can write
def sum_of_cube_odd?(str)
str.each_char.count { |c| c.to_i.odd? }.odd?
end
Consider groups of 4 digits in the last example, "563000655734469485".
sum_of_cube_odd? "5630" #=> false (so reverse -> "0365")
sum_of_cube_odd? "0065" #=> true (so rotate -> "0650")
sum_of_cube_odd? "5734" #=> true (so rotate -> "7345")
sum_of_cube_odd? "4694" #=> true (so rotate -> "6944")
so we are to return "0365065073456944".
Let's create another helper.
def rotate_chars_left(str)
str[1..-1] << s[0]
end
rotate_chars_left "0065" #=> "0650"
rotate_chars_left "5734" #=> "7345"
rotate_chars_left "4694" #=> "6944"
We can now write the main method.
def revrot(str, sz)
str.gsub(/.{,#{sz}}/) do |s|
if s.size < sz
''
elsif sum_of_cube_odd?(s)
rotate_chars_left(s)
else
s.reverse
end
end
end
revrot("123456987654", 6) #=> "234561876549"
revrot("123456987653", 6) #=> "234561356789"
revrot("66443875", 4) #=> "44668753"
revrot("66443875", 8) #=> "64438756"
revrot("664438769", 8) #=> "67834466"
revrot("123456779", 8) #=> "23456771"
revrot("563000655734469485", 4) #=> "0365065073456944"
It might be slightly faster to write
require 'set'
ODD_DIGITS = ['1', '3', '5', '7', '9'].to_set
#=> #<Set: {"1", "3", "5", "7", "9"}>
def sum_of_cube_odd?(str)
str.each_char.count { |c| ODD_DIGITS.include?(c) }.odd?
end

Optimising code for matching two strings modulo scrambling

I am trying to write a function scramble(str1, str2) that returns true if a portion of str1 characters can be rearranged to match str2, otherwise returns false. Only lower case letters (a-z) will be used. No punctuation or digits will be included. For example:
str1 = 'rkqodlw'; str2 = 'world' should return true.
str1 = 'cedewaraaossoqqyt'; str2 = 'codewars' should return true.
str1 = 'katas'; str2 = 'steak' should return false.
This is my code:
def scramble(s1, s2)
#sorts strings into arrays
first = s1.split("").sort
second = s2.split("").sort
correctLetters = 0
for i in 0...first.length
#check for occurrences of first letter
occurrencesFirst = first.count(s1[i])
for j in 0...second.length
#scan through second string
occurrencesSecond = second.count(s2[j])
#if letter to be tested is correct and occurrences of first less than occurrences of second
#meaning word cannot be formed
if (s2[j] == s1[i]) && occurrencesFirst < occurrencesSecond
return false
elsif s2[j] == s1[i]
correctLetters += 1
elsif first.count(s1[s2[j]]) == 0
return false
end
end
end
if correctLetters == 0
return false
end
return true
end
I need help optimising this code. Please give me suggestions.
Here is one efficient and Ruby-like way of doing that.
Code
def scramble(str1, str2)
h1 = char_counts(str1)
h2 = char_counts(str2)
h2.all? { |ch, nbr| nbr <= h1[ch] }
end
def char_counts(str)
str.each_char.with_object(Hash.new(0)) { |ch, h| h[ch] += 1 }
end
Examples
scramble('abecacdeba', 'abceae')
#=> true
scramble('abecacdeba', 'abweae')
#=> false
Explanation
The three steps are as follows.
str1 = 'abecacdeba'
str2 = 'abceae'
h1 = char_counts(str1)
#=> {"a"=>3, "b"=>2, "e"=>2, "c"=>2, "d"=>1}
h2 = char_counts(str2)
#=> {"a"=>2, "b"=>1, "c"=>1, "e"=>2}
h2.all? { |ch, nbr| nbr <= h1[ch] }
#=> true
The last statement is equivalent to
2 <= 3 && 1 <= 2 && 1 <= 2 && 2 <=2
The method char_counts constructs what is sometimes called a "counting hash". To understand how char_counts works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(0), then if h does not have a key k, h[k] returns the default value, here 0 (and the hash is not changed).
Suppose, for different data,
h1 = { "a"=>2 }
h2 = { "a"=>1, "b"=>2 }
Then we would find that 1 <= 2 #=> true but 2 <= 0 #=> false, so the method would return false. The second comparison is 2 <= h1["b"]. As h1 does not have a key "b", h1["b"] returns the default value, 0.
The method char_counts is effectively a short way of writing the method expressed as follows.
def char_counts(str)
h = {}
str.each_char do |ch|
h[ch] = 0 unless h.key?(ch) # instead of Hash.new(0)
h[ch] = h[c] + 1 # instead of h[c][ += 1
end
h # no need for this if use `each_with_object`
end
See Enumerable#each_with_object, String#each_char (preferable to String.chars, as the latter produces an unneeded temporary array whereas the former returns an enumerator) and Hash#key? (or Hash#has_key?, Hash#include? or Hash#member?).
An Alternative
def scramble(str1, str2)
str2.chars.difference(str1.chars).empty?
end
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
I have found the method Array#difference to be so useful I proposed it be added to the Ruby Core (here). The response has been, er, underwhelming.
One way:
def scramble(s1,s2)
s2.chars.uniq.all? { |c| s1.count(c) >= s2.count(c) }
end
Another way:
def scramble(s1,s2)
pool = s1.chars.group_by(&:itself)
s2.chars.all? { |c| pool[c]&.pop }
end
Yet another:
def scramble(s1,s2)
('a'..'z').all? { |c| s1.count(c) >= s2.count(c) }
end
Since this appears to be from codewars, I submitted my first two there. Both got accepted and the first one was a bit faster. Then I was shown solutions of others and saw someone using ('a'..'z') and it's fast, so I include that here.
The codewars "performance tests" aren't shown explicitly but they're all up to about 45000 letters long. So I benchmarked these solutions as well as Cary's (yours was too slow to be included) on shuffles of the alphabet repeated to be about that long (and doing it 100 times):
user system total real
Stefan 1 0.812000 0.000000 0.812000 ( 0.811765)
Stefan 2 2.141000 0.000000 2.141000 ( 2.127585)
Other 0.125000 0.000000 0.125000 ( 0.122248)
Cary 1 2.562000 0.000000 2.562000 ( 2.575366)
Cary 2 3.094000 0.000000 3.094000 ( 3.106834)
Moral of the story? String#count is fast here. Like, ridiculously fast. Almost unbelievably fast (I actually had to run extra tests to believe it). It counts through about 1.9 billion letters per second (100 times 26 letters times 2 strings of ~45000 letters, all in 0.12 seconds). Note that the difference to my own first solution is just that I do s2.chars.uniq, and that increases the time from 0.12 seconds to 0.81 seconds. Meaning this double pass through one string takes about six times as long as the 52 passes for counting. The counting is about 150 times faster. I did expect it to be very fast, because it presumably just searches a byte in an array of bytes using C code (edit: looks like it does), but this speed still surprised me.
Code:
require 'benchmark'
def scramble_stefan1(s1,s2)
s2.chars.uniq.all? { |c| s1.count(c) >= s2.count(c) }
end
def scramble_stefan2(s1,s2)
pool = s1.chars.group_by(&:itself)
s2.chars.all? { |c| pool[c]&.pop }
end
def scramble_other(s1,s2)
('a'..'z').all? { |c| s1.count(c) >= s2.count(c) }
end
def scramble_cary1(str1, str2)
h1 = char_counts(str1)
h2 = char_counts(str2)
h2.all? { |ch, nbr| nbr <= h1[ch] }
end
def char_counts(str)
str.each_char.with_object(Hash.new(0)) { |ch, h| h[ch] += 1 }
end
def scramble_cary2(str1, str2)
str2.chars.difference(str1.chars).empty?
end
class Array
def difference(other)
h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
reject { |e| h[e] > 0 && h[e] -= 1 }
end
end
Benchmark.bmbm do |x|
n = 100
s1 = (('a'..'z').to_a * (45000 / 26)).shuffle.join
s2 = s1.chars.shuffle.join
x.report('Stefan 1') { n.times { scramble_stefan1(s1, s2) } }
x.report('Stefan 2') { n.times { scramble_stefan2(s1, s2) } }
x.report('Other') { n.times { scramble_other(s1, s2) } }
x.report('Cary 1') { n.times { scramble_cary1(s1, s2) } }
x.report('Cary 2') { n.times { scramble_cary2(s1, s2) } }
end

How do you count the amout of duplicate characters in a string ruby

I have a string
s = "chineedne"
I am trying create a function that can count the amount of duplicate characters in my string or any string
tried
s.each_char.map { |c| c.find.count { |c| s.count(c) > 1 }}
#=> NoMethodError: undefined method `find' for "c":String
Possible solution:
string = "chineedne"
string.chars.uniq.count { |char| string.count(char) > 1 }
#=> 2
or without uniq method to count total amount of duplicated characters:
string = "chineedne"
string.chars.count { |char| string.count(char) > 1 }
#=> 5
In order to get away from N**2 complexity, you also can use group_by method for creating hash with character -> array that include all of this character from string and than just use this hash to get any data that you want:
duplicates = string.chars.group_by { |char| char }.select { |key, value| value.size > 1 }
# or, for Ruby version >= 2.2.1 - string.chars.group_by(&:itself).select { |key, value| value.size > 1 }
than:
> duplicates.keys.size # .keys => ['n', 'e']
#=> 2
and
> duplicates.values.flatten.size # .values.flatten => ["n", "n", "e", "e", "e"]
#=> 5
You can simply count your chars:
chars_frequency = str.each_char
.with_object(Hash.new(0)) {|c, m| m[c]+=1}
=> {"c"=>1, "h"=>1, "i"=>1, "n"=>2, "e"=>3, "d"=>1}
Then just count:
chars_frequency.count { |k, v| v > 1 }
=> 2
Or (if you want to count total amount):
chars_frequency.inject(0) {|r, (k, v)| v > 1 ? r + v : r }
=> 5
I dont know too much about ruby but I think that something like this should work
yourstring = "chineedne"
count = 0
"abcdefghijklmnopqrstuvwxyz".split("").each do |c|
if (yourstring.scan(c).count > 1
count = count+1
end
end
count variable represents the ammount of duplicate characters
https://stackoverflow.com/a/12428037/6371926
string = "chineedne"
# iterate over each chars creatign an array
# downcase, incase we are dealing with case sensitive
arr = string.downcase.chars
# pulling out the uniq char and counting how many times
# they appear in the array more than once
arr.uniq.count {|n| arr.count(n) > 1}

Both tests are returning false even though in my mind the code executes perfectly

# Write a method that takes in a string. Your method should return the
# most common letter in the array, and a count of how many times it
# appears.
#
# Difficulty: medium.
def most_common_letter(string)
letter = 0
letter_count = 0
idx1 = 0
mostfreq_letter = 0
largest_letter_count = 0
while idx1 < string.length
letter = string[idx1]
idx2 = 0
while idx2 < string.length
if letter == string[idx2]
letter_count += 1
end
idx2 += 1
end
if letter_count > largest_letter_count
largest_letter_count = letter_count
mostfreq_letter = letter
end
idx1 += 1
end
return [mostfreq_letter, largest_letter_count]
end
# These are tests to check that your code is working. After writing
# your solution, they should all print true.
puts(
'most_common_letter("abca") == ["a", 2]: ' +
(most_common_letter('abca') == ['a', 2]).to_s
)
puts(
'most_common_letter("abbab") == ["b", 3]: ' +
(most_common_letter('abbab') == ['b', 3]).to_s
)
So in my mind the program should set a letter and then once that is set cycle through the string looking for letters that are the same, and then once there is one it adds to letter count and then it judges if its the largest letter count and if it is those values are stored to the eventual return value that should be correct once the while loop ends. However I keep getting false false. Where am I going wrong?
Your code does not return [false, false] to me; but it does return incorrect results. The hint by samgak should lead you to the bug.
However, for a bit shorter and more Rubyish alternative:
def most_common_letter(string)
Hash.new(0).tap { |h|
string.each_char { |c| h[c] += 1 }
}.max_by { |k, v| v }
end
Create a new Hash that has a default value of 0 for each entry; iterate over characters and count the frequency for each of them in the hash; then find which hash entry is the largest. When a hash is iterated, it produces pairs, just like what you want for your function output, so that's nice, too.

How do I use a hash to modify the values of an Array?

I am building a base converter. Here is my code so far:
def num_to_s(num, base)
remainders = [num]
while base <= num
num /= base #divide initial value of num
remainders << num #shovel results into array to map over for remainders
end
return remainders.map{|i| result = i % base}.reverse.to_s #map for remainders and shovel to new array
puts num_to_s(40002, 16)
end
Now it's time to account for bases over 10 where letters replace numbers. The instructions (of the exercise) suggest using a hash. Here is my hash:
conversion = {10 => 'A', 11 => 'B', 12 => 'C', 13 => 'D', 14 => 'E', 15 => 'F',}
The problem is now, how do I incorporate it so that it modifies the array? I have tried:
return remainders.map{|i| result = i % base}.map{|i| [i, i]}.flatten.merge(conversion).reverse.to_s
In an attempt to convert the 'remainders' array into a hash and merge them so the values in 'conversion' override the ones in 'remainders', but I get an 'odd list for Hash' error. After some research it seems to be due to the version of Ruby (1.8.7) I am running, and was unable to update. I also tried converting the array into a hash outside of the return:
Hashes = Hash[remainders.each {|i, i| [i, i]}].merge(conversion)
and I get an 'dynamic constant assignment' error. I have tried a bunch of different ways to do this... Can a hash even be used to modify an array? I was also thinking maybe I could accomplish this by using a conditional statement within an enumerator (each? map?) but haven't been able to make that work. CAN one put a conditional inside an enumerator?
Yes, you could use a hash:
def digit_hash(base)
digit = {}
(0...[10,base].min).each { |i| digit.update({ i=>i.to_s }) }
if base > 10
s = ('A'.ord-1).chr
(10...base).each { |i| digit.update({ i=>s=s.next }) }
end
digit
end
digit_hash(40)
#=> { 0=>"0", 1=>"1", 2=>"2", 3=>"3", 4=>"4",
# 5=>"5", 6=>"6", 7=>"7", 8=>"8", 9=>"9",
# 10=>"A", 11=>"B", 12=>"C", ..., 34=>"Y", 35=>"Z",
# 36=>"AA", 37=>"AB", 38=>"AC", 39=>"AD" }
There is a problem in displaying digits after 'Z'. Suppose, for example, the base were 65. Then one would not know if "ABC" was 10-11-12, 37-12 or 10-64. That's detail we needn't worry about.
For variety, I've done the base conversion from high to low, as one might do with paper and pencil for base 10:
def num_to_s(num, base)
digit = digit_hash(base)
str = ''
fac = base**((0..Float::INFINITY).find { |i| base**i > num } - 1)
until fac.zero?
d = num/fac
str << digit[d]
num -= d*fac
fac /= base
end
str
end
Let's try it:
num_to_s(134562,10) #=> "134562"
num_to_s(134562, 2) #=> "100000110110100010"
num_to_s(134562, 8) #=> "406642"
num_to_s(134562,16) #=> "20DA2"
num_to_s(134562,36) #=> "2VTU"
Let's check the last one:
digit_inv = digit_hash(36).invert
digit_inv["2"] #=> 2
digit_inv["V"] #=> 31
digit_inv["T"] #=> 29
digit_inv["U"] #=> 30
So
36*36*36*digit_inv["2"] + 36*36*digit_inv["V"] +
36*digit_inv["T"] + digit_inv["U"]
#=> 36*36*36*2 + 36*36*31 + 36*29 + 30
#=> 134562
The expression:
(0..Float::INFINITY).find { |i| base**i > num }
computes the smallest integer i such that base**i > num. Suppose, for example,
base = 10
num = 12345
then i is found to equal 5 (10**5 = 100_000). We then raise base to this number less one to get the initial factor:
fac = base**(5-1) #=> 10000
Then the first (base-10) digit is
d = num/fac #=> 1
the remainder is
num -= d*fac #=> 12345 - 1*10000 => 2345
and the factor for the next digit is:
fac /= base #=> 10000/10 => 1000
I made a couple of changes from my initial answer to make it 1.87-friedly (I removed Enumerator#with_object and Integer#times), but I haven't tested with 1.8.7, as I don't have that version installed. Let me know if there are any problems.
Apart from question, you can use Fixnum#to_s(base) to convert base.
255.to_s(16) # 'ff'
I would do a
def get_symbol_in_base(blah)
if blah < 10
return blah
else
return (blah - 10 + 65).chr
end
end
and after that do something like:
remainders << get_symbol_in_base(num)
return remainders.reverse.to_s

Resources