Why is one approach for backtracking faster than the other? - ruby

So I was working on Leetcode on Word Breaking II, and came up with two backtracking implementations that are similar but differ in memoization. However, one will pass the specs while the other would not due to time limit exceeded. Can someone please explain why approach 1 is faster than approach 2?
For context, basically the problem gives me a string and a dictionary. If there are words in the string that are also in the dictionary, separate out the words with a space and place the resulting string (of words) into a resulting array. Dictionary words can be used more than once!
So for example:
s = "pineapplepenapple"
wordDict = ["apple", "pen", "applepen", "pine", "pineapple"]
Output:
[
"pine apple pen apple",
"pineapple pen apple",
"pine applepen apple"
]
Explanation: Note that you are allowed to reuse a dictionary word.
Approach 1 (works!):
require 'set'
def word_break(s, word_dict)
word_dict = Set.new(word_dict)
bt(s, word_dict, 0, {})
end
def bt(s, word_dict, i, mem)
return mem[i] if mem[i]
return [''] if i == s.length
res = []
j = i
while j < s.length
word = s[i..j]
if word_dict.include?(word)
word_breaks = bt(s, word_dict, j + 1, mem)
word_breaks.each do |words|
new_combined_words = word
new_combined_words += ' ' + words if words.length > 0
res << new_combined_words
end
end
j += 1
end
# Memoizing here makes it fast!
mem[i] = res
end
Approach 2 (Not fast enough):
require 'set'
def word_break(s, word_dict)
word_dict = Set.new(word_dict)
bt(s, word_dict, 0, {})
end
def bt(s, word_dict, i, mem)
return mem[i] if mem[i]
return [''] if i== s.length
res = []
j = i
while j < s.length
word = s[i..j]
if word_dict.include?(word)
word_breaks = bt(s, word_dict, j + 1, mem)
word_breaks.each do |words|
new_combined_words = word
new_combined_words += ' ' + words if words.length > 0
# Memoizing here but it's too slow
(mem[i] ||= []) << new_combined_words
res << new_combined_words
end
end
j += 1
end
res
end
In approach 1, I memoize in the end with mem[i] = res while in approach 2, I memoize on the fly as I am generating new combination of words.
Any assistance would be greatly appreciated, thank you!

When mem[i] happen to be empty array, you would never set it in approach 2. Empty values are not memoized.
What Cary suggested in his comment should fix this. The code is still going to be a bit slower than approach 1, but it will probably pass the tests on leetcode.
UPD: even with that suggested edit, when bt(s, word_dict, j + 1, mem) returns an empty array, we would never memoize mem[i], which is makes the code asymptotically exponential. To fix this, try the following.
mem[i] ||= []
if word_dict.include?(word)
word_breaks = bt(s, word_dict, j + 1, mem)
word_breaks.each do |words|
new_combined_words = word
new_combined_words += ' ' + words if words.length > 0
mem[i] << new_combined_words
res << new_combined_words
end
end

This is not an answer, but an extended comment intended to shed light on the problem. I benchmarked the two methods and was unable to reproduce the result that #1 is faster than #2. word_break1 (which calls bt1) is #1; word_break2 (which calls bt2) is #2.
s = "nowisthetimetohavesomefun"
word_dict = %w|now is the time to have some fun no wist he so mefun ist he ti me|
require 'benchmark'
Benchmark.bm do |x|
x.report("#1") { word_break1(s, word_dict) }
x.report("#2") { word_break2(s, word_dict) }
end
The following results were obtained from several runs.
user system total real
#1 0.000342 0.000083 0.000425 ( 0.000312)
#2 0.000264 0.000066 0.000330 ( 0.000242)*
#1 0.000315 0.000075 0.000390 ( 0.000288)
#2 0.000230 0.000066 0.000296 ( 0.000208)*
#1 0.000292 0.000079 0.000371 ( 0.000268)
#2 0.000255 0.000065 0.000320 ( 0.000253)*
#1 0.000292 0.000090 0.000382 ( 0.000261)*
#2 0.000337 0.000121 0.000458 ( 0.000349)
#1 0.000301 0.000063 0.000364 ( 0.000291)*
#2 0.000413 0.000134 0.000547 ( 0.000385)
#1 0.000306 0.000079 0.000385 ( 0.000280)
#2 0.000281 0.000082 0.000363 ( 0.000255)*
Both methods return the following array.
["no wist he ti me to have so me fun", "no wist he ti me to have so mefun",
"no wist he ti me to have some fun", "no wist he time to have so me fun",
"no wist he time to have so mefun", "no wist he time to have some fun",
"now is the ti me to have so me fun", "now is the ti me to have so mefun",
"now is the ti me to have some fun", "now is the time to have so me fun",
"now is the time to have so mefun", "now is the time to have some fun",
"now ist he ti me to have so me fun", "now ist he ti me to have so mefun",
"now ist he ti me to have some fun", "now ist he time to have so me fun",
"now ist he time to have so mefun", "now ist he time to have some fun"]

Related

How to break a range down into smaller non-overlapping ranges

What is the most beautiful way to break a larger range into smaller non overlapping ranges?
range = 1..375
Desired Output:
1..100
101..200
201..300
301..375
You can use #each_slice in combination with #map:
(1..375).each_slice(100).map { |a,*,b| (a..b) }
#=> [1..100, 101..200, 201..300, 301..375]
The following may not be the most elegant solution but it is designed to be relatively efficient, by avoiding the creation of temporary arrays.
def divide_range(range, sz)
start = range.begin
(range.size/sz).times.with_object([]) do |_,arr|
arr << (start..start+sz-1)
start += sz
end.tap { |arr| (arr << (start..range.end)) if start < range.end }
end
divide_range(1..375, 100)
#=> [1..100, 101..200, 201..300, 301..375]
divide_range(1..400, 100)
#=> [1..100, 101..200, 201..300, 301..400]
divide_range(50..420, 50)
#=> [50..99, 100..149, 150..199, 200..249, 250..299, 300..349,
# 350..399, 400..420]
n = 1_000_000_000_000
divide_range(1..n, n/2)
#=> [1..500000000000, 500000000001..1000000000000]
Currently, I'm using the step method, but I don't like having to check the top of the range and do calculations to avoid overlapping:
For example:
range = 1..375
interval = 100
range.step(interval).each do |start|
stop = [range.last, start + (interval - 1)].min
puts "#{start}..#{stop}"
end
I've taken this code and extended Range as well:
class Range
def in_sub_ranges(interval)
step(interval).each do |start|
stop = [range.last, start + (interval - 1)].min
yield(start..stop)
end
end
end
This allows me to do
range.in_sub_ranges(100) { |sub| puts sub }

Euler 23 in Ruby

All right. I think I have the right idea to find the solution to Euler #23 (The one about finding the sum of all numbers that can't be expressed as the sum of two abundant numbers).
However, it is clear that one of my methods is too damn brutal.
How do you un-brute force this and make it work?
sum_of_two_abunds?(num, array) is the problematic method. I've tried pre-excluding certain numbers and it's still taking forever and I'm not even sure that it's giving the right answer.
def divsum(number)
divsum = 1
(2..Math.sqrt(number)).each {|i| divsum += i + number/i if number % i == 0}
divsum -= Math.sqrt(number) if Math.sqrt(number).integer?
divsum
end
def is_abundant?(num)
return true if divsum(num) > num
return false
end
def get_abundants(uptonum)
abundants = (12..uptonum).select {|int| is_abundant?(int)}
end
def sum_of_two_abunds?(num, array)
#abundant, and can be made from adding two abundant numbers.
array.each do |abun1|
array.each do |abun2|
current = abun1+abun2
break if current > num
return true if current == num
end
end
return false
end
def non_abundant_sum
ceiling = 28123
sum = (1..23).inject(:+) + (24..ceiling).select{|i| i < 945 && i % 2 != 0}.inject(:+)
numeri = (24..ceiling).to_a
numeri.delete_if {|i| i < 945 && i % 2 != 0}
numeri.delete_if {|i| i % 100 == 0}
abundants = get_abundants(ceiling)
numeri.each {|numerus| sum += numerus if sum_of_two_abunds?(numerus, abundants) == false}
return sum
end
start_time = Time.now
puts non_abundant_sum
#Not enough numbers getting excluded from the total.
duration = Time.now - start_time
puts "Took #{duration} s "
Solution 1
A simple way to make it a lot faster is to speed up your sum_of_two_abunds? method:
def sum_of_two_abunds?(num, array)
array.each do |abun1|
array.each do |abun2|
current = abun1+abun2
break if current > num
return true if current == num
end
end
return false
end
Instead of that inner loop, just ask the array whether it contains num - abun1:
def sum_of_two_abunds?(num, array)
array.each do |abun1|
return true if array.include?(num - abun1)
end
false
end
That's already faster than your Ruby code, since it's simpler and running faster C code. Also, now that that idea is clear, you can take advantage of the fact that the array is sorted and search num - abun1 with binary search:
def sum_of_two_abunds?(num, array)
array.each do |abun1|
return true if array.bsearch { |x| num - abun1 <=> x }
end
false
end
And making that Rubyish:
def sum_of_two_abunds?(num, array)
array.any? do |abun1|
array.bsearch { |x| num - abun1 <=> x }
end
end
Now you can get rid of your own special case optimizations and fix your incorrect divsum (which for example claims that divsum(4) is 5 ... you should really compare against a naive implementation that doesn't try any square root optimizations).
And then it should finish in well under a minute (about 11 seconds on my PC).
Solution 2
Or you could instead ditch sum_of_two_abunds? entirely and just create all sums of two abundants and nullify their contribution to the sum:
def non_abundant_sum
ceiling = 28123
abundants = get_abundants(ceiling)
numeri = (0..ceiling).to_a
abundants.each { |a| abundants.each { |b| numeri[a + b] = 0 } }
numeri.compact.sum
end
That runs on my PC in about 3 seconds.

How to write nested if/then statements in Ruby

I'm supposed to
define a method, three_digit_format(n), that accepts an integer, n, as an argument. Assume that n < 1000. Your method should return a string version of n, but with leading zeros such that the string is always 3 characters long.
I have been tinkering with versions of the below code, but I always get errors. Can anyone advise?
def three_digit_format(n)
stringed = n.to_s
stringed.size
if stringed.size > 2
return stringed
end
elsif stringed > 1
return "0" + stringed
end
else
return "00" + stringed
end
end
puts three_digit_format(9)
rjust
You could just use rjust:
n.to_s.rjust(3, '0')
If integer is greater than the length of str, returns a new String of
length integer with str right justified and padded with padstr;
otherwise, returns str.
Your code
Problem
If you let your text editor indents your code, you can notice there's something wrong:
def three_digit_format(n)
stringed = n.to_s
stringed.size
if stringed.size > 2
return stringed
end
elsif stringed > 1 # <- elsif shouldn't be here
return "0" + stringed
end
else
return "00" + stringed
end
end
puts three_digit_format(9)
Solution
if, elsif and else belong to the same expression : there should only be one end at the end of the expression, not for each statement.
def three_digit_format(n)
stringed = n.to_s
if stringed.size > 2
return stringed
elsif stringed.size > 1
return "0" + stringed
else
return "00" + stringed
end
end
puts three_digit_format(9)
# 009
This function, as some have pointed out, is entirely pointless since there's several built-in ways of doing this. Here's the most concise:
def three_digit_format(n)
'%03d' % n
end
Exercises that force you to re-invent tools just drive me up the wall. That's not what programming is about. Learning to be an effective programmer means knowing when you have a tool at hand that can do the job, when you need to use several tools in conjunction, or when you have no choice but to make your own tool. Too many programmers jump immediately to writing their own tools and overlook more elegant solutions.
If you're committed to that sort of approach due to academic constraints, why not this?
def three_digit_format(n)
v = n.to_s
while (v.length < 3)
v = '0' + v
end
v
end
Or something like this?
def three_digit_format(n)
(n + 1000).to_s[1,3]
end
Where in that case values of the form 0-999 will be rendered as "1000"-"1999" and you can just trim off the last three characters.
Since these exercises are often absurd, why not take this to the limit of absurdity?
def three_digit_format(n)
loop do
v = Array.new(3) { (rand(10) + '0'.ord).chr }.join('')
return v if (v.to_i == n)
end
end
If you're teaching things about if statements and how to append elsif clauses, it makes sense to present those in a meaningful context, not something contrived like this. For example:
if (customer.exists? and !customer.on_fire?)
puts('Welcome back!')
elsif (!customer.exists?)
puts('You look new here, welcome!')
else
puts('I smell burning.')
end
There's so many ways a chain of if statements is unavoidable, it's how business logic ends up being implemented. Using them in inappropriate situations is how code ends up ugly and Rubocop or Code Climate give you a failing grade.
As others have pointed out, rjust and applying a format '%03d' % n are built in ways to do it.
But if you have to stick to what you've learned so far, I wonder if you've been introduced to the case statement?
def three_digit_format(n)
case n
when 0..9
return "00#{n}"
when 10..99
return "0#{n}"
when 100..999
return "#{n}"
end
end
I think it's cleaner than successive if statements.
Here's my spin on it:
def three_digit_format(n)
str = n.to_s
str_len = str.length
retval = if str_len > 2
str
elsif str_len > 1
'0' + str
else
'00' + str
end
retval
end
three_digit_format(1) # => "001"
three_digit_format(12) # => "012"
three_digit_format(123) # => "123"
Which can be reduced to:
def three_digit_format(n)
str = n.to_s
str_len = str.length
if str_len > 2
str
elsif str_len > 1
'0' + str
else
'00' + str
end
end
The way it should be done is by taking advantage of String formats:
'%03d' % 1 # => "001"
'%03d' % 12 # => "012"
'%03d' % 123 # => "123"

What's wrong with my code?

def encrypt(string)
alphabet = ("a".."b").to_a
result = ""
idx = 0
while idx < string.length
character = string[idx]
if character == " "
result += " "
else
n = alphabet.index(character)
n_plus = (n + 1) % alphabet.length
result += alphabet[n_plus]
end
idx += 1
end
return result
end
puts encrypt("abc")
puts encrypt("xyz")
I'm trying to get "abc" to print out "bcd" and "xyz" to print "yza". I want to advance the letter forward by 1. Can someone point me to the right direction?
All I had to do was change your alphabet array to go from a to z, not a to b, and it works fine.
def encrypt(string)
alphabet = ("a".."z").to_a
result = ""
idx = 0
while idx < string.length
character = string[idx]
if character == " "
result += " "
else
n = alphabet.index(character)
n_plus = (n + 1) % alphabet.length
result += alphabet[n_plus]
end
idx += 1
end
return result
end
puts encrypt("abc")
puts encrypt("xyz")
Another way to solve the issue, that I think is simpler, personally, is to use String#tr:
ALPHA = ('a'..'z').to_a.join #=> "abcdefghijklmnopqrstuvwxyz"
BMQIB = ('a'..'z').to_a.rotate(1).join #=> "bcdefghijklmnopqrstuvwxyza"
def encrypt(str)
str.tr(ALPHA,BMQIB)
end
def decrypt(str)
str.tr(BMQIB,ALPHA)
end
encrypt('pizza') #=> "qjaab"
decrypt('qjaab') #=> "pizza"
Alternatively if you don't want to take up that memory storing the alphabet you could use character codings and then just use arithmetic operations on them to shift the letters:
def encrypt(string)
result = ""
idx = 0
while idx < string.length
result += (string[idx].ord == 32 ? (string[idx].chr) : (string[idx].ord+1).chr)
idx += 1
end
result
end
Other strange thing about ruby is that you do not need to explicitly return something at the end of the method body. It just returns the last thing by default. This is considered good style amongst ruby folks.
Your question has been answered, so here are a couple of more Ruby-like ways of doing that.
Use String#gsub with a hash
CODE_MAP = ('a'..'z').each_with_object({}) { |c,h| h[c] = c < 'z' ? c.next : 'a' }
#=> {"a"=>"b", "b"=>"c",..., "y"=>"z", "z"=>"a"}
DECODE_MAP = CODE_MAP.invert
#=> {"b"=>"a", "c"=>"b",..., "z"=>"y", "a"=>"z"}
def encrypt(word)
word.gsub(/./, CODE_MAP)
end
def decrypt(word)
word.gsub(/./, DECODE_MAP)
end
encrypt('pizza')
#=> "qjaab"
decrypt('qjaab')
#=> "pizza"
Use String#gsub with Array#rotate
LETTERS = ('a'..'z').to_a
#=> ["a", "b", ..., "z"]
def encrypt(word)
word.gsub(/./) { |c| LETTERS.rotate[LETTERS.index(c)] }
end
def decrypt(word)
word.gsub(/./) { |c| LETTERS.rotate(-1)[LETTERS.index(c)] }
end
encrypt('pizza')
#=> "qjaab"
decrypt('qjaab')
#=> "pizza"

Stack level too deep error in ruby's recursive call

I am trying to implement the quick sort algorithm using ruby. See what I did:
class Array
def quick_sort #line 14
less=[];greater=[]
if self.length<=1
self[0]
else
i=1
while i<self.length
if self[i]<=self[0]
less << self[i]
else
greater << self[i]
end
i=i+1
end
end
less.quick_sort + self[0] + greater.quick_sort #line 29
end
end
[1,3,2,5,4].quick_sort #line 32
This generated the error:
bubble_sort.rb:29:in `quick_sort': stack level too deep (SystemStackError)
from bubble_sort.rb:29:in `quick_sort'
from bubble_sort.rb:32
Why is this happening?
I think the problem in your example was you needed an explicit return.
if self.length<=1
self[0]
should have been
return [] if self == []
and
less.quick_sort + self[0] + greater.quick_sort #line 29
should have been
less.quick_sort + [self[0]] + greater.quick_sort #line 29
Here is a working example
class Array
def quick_sort
return [] if self == []
pivotal = self.shift;
less, greater = [], []
self.each do |x|
if x <= pivotal
less << x
else
greater << x
end
end
return less.quick_sort + [pivotal] + greater.quick_sort
end
end
[1,3,2,5,4].quick_sort # => [1, 2, 3, 4, 5]
less.quick_sort + self[0] + greater.quick_sort
This line is outside of the if statement, so it gets executed whether self.length<=1 is true or not. Consequently the method recurses infinitely, which causes the stack to overflow.
It should also be pointed out that self[0] does not return an array (unless self is an array of arrays), so it does not make sense to use Array#+ on it. Nor does it make sense as a return value for your quick_sort method.
In that part you should not handle the "=" case. Only < and > should be handled. Therefore your algorithm never stops and causes an infinite recursion.
if self[i]<=self[0]
less << self[i]
else
greater << self[i]
end

Resources