I want to iterate a given array, for example:
["goat", "action", "tear", "impromptu", "tired", "europe"]
I want to look at all possible pairs.
The desired output is a new array, which contains all pairs, that combined contain all vowels. Also those pairs should be concatenated as one element of the output array:
["action europe", "tear impromptu"]
I tried the following code, but got an error message:
No implicit conversion of nil into string.
def all_vowel_pairs(words)
pairs = []
(0..words.length).each do |i| # iterate through words
(0..words.length).each do |j| # for every word, iterate through words again
pot_pair = words[i].to_s + words[j] # build string from pair
if check_for_vowels(pot_pair) # throw string to helper-method.
pairs << words[i] + " " + words[j] # if gets back true, concatenade and push to output array "pairs"
end
end
end
pairs
end
# helper-method to check for if a string has all vowels in it
def check_for_vowels(string)
vowels = "aeiou"
founds = []
string.each_char do |char|
if vowels.include?(char) && !founds.include?(char)
founds << char
end
end
if founds.length == 5
return true
end
false
end
The following code is intended to provide an efficient way to construct the desired array when the number of words is large. Note that, unlike the other answers, it does not make use of the method Array#combination.
The first part of the section Explanation (below) provides an overview of the approach taken by the algorithm. The details are then filled in.
Code
require 'set'
VOWELS = ["a", "e", "i", "o", "u"]
VOWELS_SET = VOWELS.to_set
def all_vowel_pairs(words)
h = words.each_with_object({}) {|w,h| (h[(w.chars & VOWELS).to_set] ||= []) << w}
h.each_with_object([]) do |(k,v),a|
vowels_needed = VOWELS_SET-k
h.each do |kk,vv|
next unless kk.superset?(vowels_needed)
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
end
end
end
Example
words = ["goat", "action", "tear", "impromptu", "tired", "europe", "hear"]
all_vowel_pairs(words)
#=> ["action europe", "hear impromptu", "impromptu tear"]
Explanation
For the given example the steps are as follows.
VOWELS_SET = VOWELS.to_set
#=> #<Set: {"a", "e", "i", "o", "u"}>
h = words.each_with_object({}) {|w,h| (h[(w.chars & VOWELS).to_set] ||= []) << w}
#=> {#<Set: {"o", "a"}>=>["goat"],
# #<Set: {"a", "i", "o"}>=>["action"],
# #<Set: {"e", "a"}>=>["tear", "hear"],
# #<Set: {"i", "o", "u"}>=>["impromptu"],
# #<Set: {"i", "e"}>=>["tired"],
# #<Set: {"e", "u", "o"}>=>["europe"]}
It is seen that the keys of h are subsets of the five vowels. The values are arrays of elements of words (words) that contain the vowels given by the key and no others. The values therefore collectively form a partition of words. When the number of words is large one would expect h to have 31 keys (2**5 - 1).
We now loop through the key-value pairs of h. For each, with key k and value v, the set of missing vowels (vowels_needed) is determined, then we loop through those keys-value pairs [kk, vv] of h for which kk is a superset of vowels_needed. All combinations of elements of v and vv are then added to the array being returned (after an adjustment to avoid double-counting each pair of words).
Continuing,
enum = h.each_with_object([])
#=> #<Enumerator: {#<Set: {"o", "a"}>=>["goat"],
# #<Set: {"a", "i", "o"}>=>["action"],
# ...
# #<Set: {"e", "u", "o"}>=>["europe"]}:
# each_with_object([])>
The first value is generated by enum and passed to the block, and the block variables are assigned values:
(k,v), a = enum.next
#=> [[#<Set: {"o", "a"}>, ["goat"]], []]
See Enumerator#next.
The individual variables are assigned values by array decomposition:
k #=> #<Set: {"o", "a"}>
v #=> ["goat"]
a #=> []
The block calculations are now performed.
vowels_needed = VOWELS_SET-k
#=> #<Set: {"e", "i", "u"}>
h.each do |kk,vv|
next unless kk.superset?(vowels_needed)
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
end
The word "goat" (v) has vowels "o" and "a", so it can only be matched with words that contain vowels "e", "i" and "u" (and possibly "o" and/or "a"). The expression
next unless kk.superset?(vowels_needed)
skips those keys of h (kk) that are not supersets of vowels_needed. See Set#superset?.
None of the words in words contain "e", "i" and "u" so the array a is unchanged.
The next element is now generated by enum, passed to the block and the block variables are assigned values:
(k,v), a = enum.next
#=> [[#<Set: {"a", "i", "o"}>, ["action"]], []]
k #=> #<Set: {"a", "i", "o"}>
v #=> ["action"]
a #=> []
The block calculation begins:
vowels_needed = VOWELS_SET-k
#=> #<Set: {"e", "u"}>
We see that h has only one key-value pair for which the key is a superset of vowels_needed:
kk = %w|e u o|.to_set
#=> #<Set: {"e", "u", "o"}>
vv = ["europe"]
We therefore execute:
v.each {|w1| vv.each {|w2| a << "%s %s" % [w1, w2] if w1 < w2}}
which adds one element to a:
a #=> ["action europe"]
The clause if w1 < w2 is to ensure that later in the calculations "europe action" is not added to a.
If v (words containing 'a', 'i' and 'u') and vv (words containing 'e', 'u' and 'o') had instead been:
v #=> ["action", "notification"]
vv #=> ["europe", "route"]
we would have added "action europe", "action route" and "notification route" to a. (”europe notification” would be added later, when k #=> #<Set: {"e", "u", "o"}.)
Benchmark
I benchmarked my method against others suggested using #theTinMan's Fruity benchmark code. The only differences were in the array of words to be tested and the addition of my method to the benchmark, which I named cary. For the array of words to be considered I selected 600 words at random from a file of English words on my computer:
words = IO.readlines('/usr/share/dict/words', chomp: true).sample(600)
words.first 10
#=> ["posadaship", "explosively", "expensilation", "conservatively", "plaiting",
# "unpillared", "intertwinement", "nonsolidified", "uraemic", "underspend"]
This array was found to contain 46,436 pairs of words containing all five vowels.
The results were as shown below.
compare {
_viktor { viktor(words) }
_ttm1 { ttm1(words) }
_ttm2 { ttm2(words) }
_ttm3 { ttm3(words) }
_cary { cary(words) }
}
Running each test once. Test will take about 44 seconds.
_cary is faster than _ttm3 by 5x ± 0.1
_ttm3 is faster than _viktor by 50.0% ± 1.0%
_viktor is faster than _ttm2 by 30.000000000000004% ± 1.0%
_ttm2 is faster than _ttm1 by 2.4x ± 0.1
I then compared cary with ttm3 for 1,000 randomly selected words. This array was found to contain 125,068 pairs of words containing all five vowels. That result was as follows:
Running each test once. Test will take about 19 seconds.
_cary is faster than _ttm3 by 3x ± 1.0
To get a feel for the variability of the benchmark I ran this last comparison twice more, each with a new random selection of 1,000 words. That gave me the following results:
Running each test once. Test will take about 17 seconds.
_cary is faster than _ttm3 by 5x ± 1.0
Running each test once. Test will take about 18 seconds.
_cary is faster than _ttm3 by 4x ± 1.0
It is seen the there is considerable variation among the samples.
You said pairs so I assume it's a combination of two elements. I've made a combination of each two elements in the array using the #combination method. Then I #select-ed only those pairs that contain all vowels once they're joined. Finally, I made sure to join those pairs :
["goat", "action", "tear", "impromptu", "tired", "europe"]
.combination(2)
.select { |c| c.join('') =~ /\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b/ }
.map{ |w| w.join(' ') }
#=> ["action europe", "tear impromptu"]
The regex is from "What is the regex to match the words containing all the vowels?".
Starting similarly to Viktor's, I'd use a simple test to see what vowels exist in the words and compare to whether they match "aeiou" after stripping duplicates and sorting them:
def ttm1(ary)
ary.combination(2).select { |a|
a.join.scan(/[aeiou]/).uniq.sort.join == 'aeiou'
}.map { |a| a.join(' ') }
end
ttm1(words) # => ["action europe", "tear impromptu"]
Breaking it down so you can see what's happening.
["goat", "action", "tear", "impromptu", "tired", "europe"] # => ["goat", "action", "tear", "impromptu", "tired", "europe"]
.combination(2)
.select { |a| a # => ["goat", "action"], ["goat", "tear"], ["goat", "impromptu"], ["goat", "tired"], ["goat", "europe"], ["action", "tear"], ["action", "impromptu"], ["action", "tired"], ["action", "europe"], ["tear", "impromptu"], ["tear", "tired"], ["tear", "europe"], ["impromptu", "tired"], ["impromptu", "europe"], ["tired", "europe"]
.join # => "goataction", "goattear", "goatimpromptu", "goattired", "goateurope", "actiontear", "actionimpromptu", "actiontired", "actioneurope", "tearimpromptu", "teartired", "teareurope", "impromptutired", "impromptueurope", "tiredeurope"
.scan(/[aeiou]/) # => ["o", "a", "a", "i", "o"], ["o", "a", "e", "a"], ["o", "a", "i", "o", "u"], ["o", "a", "i", "e"], ["o", "a", "e", "u", "o", "e"], ["a", "i", "o", "e", "a"], ["a", "i", "o", "i", "o", "u"], ["a", "i", "o", "i", "e"], ["a", "i", "o", "e", "u", "o", "e"], ["e", "a", "i", "o", "u"], ["e", "a", "i", "e"], ["e", "a", "e", "u", "o", "e"], ["i", "o", "u", "i", "e"], ["i", "o", "u", "e", "u", "o", "e"], ["i", "e", "e", "u", "o", "e"]
.uniq # => ["o", "a", "i"], ["o", "a", "e"], ["o", "a", "i", "u"], ["o", "a", "i", "e"], ["o", "a", "e", "u"], ["a", "i", "o", "e"], ["a", "i", "o", "u"], ["a", "i", "o", "e"], ["a", "i", "o", "e", "u"], ["e", "a", "i", "o", "u"], ["e", "a", "i"], ["e", "a", "u", "o"], ["i", "o", "u", "e"], ["i", "o", "u", "e"], ["i", "e", "u", "o"]
.sort # => ["a", "i", "o"], ["a", "e", "o"], ["a", "i", "o", "u"], ["a", "e", "i", "o"], ["a", "e", "o", "u"], ["a", "e", "i", "o"], ["a", "i", "o", "u"], ["a", "e", "i", "o"], ["a", "e", "i", "o", "u"], ["a", "e", "i", "o", "u"], ["a", "e", "i"], ["a", "e", "o", "u"], ["e", "i", "o", "u"], ["e", "i", "o", "u"], ["e", "i", "o", "u"]
.join == 'aeiou' # => false, false, false, false, false, false, false, false, true, true, false, false, false, false, false
} # => [["action", "europe"], ["tear", "impromptu"]]
Looking at the code it was jumping through hoops to find whether all the vowels exist. Every time it checked it had to step through many methods before determining whether all the vowels were found; In other words it couldn't short-circuit and fail until the very end which isn't good.
This code will:
def ttm2(ary)
ary.combination(2).select { |a|
str = a.join
str[/a/] && str[/e/] && str[/i/] && str[/o/] && str[/u/]
}.map { |a| a.join(' ') }
end
ttm2(words) # => ["action europe", "tear impromptu"]
But I don't like using the regular expression engine this way as it's slower than doing a direct lookup, which lead to:
def ttm3(ary)
ary.combination(2).select { |a|
str = a.join
str['a'] && str['e'] && str['i'] && str['o'] && str['u']
}.map { |a| a.join(' ') }
end
Here's the benchmark:
require 'fruity'
words = ["goat", "action", "tear", "impromptu", "tired", "europe"]
def viktor(ary)
ary.combination(2)
.select { |c| c.join('') =~ /\b(?=\w*?a)(?=\w*?e)(?=\w*?i)(?=\w*?o)(?=\w*?u)[a-zA-Z]+\b/ }
.map{ |w| w.join(' ') }
end
viktor(words) # => ["action europe", "tear impromptu"]
def ttm1(ary)
ary.combination(2).select { |a|
a.join.scan(/[aeiou]/).uniq.sort.join == 'aeiou'
}.map { |a| a.join(' ') }
end
ttm1(words) # => ["action europe", "tear impromptu"]
def ttm2(ary)
ary.combination(2).select { |a|
str = a.join
str[/a/] && str[/e/] && str[/i/] && str[/o/] && str[/u/]
}.map { |a| a.join(' ') }
end
ttm2(words) # => ["action europe", "tear impromptu"]
def ttm3(ary)
ary.combination(2).select { |a|
str = a.join
str['a'] && str['e'] && str['i'] && str['o'] && str['u']
}.map { |a| a.join(' ') }
end
ttm3(words) # => ["action europe", "tear impromptu"]
compare {
_viktor { viktor(words) }
_ttm1 { ttm1(words) }
_ttm2 { ttm2(words) }
_ttm3 { ttm3(words) }
}
With the results:
# >> Running each test 256 times. Test will take about 1 second.
# >> _ttm3 is similar to _viktor
# >> _viktor is similar to _ttm2
# >> _ttm2 is faster than _ttm1 by 2x ± 0.1
Now, because this looks so much like a homework assignment, it's important to understand that schools are aware of Stack Overflow, and they look for students asking for help, so you probably don't want to reuse this code, especially not verbatim.
Your code contains two errors, one of which is causing the error message.
(0..words.length) loops from 0 to 6 . words[6] however does not exist (arrays are zero-based), so you get nil. Replacing by (0..words.length-1) (twice) should take care of that.
You will get every correct result twice, once as "action europe" and once as "europe action". This is caused by looping too much, going two times over every combination. Replace the second loop from (0..words.length-1) to (i..words.length-1).
This cumbersome bookkeeping of indexes is boring and leads to mistakes very often. This is why Ruby programmers often prefer more hassle-free methods (like combination as in other answers), avoiding indexes altogether.
I have an array with each letter
alphabet = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
input ="cg?"
Then i'm creating sub-keys
sub_keys = (2..letters.length).flat_map do |n|
letters.combination(n).map(&:join).map{|n| n = n.length.to_s + n.chars.sort.join }
end.uniq
Which gives me
sub_keys = ["2?c, 2?g, 2cg, 3cg?"]
Now i want to transform my sub_keys to get expected output which is
keys = [2ac, 2bc, 2cc, 2c, 2ag, 2bg, 2cg, 2dg, 2eg, 2fg, 2gg, 2g, 3acg, 3bcg, 3ccg, 3cdg, 3ceg, 3cfg, 3cgg, 3cg]
As you can see some keys like: 2c, 2g, 3cg stand out from the rest. In my code 2c acts as 2cd, 2ce, 2cf etc.. This is the output i want and my code for it is here:
keys_with_blanks = sub_keys.select{ |k| k.include? "?" }.map{ |k| k.gsub("?","") }
keys = keys_with_blanks.map do |k|
current_letters = choose_letters(all_letters, k)
k = current_letters.map{ |l| l.gsub(l,l+k) }
end.flatten.map{ |k| k.chars.sort.join }.uniq
keys += keys_with_blanks
For making less keys as the input grows, i decided to choose only the letters i need from the alphabet in each iteration instead of all of them, hence the line current_letters = choose_letters(all_letters, k)
choose_letters:
def choose_letters(letters, key)
letters = letters.select{ |l| l <= key.chars.last }
end
Let's say if i have a key key = "2c" then choose_letters will return ["a","b","c"]
This way i only make the keys i need. Up to about 13 character input its quite fast. The problem start when input is 14 or 15 characters long. Then it takes 3 or 4 seconds to make all the keys.
THIS IS THE SLOW PART
keys = keys_with_blanks.map do |k|
current_letters = choose_letters(all_letters, k)
k = current_letters.map{ |l| l.gsub(l,l+k) }
end.flatten.map{ |k| k.chars.sort.join }.uniq
It takes about 3-4 seconds for input = "abcdefghijklmn?"
QUESTION
How can the code be refactored to generate keys faster? My latest upgrade was to impletent choose_letters method, so keys like "2cd", "2ce", "2cf" and so on wouldn't generate, because they work exactly like "2c". Now it generates only the keys i need, so it's a little faster, but still.. Can it be faster?
I am doing a Caesar cipher. I thought that the unless statement will work but it doesn't with or without then. Then I changed the unless with if and put ; in the place of then and it reads : undefined method `>' for nil:NilClass.
def caesar_cipher(input, key)
input.each do |x|
numbers = x.ord + key.to_i unless (numbers > 122) then numbers = x.ord + key - 26
letters = numbers.chr
print letters
end
end
puts "Write the words you want to be ciphered: "
input = gets.chomp.split(//)
puts "Write the key (1 - 26): "
key = gets.chomp
caesar_cipher(input,key)
Here are a couple of Ruby-like ways to write that:
#1
def caesar_cipher(input, key)
letters = ('a'..'z').to_a
input.each_char.map { |c| letters.include?(c) ?
letters[(letters.index(c)+key) % 26] : c }.join
end
caesar_cipher("this is your brown dog", 2)
#=> "vjku ku aqwt dtqyp fqi"
#2
def caesar_cipher(input, key)
letters = ('a'..'z').to_a
h = letters.zip(letters.rotate(key)).to_h
h.default_proc = ->(_,k) { k }
input.gsub(/./,h)
end
caesar_cipher("this is your brown dog", 2)
#=> "vjku ku aqwt dtqyp fqi"
The hash h constructed in #2 equals:
h = letters.zip(letters.rotate(key)).to_h
#=> {"a"=>"c", "b"=>"d", "c"=>"e", "d"=>"f", "e"=>"g", "f"=>"h",
# ...
# "u"=>"w", "v"=>"x", "w"=>"y", "x"=>"z", "y"=>"a", "z"=>"b"}
h.default_proc = ->(_,k) { k } causes
h[c] #=> c
if c is not a lowercase letter (e.g., a space, capital letter, number, punctuation, etc.)
If you write a branch with condition (if or unless) at the end of a line, after an initial statement, there are two things that apply and affect you:
The condition is assessed before the statement on its left. In your case that means numbers has not been assigned yet so it is nil.
The branch decision is whether or not to run the initial statement, you do not branch to the statement after the then.
You can solve this simply by converting your condition to an if and moving it to a separate line:
def caesar_cipher(input, key)
input.each do |x|
numbers = x.ord + key.to_i
if (numbers > 122)
numbers = x.ord + key - 26
end
letters = numbers.chr
print letters
end
end
There are arguably better ways of coding this cipher in Ruby, but this should solve your immediate problem.
There is a more elegant way to loop repeating sequences in ruby. Meet Enumerable#cycle.
('a'..'z').cycle.take(50)
# => ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j",
# "k", "l", "m", "n", "o", "p", "q", "r", "s", "t",
# "u", "v", "w", "x", "y", "z", "a", "b", "c", "d",
# "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
# "o", "p", "q", "r", "s", "t", "u", "v", "w", "x"]
Therefore, translating a single letter given a key can be written as:
('a'..'z').cycle.take(letter.ord + key.to_i - 'a'.ord.pred).last
And the entire method can look prettier:
def caesar_cipher(phrase, key)
phrase.each_char.map do |letter|
('a'..'z').cycle.take(letter.ord + key.to_i - 'a'.ord.pred).last
end.join
end
puts caesar_cipher('abcxyz', 3) # => defabc
Note that this is slower than the alternative, but it also has the benefit that it's easier to read and the key can be any number.
Lets say you have a string
initial_message = "My dear cousin bill!"
I put this string of N characters in an array of hashes (where each letter is the key and the value is A = 0 , B = 1, C = 2.. etc).
hsh_letter_values = Hash[('a'..'z').zip (0..25).to_a] #Map letters to numbers in a hash
clean_message = initial_message.tr('^a-zA-Z0-9','').downcase #remove non-letters
char_map = clean_message.each_char.map { |i| { i => hsh_letter_values[i] } } #map each letter of message to corresponding number
Then I split the char_map into slices of 16.
char_split_map = char_map.each_slice(16).to_a
I want to split each 16 character slice into slices of 4, while keeping the hashes in the same order.
The outcome should look like:
[[[{"m"=>12}, {"y"=>24}, {"d"=>3}, {"e"=>4}],[{"a"=>0}, {"r"=>17}, {"c"=>2}, {"o"=>14}], [{"u"=>20}, {"s"=>18}, {"i"=>8}, {"n"=>13}], [{"b"=>1}, {"i"=>8}, {"l"=>11}, {"l"=>11}]]
I am planning on adding the values of each letter from each column to get four sums (C1,C2,C3,C4)
So for the first column it would be 12+0+20+1.
This is what I have so far http://repl.it/2cd/1.
Any help on what im doing wrong or a better way to handle this situation?
One way, starting with the message:
msg = "My dear cousin bill!"
arr = msg.downcase.gsub(/[^a-z]/,'').chars.each_slice(4).to_a
#=> [["m", "y", "d", "e"],
# ["a", "r", "c", "o"],
# ["u", "s", "i", "n"],
# ["b", "i", "l", "l"]]
4.times.map { |i| arr.reduce(0) { |t,a| t + (a[i]||?a).ord-?a.ord } }
#=> [33, 67, 24, 42]
msg = "My dearest cousin bill!"
arr = msg.downcase.gsub(/[^a-z]/,'').chars.each_slice(4).to_a
#=> [["m", "y", "d", "e"],
# ["a", "r", "e", "s"],
# ["t", "c", "o", "u"],
# ["s", "i", "n", "b"],
# ["i", "l", "l"]]
4.times.map { |i| arr.reduce(0) { |t,a| t + (a[i]||?a).ord-?a.ord } }
#=>[57, 62, 45, 43]
I would probably go with a slightly different approach:
initial_message = "My dear cousin bill!"
chars = initial_message.tr('^a-zA-Z0-9','').downcase.chars
char_map = ->(char) { char.ord - 97 }
results = chars.each_slice(4).each_slice(4).map do |array|
array.transpose.map {|column| column.reduce(0) {|res, letter| res + char_map[letter]} }
end
results.inspect => '[[33, 67, 24, 42]]'
This is not hitting the intermediate step you described in your question, however is probably a better way to achieve your final result.