Recursively group strings based on length - ruby

I have
strings = ["aaa", "bb", "ccc", "ddd", "e", "ff", "rrrrrrrr", "tttttttt", "a"]
I want to group the strings in the array so that each element is no longer then 5 and not shorter then 9. The strings have to maintain their order in the array.
EDIT: Sorry for confusion, yes - at least 5 and at most 9.
the outcome I am looking for is:
result = ["aaabbbccc", "dddeff", "rrrrrrrr", "tttttttta"]

Since your question was specified in a rather confusing way, this is the best I could come up with.
strings.inject(['']) { |a, s| a.last.size + s.size <= 9 ? a.last << s : a << s ; a }
#=> ["aaabbccc", "dddeff", "rrrrrrrr", "tttttttta"]

Related

Ruby permutations

Simply put, I want to have an input of letters, and output all possible combinations for a set length range.
for example:
length range 1 - 2
input a, b, c
...
output a, b, c, aa, ab, ac, bb, ba, bc, cc, ca, cb
I am trying to make an anagram/spell check solver so that I can 'automate' the NYT's Spelling Bee game. So, I want to input the letters given into my program, get an array of all possible combinations for specific lengths (they have a min word length of 4) and then check that array against an array of all English words. What I have so far is:
letters = ["m","o","r"]
words = []
# Puts all the words into an array
File.open('en_words.txt') do |word|
word.each_line.each do |line|
words << line.strip
end
end
class String
def permutation(&block)
arr = split(//)
arr.permutation { |i| yield i.join }
end
end
letters.join.permutation do |i|
p "#{i}" if words.include?(i)
end
=>"mor"
=>"rom"
my issue with the above code is that it stop
s at the number of letters I have given it. For example, it will not repeat to return "room" or "moor". So, what I am trying to do is get a more complete list of combinations, and then check those against my word list.
Thank you for your help.
How about going the other way? Checking every word to make sure it only uses the allowed letters?
I tried this with the 3000 most common words and it worked plenty fast.
words = [..]
letters = [ "m", "o", "r" ]
words.each do |word|
all_letters_valid = true
word.chars.each do |char|
unless letters.include?(char)
all_letters_valid = false
break
end
end
if all_letters_valid
puts word
end
end
If letters can repeat there isn't a finite number of permutations so that approach doesn't make sense.
Assumption: English ascii characters only
If the goal is not to recode the combination for an educational purpose :
In the ruby standard library, the Array class has a combination method.
Here an examples :
letters = ["m","o","r"]
letters.combination(2).to_a
# => [["m", "o"], ["m", "r"], ["o", "r"]]
You also have a magic permutation method :
letters.permutation(3).to_a
# => [["m", "o", "r"], ["m", "r", "o"], ["o", "m", "r"], ["o", "r", "m"], ["r", "m", "o"], ["r", "o", "m"]]
If the goal is to recode theses methods. Maybe you can use them as validation. For exemple by counting the elements in your method and in the standard library method.

Count array with condition

So I have a array of characters and I'd like to display all permutations of a given size meeting a certain condition. For instance, if my array contains 'L', 'E' and 'A' and I choose to display all permutations of size 3 that ends with 'L'. There are two possibilities, ["A", "E", "L"] and ["E", "A", "L"].
My problem is: how can I count the number of possibilities and print all the possibilities within the same each? Here's what I have so far:
count = 0
combination_array.select do |item|
count += 1 if item.last == 'L'
puts "#{item} " if item.last == 'L'
end
It works fine, but I have to write the condition 2 times and also I can't write before displaying all possibilities. I've created a method
def count_occurrences(arr)
counter = 0
arr.each do |item|
counter += 1 if item.last == 'L'
end
counter
end
but I still have to repeat my condition (item.last == 'L'). it doesn't seem very efficient to me.
You could use each_cons (docs) to iterate through each set of 3 items, and count (docs) in block form to have Ruby count for you without constructing a new array:
matches = [["E", "A", "L"], ["A", "E", "L"]]
match_count = data.each_cons(3).count do |set|
if matches.include?(set)
puts set.to_s
return true
end
end
If you really dislike the conditional block, you could technically simplify to a one-liner:
stuff_from_above.count do |set|
matches.include?(set) && !(puts set.to_s)
end
This takes advantage of the fact that puts always evaluates to nil.
And if you're feeling extra lazy, you can also write ["A", "E", "L"] as %w[A E L] or "AEL".chars.
If you specifically want to display and count permutations that end in "L", and the array arr is known to contain exactly one "L", the most efficient method is to simply generate permutations of the array with "L" removed and then tack "L" onto each permutation:
arr = ['B', 'L', 'E', 'A']
str_at_end = 'L'
ar = arr - [str_at_end]
#=> ["B", "E", "A"]
ar.permutation(2).reduce(0) do |count,a|
p a + [str_at_end]
count += 1
end
#=> 6
displaying:
["B", "E", "L"]
["B", "A", "L"]
["E", "B", "L"]
["E", "A", "L"]
["A", "B", "L"]
["A", "E", "L"]
If you want to do something else as well you need to state specifically what that is.
Note that the number of permutations of the elements of an array of size n is simply n! (n factorial), so if you only need the number of permutations with L at the end you could compute that as factorial(arr.size-1), where factorial is a simple method you would need to write.

Ruby: How to create a letter counting function

I wouldn't have asked for help without first spending a few hours trying to figure out my error but I'm at a wall. So there is my attempt but I'm getting false when I try to pass the argument though and I'm not sure why. I know there are other ways to solve this problem that are a little shorter but I'm more interested in trying to get my code to work. Any help is much appreciated.
Write a method that takes in a string. Your method should return the most common letter in the array, and a count of how many times it appears.
def most_common_letter(string)
idx1 = 0
idx2 = 0
counter1 = 0
counter2 = 0
while idx1 < string.length
idx2 = 0
while idx2 < string.length
if string[idx1] == string[idx2]
counter1 += 1
end
idx2 += 1
end
if counter1 > counter2
counter2 = counter1
counter1 = 0
var = [string[idx1], counter2]
end
idx1 += 1
end
return var
end
puts("most_common_letter(\"abca\") == [\"a\", 2]: #{most_common_letter("abca") == ["a", 2]}")
puts("most_common_letter(\"abbab\") == [\"b\", 3]: #{most_common_letter("abbab") == ["b", 3]}")
I didn't rewrite your code because I think it's important to point out what is wrong with the existing code that you wrote (especially since you're familiar with it). That said, there are much more 'ruby-like' ways to go about this.
The issue
counter1 is only being reset if you've found a 'new highest'. You need to reset it regardless of whether or not a new highest number has been found:
def most_common_letter(string)
idx1 = 0
idx2 = 0
counter1 = 0
counter2 = 0
while idx1 < string.length
idx2 = 0
while idx2 < string.length
if string[idx1] == string[idx2]
counter1 += 1
end
idx2 += 1
end
if counter1 > counter2
counter2 = counter1
# counter1 = 0 THIS IS THE ISSUE
var = [string[idx1], counter2]
end
counter1 = 0 # this is what needs to be reset each time
idx1 += 1
end
return var
end
Here's what the output is:
stackoverflow master % ruby letter-count.rb
most_common_letter("abca") == ["a", 2]: true
most_common_letter("abbab") == ["b", 3]: true
I think you're aware there are way better ways to do this but frankly the best way to debug this is with a piece of paper. "Ok counter1 is now 1, indx2 is back to zero", etc. That will help you keep track.
Another bit of advice, counter1 and counter2 are not very good variable names. I didn't realize what you were using them for initially and that should never be the case, it should be named something like current_count highest_known_count or something like that.
Your question has been answered and #theTinMan has suggested a more Ruby-like way of doing what you want to do. There are many other ways of doing this and you might find it useful to consider a couple more.
Let's use the string:
string = "Three blind mice. Oh! See how they run."
First, you need to answer a couple of questions:
do you want the frequency of letters or characters?
do you want the frequency of lowercase and uppercase letters combined?
I assume you want the frequency of letters only, independent of case.
#1 Count each unique letter
We can deal with the case issue by converting all the letters to lower or upper case, using the method String#upcase or String#downcase:
s1 = string.downcase
#=> "three blind mice. oh! see how they run."
Next we need to get rid of all the characters that are not letters. For that, we can use String#delete1:
s2 = s1.delete('^a-z')
#=> "threeblindmiceohseehowtheyrun"
Now we are ready to convert the string s2 to an an array of individual characters2:
arr = s2.chars
#=> ["t", "h", "r", "e", "e", "b", "l", "i", "n", "d",
# "m", "i", "c", "e", "o", "h", "s", "e", "e", "h",
# "o", "w", "t", "h", "e", "y", "r", "u", "n"]
We can combine these first three steps as follows:
arr = string.downcase.gsub(/[^a-z]/, '').chars
First obtain all the distinct letters present, using Array.uniq.
arr1 = arr.uniq
#=> ["t", "h", "r", "e", "b", "l", "i", "n",
# "d", "m", "c", "o", "s", "w", "y", "u"]
Now convert each of these characters to a two-character array consisting of the letter and its count in arr. Whenever you need convert elements of a collection to something else, think Enumerable#map (a.k.a. collect). The counting is done with Array#count. We have:
arr2 = arr1.map { |c| [c, arr.count(c)] }
#=> [["t", 2], ["h", 4], ["r", 2], ["e", 6], ["b", 1], ["l", 1],
# ["i", 2], ["n", 2], ["d", 1], ["m", 1], ["c", 1], ["o", 2],
# ["s", 1], ["w", 1], ["y", 1], ["u", 1]]
Lastly, we use Enumerable#max_by to extract the element of arr2 with the largest count3:
arr2.max_by(&:last)
#=> ["e", 6]
We can combine the calculation of arr1 and arr2:
arr.uniq.map { |c| [c, arr.count(c)] }.max_by(&:last)
and further replace arr with that obtained earlier:
string.downcase.gsub(/[^a-z]/, '').chars.uniq.map { |c|
[c, arr.count(c)] }.max_by(&:last)
#=> ["e", 6]
String#chars returns a temporary array, upon which the method Array#uniq is invoked. As alternative, which avoids the creation of the temporary array, is to use String#each_char in place of String#chars, which returns an enumerator, upon which Enumerable#uniq is invoked.
The use of Array#count is quite an inefficient way to do the counting because a full pass through arr is made for each unique letter. The methods below are much more efficient.
#2 Use a hash
With this approach we wish to create a hash whose keys are the distinct elements of arr and each value is the count of the associated key. Begin by using the class method Hash::new to create hash whose values have a default value of zero:
h = Hash.new(0)
#=> {}
We now do the following:
string.each_char { |c| h[c.downcase] += 1 if c =~ /[a-z]/i }
h #=> {"t"=>2, "h"=>4, "r"=>2, "e"=>6, "b"=>1, "l"=>1, "i"=>2, "n"=>2,
# "d"=>1, "m"=>1, "c"=>1, "o"=>2, "s"=>1, "w"=>1, "y"=>1, "u"=>1}
Recall h[c] += 1 is shorthand for:
h[c] = h[c] + 1
If the hash does not already have a key c when the above expression is evaluated, h[c] on the right side is replaced by the default value of zero.
Since the Enumerable module is included in the class Hash we can invoke max_by on h just as we did on the array:
h.max_by(&:last)
#=> ["e", 6]
There is just one more step. Using Enumerable#each_with_object, we can shorten this as follows:
string.each_char.with_object(Hash.new(0)) do |c,h|
h[c.downcase] += 1 if c =~ /[a-z]/i
end.max_by(&:last)
#=> ["e", 6]
The argument of each_with_object is an object we provide (the empty hash with default zero). This is represented by the additional block variable h. The expression
string.each_char.with_object(Hash.new(0)) do |c,h|
h[c.downcase] += 1 if c =~ /[a-z]/i
end
returns h, to which max_by(&:last) is sent.
#3 Use group_by
I will give a slightly modified version of the Tin Man's answer and show how it works with the value of string I have used. It uses the method Enumerable#group_by:
letters = string.downcase.delete('^a-z').each_char.group_by { |c| c }
#=> {"t"=>["t", "t"], "h"=>["h", "h", "h", "h"], "r"=>["r", "r"],
# "e"=>["e", "e", "e", "e", "e", "e"], "b"=>["b"], "l"=>["l"],
# "i"=>["i", "i"], "n"=>["n", "n"], "d"=>["d"], "m"=>["m"],
# "c"=>["c"], "o"=>["o", "o"], "s"=>["s"], "w"=>["w"],
# "y"=>["y"], "u"=>["u"]}
used_most = letters.max_by { |k,v| v.size }
#=> ["e", ["e", "e", "e", "e", "e", "e"]]
used_most[1] = used_most[1].size
used_most
#=> ["e", 6]
In later versions of Ruby you could simplify as follows:
string.downcase.delete('^a-z').each_char.group_by(&:itself).
transform_values(&:size).max_by(&:last)
#=> ["e", 6]
See Enumerable#max_by, Object#itself and Hash#transform_values.
1. Alternatively, use String#gsub: s1.gsub(/[^a-z]/, '').
2. s2.split('') could also be used.
3. More or less equivalent to arr2.max_by { |c, count| count }.
It's a problem you'll find asked all over Stack Overflow, a quick search should have returned a number of hits.
Here's how I'd do it:
foo = 'abacab'
letters = foo.chars.group_by{ |c| c }
used_most = letters.sort_by{ |k, v| [v.size, k] }.last
used_most # => ["a", ["a", "a", "a"]]
puts '"%s" was used %d times' % [used_most.first, used_most.last.size]
# >> "a" was used 3 times
Of course, now that this is here, and it's easily found, you can't use it because any teacher worth listening to will also know how to search Stack Overflow and will find this answer.

How to create two seperate arrays from one input?

DESCRIPTION:
The purpose of my code is to take in input of a sequence of R's and C's and to simply store each number that comes after the character in its proper array.
For Example: "The input format is as follows: R1C4R2C5
Column Array: [ 4, 5 ] Row Array: [1,2]
My problem is I am getting the output like this:
[" ", 1]
[" ", 4]
[" ", 2]
[" ", 5]
**How do i get all the Row integers following R in one array, and all the Column integers following C in another seperate array. I do not want to create multiple arrays, Rather just two.
Help!
CODE:
puts 'Please input: '
input = gets.chomp
word2 = input.scan(/.{1,2}/)
col = []
row = []
word2.each {|a| col.push(a.split(/C/)) if a.include? 'C' }
word2.each {|a| row.push(a.split(/R/)) if a.include? 'R' }
col.each do |num|
puts num.inspect
end
row.each do |num|
puts num.inspect
end
x = "R1C4R2C5"
col = []
row = []
x.chars.each_slice(2) { |u| u[0] == "R" ? row << u[1] : col << u[1] }
p col
p row
The main problem with your code is that you replicate operations for rows and columns. You want to write "DRY" code, which stands for "don't repeat yourself".
Starting with your code as the model, you can DRY it out by writing a method like this to extract the information you want from the input string, and invoke it once for rows and once for columns:
def doit(s, c)
...
end
Here s is the input string and c is the string "R" or "C". Within the method you want
to extract substrings that begin with the value of c and are followed by digits. Your decision to use String#scan was a good one, but you need a different regex:
def doit(s, c)
s.scan(/#{c}\d+/)
end
I'll explain the regex, but let's first try the method. Suppose the string is:
s = "R1C4R2C5"
Then
rows = doit(s, "R") #=> ["R1", "R2"]
cols = doit(s, "C") #=> ["C4", "C5"]
This is not quite what you want, but easily fixed. First, though, the regex. The regex first looks for a character #{c}. #{c} transforms the value of the variable c to a literal character, which in this case will be "R" or "C". \d+ means the character #{c} must be followed by one or more digits 0-9, as many as are present before the next non-digit (here a "R" or "C") or the end of the string.
Now let's fix the method:
def doit(s, c)
a = s.scan(/#{c}\d+/)
b = a.map {|str| str[1..-1]}
b.map(&:to_i)
end
rows = doit(s, "R") #=> [1, 2]
cols = doit(s, "C") #=> [4, 5]
Success! As before, a => ["R1", "R2"] if c => "R" and a =>["C4", "C5"] if c => "C". a.map {|str| str[1..-1]} maps each element of a into a string comprised of all characters but the first (e.g., "R12"[1..-1] => "12"), so we have b => ["1", "2"] or b =>["4", "5"]. We then apply map once again to convert those strings to their Fixnum equivalents. The expression b.map(&:to_i) is shorthand for
b.map {|str| str.to_i}
The last computed quantity is returned by the method, so if it is what you want, as it is here, there is no need for a return statement at the end.
This can be simplified, however, in a couple of ways. Firstly, we can combine the last two statements by dropping the last one and changing the one above to:
a.map {|str| str[1..-1].to_i}
which also gets rid of the local variable b. The second improvement is to "chain" the two remaining statements, which also rids us of the other temporary variable:
def doit(s, c)
s.scan(/#{c}\d+/).map { |str| str[1..-1].to_i }
end
This is typical Ruby code.
Notice that by doing it this way, there is no requirement for row and column references in the string to alternate, and the numeric values can have arbitrary numbers of digits.
Here's another way to do the same thing, that some may see as being more Ruby-like:
s.scan(/[RC]\d+/).each_with_object([[],[]]) {|n,(r,c)|
(n[0]=='R' ? r : c) << n[1..-1].to_i}
Here's what's happening. Suppose:
s = "R1C4R2C5R32R4C7R18C6C12"
Then
a = s.scan(/[RC]\d+/)
#=> ["R1", "C4", "R2", "C5", "R32", "R4", "C7", "R18", "C6", "C12"]
scan uses the regex /([RC]\d+)/ to extract substrings that begin with 'R' or 'C' followed by one or more digits up to the next letter or end of the string.
b = a.each_with_object([[],[]]) {|n,(r,c)|(n[0]=='R' ? r : c) << n[1..-1].to_i}
#=> [[1, 2, 32, 4, 18], [4, 5, 7, 6, 12]]
The row values are given by [1, 2, 32, 4, 18]; the column values by [4, 5, 7, 6, 12].
Enumerable#each_with_object (v1.9+) creates an array comprised of two empty arrays, [[],[]]. The first subarray will contain the row values, the second, the column values. These two subarrays are represented by the block variables r and c, respectively.
The first element of a is "R1". This is represented in the block by the variable n. Since
"R1"[0] #=> "R"
"R1"[1..-1] #=> "1"
we execute
r << "1".to_i #=> [1]
so now
[r,c] #=> [[1],[]]
The next element of a is "C4", so we will execute:
c << "4".to_i #=> [4]
so now
[r,c] #=> [[1],[4]]
and so on.
rows, cols = "R1C4R2C5".scan(/R(\d+)C(\d+)/).flatten.partition.with_index {|_, index| index.even? }
> rows
=> ["1", "2"]
> cols
=> ["4", "5"]
Or
rows = "R1C4R2C5".scan(/R(\d+)/).flatten
=> ["1", "2"]
cols = "R1C4R2C5".scan(/C(\d+)/).flatten
=> ["4", "5"]
And to fix your code use:
word2.each {|a| col.push(a.delete('C')) if a.include? 'C' }
word2.each {|a| row.push(a.delete('R')) if a.include? 'R' }

What is the Big-O complexity for this "Telephone Words" Algorithm?

This isn't homework, just an interview question I found on the web that looks interesting.
So I took a look at this first: Telephone Words problem -- but it seems to be poorly worded/created some controversy. My question is pretty much the same, except my question is more about the time complexity behind it.
You want to list all the possible words when given a 10-digit phone number as your input. So here is what I have done:`
def main(telephone_string)
hsh = {1 => "1", 2 => ["a","b","c"], 3 => ["d","e","f"], 4 => ["g","h","i"],
5 => ["j","k","l"], 6 => ["m","n","o"], 7 => ["p","q","r","s"],
8 => ["t","u","v"], 9 => ["w","x","y","z"], 0 => "0" }
telephone_array = telephone_string.split("-")
three_number_string = telephone_array[1]
four_number_string = telephone_array[2]
string = ""
result_array = []
hsh[three_number_string[0].to_i].each do |letter|
hsh[three_number_string[1].to_i].each do |second_letter|
string = letter + second_letter
hsh[three_number_string[2].to_i].each do |third_letter|
new_string = string + third_letter
result_array << new_string
end
end
end
second_string = ""
second_result = []
hsh[four_number_string[0].to_i].each do |letter|
hsh[four_number_string[1].to_i].each do |second_letter|
second_string = letter + second_letter
hsh[four_number_string[2].to_i].each do |third_letter|
new_string = second_string + third_letter
hsh[four_number_string[3].to_i].each do |fourth_letter|
last_string = new_string + fourth_letter
second_result << last_string
end
end
end
end
puts result_array.inspect
puts second_result.inspect
end
First off, this is what I hacked together in a few minutes time, no refactoring has been done. So I apologize for the messy code, I just started learning Ruby 6 weeks ago, so please bear with me!
So finally to my question: I was wondering what the time complexity of this method would be. My guess is that it would be O(n^4) because the second loop (for the four letter words) is nested four times. I'm not really positive though. So I would like to know whether that is correct, and if there is a better way to do this problem.
This is actually a constant time algorithm, so O(1) (or to be more explicit, O(4^3 + 4^4))
The reason this is a constant time algorithm is that for each digit in the telephone number, you're iterating through a fixed number (at most 4) of possible letters, that's known beforehand (which is why you can put hsh statically into your method).
One possible optimization would be to stop searching when you know there are no words with the current prefix. For example, if the 3-digit number is "234", you can ignore all strings that start with "bd" (there are some bd- words, like "bdellid", but none that are 3-letters, at least in my /usr/share/dict/words).
From the original phrasing, I would assume that is requesting all of the possibilities, instead of the number of possibilities as output.
Unfortunately, if you need to return every combination, there is no way to lower the complexity below that determined by the specified keys.
If it were simply the number, it could be in constant time. However, to print them all out, the end result depends highly on assumptions:
1) Assuming that all of the words you are checking for are composed solely of letters, you only need to check against the eight keys from 2 to 9. If this is incorrect, just sub out 8 in the function below.
2) Assuming the layout of all keys is exactly as set up here (no octothorpes or asterisks), with the contents of the empty arrays taking up no space in the final word.
{
1 => [],
2 => ["a", "b", "c"],
3 => ["d", "e", "f"],
4 => ["g", "h", "i"],
5 => ["j", "k", "l"],
6 => ["m", "n", "o"],
7 => ["p", "q", "r", "s"],
8 => ["t", "u", "v"],
9 => ["w", "x", "y", "z"],
0 => []
}
At each stage, you would simply check the number of possibilities for the next step, and append each possible choice to the end of a string. If you were to do, so, the minimum time would be (essentially) constant time (0, if the number consisted of all ones and zeros). However, the function would be O(4^n), where n reaches a maximum at 10. The largest possible number of combinations would be 4^10, if they hit 7 or nine each time.
As for your code, I would recommend a single loop, with a few basic nested loops. Here is the code, in Ruby, although I haven't run it, so there may be syntax errors.
def get_words(number_string)
hsh = {"2" => ["a", "b", "c"],
"3" => ["d", "e", "f"],
"4" => ["g", "h", "i"],
"5" => ["j", "k", "l"],
"6" => ["m", "n", "o"],
"7" => ["p", "q", "r", "s"],
"8" => ["t", "u", "v"],
"9" => ["w", "x", "y", "z"]}
possible_array = hsh.keys
number_array = number_string.split("").reject{|x| possible_array.include?(x)}
if number_array.length > 0
array = hsh[number_array[0]]
end
unless number_array[1,-1].nil?
number_array.each do |digit|
new_array = Array.new()
array.each do |combo|
hsh[digit].each do |new|
new_array = new_array + [combo + new]
end
end
array = new_array
end
new_array
end

Resources