Ruby Anagrams puzzle - ruby

i a m a begginner in ruby please i need directions in how to get the program to return a list containing "inlets"
Question: Given a word and a list of possible anagrams, select the correct sublist.
Given "listen" and a list of candidates like "enlists" "google" "inlets" "banana" the program should return a list containing "inlets".
This is what i have been able to do
puts 'Enter word'
word_input = gets.chomp
puts 'Enter anagram_list'
potential_anagrams = gets.chomp
potential_anagrams.each do |anagram|
end
this is how the program should behave assuming my word_input was "hello" but i do not know how to get this working.
is_anagram? word: 'hello', anagrams: ['helo', 'elloh', 'heelo', 'llohe']
# => 'correct anagrams are: elloh, llohe'
Would really appreciate ideas.

As hinted in comments, a string can be easily converted to an array of its characters.
irb(main):005:0> "hello".chars
=> ["h", "e", "l", "l", "o"]
irb(main):006:0> "lleho".chars
=> ["l", "l", "e", "h", "o"]
Arrays can be easily sorted.
irb(main):007:0> ["h", "e", "l", "l", "o"].sort
=> ["e", "h", "l", "l", "o"]
irb(main):008:0> ["l", "l", "e", "h", "o"].sort
=> ["e", "h", "l", "l", "o"]
And arrays can be compared.
irb(main):009:0> ["e", "h", "l", "l", "o"] == ["e", "h", "l",
l", "o"]
=> true
Put all of this together and you should be able to determine if one word is an anagram of another. You can then pair this with #select to find the anagrams in an array. Something like:
def is_anagram?(word, words)
words.select do |w|
...
end
end

I took the tack of creating a class in pursuit of re-usability. This is overkill for a one-off usage, but allows you to build up a set of known words and then poll it as many times as you like for anagrams of multiple candidate words.
This solution is built on hashes and sets, using the sorted characters of a word as the index to a set of words sharing the same letters. Hashing is O(1), Sets are O(1), and if we view words as having a bounded length the calculation of the key is also O(1), yielding an overall complexity of a constant time per word.
I've commented the code, but if anything is unclear feel free to ask.
require 'set'
class AnagramChecker
def initialize
# A hash whose default value is an empty set object
#known_words = Hash.new { |h, k| h[k] = Set.new }
end
# Create a key value from a string by breaking it into individual
# characters, sorting, and rejoining, so all strings which are
# anagrams of each other will have the same key.
def key(word)
word.chars.sort.join
end
# Add individual words to the class by generating their key value
# and adding the word to the set. Using a set guarantees no
# duplicates of the words, since set contents are unique.
def add_word(word)
word_key = key(word)
#known_words[word_key] << word
# return the word's key to avoid duplicate work in find_anagrams
word_key
end
def find_anagrams(word)
word_key = add_word(word) # add the target word to the known_words
#known_words[word_key].to_a # return all anagramatic words as an array
end
def inspect
p #known_words
end
end
Producing a library of known words looks like this:
ac = AnagramChecker.new
["enlists", "google", "inlets", "banana"].each { |word| ac.add_word word }
ac.inspect # {"eilnsst"=>#<Set: {"enlists"}>, "eggloo"=>#<Set: {"google"}>, "eilnst"=>#<Set: {"inlets"}>, "aaabnn"=>#<Set: {"banana"}>}
Using it looks like:
p ac.find_anagrams("listen") # ["inlets", "listen"]
p ac.find_anagrams("google") # ["google"]
If you don't want the target word to be included in the output, adjust find_anagrams accordingly.

Here's how I would do it.
Method
def anagrams(list, word)
ltr_freq = word.each_char.tally
list.select { |w| w.size == word.size && w.each_char.tally == ltr_freq }
end
Example
list = ['helo', 'elloh', 'heelo', 'llohe']
anagrams(list, 'hello')
#=> ["elloh", "llohe"]
Computational complexity
For practical purposes, the computational complexity of computing w.each_char.tally for any word w of length n can be regarded as O(n). That's because hash key lookups are nearly constant-time. It follows that the computational complexity of determining whether a word of length n is an anagram of another word of the same length can be regarded as O(n).
This compares with methods that sort the letters of a word, which have a computational complexity of O(n*log(n)), n being the word length.
Explanation
See Enumerable#tally. Note that w.each_char.tally is not computed when w.size == word.size is false.
Now let's add some puts statements to see what is happening.
def anagrams(list, word)
ltr_freq = word.each_char.tally
puts "ltr_freq = #{ltr_freq}"
list.select do |w|
puts "\nw = '#{w}'"
if w.size != word.size
puts "words differ in length"
false
else
puts "w.each_char.tally = #{w.each_char.tally}"
if w.each_char.tally == ltr_freq
puts "character frequencies are the same"
true
else
puts "character frequencies differ"
false
end
end
end
end
anagrams(list, 'hello')
ltr_freq = {"h"=>1, "e"=>1, "l"=>2, "o"=>1}
w = 'helo'
words differ in length
w = 'elloh'
w.each_char.tally = {"e"=>1, "l"=>2, "o"=>1, "h"=>1}
character frequencies are the same
w = 'heelo'
w.each_char.tally = {"h"=>1, "e"=>2, "l"=>1, "o"=>1}
character frequencies differ
w = 'llohe'
w.each_char.tally = {"l"=>2, "o"=>1, "h"=>1, "e"=>1}
character frequencies are the same
#=>["elloh", "llohe"]
Possible improvement
A potential weakness of the expression
w.each_char.tally == ltr_freq
is that the frequencies of all unique letters in the word w must be determined before a conclusion is reached, even if, for example, the first letter of w does not appear in the word word. We can remedy that as follows.
def anagrams(list, word)
ltr_freq = word.each_char.tally
list.select do |w|
next false unless w.size == word.size
ltr_freqs_match?(w, ltr_freq.dup)
end
end
def ltr_freqs_match?(w, ltr_freq)
w.each_char do |c|
return false unless ltr_freq.key?(c)
ltr_freq[c] -= 1
ltr_freq.delete(c) if ltr_freq[c].zero?
end
true
end
One would have to test this variant against the original version of anagrams above to determine which tends to be fastest. This variant has the advantage that it terminates (short-circuits) the comparison as soon as is found that the cumulative count of a given character in the word w is greater than the total count of the same letter in word. At the same time, tally is written in C so it may still be faster.

Related

Ruby permutations

Simply put, I want to have an input of letters, and output all possible combinations for a set length range.
for example:
length range 1 - 2
input a, b, c
...
output a, b, c, aa, ab, ac, bb, ba, bc, cc, ca, cb
I am trying to make an anagram/spell check solver so that I can 'automate' the NYT's Spelling Bee game. So, I want to input the letters given into my program, get an array of all possible combinations for specific lengths (they have a min word length of 4) and then check that array against an array of all English words. What I have so far is:
letters = ["m","o","r"]
words = []
# Puts all the words into an array
File.open('en_words.txt') do |word|
word.each_line.each do |line|
words << line.strip
end
end
class String
def permutation(&block)
arr = split(//)
arr.permutation { |i| yield i.join }
end
end
letters.join.permutation do |i|
p "#{i}" if words.include?(i)
end
=>"mor"
=>"rom"
my issue with the above code is that it stop
s at the number of letters I have given it. For example, it will not repeat to return "room" or "moor". So, what I am trying to do is get a more complete list of combinations, and then check those against my word list.
Thank you for your help.
How about going the other way? Checking every word to make sure it only uses the allowed letters?
I tried this with the 3000 most common words and it worked plenty fast.
words = [..]
letters = [ "m", "o", "r" ]
words.each do |word|
all_letters_valid = true
word.chars.each do |char|
unless letters.include?(char)
all_letters_valid = false
break
end
end
if all_letters_valid
puts word
end
end
If letters can repeat there isn't a finite number of permutations so that approach doesn't make sense.
Assumption: English ascii characters only
If the goal is not to recode the combination for an educational purpose :
In the ruby standard library, the Array class has a combination method.
Here an examples :
letters = ["m","o","r"]
letters.combination(2).to_a
# => [["m", "o"], ["m", "r"], ["o", "r"]]
You also have a magic permutation method :
letters.permutation(3).to_a
# => [["m", "o", "r"], ["m", "r", "o"], ["o", "m", "r"], ["o", "r", "m"], ["r", "m", "o"], ["r", "o", "m"]]
If the goal is to recode theses methods. Maybe you can use them as validation. For exemple by counting the elements in your method and in the standard library method.

Print elements of array of arrays of different size in same line in Ruby

Maybe someone could help me with this. I have an array of arrays. The internal arrays have different sizes (from 2 to 4 elements).
letters = [["A", "B"],["C", "D", "F", "G"],["H", "I", "J" ]]
I'm trying to print in a same line each array havins as first column element[0] and element[1] joined, as 2nd column element[0], element[1], element[2] joined as 3rd column element[0], element[1], element[3] joined. Elements 2 and 3 not always exist.
The output I'm trying to get is like this:
AB
CD CDF CDG
HI HIJ
I'm doing in this way but I'm getting this error.
letters.map{|x| puts x[0]+x[1] + "," + x[0]+x[1]+x[2] + "," + x[0]+x[1]+x[3]}
TypeError: no implicit conversion of nil into String
from (irb):1915:in "+"
from (irb):1915:in "block in irb_binding"
from (irb):1915:in "map"
from (irb):1915
from /usr/bin/irb:11:in "<main>"
letters.each do |a,b,*rest|
puts rest.each_with_object([a+b]) { |s,arr| arr << arr.first + s }.join(' ')
end
prints
AB
CD CDF CDG
HI HIJ
The steps are as follows.
Suppose
letters = [["C", "D", "F", "G"],["H", "I", "J" ]]
Then
enum0 = letters.each
#=> #<Enumerator: [["C", "D", "F", "G"], ["H", "I", "J"]]:each>
The first element of this enumerator is generated and passed to the block, and the three block variables are assigned values.
a, b, *rest = enum0.next
#=> ["C", "D", "F", "G"]
a
#=> "C"
b
#=> "D"
rest
#=> ["F", "G"]
Next, we obtain
enum1 = rest.each_with_object([a+b])
#=> rest.each_with_object(["CD"])
#=> #<Enumerator: ["F", "G"]:each_with_object(["CD"])>
The first element of this enumerator is generated and passed to the block, and the block variables are assigned values.
s, arr = enum1.next
#=> ["F", ["CD"]]
s
#=> "F"
arr
#=> ["CD"]
The block calculation is now performed.
arr << arr.first + s
#=> arr << "CD" + "F"
#=> ["CD", "CDF"]
The second and last element of enum1 is generated and passed to the block, and block variables are assigned values and the block is computed.
s, arr = enum1.next
#=> ["G", ["CD", "CDF"]]
arr << arr.first + s
#=> ["CD", "CDF", "CDG"]
When an attempt to generate another element from enum1 we obtain
enum1.next
#StopIteration: iteration reached an end
Ruby handles the exception by breaking out of the block and returning arr. The elements of arr are then joined:
arr.join(' ')
#=> "CD CDF CDG"
and printed.
The second and last element of enum0 is now generated, passed to the block, and the three block variables are assigned values.
a, b, *rest = enum0.next
#=> ["H", "I", "J"]
a
#=> "H"
b
#=> "I"
rest
#=> ["J"]
The remaining calculations are similar.
Some readers may be unfamiliar with the method Enumerable#each_with_object, which is widely used. Read the doc, but note that here it yields the same result as the code written as follows.
letters.each do |a,b,*rest|
arr = [a+b]
rest.each { |s| arr << arr.first + s }
puts arr.join(' ')
end
By using each_with_object we avoid the need for the statement arr = [a+b] and the statement puts arr.join(' '). The functions of those two statements are of course there in the line using each_with_object, but most Ruby users prefer the flow when when chaining each_with_object to join(' '). One other difference is that the value of arr is confined to each_with_object's block, which is good programming practice.
Looks like you want to join the first two letters, then take the cartesian product with the remaining.
letters.each do |arr|
first = arr.take(2).join
rest = arr.drop(2)
puts [first, [first].product(rest).map(&:join)].join(" ")
end
This provides the exact output you specified.
Just out of curiosity, Enumerable#map-only solution.
letters = [["A", "B"],["C", "D", "F", "G"],["H", "I", "J" ]]
letters.map do |f, s, *rest|
rest.unshift(nil).map { |l| [f, s, l].join }.join(' ')
end.each(&method(:puts))
#⇒ AB
# CD CDF CDG
# HI HIJ

How to count the number of different letter between a string in Ruby? [duplicate]

This question already has answers here:
Measure the distance between two strings with Ruby?
(7 answers)
Closed 6 years ago.
I was trying to find the difference in letters between two strings.
For example, if I put the word ATTGCC and GTTGAC, the difference would be 2 since A and G and C and G are not the same characters.
class DNA
def initialize (nucleotide)
#nucleotide = nucleotide
end
def length
#nucleotide.length
end
def hamming_distance(other)
self.nucleotide.chars.zip(other.nucleotide) { |a,b| a == b }.count
end
protected
attr_reader :nucleotide
end
dna1 = DNA.new("ATTGCC")
dna2 = DNA.new("GTTGAC")
puts dna1.hamming_distance(dna2)
The method hamming_distance doesn't really work as it gives a wrong argument type String (must respond to :each) (TypeError)
Assuming the strings are of the same length, you can split them, zip them, and find how many pairs match:
string1 = "RATTY"
string2 = "CATTI"
string1.chars.zip(string2.chars).select { |a,b| a == b }.count
The .chars produces an array of the characters in the string ("RATTY" => ["R", "A", "T", "T", "Y"])
The .zip call merges the two arrays together into a an array of pairs, ["R", "A", "T"].zip(["C", "A", "T"]) => [ ["R", "C"], ["A", "A"], ["T", "T"]]
The select filters out the pairs where the values aren't equal
The count returns the number of pairs matched by select
You can find the number of non-matching pairs by negating the selection

why doesn't this logic work in the initialize method?

def initialize(letters)
#letters = letters
#face = letters.sample # letters is an array of all letters from A to Z
if #face == "Q"
#face = "Qu"
end
#visited = false
#coord = []
end
When I p my array of dice later, I see that the #face is still "Q"
..., [#<Dice:0x007f907b032948 #letters=["H", "I", "M", "N", "Q", "U"], #face="Q", #visited=false, #coord=[]>, ...
What's going on ?
the sample method selects a random element from the array. I think you'll find that if #letters were to equal ['Q'], your code will work.

What is the Big-O complexity for this "Telephone Words" Algorithm?

This isn't homework, just an interview question I found on the web that looks interesting.
So I took a look at this first: Telephone Words problem -- but it seems to be poorly worded/created some controversy. My question is pretty much the same, except my question is more about the time complexity behind it.
You want to list all the possible words when given a 10-digit phone number as your input. So here is what I have done:`
def main(telephone_string)
hsh = {1 => "1", 2 => ["a","b","c"], 3 => ["d","e","f"], 4 => ["g","h","i"],
5 => ["j","k","l"], 6 => ["m","n","o"], 7 => ["p","q","r","s"],
8 => ["t","u","v"], 9 => ["w","x","y","z"], 0 => "0" }
telephone_array = telephone_string.split("-")
three_number_string = telephone_array[1]
four_number_string = telephone_array[2]
string = ""
result_array = []
hsh[three_number_string[0].to_i].each do |letter|
hsh[three_number_string[1].to_i].each do |second_letter|
string = letter + second_letter
hsh[three_number_string[2].to_i].each do |third_letter|
new_string = string + third_letter
result_array << new_string
end
end
end
second_string = ""
second_result = []
hsh[four_number_string[0].to_i].each do |letter|
hsh[four_number_string[1].to_i].each do |second_letter|
second_string = letter + second_letter
hsh[four_number_string[2].to_i].each do |third_letter|
new_string = second_string + third_letter
hsh[four_number_string[3].to_i].each do |fourth_letter|
last_string = new_string + fourth_letter
second_result << last_string
end
end
end
end
puts result_array.inspect
puts second_result.inspect
end
First off, this is what I hacked together in a few minutes time, no refactoring has been done. So I apologize for the messy code, I just started learning Ruby 6 weeks ago, so please bear with me!
So finally to my question: I was wondering what the time complexity of this method would be. My guess is that it would be O(n^4) because the second loop (for the four letter words) is nested four times. I'm not really positive though. So I would like to know whether that is correct, and if there is a better way to do this problem.
This is actually a constant time algorithm, so O(1) (or to be more explicit, O(4^3 + 4^4))
The reason this is a constant time algorithm is that for each digit in the telephone number, you're iterating through a fixed number (at most 4) of possible letters, that's known beforehand (which is why you can put hsh statically into your method).
One possible optimization would be to stop searching when you know there are no words with the current prefix. For example, if the 3-digit number is "234", you can ignore all strings that start with "bd" (there are some bd- words, like "bdellid", but none that are 3-letters, at least in my /usr/share/dict/words).
From the original phrasing, I would assume that is requesting all of the possibilities, instead of the number of possibilities as output.
Unfortunately, if you need to return every combination, there is no way to lower the complexity below that determined by the specified keys.
If it were simply the number, it could be in constant time. However, to print them all out, the end result depends highly on assumptions:
1) Assuming that all of the words you are checking for are composed solely of letters, you only need to check against the eight keys from 2 to 9. If this is incorrect, just sub out 8 in the function below.
2) Assuming the layout of all keys is exactly as set up here (no octothorpes or asterisks), with the contents of the empty arrays taking up no space in the final word.
{
1 => [],
2 => ["a", "b", "c"],
3 => ["d", "e", "f"],
4 => ["g", "h", "i"],
5 => ["j", "k", "l"],
6 => ["m", "n", "o"],
7 => ["p", "q", "r", "s"],
8 => ["t", "u", "v"],
9 => ["w", "x", "y", "z"],
0 => []
}
At each stage, you would simply check the number of possibilities for the next step, and append each possible choice to the end of a string. If you were to do, so, the minimum time would be (essentially) constant time (0, if the number consisted of all ones and zeros). However, the function would be O(4^n), where n reaches a maximum at 10. The largest possible number of combinations would be 4^10, if they hit 7 or nine each time.
As for your code, I would recommend a single loop, with a few basic nested loops. Here is the code, in Ruby, although I haven't run it, so there may be syntax errors.
def get_words(number_string)
hsh = {"2" => ["a", "b", "c"],
"3" => ["d", "e", "f"],
"4" => ["g", "h", "i"],
"5" => ["j", "k", "l"],
"6" => ["m", "n", "o"],
"7" => ["p", "q", "r", "s"],
"8" => ["t", "u", "v"],
"9" => ["w", "x", "y", "z"]}
possible_array = hsh.keys
number_array = number_string.split("").reject{|x| possible_array.include?(x)}
if number_array.length > 0
array = hsh[number_array[0]]
end
unless number_array[1,-1].nil?
number_array.each do |digit|
new_array = Array.new()
array.each do |combo|
hsh[digit].each do |new|
new_array = new_array + [combo + new]
end
end
array = new_array
end
new_array
end

Resources