How to fix incorrect character counting code - ruby

I have a question about mysterious 'e' characters appearing in my counts hash.
My initial approach was clunky and inelegant:
def letter_count(str)
counts = {}
words = str.split(" ")
words.each do |word|
letters = word.split("")
letters.each do |letter|
if counts.include?(letter)
counts[letter] += 1
else
counts[letter] = 1
end
end
end
counts
end
This approach worked, but I wanted to make it a little more readable, so I abbreviated it to:
def letter_count(str)
counts = Hash.new(0)
str.split("").each{|letter| counts[letter] += 1 unless letter == ""}
counts
end
This is where I encountered the issue, and fixed it by using:
str.split("").each{|letter| counts[letter] += 1 unless letter == " "} # added a space.
I don't understand why empty spaces were being represented by the letter 'e' or being counted at all.

I don't understand why empty spaces were being represented by the letter 'e' or being counted at all.
I can't duplicate the problem:
def letter_count(str)
counts = Hash.new(0)
str.split("").each{|letter| counts[letter] += 1 unless letter == ""}
counts
end
letter_count('a cat') # => {"a"=>2, " "=>1, "c"=>1, "t"=>1}
"empty spaces"? There's no such thing. A space is not empty; It's considered blank but not empty:
' '.empty? # => false
Loading the ActiveSupport extension:
require 'active_support/core_ext/object/blank'
' '.blank? # => true
spaces are valid characters, which is why they're being counted. You have to disallow them if you don't want them counted.
For reference, here's how I'd do it:
def letter_count(str)
str.chars.each_with_object(Hash.new(0)) { |l, h| h[l] += 1 }
end
letter_count('a cat') # => {"a"=>2, " "=>1, "c"=>1, "t"=>1}
A messier way would be:
def letter_count(str)
str.chars.group_by { |c| c }.map { |char, chars| [char, chars.count] }.to_h
end
Breaking that down:
def letter_count(str)
str.chars # => ["a", " ", "c", "a", "t"]
.group_by { |c| c } # => {"a"=>["a", "a"], " "=>[" "], "c"=>["c"], "t"=>["t"]}
.map { |char, chars| [char, chars.count] } # => [["a", 2], [" ", 1], ["c", 1], ["t", 1]]
.to_h # => {"a"=>2, " "=>1, "c"=>1, "t"=>1}
end

Ruby already has String#each_char which you could use.
def char_count(string)
counts = Hash.new(0)
string.each_char { |char|
counts[char] += 1
}
return counts
end
puts char_count("Basset hounds got long ears").inspect
# {"B"=>1, "a"=>2, "s"=>4, "e"=>2, "t"=>2, " "=>4, "h"=>1,
# "o"=>3, "u"=>1, "n"=>2, "d"=>1, "g"=>2, "l"=>1, "r"=>1}
As for why you're getting the wrong characters, are you sure you're passing in the string you think you are?

Related

How to count a string elements' occurrence in another string in ruby?

How can I check how many times a phrase occurs in a string?
For example, let's say the phrase is donut
str1 = "I love donuts!"
#=> returns 1 because "donuts" is found once.
str2 = "Squirrels do love nuts"
#=> also returns 1 because of 'do' and 'nuts' make up donut
str3 = "donuts do stun me"
#=> returns 2 because 'donuts' and 'do stun' has all elements to make 'donuts'
I checked this SO that suggests using include, but it only works if donuts is spelled in order.
I came up with this, but it doesn't stop spelling after all elements of "donuts"is spelled. i.e. "I love donuts" #=> ["o", "d", "o", "n", "u", "t", "s"]
def word(arr)
acceptable_word = "donuts".chars
arr.chars.select { |name| acceptable_word.include? name.downcase }
end
How can I check how many occurrences of donuts are there in a given string? No edge cases. Input will always be String, no nil. If it contains elements of donut only it should not count as 1 occurrence; it needs to contain donuts, doesn't have to be in order.
Code
def count_em(str, target)
target.chars.uniq.map { |c| str.count(c)/target.count(c) }.min
end
Examples
count_em "I love donuts!", "donuts" #=> 1
count_em "Squirrels do love nuts", "donuts" #=> 1
count_em "donuts do stun me", "donuts" #=> 2
count_em "donuts and nuts sound too delicious", "donuts" #=> 3
count_em "cats have nine lives", "donuts" #=> 0
count_em "feeding force scout", "coffee" #=> 1
count_em "feeding or scout", "coffee" #=> 0
str = ("free mocha".chars*4).shuffle.join
# => "hhrefemcfeaheomeccrmcre eef oa ofrmoaha "
count_em str, "free mocha"
#=> 4
Explanation
For
str = "feeding force scout"
target = "coffee"
a = target.chars
#=> ["c", "o", "f", "f", "e", "e"]
b = a.uniq
#=> ["c", "o", "f", "e"]
c = b.map { |c| str.count(c)/target.count(c) }
#=> [2, 2, 1, 1]
c.min
#=> 1
In calculating c, consider the first element of b passed to the block and assigned to the block variable c.
c = "c"
Then the block calculation is
d = str.count(c)
#=> 2
e = target.count(c)
#=> 1
d/e
#=> 2
This indicates that str contains enough "c"'s to match "coffee" twice.
The remaining calculations to obtain c are similar.
Addendum
If the characters of str matching characters target must be in the same order as those of target, the following regex could be used.
target = "coffee"
r = /#{ target.chars.join(".*?") }/i
#=> /c.*?o.*?f.*?f.*?e.*?e/i
matches = "xcorr fzefe yecaof tfe erg eeffoc".scan(r)
#=> ["corr fzefe ye", "caof tfe e"]
matches.size
#=> 2
"feeding force scout".scan(r).size
#=> 0
The questions marks in the regex are needed to make the searches non-greedy.
The solution is more or less simple (map(&:dup) is used there to avoid inputs mutating):
pattern = 'donuts'
[str1, str2, str3].map(&:dup).map do |s|
loop.with_index do |_, i|
break i unless pattern.chars.all? { |c| s.sub!(c, '') }
end
end
#⇒ [1, 1, 2]
Here's an approach with two variants, one where the letters must appear in order, and one where order is irrelevant. In both cases the frequency of each letter is respected, so "coffee" must match vs. two 'f' and two 'e' letters, "free mocha" is insufficient to match, lacking a second 'f'.
def sorted_string(string)
string.split('').sort.join
end
def phrase_regexp_sequence(phrase)
Regexp.new(
phrase.downcase.split('').join('.*')
)
end
def phrase_regexp_unordered(phrase)
Regexp.new(
phrase.downcase.gsub(/\W/, '').split('').sort.chunk_while(&:==).map do |bit|
"#{bit[0]}{#{bit.length}}"
end.join('.*')
)
end
def contains_unordered(phrase, string)
!!phrase_regexp_unordered(phrase).match(sorted_string(string.downcase))
end
def contains_sequence(phrase, string)
!!phrase_regexp_sequence(phrase).match(string.downcase)
end
strings = [
"I love donuts!",
"Squirrels do love nuts",
"donuts do stun me",
"no stunned matches",
]
phrase = 'donut'
strings.each do |string|
puts '%-30s %s %s' % [
string,
contains_unordered(phrase, string),
contains_sequence(phrase, string)
]
end
# => I love donuts! true true
# => Squirrels do love nuts true true
# => donuts do stun me true true
# => no stunned matches true false
Simple solution:
criteria = "donuts"
str1 = "I love donuts!"
str2 = "Squirrels do love nuts"
str3 = "donuts do stun me"
def strings_construction(criteria, string)
unique_criteria_array = criteria.split("").uniq
my_hash = {}
# Let's count how many times each character of the string matches a character in the string
unique_criteria_array.each do |char|
my_hash[char] ? my_hash[char] = my_hash[char] + 1 : my_hash[char] = string.count(char)
end
my_hash.values.min
end
puts strings_construction(criteria, str1) #=> 1
puts strings_construction(criteria, str2) #=> 1
puts strings_construction(criteria, str3) #=> 2

Keep characters and whitespace in ruby method

Building out a Rot method to solve encryption. I have something that is working but takes out whitespaces and any characters that are included. Was going to use bytes instead of chars then turn it back into a string once I have the byte code but I can't seem to get it working. How would you go about keeping those in place from this code:
code
def rot(x, string, encrypt=true)
alphabet = Array("A".."Z") + Array("a".."z")
results = []
if encrypt == true
key = Hash[alphabet.zip(alphabet.rotate(x))]
string.chars.each do |i|
if ('a'..'z').include? i
results << key.fetch(i).downcase
elsif ('A'..'Z').include? i
results << key.fetch(i).upcase
end
end
return results.join
else
key_false = Hash[alphabet.zip(alphabet.rotate(26 - x))]
string.chars.each do |i|
if ('a'..'z').include? i
results << key_false.fetch(i).downcase
elsif ('A'..'Z').include? i
results << key_false.fetch(i).upcase
end
end
return results.join
end
end
puts rot(10, "Hello, World")
=> RovvyGybvn
puts rot(10, "Rovvy, Gybvn", false)
=> HelloWorld
Thanks for your help in advance!
Just add to both if blocks an else condition like this:
if ('a'..'z').include? i
# ...
elsif ('A'..'Z').include? i
# ...
else
results << i
end
Which will add all non A-z characters untouched to the output.
I've noticed some issues with your code:
Broken replacement hash
This is the biggest problem - your replacement hash is broken. I'm using a smaller alphabet for demonstration purposes, but this applies to 26 characters as well:
uppercase = Array("A".."C")
lowercase = Array("a".."c")
alphabet = uppercase + lowercase
#=> ["A", "B", "C", "a", "b", "c"]
You build the replacement hash via:
x = 1
key = Hash[alphabet.zip(alphabet.rotate(x))]
#=> {"A"=>"B", "B"=>"C", "C"=>"a", "a"=>"b", "b"=>"c", "c"=>"A"}
"C"=>"a" and "c"=>"A" are referring to the wrong character case. This happens because you rotate the entire alphabet at once:
alphabet #=> ["A", "B", "C", "a", "b", "c"]
alphabet.rotate(x) #=> ["B", "C", "a", "b", "c", "A"]
Instead. you have to rotate the uppercase and lowercase letter separately:
uppercase #=> ["A", "B", "C"]
uppercase.rotate(x) #=> ["B", "C", "A"]
lowercase #=> ["a", "b", "c"]
lowercase.rotate(x) #=> ["B", "C", "A"]
and concatenate the rotated parts afterwards. Either:
key = Hash[uppercase.zip(uppercase.rotate(x)) + lowercase.zip(lowercase.rotate(x))]
#=> {"A"=>"B", "B"=>"C", "C"=>"A", "a"=>"b", "b"=>"c", "c"=>"a"}
or:
key = Hash[(uppercase + lowercase).zip(uppercase.rotate(x) + lowercase.rotate(x))]
#=> {"A"=>"B", "B"=>"C", "C"=>"A", "a"=>"b", "b"=>"c", "c"=>"a"}
Replacing the characters
Back to a full alphabet:
uppercase = Array("A".."Z")
lowercase = Array("a".."z")
x = 10
key = Hash[uppercase.zip(uppercase.rotate(x)) + lowercase.zip(lowercase.rotate(x))]
Having a working replacement hash makes replacing the characters almost trivial:
string = "Hello, World!"
result = ""
string.each_char { |char| result << key.fetch(char, char) }
result
#=> "Rovvy, Gybvn!"
I've changed result from an array to a string. It also has a << method and you don't have to join it afterwards.
Hash#fetch works almost like Hash#[], but you can pass a default value that is returned if the key is not found in the hash:
key.fetch("H", "H") #=> "R" (replacement value)
key.fetch("!", "!") #=> "!" (default value)
Handling encryption / decryption
You're duplicating a lot of code to handle the decryption part. But there's a much easier way - just reverse the direction:
rot(10, "Hello") #=> "Rovvy"
rot(10, "Rovvy", false) #=> "Hello"
rot(-10, "Rovvy") #=> "Hello"
So within your code, you can write:
x = -x unless encrypt
Putting it all together
def rot(x, string, encrypt = true)
uppercase = Array("A".."Z")
lowercase = Array("a".."z")
x = -x unless encrypt
key = Hash[uppercase.zip(uppercase.rotate(x)) + lowercase.zip(lowercase.rotate(x))]
result = ""
string.each_char { |char| result << key.fetch(char, char) }
result
end
rot(10, "Hello, World!") #=> "Rovvy, Gybvn!"
rot(10, "Rovvy, Gybvn!", false) #=> "Hello, World!"

Finding the most occurring character/letter in a string

Trying to get the most occurring letter in a string.
So far:
puts "give me a string"
words = gets.chomp.split
counts = Hash.new(0)
words.each do |word|
counts[word] += 1
end
Does not run further than asking for a string. What am I doing wrong?
If you're running this in irb, then the computer may think that the ruby code you're typing in is the text to analyse:
irb(main):001:0> puts "give me a string"
give me a string
=> nil
irb(main):002:0> words = gets.chomp.split
counts = Hash.new(0)
words.each do |word|
counts[word] += 1
end=> ["counts", "=", "Hash.new(0)"]
irb(main):003:0> words.each do |word|
irb(main):004:1* counts[word] += 1
irb(main):005:1> end
NameError: undefined local variable or method `counts' for main:Object
from (irb):4:in `block in irb_binding'
from (irb):3:in `each'
from (irb):3
from /Users/agrimm/.rbenv/versions/2.2.1/bin/irb:11:in `<main>'
irb(main):006:0>
If you wrap it in a block of some sort, you won't get that confusion:
begin
puts "give me a string"
words = gets.chomp.split
counts = Hash.new(0)
words.each do |word|
counts[word] += 1
end
counts
end
gives
irb(main):001:0> begin
irb(main):002:1* puts "give me a string"
irb(main):003:1> words = gets.chomp.split
irb(main):004:1> counts = Hash.new(0)
irb(main):005:1> words.each do |word|
irb(main):006:2* counts[word] += 1
irb(main):007:2> end
irb(main):008:1> counts
irb(main):009:1> end
give me a string
foo bar
=> {"foo"=>1, "bar"=>1}
Then you can work on the fact that split by itself isn't what you want. :)
This should work:
puts "give me a string"
result = gets.chomp.split(//).reduce(Hash.new(0)) { |h, v| h.store(v, h[v] + 1); h }.max_by{|k,v| v}
puts result.to_s
Output:
#Alan ➜ test rvm:(ruby-2.2#europa) ruby test.rb
give me a string
aa bbb cccc ddddd
["d", 5]
Or in irb:
:008 > 'This is some random string'.split(//).reduce(Hash.new(0)) { |h, v| h.store(v, h[v] + 1); h }.max_by{|k,v| v}
=> ["s", 4]
Rather than getting a count word by word, you can process the whole string immediately.
str = gets.chomp
hash = Hash.new(0)
str.each_char do |c|
hash[c] += 1 unless c == " " #used to filter the space
end
After getting the number of letters, you can then find the letter with highest count with
max = hash.values.max
Then match it to the key in the hash and you're done :)
puts hash.select{ |key| hash[key] == max }
Or to simplify the above methods
hash.max_by{ |key,value| value }
The compact form of this is :
hash = Hash.new(0)
gets.chomp.each_char { |c| hash[c] += 1 unless c == " " }
puts hash.max_by{ |key,value| value }
This returns the highest occurring character within a given string:
puts "give me a string"
characters = gets.chomp.split("").reject { |c| c == " " }
counts = Hash.new(0)
characters.each { |character| counts[character] += 1 }
print counts.max_by { |k, v| v }

Counting frequency of symbols

So I have the following code which counts the frequency of each letter in a string (or in this specific instance from a file):
def letter_frequency(file)
letters = 'a' .. 'z'
File.read(file) .
split(//) .
group_by {|letter| letter.downcase} .
select {|key, val| letters.include? key} .
collect {|key, val| [key, val.length]}
end
letter_frequency(ARGV[0]).sort_by {|key, val| -val}.each {|pair| p pair}
Which works great, but I would like to see if there is someway to do something in ruby that is similar to this but to catch all the different possible symbols? ie spaces, commas, periods, and everything in between. I guess to put it more simply, is there something similar to 'a' .. 'z' that holds all the symbols? Hope that makes sense.
You won't need a range when you're trying to count every possible character, because every possible character is a domain. You should only create a range when you specifically need to use a subset of said domain.
This is probably a faster implementation that counts all characters in the file:
def char_frequency(file_name)
ret_val = Hash.new(0)
File.open(file_name) {|file| file.each_char {|char| ret_val[char] += 1 } }
ret_val
end
p char_frequency("1003v-mm") #=> {"\r"=>56, "\n"=>56, " "=>2516, "\xC9"=>2, ...
For reference I used this test file.
It may not use much Ruby magic with Ranges but a simple way is to build a character counter that iterates over each character in a string and counts the totals:
class CharacterCounter
def initialize(text)
#characters = text.split("")
end
def character_frequency
character_counter = {}
#characters.each do |char|
character_counter[char] ||= 0
character_counter[char] += 1
end
character_counter
end
def unique_characters
character_frequency.map {|key, value| key}
end
def frequency_of(character)
character_frequency[character] || 0
end
end
counter = CharacterCounter.new("this is a test")
counter.character_frequency # => {"t"=>3, "h"=>1, "i"=>2, "s"=>3, " "=>3, "a"=>1, "e"=>1}
counter.unique_characters # => ["t", "h", "i", "s", " ", "a", "e"]
counter.frequency_of 't' # => 3
counter.frequency_of 'z' # => 0

Why is my all? function not working? What's wrong with my syntax?

I originally wrote a method to take a word and find out if its vowels were in alphabetical order. I did it by using the code below:
def ordered_vowel_word?(word)
vowels = ["a", "e", "i", "o", "u"]
letters_arr = word.split("")
vowels_arr = letters_arr.select { |l| vowels.include?(l) }
(0...(vowels_arr.length - 1)).all? do |i|
vowels_arr[i] <= vowels_arr[i + 1]
end
end
However, I decided to try to change it by using an all? method. I tried to do so with the following code:
def ordered_vowel_word?(word)
vowels = ["a","e", "i", "o", "u"]
splitted_word = word.split("")
vowels_in_word = []
vowels_in_word = splitted_word.select {|word| vowels.include?(word)}
vowels_in_word.all? {|x| vowels_in_word[x]<= vowels_in_word[x+1]}
end
ordered_vowel_word?("word")
Anyone have any ideas why it isnt working? I would have expected this to work.
Also, if anyone has a better solution please feel free to post. Thanks!
Examples are:
it "does not return a word that is not in order" do
ordered_vowel_words("complicated").should == ""
end
it "handle double vowels" do
ordered_vowel_words("afoot").should == "afoot"
end
it "handles a word with a single vowel" do
ordered_vowel_words("ham").should == "ham"
end
it "handles a word with a single letter" do
ordered_vowel_words("o").should == "o"
end
it "ignores the letter y" do
ordered_vowel_words("tamely").should == "tamely"
end
Here is how I would do it:
#!/usr/bin/ruby
def ordered?(word)
vowels = %w(a e i o u)
check = word.each_char.select { |x| vowels.include?(x) }
# Another option thanks to #Michael Papile
# check = word.scan(/[aeiou]/)
puts check.sort == check
end
ordered?("afoot")
ordered?("outaorder")
Output is:
true
false
In your original example, you use the array values (String) as array indices which should be Integers when the all? method fires.
def ordered_vowel_word?(word)
vowels = ["a","e", "i", "o", "u"]
splitted_word = word.split("")
vowels_in_word = []
vowels_in_word = splitted_word.select {|word| vowels.include?(word)}
p vowels_in_word #=> ["o"]
vowels_in_word.all? {|x| vowels_in_word[x]<= vowels_in_word[x+1]}
end
p ordered_vowel_word?("word")
#=> `[]': no implicit conversion of String into Integer (TypeError)
vowels_in_word contains only 'o', and inside the vowels_in_word.all? {|x| vowels_in_word[x]<= vowels_in_word[x+1]} the expression vowels_in_word[x] means vowels_in_word["o"], which in-turn throws error as index can never be string.

Resources