Ruby Hash explanation - ruby

I do not understand this particular step in CodeAcademy.
text = puts "Hello text please"
text = gets.chomp
words = text.split(' ')
frequencies = Hash.new(0)
words.each { |x| frequencies[x] += 1 }
The idea is to filter the input to return a hash with each word and the amount of times the word appears. Having trouble understanding why this works.
words.each { |x| frequencies[x] += 1 }
Doesn't hash work by a {key, value} method?

The syntax for setting hash value is:
hash_name[key] = value
And the value is referenced as hash_name[key]. So:
frequencies = Hash.new(0)
This creates a new hash which, if you read the value of the hash for an unknown key, it will allow it and default the key's value as 0 (returns a 0). Without the 0 parameter, there would be no default key value, so that reading the hash with an unknown key would yield nil. But with the default return value of 0, the following:
words.each { |x| frequencies[x] += 1 }
Takes advantage of the default by going through all of the words, using them as keys, even though they don't initially exist, and incrementing the hash value of frequency[x] for the hash key x (the current word). If it hasn't been set yet, it starts at 0 which is what you want to count things. This because += will really mean frequencies[x] = frequencies[x] + 1 and the initial value returned for frequencies[x] when the value hasn't been set yet will be 0.

I'm not sure exactly where your problem lies, but hopefully this will help.
Doesn't hash work by a {key, value} method?
Yes it does. In the line
words.each { |x| frequencies[x] += 1 }
the hash is called frequencies and the key is x. The value for that key is returned by the expression frequencies[x].
It's just like an array, but using strings as indices instead of integers. data[2] is the value stored at the element of array data identified by 2, while frequencies[x] is the value stored at the element of hash frequencies indicated by x.
+= has its usual meaning as a Ruby abbreviation, so that var += 1 is identical to var = var + 1.
So frequencies[x] += 1 is frequencies[x] = frequencies[x] + 1: it adds one to the current value of the hash element identified by x.
The last piece in the puzzle is the way frequencies has been created. Ordinarily, accessing a hash element that hasn't been assigned returns nil. Using += would usually raise an undefined method '+' for nil:NilClass error because there is no method NilClass#+. But using Hash.new(0) creates a hash with a default value of zero, so that non-existent elements of this hash evaluate as 0 instead of nil, and now everything works fine when you try to increment an element for the first time.

Related

Bug in my Ruby counter

It is only counting once for each word. I want it to tell me how many times each word appears.
dictionary = ["to","do","to","do","to","do"]
string = "just do it to"
def machine(word,list)
initialize = Hash.new
swerve = word.downcase.split(" ")
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
end
end
initialize[i]=counter
end
return initialize
end
machine(string,dictionary)
I assume that, for each word in string, you wish to determine the number of instances of that word in dictionary. If so, the first step is to create a counting hash.
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
#=> {"to"=>3, "do"=>3}
(I will explain this code later.)
Now split string on whitespace and create a hash whose keys are the words in string and whose values are the numbers of times that the value of word appears in dictionary.
string.split.each_with_object({}) { |word,h| h[word] = dict_hash.fetch(word, 0) }
#=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
This of course assumes that each word in string is unique. If not, depending on the desired behavior, one possibility would be to use another counting hash.
string = "to just do it to"
string.split.each_with_object(Hash.new(0)) { |word,h|
h[word] += dict_hash.fetch(word, 0) }
#=> {"to"=>6, "just"=>0, "do"=>3, "it"=>0}
Now let me explain some of the constructs above.
I created two hashes with the form of the class method Hash::new that takes a parameter equal to the desired default value, which here is zero. What that means is that if
h = Hash.new(0)
and h does not have a key equal to the value word, then h[word] will return h's default value (and the hash h will not be changed). After creating the first hash that way, I wrote h[word] += 1. Ruby expands that to
h[word] = h[word] + 1
before she does any further processing. The first word in string that is passed to the block is "to" (which is assigned to the block variable word). Since the hash h is is initially empty (has no keys), h[word] on the right side of the above equality returns the default value of zero, giving us
h["to"] = h["to"] + 1
#=> = 0 + 1 => 1
Later, when word again equals "to" the default value is not used because h now has a key "to".
h["to"] = h["to"] + 1
#=> = 1 + 1 => 2
I used the well-worn method Enumerable#each_with_object. To a newbie this might seem complex. It isn't. The line
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
is effectively1 the same as the following.
h = Hash.new(0)
dict_hash = dictionary.each { |word| h[word] += 1 }
h
In other words, the method allows one to write a single line that creates, constructs and returns the hash, rather than three lines that do the same.
Notice that I used the method Hash#fetch for retrieving values from the hash:
dict_hash.fetch(word, 0)
fetch's second argument (here 0) is returned if dict_hash does not have a key equal to the value of word. By contrast, dict_hash[word] returns nil in that case.
1 The reason for "effectively" is that when using each_with_object, the variable h's scope is confined to the block, which is generally a good programming practice. Don't worry if you haven't learned about "scope" yet.
You can actually do this using Array#count rather easily:
def machine(word,list)
word.downcase.split(' ').collect do |w|
# for every word in `word`, count how many appearances in `list`
[w, list.count { |l| l.include?(w) }]
end.to_h
end
machine("just do it to", ["to","do","to","do","to","do"]) # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
I think this is what you're looking for, but it seems like you're approaching this backwards
Convert your string "string" into an array, remove duplicate values and iterate through each element, counting the number of matches in your array "dictionary". The enumerable method :count is useful here.
A good data structure to output here would be a hash, where we store the unique words in our string "string" as keys and the number of occurrences of these words in array "dictionary" as the values. Hashes allow one to store more information about the data in a collection than an array or string, so this fits here.
dictionary = [ "to","do","to","do","to","do" ]
string = "just do it to"
def group_by_matches( match_str, list_of_words )
## trim leading and trailing whitespace and split string into array of words, remove duplicates.
to_match = match_str.strip.split.uniq
groupings = {}
## for each element in array of words, count the amount of times it appears *exactly* in the list of words array.
## store that in the groupings hash
to_match.each do | word |
groupings[ word ] = list_of_words.count( word )
end
groupings
end
group_by_matches( string, dictionary ) #=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
On a side note, you should consider using more descriptive variable and method names to help yourself and others follow what's going on.
This also seems like you have it backwards. Typically, you'd want to use the array to count the number of occurrences in the string. This seems to more closely fit a real-world application where you'd examine a sentence/string of data for matches from a list of predefined words.
Arrays are also useful because they're flexible collections of data, easily iterated through and mutated with enumerable methods. To work with the words in our string, as you can see, it's easiest to immediately convert it to an array of words.
There are many alternatives. If you wanted to shorten the method, you could replace the more verbose each loop with an each_with_object call or a map call which will return a new object rather than the original object like each. In the case of using map.to_h, be careful as to_h will work on a two-dimensional array [["key1", "val1"], ["key2", "val2"]] but not on a single dimensional array.
## each_with_object
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
each_with_object( {} ) { | word, groupings | groupings[ word ] = list_of_words.count( word ) }
end
## map
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
map { | word | [ word, list_of_words.count( word ) ] }.to_h
end
Gauge your method preferences depending on performance, readability, and reliability.
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
needs to be changed to
swerve.each do |i|
counter = 0
list.each do |j|
if i.include? j
counter += 1
Your code is telling how many times each word in the word/string (the word which is included in the dictionary) appears.
If you want to tell how many times each word in the dictionary appears, you can switch the list.each and swerve.each loops. Then, it will return a hash # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}

adding to the value in a hash table

I am trying to increment the value in a hash by one. My logic seems right, but for some reason my value in the hash is not incrementing by one.
puts item_sold
temp = sales_hash.values[item_sold] + 1
sales_hash.values[item_sold] = temp
puts sales_hash.values[item_sold]
sales_hash is a hash where the key is a number between 1000-2000 and the value for each key starts at 0. item_sold is a random number between 1 and 15. There are 15 items in the hash. When temp prints out it is a value of one. However when I print out the value of sales_hash.values[item_sold] it prints 0. What is sales_hash.values[item_sold] not incrementing?
Hash#values returns an array of of all the hashes values. You want to add to one value, which you'd do like this:
item_sold
=> {0=>0, 1=>0, 2=>0}
item_sold[0] += 1
=> 1
item_sold
=> {0=>1, 1=>0, 2=>0}
You access the value of a hash by by using the hash[key] syntax.

my_hash.keys == [], yet my_hash[key] gives a value?

I'm trying to demonstrate a situation where it's necessary to pass a block to Hash.new in order to set up default values for a given key when creating a hash of hashes.
To show what can go wrong, I've created the following code, which passes a single value as an argument to Hash.new. I expected all outer hash keys to wind up holding a reference to the same inner hash, causing the counts for the "piles" to get mixed together. And indeed, that does seem to have happened. But part_counts.each doesn't seem to find any keys/values to iterate over, and part_counts.keys returns an empty array. Only part_counts[0] and part_counts[1] successfully retrieve a value for me.
piles = [
[:gear, :spring, :gear],
[:axle, :gear, :spring],
]
# I do realize this should be:
# Hash.new {|h, k| h[k] = Hash.new(0)}
part_counts = Hash.new(Hash.new(0))
piles.each_with_index do |pile, pile_index|
pile.each do |part|
part_counts[pile_index][part] += 1
end
end
p part_counts # => {}
p part_counts.keys # => []
# The next line prints no output
part_counts.each { |key, value| p key, value }
p part_counts[0] # => {:gear=>3, :spring=>2, :axle=>1}
For context, here is the corrected code that I intend to show after the "broken" code. The parts for each pile within part_counts are separated, as they should be. each and keys work as expected, as well.
# ...same pile initialization code as above...
part_counts = Hash.new {|h, k| h[k] = Hash.new(0)}
# ...same part counting code as above...
p part_counts # => {0=>{:gear=>2, :spring=>1}, 1=>{:axle=>1, :gear=>1, :spring=>1}}
p part_counts.keys # => [0, 1]
# The next line of code prints:
# 0
# {:gear=>2, :spring=>1}
# 1
# {:axle=>1, :gear=>1, :spring=>1}
part_counts.each { |key, value| p key, value }
p part_counts[0] # => {:gear=>2, :spring=>1}
But why don't each and keys work (at all) in the first sample?
We'll start by decomposing this a little bit:
part_counts = Hash.new(Hash.new(0))
That's the same as saying:
default_hash = { }
default_hash.default = 0
part_counts = { }
part_counts.default = default_hash
Later on, you're saying things like this:
part_counts[pile_index][part] += 1
That's the same as saying:
h = part_counts[pile_index]
h[part] += 1
You're not using the (correct) block form of the default value for your Hash so accessing the default value doesn't auto-vivify the key. That means that part_counts[pile_index] doesn't create a pile_index key in part_counts, it just gives you part_counts.default and you're really saying:
h = part_counts.default
h[part] += 1
You're not doing anything else to add keys to part_counts so it has no keys and:
part_counts.keys == [ ]
So why does part_counts[0] give us {:gear=>3, :spring=>2, :axle=>1}? part_counts doesn't have any keys and in particular doesn't have a 0 key so:
part_counts[0]
is the same as
part_counts.default
Up above where you're accessing part_counts[pile_index], you're really just getting a reference to the default, the Hash won't clone it, you get the whole default value that the Hash will use next time. That means that:
part_counts[pile_index][part] += 1
is another way of saying:
part_counts.default[part] += 1
so you're actually just changing part_counts's default value in-place. Then when you part_counts[0], you're accessing this modified default value and there's the {:gear=>3, :spring=>2, :axle=>1} that you accidentally built in your loop.
The value given to Hash.new is used as the default value, but this value is not inserted into the hash. So part_count remains empty. You can get the default value by using part_count[...] but this has no effect on the hash, it doesn't really contain the key.
When you call part_counts[pile_index][part] += 1, then part_counts[pile_index] returns the default value, and it's this value that is modified with the assignment, not part_counts.
You have something like:
outer = Hash.new({})
outer[1][2] = 3
p outer, outer[1]
which can also be written like:
inner = {}
outer = Hash.new(inner)
inner2 = outer[1] # inner2 refers to the same object as inner, outer is not modified
inner2[2] = 3 # same as inner[2] = 3
p outer, inner

Initializing hash with default value and incrementing by 1

I need a hash whose keys should have default value 0. (basically I'm making a counter). Keys are not known so I cannot initialize them in beginning. Also with every occurrence of the key, the value should increase by 1.
I have come up with this:
hash = {}
hash[key] ? hash[key]+=1 : hash[key]=0
This looks OK and short, but I don't like repeating hash[key] so many times in one line of code. Is there a better way to write this?
I think all you need is to give the hash a default value of 0
hash = Hash.new(0)
then for every occurrence of the key, you don't need to check its value, just increment it directly:
hash[key]+=1
Reference: Hash#new.
Look at Hash#default:
=> h = { }
=> h.default = 0
=> h["a"]
#> 0
=> h["z"]
#> 0

Simple Ruby Input Scraper

I'm completely new to ruby and wanted to ask for some help with this ruby script.
it's supposed to take in a string and find out which character occurs the most frequently. It does this using a hash, it stores all the characters in a hash and then iterates through it to find the one with greatest value. As of right now it doesn't seem to be working properly and i'm not sure why. It reads the characters in properly as far as i can tell with print statements. Any help is appreciated.
Thanks!
puts "Enter the string you want to search "
input = gets.chomp
charHash = Hash.new
input.split("").each do |i|
if charHash.has_key?(i)
puts "incrementing"
charHash[i]+=1
else
puts"storing"
charHash.store(i, 1)
end
end
goc = ""
max = 0
charHash.each { |key,value| goc = key if value > max }
puts "The character #{goc} occurs the most frequently"
There are two major issues with you code:
As commented by Holger Just, you have to use += 1 instead of ++
charHash.store(:i, 1) stores the symbol :i, you want to store i
Fixing these results in a working code (I'm using snake_case here):
char_hash = Hash.new
input.split("").each do |i|
if char_hash.has_key?(i)
char_hash[i] += 1
else
char_hash.store(i, 1)
end
end
You can omit the condition by using 0 as your default hash value and you can replace split("").each with each_char:
char_hash = Hash.new(0)
input.each_char do |i|
char_hash[i] += 1
end
Finally, you can pass the hash into the loop using Enumerator#with_object:
char_hash = input.each_char.with_object(Hash.new(0)) { |i, h| h[i] += 1 }
I might be missing something but it seems that instead of
charHash.each { |key,value| goc = key if value > max }
you need something like
charHash.each do |key,value|
if value > max then
max = value
goc = key
end
end
Notice the max = value statement. In your current implementation (i.e. without updating the max variable), every character that appears in the text at least once satisfies the condition and you end up getting the last one.

Resources