my_hash.keys == [], yet my_hash[key] gives a value? - ruby

I'm trying to demonstrate a situation where it's necessary to pass a block to Hash.new in order to set up default values for a given key when creating a hash of hashes.
To show what can go wrong, I've created the following code, which passes a single value as an argument to Hash.new. I expected all outer hash keys to wind up holding a reference to the same inner hash, causing the counts for the "piles" to get mixed together. And indeed, that does seem to have happened. But part_counts.each doesn't seem to find any keys/values to iterate over, and part_counts.keys returns an empty array. Only part_counts[0] and part_counts[1] successfully retrieve a value for me.
piles = [
[:gear, :spring, :gear],
[:axle, :gear, :spring],
]
# I do realize this should be:
# Hash.new {|h, k| h[k] = Hash.new(0)}
part_counts = Hash.new(Hash.new(0))
piles.each_with_index do |pile, pile_index|
pile.each do |part|
part_counts[pile_index][part] += 1
end
end
p part_counts # => {}
p part_counts.keys # => []
# The next line prints no output
part_counts.each { |key, value| p key, value }
p part_counts[0] # => {:gear=>3, :spring=>2, :axle=>1}
For context, here is the corrected code that I intend to show after the "broken" code. The parts for each pile within part_counts are separated, as they should be. each and keys work as expected, as well.
# ...same pile initialization code as above...
part_counts = Hash.new {|h, k| h[k] = Hash.new(0)}
# ...same part counting code as above...
p part_counts # => {0=>{:gear=>2, :spring=>1}, 1=>{:axle=>1, :gear=>1, :spring=>1}}
p part_counts.keys # => [0, 1]
# The next line of code prints:
# 0
# {:gear=>2, :spring=>1}
# 1
# {:axle=>1, :gear=>1, :spring=>1}
part_counts.each { |key, value| p key, value }
p part_counts[0] # => {:gear=>2, :spring=>1}
But why don't each and keys work (at all) in the first sample?

We'll start by decomposing this a little bit:
part_counts = Hash.new(Hash.new(0))
That's the same as saying:
default_hash = { }
default_hash.default = 0
part_counts = { }
part_counts.default = default_hash
Later on, you're saying things like this:
part_counts[pile_index][part] += 1
That's the same as saying:
h = part_counts[pile_index]
h[part] += 1
You're not using the (correct) block form of the default value for your Hash so accessing the default value doesn't auto-vivify the key. That means that part_counts[pile_index] doesn't create a pile_index key in part_counts, it just gives you part_counts.default and you're really saying:
h = part_counts.default
h[part] += 1
You're not doing anything else to add keys to part_counts so it has no keys and:
part_counts.keys == [ ]
So why does part_counts[0] give us {:gear=>3, :spring=>2, :axle=>1}? part_counts doesn't have any keys and in particular doesn't have a 0 key so:
part_counts[0]
is the same as
part_counts.default
Up above where you're accessing part_counts[pile_index], you're really just getting a reference to the default, the Hash won't clone it, you get the whole default value that the Hash will use next time. That means that:
part_counts[pile_index][part] += 1
is another way of saying:
part_counts.default[part] += 1
so you're actually just changing part_counts's default value in-place. Then when you part_counts[0], you're accessing this modified default value and there's the {:gear=>3, :spring=>2, :axle=>1} that you accidentally built in your loop.

The value given to Hash.new is used as the default value, but this value is not inserted into the hash. So part_count remains empty. You can get the default value by using part_count[...] but this has no effect on the hash, it doesn't really contain the key.
When you call part_counts[pile_index][part] += 1, then part_counts[pile_index] returns the default value, and it's this value that is modified with the assignment, not part_counts.
You have something like:
outer = Hash.new({})
outer[1][2] = 3
p outer, outer[1]
which can also be written like:
inner = {}
outer = Hash.new(inner)
inner2 = outer[1] # inner2 refers to the same object as inner, outer is not modified
inner2[2] = 3 # same as inner[2] = 3
p outer, inner

Related

Bug in my Ruby counter

It is only counting once for each word. I want it to tell me how many times each word appears.
dictionary = ["to","do","to","do","to","do"]
string = "just do it to"
def machine(word,list)
initialize = Hash.new
swerve = word.downcase.split(" ")
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
end
end
initialize[i]=counter
end
return initialize
end
machine(string,dictionary)
I assume that, for each word in string, you wish to determine the number of instances of that word in dictionary. If so, the first step is to create a counting hash.
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
#=> {"to"=>3, "do"=>3}
(I will explain this code later.)
Now split string on whitespace and create a hash whose keys are the words in string and whose values are the numbers of times that the value of word appears in dictionary.
string.split.each_with_object({}) { |word,h| h[word] = dict_hash.fetch(word, 0) }
#=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
This of course assumes that each word in string is unique. If not, depending on the desired behavior, one possibility would be to use another counting hash.
string = "to just do it to"
string.split.each_with_object(Hash.new(0)) { |word,h|
h[word] += dict_hash.fetch(word, 0) }
#=> {"to"=>6, "just"=>0, "do"=>3, "it"=>0}
Now let me explain some of the constructs above.
I created two hashes with the form of the class method Hash::new that takes a parameter equal to the desired default value, which here is zero. What that means is that if
h = Hash.new(0)
and h does not have a key equal to the value word, then h[word] will return h's default value (and the hash h will not be changed). After creating the first hash that way, I wrote h[word] += 1. Ruby expands that to
h[word] = h[word] + 1
before she does any further processing. The first word in string that is passed to the block is "to" (which is assigned to the block variable word). Since the hash h is is initially empty (has no keys), h[word] on the right side of the above equality returns the default value of zero, giving us
h["to"] = h["to"] + 1
#=> = 0 + 1 => 1
Later, when word again equals "to" the default value is not used because h now has a key "to".
h["to"] = h["to"] + 1
#=> = 1 + 1 => 2
I used the well-worn method Enumerable#each_with_object. To a newbie this might seem complex. It isn't. The line
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
is effectively1 the same as the following.
h = Hash.new(0)
dict_hash = dictionary.each { |word| h[word] += 1 }
h
In other words, the method allows one to write a single line that creates, constructs and returns the hash, rather than three lines that do the same.
Notice that I used the method Hash#fetch for retrieving values from the hash:
dict_hash.fetch(word, 0)
fetch's second argument (here 0) is returned if dict_hash does not have a key equal to the value of word. By contrast, dict_hash[word] returns nil in that case.
1 The reason for "effectively" is that when using each_with_object, the variable h's scope is confined to the block, which is generally a good programming practice. Don't worry if you haven't learned about "scope" yet.
You can actually do this using Array#count rather easily:
def machine(word,list)
word.downcase.split(' ').collect do |w|
# for every word in `word`, count how many appearances in `list`
[w, list.count { |l| l.include?(w) }]
end.to_h
end
machine("just do it to", ["to","do","to","do","to","do"]) # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
I think this is what you're looking for, but it seems like you're approaching this backwards
Convert your string "string" into an array, remove duplicate values and iterate through each element, counting the number of matches in your array "dictionary". The enumerable method :count is useful here.
A good data structure to output here would be a hash, where we store the unique words in our string "string" as keys and the number of occurrences of these words in array "dictionary" as the values. Hashes allow one to store more information about the data in a collection than an array or string, so this fits here.
dictionary = [ "to","do","to","do","to","do" ]
string = "just do it to"
def group_by_matches( match_str, list_of_words )
## trim leading and trailing whitespace and split string into array of words, remove duplicates.
to_match = match_str.strip.split.uniq
groupings = {}
## for each element in array of words, count the amount of times it appears *exactly* in the list of words array.
## store that in the groupings hash
to_match.each do | word |
groupings[ word ] = list_of_words.count( word )
end
groupings
end
group_by_matches( string, dictionary ) #=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
On a side note, you should consider using more descriptive variable and method names to help yourself and others follow what's going on.
This also seems like you have it backwards. Typically, you'd want to use the array to count the number of occurrences in the string. This seems to more closely fit a real-world application where you'd examine a sentence/string of data for matches from a list of predefined words.
Arrays are also useful because they're flexible collections of data, easily iterated through and mutated with enumerable methods. To work with the words in our string, as you can see, it's easiest to immediately convert it to an array of words.
There are many alternatives. If you wanted to shorten the method, you could replace the more verbose each loop with an each_with_object call or a map call which will return a new object rather than the original object like each. In the case of using map.to_h, be careful as to_h will work on a two-dimensional array [["key1", "val1"], ["key2", "val2"]] but not on a single dimensional array.
## each_with_object
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
each_with_object( {} ) { | word, groupings | groupings[ word ] = list_of_words.count( word ) }
end
## map
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
map { | word | [ word, list_of_words.count( word ) ] }.to_h
end
Gauge your method preferences depending on performance, readability, and reliability.
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
needs to be changed to
swerve.each do |i|
counter = 0
list.each do |j|
if i.include? j
counter += 1
Your code is telling how many times each word in the word/string (the word which is included in the dictionary) appears.
If you want to tell how many times each word in the dictionary appears, you can switch the list.each and swerve.each loops. Then, it will return a hash # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}

How do I increment a value for an uninitialized key in a hash?

If I try to increment the value for a key that does not yet exist in a hash like so
h = Hash.new
h[:ferrets] += 1
I get the following error:
NoMethodError: undefined method `+' for nil:NilClass
This makes sense to me, and I know this must be an incredibly easy question, but I'm having trouble finding it on SO. How do I add and increment such keys if I don't even know in advance what keys I will have?
you can set default value of hash in constructor
h = Hash.new(0)
h[:ferrets] += 1
p h[:ferrets]
note that setting default value has some pitfalls, so you must use it with care.
h = Hash.new([]) # does not work as expected (after `x[:a].push(3)`, `x[:b]` would be `[3]`)
h = Hash.new{[]} # also does not work as expected (after `x[:a].push(3)` `x[:a]` would be `[]` not `[3]`)
h = Hash.new{Array.new} # use this one instead
Therefore using ||= might be simple in some situations
h = Hash.new
h[:ferrets] ||= 0
h[:ferrets] += 1
One way to fix this is to give your hash a default:
h = Hash.new
h.default = 0
h[:ferrets] += 1
puts h.inspect
#{:ferrets=>1}
The default default for a hash is nil, and nil doesn't understand how to ++ itself.
h = Hash.new{0}
h = Hash.new(0) # also works (thanks #Phrogz)
Is another way to set the default while declaring it.

Ruby Hash explanation

I do not understand this particular step in CodeAcademy.
text = puts "Hello text please"
text = gets.chomp
words = text.split(' ')
frequencies = Hash.new(0)
words.each { |x| frequencies[x] += 1 }
The idea is to filter the input to return a hash with each word and the amount of times the word appears. Having trouble understanding why this works.
words.each { |x| frequencies[x] += 1 }
Doesn't hash work by a {key, value} method?
The syntax for setting hash value is:
hash_name[key] = value
And the value is referenced as hash_name[key]. So:
frequencies = Hash.new(0)
This creates a new hash which, if you read the value of the hash for an unknown key, it will allow it and default the key's value as 0 (returns a 0). Without the 0 parameter, there would be no default key value, so that reading the hash with an unknown key would yield nil. But with the default return value of 0, the following:
words.each { |x| frequencies[x] += 1 }
Takes advantage of the default by going through all of the words, using them as keys, even though they don't initially exist, and incrementing the hash value of frequency[x] for the hash key x (the current word). If it hasn't been set yet, it starts at 0 which is what you want to count things. This because += will really mean frequencies[x] = frequencies[x] + 1 and the initial value returned for frequencies[x] when the value hasn't been set yet will be 0.
I'm not sure exactly where your problem lies, but hopefully this will help.
Doesn't hash work by a {key, value} method?
Yes it does. In the line
words.each { |x| frequencies[x] += 1 }
the hash is called frequencies and the key is x. The value for that key is returned by the expression frequencies[x].
It's just like an array, but using strings as indices instead of integers. data[2] is the value stored at the element of array data identified by 2, while frequencies[x] is the value stored at the element of hash frequencies indicated by x.
+= has its usual meaning as a Ruby abbreviation, so that var += 1 is identical to var = var + 1.
So frequencies[x] += 1 is frequencies[x] = frequencies[x] + 1: it adds one to the current value of the hash element identified by x.
The last piece in the puzzle is the way frequencies has been created. Ordinarily, accessing a hash element that hasn't been assigned returns nil. Using += would usually raise an undefined method '+' for nil:NilClass error because there is no method NilClass#+. But using Hash.new(0) creates a hash with a default value of zero, so that non-existent elements of this hash evaluate as 0 instead of nil, and now everything works fine when you try to increment an element for the first time.

hashes ruby merge

My txt file contains a few lines and i want to add each line to a hash with key as first 2 words and value as 3rd word...The following code has no errors but the logic may be wrong...last line is supposed to print all the keys of the hash...but nothing happens...pls help
def word_count(string)
count = string.count(' ')
return count
end
h = Hash.new
f = File.open('sheet.txt','r')
f.each_line do |line|
count = word_count(line)
if count == 3
a = line.split
h.merge(a[0]+a[1] => a[2])
end
end
puts h.keys
Hash#merge doesn't modify the hash you call it on, it returns the merged Hash:
merge(other_hash) → new_hash
Returns a new hash containing the contents of other_hash and the contents of hsh. [...]
Note the Returns a new hash... part. When you say this:
h.merge(a[0]+a[1] => a[2])
You're merge the new values you built into a copy of h and then throwing away the merged hash; the end result is that h never gets anything added to it and ends up being empty after all your work.
You want to use merge! to modify the Hash:
h.merge!(a[0]+a[1] => a[2])
or keep using merge but save the return value:
h = h.merge(a[0]+a[1] => a[2])
or, since you're only adding a single value, just assign it:
h[a[0] + a[1]] = a[2]
If you want to add the first three words of each line to the hash, regardless of how many words there are, then you can drop the if count == 3 line. Or you can change it to if count > 2 if you want to make sure that there are at least three words.
Also, mu is correct. You'll want h.merge!

How do I dynamically decide which hash to add a value to?

I have a class that has hashes in various stages of "completion". This is to optimize so that I don't have to keep recreating hashes with root data that I already know. For example this is a counter called #root that would serve as a starting point.
{3=>4, 4=>1, 10=>3, 12=>5, 17=>1}
and it took key+key+key+key+key number of iterations to create #root. But now I have all combinations of [x,y] left to be added to the counter and individually evaluated. So I could do it like:
a = (1..52)
a.combination{|x,y|
evaluate(x,y)
}
But instead of I would like to do this:
a.each{|x|
evaluate(x, "foo")
a.each {|y| evaluate(y, "bar")}
}
Where i have a method like this to keep track of the hash at each state:
def evaluate index, hsh
case hsh
when "root"
#root.key?(index) ? #root[index] += 1 : #root[index] = 1
when "foo"
#foo = #root.clone
#foo.key?(index) ? #foo[index] += 1 : #foo[index] = 1
when "bar"
#bar = #foo.clone
#bar.key?(index) ? #bar[index] += 1 : #bar[index] = 1
end
end
But there is alot of repetition in this method. Is there a way that I could do this dynamically without using eval?
Instead of using hsh as a string descriptor, you can directly pass the hash object as parameter to your method evaluate? E.g. instead of evaluate(x, "foo") you write
#foo = #root.clone
evaluate(x, #foo)
Also note the #root.clone in your code overwrites the field several times inside the loop.
Additionally if you use a default initializer for your hash you save quite some logic in your code. E.g. the code lines
h = Hash.new{0}
...
h[index] += 1
will set the default value to zero if non was set for index. Thus you do not have to take care of the special case inside your evaluate method.

Resources