creating key/value hash out of array in ruby - ruby

I came across an old piece of code which was creating the hash datatype using simple arrays in ruby. I get most of it, except one part:
The point as I understand it is, every element of #store is a memory location for a particular key/value pair, as a function of the key. So #store[3] would generally store key/value pairs corresponding to key=3, key=53, ... and in general where key % size == 3 (in this case size = 50).
However, when I set hash[3] = 7, hash[53] = 9, etc., every element of #store array is populated with the key/value pairs, and not just the index 3 element. It seems like the line #store[store_key] << [key, value] in the method []=(key, value) is adding [key, value] to every element of #store, and not just one with the index store_key. Any ideas?
class SimpleHash
attr_accessor :size, :store
def initialize(size)
#size = size
#store = Array.new(size, [])
end
def []=(key, value)
store_key = key % #size
index = find_key(key, #store[store_key])
if index
#store[store_key][index][1] = value
else
p "***********************************"
p #store
#store[store_key] << [key, value]
p "after"
p store_key
p #store
end
end
end
hash = SimpleHash.new(50)
p hash
hash[3] = 5
p hash
hash[3] = 7
hash[53] = 9
hash[103] = 11
hash[104] = 11

You can simply do this
#store = Array.new(size){ [] }
Every element is a separate array.

Although your question's a bit unclear, I can guess what the problem is.
#store = Array.new(size, [])
That creates an array of the right size, but where every element is the SAME OBJECT.
Change the array-within-an-array at any position, and the change will be evident in every position.
Try instead
#store = Array.new
size.times { #store << [] }
Every sub-array will be a separate object that way.
EDIT
#nafaa boutefer's answer is better. The block gets evaluated for each instance of the array so each sub-array is a different object.
#store = Array.new(size){ [] }

Related

Ruby iterate through hash and compare value pairs

My Ruby assignment is to iterate through a hash and return the key associated with the lowest value, without using any of the following methods:
#keys #values #min #sort #min_by
I don't understand how to iterate through the hash and store each pair as it comes through, compare it to the last pair that came through, and return the lowest key. This is my code to show you my thought process, but it of course does not work. Any thoughts on how to do this? Thanks!
def key_for_min_value(name_hash)
index = 0
lowest_hash = {}
name_hash.collect do |key, value|
if value[index] < value[index + 1]
lowest = value
index = index + 1
key_for_min_value[value]
return lowest
end
end
end
Track min_value and key_for_min_value. Iterate through the hash, and any time the current value is lower than min_value, update both of these vars. At the end of the loop, return key_for_min_value.
I didn't include sample code because, hey, this is homework. :) Good luck!
One way to do it is transforming our hash into an array;
def key_for_min_value(name_hash)
# Convert hash to array
name_a = name_hash.to_a
# Default key value
d_value= 1000
d_key= 0
# Iterate new array
name_a.each do |i|
# If current value is lower than default, change value&key
if i[1] < d_value
d_value = i[1]
d_key = i[0]
end
end
return d_key
end
You might need to change d_value to something higher or find something more creative :)
We can use Enumerable#reduce method to compare entries and pick the smallest value. Each hash entry gets passed in as an array with 2 elements in reduce method, hence, I am using Array#first and Array#last methods to access key and values.
h = {"a" => 1, "b" => 2, "c" => 0}
p h.reduce{ |f, s| f.last > s.last ? s : f }.first
#=> "c"

Comparing values of one hash to many hashes to get inverse document frequency in ruby

I'm trying to find the inverse document frequency for a categorization algorithm and am having trouble getting it the way that my code is structured (with nested hashes), and generally comparing one hash to many hashes.
My training code looks like this so far:
def train!
#data = {}
#all_books.each do |category, books|
#data[category] = {
words: 0,
books: 0,
freq: Hash.new(0)
}
books.each do |filename, tokens|
#data[category][:words] += tokens.count
#data[category][:books] += 1
tokens.each do |token|
#data[category][:freq][token] += 1
end
end
#data[category][:freq].map { |k, v| v = (v / #data[category][:freq].values.max) }
end
end
Basically, I have a hash with 4 categories (subject to change), and for each have word count, book count, and a frequency hash which shows term frequency for the category. How do I get the frequency of individual words from one category compared against the frequency of the words shown in all categories? I know how to do the comparison for one set of hash keys against another, but am not sure how to loop through a nested hash to get the frequency of terms against all other terms, if that makes sense.
Edit to include predicted outcome -
I'd like to return a hash of nested hashes (one for each category) that shows the word as the key, and the number of other categories in which it appears as the value. i.e. {:category1 = {:word => 3, :other => 2, :third => 1}, :category2 => {:another => 1, ...}} Alternately an array of category names as the value, instead of the number of categories, would also work.
I've tried creating a new hash as follows, but it's turning up empty:
def train!
#data = {}
#all_words = Hash.new([]) #new hash for all words, default value is empty array
#all_books.each do |category, books|
#data[category] = {
words: 0,
books: 0,
freq: Hash.new(0)
}
books.each do |filename, tokens|
#data[category][:words] += tokens.count
#data[category][:books] += 1
tokens.each do |token|
#data[category][:freq][token] += 1
#all_words[token] << category #should insert category name if the word appears, right?
end
end
#data[category][:freq].map { |k, v| v = (v / #data[category][:freq].values.max) }
end
end
If someone can help me figure out why the #all_words hash is empty when the code is run, I may be able to get the rest.
I haven't gone through it all, but you certainly have an error:
#all_words[token] << category #should insert category name if the word appears, right?
Nope. #all_words[token] will return empty array, but not create a new slot with an empty array, like you're assuming. So that statement doesn't modify the #all_words hash at all.
Try these 2 changes and see if it helps:
#all_words = {} # ditch the default value
...
(#all_words[token] ||= []) << category # lazy-init the array, and append

Ruby grocery list program

I am currently learning Ruby and I'm trying to write a simple Ruby grocery_list method. Here are the instructions:
We want to write a program to help keep track of a grocery list. It takes a grocery item (like "eggs") as an argument, and returns the grocery list (that is, the item names with the quantities of each item). If you pass the same argument twice, it should increment the quantity.
def grocery_list(item)
array = []
quantity = 1
array.each {|x| quantity += x }
array << "#{quantity}" + " #{item}"
end
puts grocery_list("eggs", "eggs")
so I'm trying to figure out here how to return "2 eggs" by passing eggs twice
To help you count the different items you can use as Hash. A Hash is similar to an Array, but with Strings instead of Integers als an Index:
a = Array.new
a[0] = "this"
a[1] = "that"
h = Hash.new
h["sonja"] = "asecret"
h["brad"] = "beer"
In this example the Hash might be used for storing passwords for users. But for your
example you need a hash for counting. Calling grocery_list("eggs", "beer", "milk", "eggs")
should lead to the following commands being executed:
h = Hash.new(0) # empty hash {} created, 0 will be default value
h["eggs"] += 1 # h is now {"eggs"=>1}
h["beer"] += 1 # {"eggs"=>1, "beer"=>1}
h["milk"] += 1 # {"eggs"=>1, "beer"=>1, "milk"=>1}
h["eggs"] += 1 # {"eggs"=>2, "beer"=>1, "milk"=>1}
You can work through all the keys and values of a Hash with the each-loop:
h.each{|key, value| .... }
and build up the string we need as a result, adding
the number of items if needed, and the name of the item.
Inside the loop we always add a comma and a blank at the end.
This is not needed for the last element, so after the
loop is done we are left with
"2 eggs, beer, milk, "
To get rid of the last comma and blank we can use chop!, which "chops off"
one character at the end of a string:
output.chop!.chop!
One more thing is needed to get the complete implementation of your grocery_list:
you specified that the function should be called like so:
puts grocery_list("eggs", "beer", "milk","eggs")
So the grocery_list function does not know how many arguments it's getting. We can handle
this by specifying one argument with a star in front, then this argument will
be an array containing all the arguments:
def grocery_list(*items)
# items is an array
end
So here it is: I did your homework for you and implemented grocery_list.
I hope you actually go to the trouble of understanding the implementation,
and don't just copy-and-paste it.
def grocery_list(*items)
hash = Hash.new(0)
items.each {|x| hash[x] += 1}
output = ""
hash.each do |item,number|
if number > 1 then
output += "#{number} "
end
output += "#{item}, "
end
output.chop!.chop!
return output
end
puts grocery_list("eggs", "beer", "milk","eggs")
# output: 2 eggs, beer, milk
def grocery_list(*item)
item.group_by{|i| i}
end
p grocery_list("eggs", "eggs","meat")
#=> {"eggs"=>["eggs", "eggs"], "meat"=>["meat"]}
def grocery_list(*item)
item.group_by{|i| i}.flat_map{|k,v| [k,v.length]}
end
p grocery_list("eggs", "eggs","meat")
#=>["eggs", 2, "meat", 1]
def grocery_list(*item)
Hash[*item.group_by{|i| i}.flat_map{|k,v| [k,v.length]}]
end
grocery_list("eggs", "eggs","meat")
#=> {"eggs"=>2, "meat"=>1}
grocery_list("eggs", "eggs","meat","apple","apple","apple")
#=> {"eggs"=>2, "meat"=>1, "apple"=>3}
or as #Lee said:
def grocery_list(*item)
item.each_with_object(Hash.new(0)) {|a, h| h[a] += 1 }
end
grocery_list("eggs", "eggs","meat","apple","apple","apple")
#=> {"eggs"=>2, "meat"=>1, "apple"=>3}
Use a Hash Instead of an Array
When you want an easy want to count things, you can use a hash key to hold the name of the thing you want to count, and the value of that key is the quantity. For example:
#!/usr/bin/env ruby
class GroceryList
attr_reader :list
def initialize
# Specify hash with default quantity of zero.
#list = Hash.new(0)
end
# Increment the quantity of each item in the #list, using the name of the item
# as a hash key.
def add_to_list(*items)
items.each { |item| #list[item] += 1 }
#list
end
end
if $0 == __FILE__
groceries = GroceryList.new
groceries.add_to_list('eggs', 'eggs')
puts 'Grocery list correctly contains 2 eggs.' if groceries.list['eggs'] == 2
end
Here's a more verbose, but perhaps more readable solutions to your challenge.
def grocery_list(*items) # Notice the asterisk in front of items. It means "put all the arguments into an array called items"
my_grocery_hash = {} # Creates an empty hash
items.each do |item| # Loops over the argument array and passes each argument into the loop as item.
if my_grocery_hash[item].nil? # Returns true of the item is not a present key in the hash...
my_grocery_hash[item] = 1 # Adds the key and sets the value to 1.
else
my_grocery_hash[item] = my_grocery_hash[item] + 1 # Increments the value by one.
end
end
my_grocery_hash # Returns a hash object with the grocery name as the key and the number of occurences as the value.
end
This will create an empty hash (called dictionaries or maps in other languages) where each grocery is added as a key with the value set to one. In case the same grocery appears multiple times as a parameter to your method, the value is incremented.
If you want to create a text string and return that instead of the hash object and you can do like this after the iteration:
grocery_list_string = "" # Creates an empty string
my_grocery_hash.each do |key, value| # Loops over the hash object and passes two local variables into the loop with the current entry. Key being the name of the grocery and value being the amount.
grocery_list_string << "#{value} units of #{key}\n" # Appends the grocery_list_string. Uses string interpolation, so #{value} becomes 3 and #{key} becomes eggs. The remaining \n is a newline character.
end
return grocery_list_string # Explicitly declares the return value. You can ommit return.
Updated answer to comment:
If you use the first method without adding the hash iteration you will get a hash object back which can be used to look up the amount like this.
my_hash_with_grocery_count = grocery_list("Lemonade", "Milk", "Eggs", "Lemonade", "Lemonade")
my_hash_with_grocery_count["Milk"]
--> 1
my_hash_with_grocery_count["Lemonade"]
--> 3
Enumerable#each_with_object can be useful for things like this:
def list_to_hash(*items)
items.each_with_object(Hash.new(0)) { |item, list| list[item] += 1 }
end
def hash_to_grocery_list_string(hash)
hash.each_with_object([]) do |(item, number), result|
result << (number > 1 ? "#{number} #{item}" : item)
end.join(', ')
end
def grocery_list(*items)
hash_to_grocery_list_string(list_to_hash(*items))
end
p grocery_list('eggs', 'eggs', 'bread', 'milk', 'eggs')
# => "3 eggs, bread, milk"
It iterates an array or hash to enable building another object in a convenient way. The list_to_hash method uses it to build a hash from the items array (the splat operator converts the method arguments to an array); the hash is created so that each value is initialized to 0. The hash_to_grocery_list_string method uses it to build an array of strings that is joined to a comma-separated string.

Ruby 'tap' method - inside assignment

Recently I discovered that tap can be used in order to "drily" assign values to new variables; for example, for creating and filling an array, like this:
array = [].tap { |ary| ary << 5 if something }
This code will push 5 into array if something is truthy; otherwise, array will remain empty.
But I don't understand why after executing this code:
array = [].tap { |ary| ary += [5] if something }
array remains empty. Can anyone help me?
In the first case array and ary point to the same object. You then mutate that object using the << method. The object that both array and ary point to is now changed.
In the second case array and ary again both point to the same array. You now reassign the ary variable, so that ary now points to a new array. Reassigning ary however has no effect on array. In ruby reassigning a variable never effects other variables, even if they pointed to the same object before the reassignment.
In other words array is still empty for the same reason that x won't be 42 in the following example:
x = 23
y = x
y = 42 # Changes y, but not x
Edit: To append one array to another in-place you can use the concat method, which should also be faster than using +=.
I want to expand on this a bit:
array = [].tap { |ary| ary << 5 if something }
What this does (assuming something is true-ish):
assigns array to [], an empty array.
array.object_id = 2152428060
passes [] to the block as ary. ary and array are pointing to the same array object.
array.object_id = 2152428060
ary.object_id = 2152428060
ary << 5 << is a mutative method, meaning it will modify the receiving object. It is similar to the idiom of appending ! to a method call, meaning "modify this in place!", like in .map vs .map! (though the bang does not hold any intrinsic meaning on its own in a method name). ary has 5 inserted, so ary = array = [5]
array.object_id = 2152428060
ary.object_id = 2152428060
We end with array being equal to [5]
In the second example:
array = [].tap{ |ary| ary += [5] if something }
same
same
ary += 5 += is short for ary = ary + 5, so it is first modification (+) and then assignment (=), in that order. It gives the appearance of modifying an object in place, but it actually does not. It creates an entirely new object.
array.object_id = 2152428060
ary.object_id = 2152322420
So we end with array as the original object, an empty array with object_id=2152428060 , and ary, an array with one item containing 5 with object_id = 2152322420. Nothing happens to ary after this. It is uninvolved with the original assignment of array, that has already happened. Tap executes the block after array has been assigned.

difficulty modifying two dimensional ruby array

Excuse the newbie question. I'm trying to create a two dimensional array in ruby, and initialise all its values to 1. My code is creating the two dimensional array just fine, but fails to modify any of its values.
Can anyone explain what I'm doing wrong?
def mda(width,height)
#make a two dimensional array
a = Array.new(width)
a.map! { Array.new(height) }
#init all its values to 1
a.each do |row|
row.each do |column|
column = 1
end
end
return a
end
It the line row.each do |column| the variable column is the copy of the value in row. You can't edit its value in such way. You must do:
def mda(width,height)
a = Array.new(width)
a.map! { Array.new(height) }
a.each do |row|
row.map!{1}
end
return a
end
Or better:
def mda(width,height)
a = Array.new(width)
a.map! { Array.new(height) }
a.map do |row|
row.map!{1}
end
end
Or better:
def mda(width,height)
a = Array.new(width){ Array.new(height) }
a.map do |row|
row.map!{1}
end
end
Or better:
def mda(width,height)
Array.new(width) { Array.new(height){1} }
end
each passes into the block parameter the value of each element, not the element itself, so column = 1 doesn't actually modify the array.
You can do this in one step, though - see the API docs for details on the various forms of Array#new. Try a = Array.new(width) {|i| Array.new(height) {|j| 1 } }
you can create it like this?
a=Array.new(width) { Array.new(height,1) }
column in your nested each loop is a copy of the value at that place in the array, not a pointer/reference to it, so when you change its value you're only changing the value of the copy (which ceases to exist outside the block).
If you just want a two-dimensional array populated with 1s something as simple as this will work:
def mda(width,height)
[ [1] * width ] * height
end
Pretty simple.
By the way, if you want to know how to modify the elements of a two-dimensional array as you're iterating over it, here's one way (starting from line 6 in your code):
#init all its values to 1
a.length.times do |i|
a[i].length.times do |j|
a[i][j] = 1
end
end

Resources