calculations based on matching keys' values in Ruby hashes - ruby

I'm making artist recommendation program that will match a hash showing artists that a user has seen live and how many times, against a hash showing artists that a given artist has shared a bill with and how many times. The match score is calculated based on these numbers. If a user has seen some artist x amount of times and a given artist has played with this artist at least once, like this:
user = {"artist7" => 3, "artist8" => 1}
artist1 = {"artist6" => 7, "artist7" => 7}
match = 0
user.each do |k, v|
if artist1[k]
match += (1 - ((user[k] - artist1[k])/(user[k] + artist1[k])).abs)
end
end
I have tried this out in irb and the value of match does not change.

All your inputs are integers, so ruby uses integer division. It looks like that's likely to produce 1, and 1 - 1 is zero. Add some to_f to your equation to use float division instead, e.g.:
match += (1 - ((user[k] - artist1[k]).to_f/(user[k] + artist1[k])).abs)

Related

Optimize print output where i use check on zero. Ruby

Currently, I'm having print like this
print ((stamp_amount[0], 'first mark') unless stamp_amount[0].zero?), (', ' if !stamp_amount[0].zero? && !stamp_amount[1].zero?),
((stamp_amount[1], 'second mark') unless stamp_amount[1].zero?)
stamp_amount is an array with 2 integer values
Let's say in the current situation stamp_amount[0] = 10 and stamp_amount[1] = 3
Output preview:
10 first mark, 3 second mark
So if stamp_amount[0] = 0 the 10 first mark, part won't be show. Same if stamp_amount[1] = 0 the , 3 second mark part won't be shown
For me, it seems a little bit incorrect in terms of theory. Could you please suggest me the more correct or less painful print of this? :)
Cheers!
Your code is trying to join a sequence of up to two elements with a separator. The joining is a solved problem, see Array#join.
The problem can be then reduced to "how can I produce the correct sequence, given my stamp_amount input". Now this can be done in a thousand ways. Here's one:
def my_print(stamp_amount)
ary = [
!stamp_amount[0].zero? && stamp_amount[0],
!stamp_amount[1].zero? && stamp_amount[1],
].select{|elem| elem }
ary.join(', ')
end
my_print([10, 3]) # => "10, 3"
my_print([0, 3]) # => "3"
my_print([10, 0]) # => "10"
my_print([0, 0]) # => ""
Here's another
ary = []
ary << stamp_amount[0] unless stamp_amount[0].zero?
ary << stamp_amount[1] unless stamp_amount[1].zero?
ary.join(', ')
Here's yet another. This version can handle stamp_amount of any length.
ary = stamp_amount.reject(&:zero?)
ary.join(', ')
I'd go with the third, but the second one may be the easiest to understand for a beginner.
Use the select, as an alternative to reject (shown in part 3 of the answer by Sergio Tulentsev). It is just asa readable, and depending on the context and on the future changes to the code, you may prefer one versus the other.
puts stamp_amount.select{ |a| !a.zero? }.join(", ")
A few examples of inputs and outputs are:
stamp_amount output
--------------------------------------------------------------------------
10, 3 10, 3
10, 0 10
0, 3 3
0, 0 (prints an empty line, because the selected array is empty)
You're calculating zero? on index points more often than is needed, but the first thing I would look at refactoring here is the readability of the code. It might be nicer to calculate the message to print outside of the print method and explain what is happening with variable names.
# rubocop is going to complain about variable assignment like this
first_amount, second_amount = *stamp_amount
We can actually use the reason rubocop prefers the .zero? over == 0 or .empty? method to guide our development. zero? is in essence just empty? but it communicates the meaning of what you are attempting to do in a better manner. I would use this reasoning when assigning strings to variables that explain what they are doing.
some_name_that_explains_what_this_is_0 = "#{first_amount} piecu centu marka"
some_name_that_explains_what_this_is_1 = "#{second_amount} tris centu marka"
Your current code is confusing as you have the possibility of printing a string like "10 tris centu marka" which does not make lexical sense and probably not what you are after considering tis evaluates to 'second mark', which would pose an issue if the first value is zero. We also could reject zero integers before we start converting them to strings.
array = [1, 0].reject(&:zero?)
Now we can take the array and do something like:
string = []
array.each_with_index { |e, i| string << "#{e} #{Ordinalize.new(i).ordinalize} mark" }
message = string.join(', ')
print(message)
# ord class
class Ordinalize
def initialize(value)
#value = value
end
def ordinalize
mapping[#value]
end
def mapping
# acounting for zero index
['first', 'second']
end
end
where we are calculating the ordinalization and letting our new class handle the sentence structure for us.
Outputs:
[1, 0] => "1 first mark"
[0, 1] => "1 first mark"
[1, 2] => "1 first mark, 2 second mark"

Bug in my Ruby counter

It is only counting once for each word. I want it to tell me how many times each word appears.
dictionary = ["to","do","to","do","to","do"]
string = "just do it to"
def machine(word,list)
initialize = Hash.new
swerve = word.downcase.split(" ")
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
end
end
initialize[i]=counter
end
return initialize
end
machine(string,dictionary)
I assume that, for each word in string, you wish to determine the number of instances of that word in dictionary. If so, the first step is to create a counting hash.
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
#=> {"to"=>3, "do"=>3}
(I will explain this code later.)
Now split string on whitespace and create a hash whose keys are the words in string and whose values are the numbers of times that the value of word appears in dictionary.
string.split.each_with_object({}) { |word,h| h[word] = dict_hash.fetch(word, 0) }
#=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
This of course assumes that each word in string is unique. If not, depending on the desired behavior, one possibility would be to use another counting hash.
string = "to just do it to"
string.split.each_with_object(Hash.new(0)) { |word,h|
h[word] += dict_hash.fetch(word, 0) }
#=> {"to"=>6, "just"=>0, "do"=>3, "it"=>0}
Now let me explain some of the constructs above.
I created two hashes with the form of the class method Hash::new that takes a parameter equal to the desired default value, which here is zero. What that means is that if
h = Hash.new(0)
and h does not have a key equal to the value word, then h[word] will return h's default value (and the hash h will not be changed). After creating the first hash that way, I wrote h[word] += 1. Ruby expands that to
h[word] = h[word] + 1
before she does any further processing. The first word in string that is passed to the block is "to" (which is assigned to the block variable word). Since the hash h is is initially empty (has no keys), h[word] on the right side of the above equality returns the default value of zero, giving us
h["to"] = h["to"] + 1
#=> = 0 + 1 => 1
Later, when word again equals "to" the default value is not used because h now has a key "to".
h["to"] = h["to"] + 1
#=> = 1 + 1 => 2
I used the well-worn method Enumerable#each_with_object. To a newbie this might seem complex. It isn't. The line
dict_hash = dictionary.each_with_object(Hash.new(0)) { |word,h| h[word] += 1 }
is effectively1 the same as the following.
h = Hash.new(0)
dict_hash = dictionary.each { |word| h[word] += 1 }
h
In other words, the method allows one to write a single line that creates, constructs and returns the hash, rather than three lines that do the same.
Notice that I used the method Hash#fetch for retrieving values from the hash:
dict_hash.fetch(word, 0)
fetch's second argument (here 0) is returned if dict_hash does not have a key equal to the value of word. By contrast, dict_hash[word] returns nil in that case.
1 The reason for "effectively" is that when using each_with_object, the variable h's scope is confined to the block, which is generally a good programming practice. Don't worry if you haven't learned about "scope" yet.
You can actually do this using Array#count rather easily:
def machine(word,list)
word.downcase.split(' ').collect do |w|
# for every word in `word`, count how many appearances in `list`
[w, list.count { |l| l.include?(w) }]
end.to_h
end
machine("just do it to", ["to","do","to","do","to","do"]) # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
I think this is what you're looking for, but it seems like you're approaching this backwards
Convert your string "string" into an array, remove duplicate values and iterate through each element, counting the number of matches in your array "dictionary". The enumerable method :count is useful here.
A good data structure to output here would be a hash, where we store the unique words in our string "string" as keys and the number of occurrences of these words in array "dictionary" as the values. Hashes allow one to store more information about the data in a collection than an array or string, so this fits here.
dictionary = [ "to","do","to","do","to","do" ]
string = "just do it to"
def group_by_matches( match_str, list_of_words )
## trim leading and trailing whitespace and split string into array of words, remove duplicates.
to_match = match_str.strip.split.uniq
groupings = {}
## for each element in array of words, count the amount of times it appears *exactly* in the list of words array.
## store that in the groupings hash
to_match.each do | word |
groupings[ word ] = list_of_words.count( word )
end
groupings
end
group_by_matches( string, dictionary ) #=> {"just"=>0, "do"=>3, "it"=>0, "to"=>3}
On a side note, you should consider using more descriptive variable and method names to help yourself and others follow what's going on.
This also seems like you have it backwards. Typically, you'd want to use the array to count the number of occurrences in the string. This seems to more closely fit a real-world application where you'd examine a sentence/string of data for matches from a list of predefined words.
Arrays are also useful because they're flexible collections of data, easily iterated through and mutated with enumerable methods. To work with the words in our string, as you can see, it's easiest to immediately convert it to an array of words.
There are many alternatives. If you wanted to shorten the method, you could replace the more verbose each loop with an each_with_object call or a map call which will return a new object rather than the original object like each. In the case of using map.to_h, be careful as to_h will work on a two-dimensional array [["key1", "val1"], ["key2", "val2"]] but not on a single dimensional array.
## each_with_object
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
each_with_object( {} ) { | word, groupings | groupings[ word ] = list_of_words.count( word ) }
end
## map
def group_by_matches( match_str, list_of_words )
to_match = match_str.strip.split.uniq
to_match.
map { | word | [ word, list_of_words.count( word ) ] }.to_h
end
Gauge your method preferences depending on performance, readability, and reliability.
list.each do |i|
counter = 0
swerve.each do |j|
if i.include? j
counter += 1
needs to be changed to
swerve.each do |i|
counter = 0
list.each do |j|
if i.include? j
counter += 1
Your code is telling how many times each word in the word/string (the word which is included in the dictionary) appears.
If you want to tell how many times each word in the dictionary appears, you can switch the list.each and swerve.each loops. Then, it will return a hash # => {"just"=>0, "do"=>3, "it"=>0, "to"=>3}

Calculating totals from two different hashes

I have two hashes:
For example, one contains a list of dishes and their prices
dishes = {"Chicken"=>12.5, "Pizza"=>10, "Pasta"=>8.99}
The other is a basket hash i.e. I've selected one pasta and two pizzas:
basket = {"Pasta"=>1, "Pizza"=>2}
Now I am trying to calculate the total cost of the basket but can't seem to get my references right.
Have tried
basket.inject { |item, q| dishes[item] * q }
But keep getting the following error
NoMethodError: undefined method `*' for nil:NilClass
basket.inject { |item, q| dishes[item] * q }
Let's look at the documentation for Enumerable#inject to see what is going on. inject "folds" the collection into a single object, by taking a "starting object" and then repeatedly applying the binary operation to the starting object and the first element, then to the result of that and the second element, then to the result of that and the third element, and so forth.
So, the block receives two arguments: the current value of the accumulator and the current element, and the block returns the new value of the accumulator for the next invocation of the block. If you don't supply a starting value for the accumulator, then the first element of the collection is used.
So, during the first iteration here, since you didn't supply a starting value for the accumulator, the value is going to be the first element; and iteration is going to start from the second element. This means that during the first iteration, item is going to be ['Pasta', 1] and q is going to be ['Pizza', 2]. Let's just run through the example in our heads:
dishes[item] * q # item is ['Pasta', 1]
dishes[['Pasta', 1]] * q # q is ['Pizza', 2]
dishes[['Pasta', 1]] * ['Pizza', 2] # there is no key ['Pasta', 1] in dishes
nil * ['Pizza', 2] # nil doesn't have * method
Ergo, you get a NoMethodError.
Now, I believe, what you actually wanted to do was something like this:
basket.inject(0.0) {|sum, (item, q)| sum + dishes[item] * q }
# ↑↑↑ ↑↑↑ ↑↑↑↑↑
You don't want to accumulate orders, you want to accumulate numbers, so you need to supply a number as the starting value; if you don't, the starting value will be the first element, which is an order, not a number
You were mixing up the meaning of the block parameters
You weren't actually summing anything
Now, while inject is capable of summing (in fact, inject is capable of anything, it is a general iteration operation, i.e. anything you could do with a loop, you can also do with inject), it is usually better to use more specialized operations if they exist. In this case, a more specialized operation for summing does exist, and it is called Enumerable#sum:
basket.sum {|item, q| dishes[item] * q }
But there is a deeper underlying problem with your code: Ruby is an object-oriented language. It is not an array-of-hash-of-strings-and-floats-oriented language. You should build objects that represent your domain abstractions:
class Dish < Struct.new(:name, :price)
def to_s; "#{name}: $#{price}" end
def *(num) num * price end
def coerce(other) [other, price] end
end
require 'bigdecimal'
require 'bigdecimal/util'
dishes = {
chicken: Dish.new('Chicken', '12.5'.to_d),
pizza: Dish.new('Pizza', '10'.to_d),
pasta: Dish.new('Pasta', '8.99'.to_d)
}
class Order < Struct.new(:dish, :quantity)
def to_s; "#{quantity} * #{dish}" end
def total; quantity * dish end
end
class Basket
def initialize(*orders)
self.orders = orders
end
def <<(order)
orders << order
end
def to_s; orders.join("\n") end
def total; orders.sum(&:total) end
private
attr_accessor :orders
end
basket = Basket.new(
Order.new(dishes[:pasta], 1),
Order.new(dishes[:pizza], 2)
)
basket.total
#=> 0.2899e2
Now, of course, for such a simple example, this is overkill. But I hope that you can see that despite this being more code, it is also much much simpler. There is no complex navigation of complex nested structures, because a) there are no complex nested structures and b) all the objects know how to take care of themselves, there is never a need to "take apart" an object to examine its parts and run complex calculations on them, because the objects themselves know their own parts and how to run calculations on them.
Note: personally, I do not think that allowing arithmetic operations on Dishes is a good idea. It is more of a "neat hack" that I wanted to show off in this code snippet.
With Ruby 2.4, you could use Hash(Enumerable)#sum with a block :
basket = {"Pasta"=>1, "Pizza"=>2}
prices = {"Chicken"=>12.5, "Pizza"=>10, "Pasta"=>8.99}
basket.sum{ |dish, quantity| quantity * prices[dish] }
# 28.99
Data structure
dishes
dishes (what I called prices to avoid writing dishes[dish]) is the correct data structure :
Hash lookup is fast
If you want to update the price of a dish, you only have to do it in one place
It's basically a mini database.
basket
basket is also fine as a Hash, but only if you don't oder any dish more than once. If you want to order 2 pizzas, 1 pasta and then 3 pizzas again :
{"Pizza"=>2, "Pasta" => 1, "Pizza" =>3}
=> {"Pizza"=>3, "Pasta"=>1}
you'll lose the first order.
In that case, you might want to use an array of pairs (a 2-element array with dish and quantity) :
basket = [["Pizza", 2], ["Pasta", 1], ["Pizza", 3]]
With this structure, you could use the exact same syntax to get the total as with a Hash :
basket.sum{ |dish, quantity| quantity * prices[dish] }
Try this one
basket.inject(0) do |acc, item|
dish, q = item
acc + (dishes[dish] * q)
end
=> 28.990000000000002
one line
basket.inject(0) { |acc, item| acc + (dishes[item.first] * item.last) }
Your variables for the block are wrong. You have the accumulator and an item (that it's an hash)
2.2.0 :011 > basket.inject(0){ |sum, (item, q)| sum + dishes[item].to_f * q }
=> 28.990000000000002

converting a hours from a time stamp into a hash listing the hour and frequency in Ruby

Ok, so I'm pretty new at this, I hope I explain this correctly. I'm using Ruby, and I have a program which takes a CSV file and performs some various functions on it. What I'm concerned with here is the TIME portion. I took a column of data which was a string, and used this method to convert it to DateTime and give me just the hour part:
def hour_reg(regdate)
regdate.to_s
time_stamp = DateTime.strptime("#{regdate}", "%m/%d/%y %H:%M").hour
time_stamp
end
that part works fine. so now I'm trying to take that HOUR that I just got, and convert that into a HASH which displays the Hour of the day (1 through 24), and how many times each hour comes up. For example, if the hour 1 came up (for 1AM) 3 separate times, it would display: {1 => 3} in the hash. here's what the code looks like that iterates through the column of TIMES, indicated by ":regdate"
contents.each do |row|
id = row[0]
name = row[:first_name]
zipcode = clean_zipcode(row[:zipcode])
**reg_time = hour_reg(row[:regdate])**
end
Basically I want the frequency of each hour. can anyone help with this? I'm having a great deal of trouble
You will need to create a Hash with 1-24 keys, initialized to 0.
h = { 1 => 0, 2 => 0, ...}
Then do this to increment the hash. I'm assuming the hour_reg method returns an integer.
h[hour_reg(row[:regdate])] += 1
Also you can simplify your hour_reg method to:
def hour_reg(regdate)
DateTime.strptime("#{regdate}", "%m/%d/%y %H:%M").hour
end
Updating my answer to reflect the discussion in comments:
#get contents from CSV file
contents = CSV.open 'event_attendees.csv', headers: true, header_converters: :symbol
# create Hash h with 1-24 keys initialized to 0
h = {}
(1..24).each {|x| h[x] = 0}
contents.each do |row|
reg_time = hour_reg(row[:regdate]).to_i
h[reg_time] += 1
end
The hour frequency is stored in the "h" hash.
You can simplify the above "contents" block to a single line if you want:
contents.each do {|row| h[hour_reg(row[:regdate]).to_i] += 1}

selecting hash values by value property in ruby

In Ruby, I have a hash of objects. Each object has a type and a value. I am trying to design an efficient function that can get the average of the values of all of objects of a certain type within the hash.
Here is an example of how this is currently implemented:
#the hash is composed of a number of objects of class Robot (example name)
class Robot
attr_accessor :type, :value
def initialize(type, value)
#type = type
#value = value
end
end
#this is the hash that inclues the Robot objects
hsh = { 56 => Robot.new(:x, 5), 21 => Robot.new(:x, 25), 45 => Robot.new(:x, 35), 31 => Robot.new(:y, 15), 0 => Robot.new(:y, 5) }
#this is the part where I find the average
total = 0
count = 0
hsh.each_value { |r|
if r.type == :x #is there a better way to get only objects of type :x ?
total += r.value
count += 1
end
}
average = total / count
So my question is:
is there a better way to do this that does not involve looping through the entire hash?
Note that I can't use the key values because there will be multiple objects with the same type in the same hash (and the key values are being used to signify something else already).
If there is an easy way to do this with arrays, that would also work (since I can convert the hash to an array easily).
Thanks!
EDIT: fixed error in my code.
hsh.values.select {|v| v.type == :x}.map(&:value).reduce(:+) / hsh.size
I am trying to design an efficient function that can get the average of the values of all of objects of a certain type within the hash
Unless I misunderstood what you are trying to say, this is not what the code you posted does. The average value of the :x robots is 21 (there's 3 :x robots, with values 5, 25 and 35; 5 + 25 + 35 == 65 and 65 divided by 3 robots is 21), but your code (and mine, since I modeled mine after yours) prints 13.
is there a better way to get only objects of type :x ?
Yes. To select elements, use the select method.
is there a better way to do this that does not involve looping through the entire hash?
No. If you want to find all objects with a given property, you have to look at all objects to see whether or not they have that property.
Do you have actual hard statistical evidence that this method is causing your performance bottlenecks?
You can also directly map the keys like this
hsh.values.map {|k| k[:x]}.reduce(:+) / hsh.size

Resources