How do you check for matching keys in a ruby hash? - ruby

I'm learning coding, and one of the assignments is to return keys is return the names of people who like the same TV show.
I have managed to get it working and to pass TDD, but I'm wondering if I've taken the 'long way around' and that maybe there is a simpler solution?
Here is the setup and test:
class TestFriends < MiniTest::Test
def setup
#person1 = {
name: "Rick",
age: 12,
monies: 1,
friends: ["Jay","Keith","Dave", "Val"],
favourites: {
tv_show: "Friends",
things_to_eat: ["charcuterie"]
}
}
#person2 = {
name: "Jay",
age: 15,
monies: 2,
friends: ["Keith"],
favourites: {
tv_show: "Friends",
things_to_eat: ["soup","bread"]
}
}
#person3 = {
name: "Val",
age: 18,
monies: 20,
friends: ["Rick", "Jay"],
favourites: {
tv_show: "Pokemon",
things_to_eat: ["ratatouille", "stew"]
}
}
#people = [#person1, #person2, #person3]
end
def test_shared_tv_shows
expected = ["Rick", "Jay"]
actual = tv_show(#people)
assert_equal(expected, actual)
end
end
And here is the solution that I found:
def tv_show(people_list)
tv_friends = {}
for person in people_list
if tv_friends.key?(person[:favourites][:tv_show]) == false
tv_friends[person[:favourites][:tv_show]] = [person[:name]]
else
tv_friends[person[:favourites][:tv_show]] << person[:name]
end
end
for array in tv_friends.values()
if array.length() > 1
return array
end
end
end
It passes, but is there a better way of doing this?

I think you could replace those for loops with the Array#each. But in your case, as you're creating a hash with the values in people_list, then you could use the Enumerable#each_with_object assigning a new Hash as its object argument, this way you have your own person hash from the people_list and also a new "empty" hash to start filling as you need.
To check if your inner hash has a key with the value person[:favourites][:tv_show] you can check for its value just as a boolean one, the comparison with false can be skipped, the value will be evaluated as false or true by your if statement.
You can create the variables tv_show and name to reduce a little bit the code, and then over your tv_friends hash to select among its values the one that has a length greater than 1. As this will give you an array inside an array you can get from this the first element with first (or [0]).
def tv_show(people_list)
tv_friends = people_list.each_with_object(Hash.new({})) do |person, hash|
tv_show = person[:favourites][:tv_show]
name = person[:name]
hash.key?(tv_show) ? hash[tv_show] << name : hash[tv_show] = [name]
end
tv_friends.values.select { |value| value.length > 1 }.first
end
Also you can omit parentheses when the method call doesn't have arguments.

Related

Convert object with array values into array of object

I do have this kind of params
params = { "people" =>
{
"fname" => ['john', 'megan'],
"lname" => ['doe', 'fox']
}
}
Wherein I loop through using this code
result = []
params["people"].each do |key, values|
values.each_with_index do |value, i|
result[i] = {}
result[i][key.to_sym] = value
end
end
The problem on my code is that it always gets the last key and value.
[
{ lname: 'doe' },
{ lname: 'fox' }
]
i want to convert it into
[
{fname: 'john', lname: 'doe'},
{fname: 'megan', lname: 'fox'}
]
so that i can loop through of them and save to database.
Your question has been answered but I'd like to mention an alternative calculation that does not employ indices:
keys, values = params["people"].to_a.transpose
#=> [["fname", "lname"], [["john", "megan"], ["doe", "fox"]]]
keys = keys.map(&:to_sym)
#=> [:fname, :lname]
values.transpose.map { |val| keys.zip(val).to_h }
#=> [{:fname=>"john", :lname=>"doe"},
# {:fname=>"megan", :lname=>"fox"}]
result[i] = {}
The problem is that you're doing this each loop iteration, which resets the value and deletes any existing keys you already put there. Instead, only set the value to {} if it doesn't already exist.
result[i] ||= {}
In your inner loop, you're resetting the i-th element to an empty hash:
result[i] = {}
So you only end up with the data from the last key-value-pair, i.e. lname.
Instead you can use this to only set it to an empty hash if it doesn't already exist:
result[i] ||= {}
So the first loop through, it gets set to {}, but after that, it just gets set to itself.
Alternatively, you can also use
result[i] = {} if !result[i]
which may or may not be more performant. I don't know.

How to solve the "retrieve_values" problem

I'm working on this problem:
Write a method retrieve_values that takes in two hashes and a key. The method should return an array containing the values from the two hashes that correspond with the given key.
def retrieve_values(hash1, hash2, key)
end
dog1 = {"name"=>"Fido", "color"=>"brown"}
dog2 = {"name"=>"Spot", "color"=> "white"}
print retrieve_values(dog1, dog2, "name") #=> ["Fido", "Spot"]
puts
print retrieve_values(dog1, dog2, "color") #=> ["brown", "white"]
puts
I came up with a working solution:
def retrieve_values(hash1, hash2, key)
arr = []
hash1.each { |key| } && hash2.each { |key| }
if key == "name"
arr << hash1["name"] && arr << hash2["name"]
elsif key == "color"
arr << hash1["color"] && arr << hash2["color"]
end
return arr
end
I then looked at the 'official' solution:
def retrieve_values(hash1, hash2, key)
val1 = hash1[key]
val2 = hash2[key]
return [val1, val2]
end
What is wrong with my code? Or is it an acceptable "different" approach?
Line with hash1.each { |key| } && hash2.each { |key| } just does nothing it is not needed even in your solution.
This part a bit difficult to read arr << hash1["name"] && arr << hash2["name"]. It mutates the array two times in one line, this kind of style could lead to bugs.
Also, your code sticks only to two keys name and color:
dog1 = {"name"=>"Fido", "color"=>"brown", "age" => 1}
dog2 = {"name"=>"Spot", "color"=> "white", "age" => 2}
> retrieve_values(dog1, dog2, "age")
=> []
The official solution will return [1, 2].
You don't need here to explicitly use return keyword, any block of code returns the last evaluated expression. But it is a matter of style guide.
It is possible to simplify even the official solution:
def retrieve_values(hash1, hash2, key)
[hash1[key], hash2[key]]
end

How to find count matching characters at the same indes and at an unmatching index

I have built a version of mastermind that checks a user's input and provides feedback based on how close the user's guess was to the winning sequence. If you're not familiar with the game, you get feedback indicating how many of your characters were guessed correctly at the same index and how many characters guessed are in the sequence, but at the wrong index. If there are duplicates in the guess, then you would not count the extra values unless they correspond to the same number of duplicates in the secret code.
Example: If the sequence is ["G","G","G","Y"] and the user guesses ["G", "Y","G","G"] then you'd want to return 2 for items at the same index and 2 for items at different indexes that are included in the secret sequence.
Another example: If the sequence is ["X","R","Y","T"] and the user guesses ["T","T","Y","Y"] then you'd return 1 for items at the same index 1 for the character guessed that is in the sequence but at the wrong index.
Anyway, to me this is not a simple problem to solve. Here's the code I used to get it to work, but it's not elegant. There must be a better way. I was hoping someone can tell me what I'm missing here?? New to Ruby...
def index_checker(input_array, sequence_array)
count = 0
leftover_input = []
leftover_sequence = []
input.each_with_index do |char, idx|
if char == sequence[idx]
count += 1
else
leftover_input << char
leftover_sequence << sequence[idx]
end
end
diff_index_checker(leftover_input, leftover_sequence, count)
end
def diff_index_checker(input, sequence, count)
count2 = 0
already_counted = []
input.each do |char|
if sequence.include?(char) && !already_counted.include?(char)
count2 += 1
already_counted << char
end
end
[count, count2]
end
Here's a clean Ruby solution, written in idiomatic Ruby object-oriented style:
class Mastermind
def initialize(input_array, sequence_array)
#input_array = input_array
#sequence_array = sequence_array
end
def matches
[index_matches, other_matches]
end
def results
[index_matches.size, other_matches.size]
end
private
attr_reader :input_array, :sequence_array
def index_matches
input_array.select.with_index { |e, i| e == sequence_array[i] }
end
def other_matches
non_exact_input & non_exact_sequence
end
def non_exact_input
array_difference(input_array, index_matches)
end
def non_exact_sequence
array_difference(sequence_array, index_matches)
end
# This method is based on https://stackoverflow.com/a/3852809/5961578
def array_difference(array_1, array_2)
counts = array_2.inject(Hash.new(0)) { |h, v| h[v] += 1; h }
array_1.reject { |e| counts[e] -= 1 unless counts[e].zero? }
end
end
You would use this class as follows:
>> input_array = ["G","G","G","Y"]
>> sequence_array = ["G", "Y","G","G"]
>> guess = Mastermind.new(input_array, sequence_array)
>> guess.results
#> [2, 2]
>> guess.matches
#> [["G", "G"], ["G", "Y"]]
Here's how it works. First everything goes into a class called Mastermind. We create a constructor for the class (which in Ruby is a method called initialize) and we have it accept two arguments: input array (the user guess), and sequence array (the answer).
We set each of these arguments to an instance variable, which is indicated by its beginning with #. Then we use attr_reader to create getter methods for #input_array and #sequence_array, which allows us to get the values by calling input_array and sequence_array from any instance method within the class.
We then define two public methods: matches (which returns an array of exact matches and an array of other matches (the ones that match but at the wrong index), and results (which returns a count of each of these two arrays).
Now, within the private portion of our class, we can define the guts of the logic. Each method has a specific job, and each is named to (hopefully) help a reader understand what it is doing.
index_matches returns a subset of the input_array whose elements match the sequence_array exactly.
other_matches returns a subset of the input_array whose elements do not match the sequence_array exactly, but do match at the wrong index.
other_matches relies on non_exact_input and non_exact_sequence, each of which is computed using the array_difference method, which I copied from another SO answer. (There is no convenient Ruby method that allows us to subtract one array from another without deleting duplicates).
Code
def matches(hidden, guess)
indices_wo_match = hidden.each_index.reject { |i| hidden[i] == guess[i] }
hidden_counts = counting_hash(hidden.values_at *indices_wo_match)
guess_counts = counting_hash(guess.values_at *indices_wo_match)
[hidden.size - indices_wo_match.size, guess_counts.reduce(0) { |tot, (k, cnt)|
tot + [hidden_counts[k], cnt].min }]
end
def counting_hash(arr)
arr.each_with_object(Hash.new(0)) { |s, h| h[s] += 1 }
end
Examples
matches ["G","G","G","Y"], ["G", "Y","G","G"]
#=> [2, 2]
matches ["X","R","Y","T"] , ["T","T","Y","Y"]
#=> [1, 1]
Explanation
The steps are as follows.
hidden = ["G","G","G","Y"]
guess = ["G", "Y","G","G"]
Save the indices i for which hidden[i] != guess[i].
indices_wo_match = hidden.each_index.reject { |i| hidden[i] == guess[i] }
#=> [1, 3]
Note that the number of indices for which the values are equal is as follows.
hidden.size - indices_wo_match.size
#=> 2
Now compute the numbers of remaining elements of guess that pair with one of the remaining values of hidden by having the same value. Begin by counting the numbers of instances of each unique element of hidden and then do the same for guess.
hidden_counts = counting_hash(hidden.values_at *indices_wo_match)
#=> {"G"=>1, "Y"=>1}
guess_counts = counting_hash(guess.values_at *indices_wo_match)
#=> {"Y"=>1, "G"=>1}
To understand how counting_hash works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(3), then if h does not have a key k, h[k] returns the default value, here 3 (the hash is not changed).
Now compute the numbers of matches of elements of guess that were not equal to the value of hidden at the same index and which pair with an element of hidden that have the same value.
val_matches = guess_counts.reduce(0) do |tot, (k, cnt)|
tot + [hidden_counts[k], cnt].min
end
#=> 2
Lastly, return the values of interest.
[hidden.size - indices_wo_match.size, val_matches]
#=> [2, 2]
In the code presented above I have substituted out the variable val_matches.
With Ruby 2.4+ one can use Enumerable#sum to replace
guess_counts.reduce(0) { |tot, (k, cnt)| tot + [hidden_counts[k], cnt].min }
with
guess_counts.sum { |k, cnt| [hidden_counts[k], cnt].min }
def judge(secret, guess)
full = secret.zip(guess).count { |s, g| s == g }
semi = secret.uniq.sum { |s| [secret.count(s), guess.count(s)].min } - full
[full, semi]
end
Demo:
> judge(["G","G","G","Y"], ["G","Y","G","G"])
=> [2, 2]
> judge(["X","R","Y","T"], ["T","T","Y","Y"])
=> [1, 1]
A shorter alternative, though I find it less clear:
full = secret.zip(guess).count(&:uniq!)
I prefer my other answer for its simplicity, but this one would be faster if someone wanted to use this for arrays larger than Mastermind's.
def judge(secret, guess)
full = secret.zip(guess).count { |s, g| s == g }
pool = secret.group_by(&:itself)
[full, guess.count { |g| pool[g]&.pop } - full]
end
Demo:
> judge(["G","G","G","Y"], ["G","Y","G","G"])
=> [2, 2]
> judge(["X","R","Y","T"], ["T","T","Y","Y"])
=> [1, 1]

Hash Enumerable methods: Inconsistent behavior when passing only one parameter

Ruby's enumerable methods for Hash expect 2 parameters, one for the key and one for the value:
hash.each { |key, value| ... }
However, I notice that the behavior is inconsistent among the enumerable methods when you only pass one parameter:
student_ages = {
"Jack" => 10,
"Jill" => 12,
}
student_ages.each { |single_param| puts "param: #{single_param}" }
student_ages.map { |single_param| puts "param: #{single_param}" }
student_ages.select { |single_param| puts "param: #{single_param}" }
student_ages.reject { |single_param| puts "param: #{single_param}" }
# results:
each...
param: ["Jack", 10]
param: ["Jill", 12]
map...
param: ["Jack", 10]
param: ["Jill", 12]
select...
param: Jack
param: Jill
reject...
param: Jack
param: Jill
As you can see, for each and map, the single parameter gets assigned to a [key, value] array, but for select and reject, the parameter is only the key.
Is there a particular reason for this behavior? The docs don't seem to mention this at all; all of the examples given just assume that you are passing in two parameters.
Just checked Rubinius behavior and it is indeed consistent with CRuby. So looking at the Ruby implementation - it is indeed because #select yields two values:
yield(item.key, item.value)
while #each yields an array with two values:
yield [item.key, item.value]
Yielding two values to a block that expects one takes the first argument and ignores the second one:
def foo
yield :bar, :baz
end
foo { |x| p x } # => :bar
Yielding an array will either get completely assigned if the block has one parameter or get unpacked and assigned to each individual value (as if you passed them one by one) if there are two or more parameters.
def foo
yield [:bar, :baz]
end
foo { |x| p x } # => [:bar, :baz]
As for why they made that descision - there probably isn't any good reason behind it, it just wasn't expected people to call them with one argument.
My guess is that internally map is just each with collect. Interesting they don't work quite the same way.
As to each...
The source code is below. It checks how many arguments you've passed into the block. If more than one it calls each_pair_i_fast, otherwise just each_pair_i.
static VALUE
rb_hash_each_pair(VALUE hash)
{
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
if (rb_block_arity() > 1)
rb_hash_foreach(hash, each_pair_i_fast, 0);
else
rb_hash_foreach(hash, each_pair_i, 0);
return hash;
}
each_pair_i_fast returns two distinct values:
each_pair_i_fast(VALUE key, VALUE value)
{
rb_yield_values(2, key, value);
return ST_CONTINUE;
}
each_pair_i does not:
each_pair_i(VALUE key, VALUE value)
{
rb_yield(rb_assoc_new(key, value));
return ST_CONTINUE;
}
rb_assoc_new returns a two element array (at least I'm assuming that is what rb_ary_new3 does
rb_assoc_new(VALUE car, VALUE cdr)
{
return rb_ary_new3(2, car, cdr);
}
select looks like this:
rb_hash_select(VALUE hash)
{
VALUE result;
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
result = rb_hash_new();
if (!RHASH_EMPTY_P(hash)) {
rb_hash_foreach(hash, select_i, result);
}
return result;
}
and select_i looks like this:
select_i(VALUE key, VALUE value, VALUE result)
{
if (RTEST(rb_yield_values(2, key, value))) {
rb_hash_aset(result, key, value);
}
return ST_CONTINUE;
}
And I'm going to assume that rb_hash_aset returns two distinct arguments similar to each_pair_i.
Most important notice that select/etc doesn't check the argument arity at all.
Sources:
https://github.com/ruby/ruby/blob/d5c5d5c778a0e8d61ab07669132dc18fb1a2e874/hash.c
https://github.com/ruby/ruby/blob/9f44b77a18d4d6099174c6044261eb1611a147ea/array.c

Comparing values of one hash to many hashes to get inverse document frequency in ruby

I'm trying to find the inverse document frequency for a categorization algorithm and am having trouble getting it the way that my code is structured (with nested hashes), and generally comparing one hash to many hashes.
My training code looks like this so far:
def train!
#data = {}
#all_books.each do |category, books|
#data[category] = {
words: 0,
books: 0,
freq: Hash.new(0)
}
books.each do |filename, tokens|
#data[category][:words] += tokens.count
#data[category][:books] += 1
tokens.each do |token|
#data[category][:freq][token] += 1
end
end
#data[category][:freq].map { |k, v| v = (v / #data[category][:freq].values.max) }
end
end
Basically, I have a hash with 4 categories (subject to change), and for each have word count, book count, and a frequency hash which shows term frequency for the category. How do I get the frequency of individual words from one category compared against the frequency of the words shown in all categories? I know how to do the comparison for one set of hash keys against another, but am not sure how to loop through a nested hash to get the frequency of terms against all other terms, if that makes sense.
Edit to include predicted outcome -
I'd like to return a hash of nested hashes (one for each category) that shows the word as the key, and the number of other categories in which it appears as the value. i.e. {:category1 = {:word => 3, :other => 2, :third => 1}, :category2 => {:another => 1, ...}} Alternately an array of category names as the value, instead of the number of categories, would also work.
I've tried creating a new hash as follows, but it's turning up empty:
def train!
#data = {}
#all_words = Hash.new([]) #new hash for all words, default value is empty array
#all_books.each do |category, books|
#data[category] = {
words: 0,
books: 0,
freq: Hash.new(0)
}
books.each do |filename, tokens|
#data[category][:words] += tokens.count
#data[category][:books] += 1
tokens.each do |token|
#data[category][:freq][token] += 1
#all_words[token] << category #should insert category name if the word appears, right?
end
end
#data[category][:freq].map { |k, v| v = (v / #data[category][:freq].values.max) }
end
end
If someone can help me figure out why the #all_words hash is empty when the code is run, I may be able to get the rest.
I haven't gone through it all, but you certainly have an error:
#all_words[token] << category #should insert category name if the word appears, right?
Nope. #all_words[token] will return empty array, but not create a new slot with an empty array, like you're assuming. So that statement doesn't modify the #all_words hash at all.
Try these 2 changes and see if it helps:
#all_words = {} # ditch the default value
...
(#all_words[token] ||= []) << category # lazy-init the array, and append

Resources