.each_with_object ruby explanation? - ruby

Below, we are given an array called win_lose. We are supposed to create a hash that looks like the hash below. My original inclination was to do something using .count, but after trying for the answer, .each_with_object worked the best.
Can someone break it down for me on what the .each_with_object method is doing and the answer itself? I got the answer and figured it out from reading the docs but still need explanation on the method itself ...
Thank you!
win_lose = ["win", "lose", "win", "lose", "win", "win"]
Create a hash based on win_lose array that looks like this:
win_loss_count = {
"win" => 4,
"loss" => 2
}
This is what I originally tried without success:
win_loss_count = Hash[win_lose.map.with_index { |outcome, times| outcome = times.count }]
Answer:
win_loss_count = win_lose.each_with_object(Hash.new(0)) { |word,counts| counts[word] += 1 }

each_with_object is very literally what it says. It's like each, but with an extra object every time.
So for this:
win_lose.each_with_object(Hash.new(0)) { |word,counts| counts[word] += 1 }
You're calling each, with the object created via Hash.new(0) passed in as well, every time. word is the word you'd get in a normal each, and counts is the "object" referred to be "with_object" (so, the hash).
Important to this shortcut is the Hash.new(0). It means create a new empty hash with 0 as the value for all keys that did not previously exist, which lets you do a counts[word] += 1 even if it wasn't in there before.
At the end, each_with_object returns the "object", so counts is returned, having been modified for every word.

Nick has it exactly right and in fact there other methods that can pass objects into a block to help you with however your structure needs to output. One of the most common you'll see in ruby is the Enumerable#inject method. Your same answer can be rewritten like
win_lose.inject(Hash.new(0)) { |hash, val| hash[val] += 1; hash }
Which performs the same operation:
[14] pry(main)> win_lose
=> ["win", "lose", "win", "lose", "win", "win"]
[15] pry(main)> win_lose.inject(Hash.new(0)) { |hash, val| hash[val] += 1; hash }
=> {"win"=>4, "lose"=>2}
We're doing the exact same thing, we're sending in a hash who's values default to zero into the block and we are building our new hash with each iteration.

Related

Ruby array initialize on read

I would like to be able to do this:
my_array = Array.new
my_array[12] += 1
In other words, somehow upon trying to access entry 12, finding it uninitialized, it is initialized to zero so I can add one to it. Array.new has a default: parameter, but that comes into play when you initialize the array with a known number of slots. Other than writing my own class, is there a ruby-ish way of doing this?
No need to create a new class :
my_hash = Hash.new(0)
my_hash[12] += 1
p my_hash
#=> {12=>1}
For many cases, hashes and arrays can be used interchangeably.
An array with an arbitrary number of elements and a default value sounds like a hash to me ;)
Just to make it clear : Hash and Array aren't equivalent. There will be cases where using a hash instead of an array will be completely wrong.
Something like:
a[12] = (a[12] ||= 0) + 1
Making use of nil.to_i == 0
my_array = Array.new
my_array[12] = my_array[12].to_i + 1
Note, that unlike other solutions here so far, this one works for any arbitrary initial value.
my_array = Array.new.extend(Module.new {
def [] idx
super || 0
end
})
my_array[12] += 1
#⇒ 1
This is not possible with the stock Array::new method.
https://docs.ruby-lang.org/en/2.0.0/Array.html#method-c-new
You will either need to monkey patch Array class, or monkey patch nil class. And they are not recommended.
If you have a specific use case, I would create a new wrapper class around Array
class MyArray < Array
def [](i)
super(i) ? super(i) : self[i] = 0
end
end
arr = MyArray.new
arr[12] += 1 # => 1

Why can't I reassign a variable in a Ruby code block?

Why doesn't calling these two .map methods bring about equivalent results? The first one works as expected, whereas the second has no effect.
array = ["batman","boobytrap"]
puts array.map { |x| x.reverse! }
=> namtab
=> partyboob
puts array.map { |x| x = x.reverse }
=> batman
=> boobytrap
Print array after puts array.map { |x| x.reverse! }. You will see - array has changed.
Read documentation for reverse! method.
The problem is that in your first map, the ! has modified the values in the original array, so it now contains reversed strings.
irb:001:0> array = ["batman","boobytrap"]
=> ["batman", "boobytrap"]
irb:002:0> puts array.map { |x| x.reverse! }
namtab
partyboob
=> nil
irb:003:0> array
=> ["namtab", "partyboob"]
So the second time is doing what you expect it to, but the entry data is not what you think it is.
If you try the second case standalone without doing the first one you will see that it works as you expect it to.
You have to change your view of what a variable is here. The variable is not the actual value, but only a reference to that value.
array = ["batman"]
# We are freezing this string to prevent ruby from creating a
# new object for a string with the same value.
# This isnt really necessary, just to "make sure" the memory
# address stays the same.
string = "boobytrap".freeze
array.each do |val|
puts "before: ",
"val.object_id: #{val.object_id}",
"string.object_id: #{string.object_id}",
"array[0].object_id: #{array[0].object_id}",
""
val = string
puts "after: ",
"val.object_id: #{val.object_id}",
"string.object_id: #{string.object_id}",
"array[0].object_id: #{array[0].object_id}"
end
# before:
# val.object_id: 70244773377800,
# string.object_id: 70244761504360,
# array[0].object_id: 70244773377800
#
# after:
# val.object_id: 70244761504360,
# string.object_id: 70244761504360,
# array[0].object_id: 70244773377800
Obviously the values will differ if you run this code on your machine, but the point is, the memory address for val changes, while array[0] (which is where val comes from) stays the same after we assigned string to val. So basically what we do with the reassignment is, we tell ruby the value for val is no longer found in 70244773377800, but in 70244761504360. The array still refers to its first value with 70244773377800 though!
The #reverse! method call you use in your example on x on the other hand changes the value of whatever is found at 70244773377800 in memory, which is why it works as you expected.
TLDR;
Your first example changes the value in memory, while the second example assigns a new memory address to a local variable.
When you do array.map { |x| x.reverse! }, it changed the array values.
array = ["batman","boobytrap"]
puts array.map { |x| x.reverse! }
=> namtab
=> partyboob
array
=> ["namtab", "partyboob"]
If you perform second operation on the same array, it will produce the same results as you have stated in the question. However, it will not change the value of original array.
array = ["batman","boobytrap"]
puts array.map { |x| x.reverse! }
=> namtab
=> partyboob
array
=> ["namtab", "partyboob"]
puts array.map { |x| x = x.reverse }
=> batman
=> boobytrap
array
=> ["namtab", "partyboob"]
To change the value of original array, use map! in second operation.
array = ["batman","boobytrap"]
puts array.map { |x| x.reverse! }
=> namtab
=> partyboob
array
=> ["namtab", "partyboob"]
puts array.map! { |x| x.reverse }
=> batman
=> boobytrap
array
=> ["batman", "boobytrap"]
There are several problems with your approach.
The first problem is that you don't properly isolate your test cases: in your first test case you reverse the strings in the array. In your second test case you reverse them again. What happens if you reverse something twice? That's right: nothing! So, the reason why you think it isn't working is actually precisely the fact that it is working! And if it weren't working (i.e. not reversing the strings), then it would print the previously-reversed strings, and you would think it does work.
So, lesson #1: Always isolate your test cases!
Problem #2 is that the second piece of code doesn't do what you (probably) think it does. x is a local variable (because it starts with a lowercase letter). Local variables are local to the scope they are defined in, in this case the block. So, x only exists inside the block. You assign to it, but you never do anything with it after assigning to it. So, the assignment is actually irrelevant.
What is relevant, is the return value of the block. Now, in Ruby, assignments evaluate to the value that is being assigned, so considering that the assignment to x is superfluous, we can just get rid of it, and your code is actually exactly equivalent to this:
array.map { |x| x.reverse }
Your third problem is that the first piece also doesn't do what you (probably) think it does. Array#map returns a new array and leaves the original array untouched, but String#reverse! mutates the strings! In other words: its primary mode of operation is a side-effect. In addition to the side-effect, it also returns the reversed string, which is another thing that confused you. It could just as well return nil instead to indicate that it performs a side-effect, in which case you would instead see the following:
array = %w[batman boobytrap]
array.map(&:reverse!)
#=> [nil, nil]
array
#=> ['namtab', 'partyboob']
As you can see, if String#reverse! did return nil, what you would observe is the following:
array.map returns a new array whose elements are the return values of the block, which is just nil
array now still contains the same String objects as before, but they have been mutated
Now, since String#reverse! actually does return the reversed String what you actually observe is this:
array = %w[batman boobytrap]
array.map(&:reverse!)
#=> ['namtab', 'partyboob']
array
#=> ['namtab', 'partyboob']
array.map(&:reverse)
#=> ['batman', 'boobytrap']
array
#=> ['namtab', 'partyboob']
Which brings me to lesson #2: side-effects and shared mutable state are evil!
You should avoid both side-effects and shared mutable state (ideally, mutable state in general, but mutable state that is shared between different pieces is especially evil) as far as possible.
Why are they evil? Well, just look at how much they confused even in this extremely simple tiny example? Can you imagine the same thing happening in a much, much larger application?
The problem with side-effects is that they "happen on the side". They are not arguments, they are not return values. You cannot print them out, inspect them, store them in a variable, place assertions on them in a unit test, and so on.
The problem with shared mutable state (mutation is just a form of side-effect, by the way) is that enables "spooky action at a distance": you have a piece of data in one place of your code, but this piece of data is shared with a different place of your code. Now, if one place mutates the data, the other place will seemingly magically have their data change out under them. In your example here, the shared state were the strings inside the array, and mutating them in one line of code made them also change in another line of code.

In Ruby, how can I recursivly populate a Mongo database using nested arrays as input?

I have been using Ruby for a while, but this is my first time doing anything with a database. I've been playing around with MongoDB for a bit and, at this point, I've begun to try and populate a simple database.
Here is my problem. I have a text file containing data in a particular format. When I read that file in, the data is stored in nested arrays like so:
dataFile = ["sectionName", ["key1", "value1"], ["key2", "value2", ["key3", ["value3A", "value3B"]]]
The format will always be that the first value of the array is a string and each subsequent value is an array. Each array is formatted in as a key/value pair. However, the value can be a string, an array of two strings, or a series of arrays that have their own key/value array pairs. I don't know any details about the data file before I read it in, just that it conforms to these rules.
Now, here is my problem. I want to read this into to a Mongo database preserving this basic structure. So, for instance, if I were to do this by hand, it would look like this:
newDB = mongo_client.db("newDB")
newCollection = newDB["dataFile1"]
doc = {"section_name" => "sectionName", "key1" => "value1", "key2" => "value2", "key3" => ["value3A", "value3B"]}
ID = newCollection.insert(doc)
I know there has to be an easy way to do this. So far, I've been trying various recursive functions to parse the data out, turn it into mongo commands and try to populate my database. But it just feels clunky, like there is a better way. Any insight into this problem would be appreciated.
The value that you gave for the variable dataFile isn't a valid array, because it is missing an closing square bracket.
If we made the definition of dataFile a valid line of ruby code, the following code would yield the hash that you described. It uses map.with_index to visit each element of the array and transforms this array into a new array of key/value hashes. This transformed array of hashes is flatted and converted into single hash using the inject method.
dataFile = ["sectionName", ["key1", "value1"], ["key2", "value2", ["key3", ["value3A", "value3B"]]]]
puts dataFile.map.with_index {
|e, ix|
case ix
when 0
{ "section_name" => e }
else
list = []
list.push( { e[0] => e[1] } )
if( e.length > 2 )
list.push(
e[2..e.length-1].map {|p|
{ p[0] => p[1] }
}
)
end
list
end
}.flatten.inject({ }) {
|accum, e|
key = e.keys.first
accum[ key ] = e[ key ]
accum
}.inspect
The output looks like:
{"section_name"=>"sectionName", "key1"=>"value1", "key2"=>"value2", "key3"=>["value3A", "value3B"]}
For input that looked like this:
["sectionName", ["key1", "value1"], ["key2", "value2", ["key3", ["value3A", "value3B"]], ["key4", ["value4A", "value4B"]]], ["key5", ["value5A", "value5B"]]]
We would see:
{"section_name"=>"sectionName", "key1"=>"value1", "key2"=>"value2", "key3"=>["value3A", "value3B"], "key4"=>["value4A", "value4B"], "key5"=>["value5A", "value5B"]}
Note the arrays "key3" and "key4", which is what I consider as being called a series of arrays. If the structure has array of arrays of unknown depth then we would need a different implementation - maybe use an array to keep track of the position as the program walks through this arbitrarily nested array of arrays.
In the following test, please find two solutions.
The first converts to a nested Hash which is what I think that you want without flattening the input data.
The second stores the key-value pairs exactly as given from the input.
I've chosen to fix missing closing square bracket by preserving key values pairs.
The major message here is that while the top-level data structure for MongoDB is a document mapped to a Ruby Hash
that by definition has key-value structure, the values can be any shape including nested arrays or hashes.
So I hope that test examples cover the range, showing that you can match storage in MongoDB to fit your needs.
test.rb
require 'mongo'
require 'test/unit'
require 'pp'
class MyTest < Test::Unit::TestCase
def setup
#coll = Mongo::MongoClient.new['test']['test']
#coll.remove
#dataFile = ["sectionName", ["key1", "value1"], ["key2", "value2"], ["key3", ["value3A", "value3B"]]]
#key, *#value = #dataFile
end
test "nested array data as hash value" do
input_doc = {#key => Hash[*#value.flatten(1)]}
#coll.insert(input_doc)
fetched_doc = #coll.find.first
assert_equal(input_doc[#key], fetched_doc[#key])
puts "#{name} fetched hash value doc:"
pp fetched_doc
end
test "nested array data as array value" do
input_doc = {#key => #value}
#coll.insert(input_doc)
fetched_doc = #coll.find.first
assert_equal(input_doc[#key], fetched_doc[#key])
puts "#{name} fetched array doc:"
pp fetched_doc
end
end
ruby test.rb
$ ruby test.rb
Loaded suite test
Started
test: nested array data as array value(MyTest) fetched array doc:
{"_id"=>BSON::ObjectId('5357d4ac7f11ba0678000001'),
"sectionName"=>
[["key1", "value1"], ["key2", "value2"], ["key3", ["value3A", "value3B"]]]}
.test: nested array data as hash value(MyTest) fetched hash value doc:
{"_id"=>BSON::ObjectId('5357d4ac7f11ba0678000002'),
"sectionName"=>
{"key1"=>"value1", "key2"=>"value2", "key3"=>["value3A", "value3B"]}}
.
Finished in 0.009493 seconds.
2 tests, 2 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
210.68 tests/s, 210.68 assertions/s

Ruby regex selecting multiple words at the same time

I have a hash that I am using regex on to select what key/value pairs I want. Here is the method I have written:
def extract_gender_race_totals(gender, race)
totals = #data.select {|k,v| k.to_s.match(/(#{gender})(#{race})/)}
temp = 0
totals.each {|key, value| temp += value}
temp
end
the hash looks like this:
#data = {
:number_of_african_male_senior_managers=>2,
:number_of_coloured_male_senior_managers=>0,
:number_of_indian_male_senior_managers=>0,
:number_of_white_male_senior_managers=>0,
:number_of_african_female_senior_managers=>0,
:number_of_coloured_female_senior_managers=>0,
:number_of_indian_female_senior_managers=>0,
:number_of_white_female_senior_managers=>0,
:number_of_african_male_middle_managers=>2,
:number_of_coloured_male_middle_managers=>0,
:number_of_indian_male_middle_managers=>0,
:number_of_white_male_middle_managers=>0,
:number_of_african_female_middle_managers=>0,
:number_of_coloured_female_middle_managers=>0,
:number_of_indian_female_middle_managers=>0,
:number_of_white_female_middle_managers=>0,
:number_of_african_male_junior_managers=>0,
:number_of_coloured_male_junior_managers=>0,
:number_of_indian_male_junior_managers=>0,
:number_of_white_male_junior_managers=>0,
:number_of_african_female_junior_managers=>0,
:number_of_coloured_female_junior_managers=>0,
:number_of_indian_female_junior_managers=>0,
:number_of_white_female_junior_managers=>0
}
but it's re-populated with data after a SQL Query.
I would like to make it so that the key must contain both the race and the gender in order for it to return something. Otherwise it must return 0. Is this right or is the regex syntax off?
It's returning 0 for all, which it shouldn't.
So the example would be
%td.total_cell= #ee_demographics_presenter.extract_gender_race_totals("male","african")
This would return 4, there are 4 African, male managers.
Try something like this.
def extract_gender_race_totals(gender, race)
#data.select{|k, v| k.to_s.match(/#{race}_#{gender}/)}.values.reduce(:+)
end
extract_gender_race_totals("male", "african")
# => 4
gmalete's answer gives an elegant solution, but here is just an explanation of why your regexp isn't quite right. If you corrected the regexp I think your approach would work, it just isn't as idiomatic Ruby.
/(#{gender})(#{race})/ won't match number_of_african_male_senior_managers for 2 reasons:
1) the race comes before the gender in the hash key and 2) there is an underscore in the hash key that needs to be in the regexp. e.g.
/(#{race})_(#{gender})/
would work, but the parentheses aren't needed so this can be simplified to
/#{race}_#{gender}/
Rather than having specific methods to query pieces of your keys (i.e. "gender_race"), you could make a general method to query any attribute in any order:
def extract_totals(*keywords)
keywords.inject(#data) { |memo, keyword| memo.select { |k, v| k.to_s =~ /_#{keyword}(?:_|\b)/ } }.values.reduce(:+)
end
Usage:
extract_totals("senior")
extract_totals("male", "african")
extract_totals("managers") # maybe you'll have _employees later...
# etc.
Not exactly what you asked for, but maybe it will help.

Symbol to integer error in inject

this should be pretty easy for ruby wizards here. I'm having a problem with inject. This is it simply :
a = Resource.all
a.inject({ :wood => 0 }) { |res, el| res[:wood] + el.cost(1)[:wood] }
TypeError: can't convert Symbol into Integer
a is a collection and i would like to create a sum of all the wood resources of this collection. The el.cost(1)[:wood] works fine and gets an integer (resources value). So this part is correct. It seems that i have a problem with initializing my new hash with the :wood symbol and setting that value in each iteration, but i can't really find the problem.
Any ideas ?
inject works like this:
take initialization value, pass it into the lambda with the first element in the list. Use the result of the lambda as the new accumulator
pass that new accumulator into the lambda together with the next element in the list. Use the result of the lambda as the new accumulator
And so on...
So what you have to do in the lambda is:
Take the hash in res.
Modify it.
Return the hash.
You fail to do 2 and 3, that's why that code doesn't work. Try the following:
a.inject({ :wood => 0 }) { |res, el| res[:wood] += el.cost(1)[:wood]; res }
This is however a bit redundant. You can easily accumulate the integers first and then create a hash:
{ :wood => a.map { |el| el.cost(1)[:wood] }.reduce(0, :+) }

Resources