Ruby splat operator changing value inside loop - ruby

I want to define a method which can take an optional amount of arguments and hashes, like so
def foo(*b, **c)
2.times.map.with_index { |i|
new_hash, new_array = {}, b
c.map { |key, value| new_hash[key] = value[i] unless value[i].nil? }
new_array << new_hash if new_hash.length > 0
send(:bar, new_array)
}
end
def bar(*b)
p b
end
If I've understood the splat and double splat operators correctly (which I doubt), then this should send the array b to the bar method, and only adding the new_hash from foo if it contains something. However, something weird happens - I'll try and illustrate with some snippets below
# invoking #foo
foo(a, key: 'value')
# first iteration of loop in #foo
# i is 0
# b is []
# c is { :key => ['value1'] }
# send(:bar, new_array) => send(:bar, [{:key => 'value1'}])
# bar yields: [{:key => 'value1'}]
Now, however, something happens
# second iteration of loop in #foo
# i is 1
# b is [:key => 'value1'] <---- why?
# c is { :key => ['value1']
Why has the value of b changed inside the loop of foo?
edit Updated the code to reflect a new array is created for each iteration

new_hash, new_array = {}, b
This doesn't create a copy of b. Now new_array and b point to the same object. Modifying one in-place will modify the other.
new_array << new_hash
That modifies new_array (and thus b) in place, so the new element remains on the next iteration. Use something like +, which creates a copy:
send(:bar, *(b + (new_hash.empty? ? [] : [new_hash])))

Related

Put every Hash Element inside of an Array Ruby

Let's say I have a Hash like this:
my_hash = {"a"=>{"a1"=>"b1"}, "b"=>"b", "c"=>{"c1"=>{"c2"=>"c3"}}}
And I want to convert every element inside the hash that is also a hash to be placed inside of an Array.
For example, I want the finished Hash to look like this:
{"a"=>[{"a1"=>"b1"}], "b"=>"b", "c"=>[{"c1"=>[{"c2"=>"c3"}]}]}
Here is what I've tried so far, but I need it to work recursively and I'm not quite sure how to make that work:
my_hash.each do |k,v|
if v.class == Hash
my_hash[k] = [] << v
end
end
=> {"a"=>[{"a1"=>"b1"}], "b"=>"b", "c"=>[{"c1"=>{"c2"=>"c3"}}]}
You need to wrap your code into a method and call it recursively.
my_hash = {"a"=>{"a1"=>"b1"}, "b"=>"b", "c"=>{"c1"=>{"c2"=>"c3"}}}
def process(hash)
hash.each do |k,v|
if v.class == Hash
hash[k] = [] << process(v)
end
end
end
p process(my_hash)
#=> {"a"=>[{"a1"=>"b1"}], "b"=>"b", "c"=>[{"c1"=>[{"c2"=>"c3"}]}]}
Recurring proc is another way around:
h = {"a"=>{"a1"=>"b1"}, "b"=>"b", "c"=>{"c1"=>{"c2"=>"c3"}}}
h.map(&(p = proc{|k,v| {k => v.is_a?(Hash) ? [p[*v]] : v}}))
.reduce({}, &:merge)
# => {"a"=>[{"a1"=>"b1"}], "b"=>"b", "c"=>[{"c1"=>[{"c2"=>"c3"}]}]}
It can be done with single reduce, but that way things get even more obfuscated.

Initialize a variable within a loop

I define a variable holding an empty object before pushing elements into it in an each loop (or other types of loop) like so:
foo = []
collection.each do |item|
foo << item
end
foo
or like this:
foo = []
count = 0
collection.each do |item|
count += 1
raise ArgumentError if count > 10
foo << item
end
foo
However, foo or count appears too often and clutters the code. Is there a method to shorten this chunk of code? I want to believe that the first foo can be placed inside the loop to run once.
You can use inject method:
foo = collection.inject([]) {|sum, item| sum << item }
single-line block just a Ruby style, prefer {...} over do...end for single-line blocks. Either multi-lines or just single-line blocks, you also can use the do...end, but for multi-lines block, do...end is better.
foo = collection.inject([]) do |sum, item|
sum << item
end # This is ok, but `{...}` looks better.
multi lines:
foo = collection.inject([]) do |sum, item|
# line 1
# line 2
# and more
end
For Ruby style, you can get more info from The Ruby Style Guide
An alternative that I use in scenarios like this is each_with_obejct.
collection = ['string', 1, []]
foo =
collection.each_with_object([]) do |item,array|
array << item
end
#=> ['string', 1, []]
Likewise, if you need an index, you can chain each_with_index with each_with_object like so, but it becomes slightly more complicated:
collection = ['string', 1, []]
foo =
collection.each_with_index.with_object([]) do |item_and_index,array|
item, index = item_and_index
raise ArgumentError if index > 10
array << item
end
#=> ['string', 1, []]
The item_and_index becomes an array holding the item from collection in the 0 index, and the index of the item in the 1 index each time it loops.

Difference between Ruby's .push and << [duplicate]

This question already has answers here:
Ruby - Difference between Array#<< and Array#push
(5 answers)
Closed 8 years ago.
Here's an example with push:
#connections = Hash.new []
#connections[1] = #connections[1].push(2)
puts #connections # => {1=>[2]}
Here's an example with <<
#connections = Hash.new []
#connections[1] << 2
puts #connections # => {}
For some reason the output (#connections) is different, but why? I'm guessing it has something to do with Ruby object model?
Perhaps the new hash object [] is being create each time, but not saved? But why?
The difference in your code isn't about << vs. push, it's about the fact that you re-assign in one case and don't in the other. The following two pieces of code are equivalent:
#connections = Hash.new []
#connections[1] = #connections[1].push(2)
puts #connections # => {1=>[2]}
#connections = Hash.new []
#connections[1] = (#connections[1] << 2)
puts #connections # => {1=>[2]}
As are these two:
#connections = Hash.new []
#connections[1].push(2)
puts #connections # => {}
#connections = Hash.new []
#connections[1] << 2
puts #connections # => {}
The reason that re-assignment makes a difference here is that accessing a default value, does not automatically add an entry for it to the hash. That is if you have h = Hash.new(0) and then you do p h[0], you'll print 0, but the value of h will still be {} (not {0 => 0}) because the 0 is not added to the hash. If you do h[0] += 1, this will call the []= method on the hash and actually add an entry for 0 to it, so h becomes {0 => 1}.
So when you do #connections[1] << 2 in your code, you get the default array and perform << on it, but you don't store anything in #connections, so it stays {}. When you do #connections[i] = #connections[i].push(2) or #connections[i] = (#connections[i] << 2), you're calling []=, so the entry gets added to the hash.
However you should note that the hash will return a reference to the same array each time, so even if you do add the entry to the hash, it will likely still not behave as you expect once you add more than one entry (since all entries refer to the same array):
#connections = Hash.new []
#connections[1] = #connections[1].push(2)
#connections[2] = #connections[2].push(42)
puts #connections # => {1 => [2, 42], 2 => [2, 42]}
What you really want is a hash that returns a reference to a new array each time that a new key is accessed and that automatically adds an entry for the new array when that happens. To do that you can use the block form of Hash.new like this:
#connections = Hash.new do |h, k|
h[k] = []
end
#connections[1].push(2)
#connections[2].push(42)
puts #connections # => {1 => [2], 2 => [42]}
Note that when you write
h = Hash.new |this_hash, non_existent_key| { this_hash[non_existent_key] = [] }
...Ruby will execute the block whenever you try to lookup a key that doesn't exist, and then return the block's return value. A block is like a def in that all variables inside it(including the parameter variables) are created anew every time the block is called. In addition, note that [] is an Array constructor, and each time it is called, it creates a new array.
A block returns the result of the last statement that was executed in the block, which is the assignment statement:
this_hash[non_existent_key] = []
And an assignment statement returns the right hand side, which will be a reference to the same Array that was assigned to the key in the hash, so any changes to the returned Array will change the Array in the hash.
On the other hand, when you write:
Hash.new([])
The [] constructor creates a new, empty Array; and that Array becomes the argument for Hash.new(). There is no block for ruby to call every time you look up a non existent key, so ruby just returns that one Array as the value for ALL non-existent keys--and very importantly nothing is done to the hash.

How to make Ruby var= return value assigned, not value passed in?

There's a nice idiom for adding to lists stored in a hash table:
(hash[key] ||= []) << new_value
Now, suppose I write a derivative hash class, like the ones found in Hashie, which does a deep-convert of any hash I store in it. Then what I store will not be the same object I passed to the = operator; Hash may be converted to Mash or Clash, and arrays may be copied.
Here's the problem. Ruby apparently returns, from the var= method, the value passed in, not the value that's stored. It doesn't matter what the var= method returns. The code below demonstrates this:
class C
attr_reader :foo
def foo=(value)
#foo = (value.is_a? Array) ? (value.clone) : value
end
end
c=C.new
puts "assignment: #{(c.foo ||= []) << 5}"
puts "c.foo is #{c.foo}"
puts "assignment: #{(c.foo ||= []) << 6}"
puts "c.foo is #{c.foo}"
output is
assignment: [5]
c.foo is []
assignment: [6]
c.foo is [6]
When I posted this as a bug to Hashie, Danielle Sucher explained what was happening and pointed out that "foo.send :bar=, 1" returns the value returned by the bar= method. (Hat tip for the research!) So I guess I could do:
c=C.new
puts "clunky assignment: #{(c.foo || c.send(:foo=, [])) << 5}"
puts "c.foo is #{c.foo}"
puts "assignment: #{(c.foo || c.send(:foo=, [])) << 6}"
puts "c.foo is #{c.foo}"
which prints
clunky assignment: [5]
c.foo is [5]
assignment: [5, 6]
c.foo is [5, 6]
Is there any more elegant way to do this?
Assignments evaluate to the value that is being assigned. Period.
In some other languages, assignments are statements, so they don't evaluate to anything. Those are really the only two sensible choices. Either don't evaluate to anything, or evaluate to the value being assigned. Everything else would be too surprising.
Since Ruby doesn't have statements, there is really only one choice.
The only "workaround" for this is: don't use assignment.
c.foo ||= []
c.foo << 5
Using two lines of code isn't the end of the world, and it's easier on the eyes.
The prettiest way to do this is to use default value for hash:
# h = Hash.new { [] }
h = Hash.new { |h,k| h[k] = [] }
But be ware that you cant use Hash.new([]) and then << because of way how Ruby store variables:
h = Hash.new([])
h[:a] # => []
h[:b] # => []
h[:a] << 10
h[:b] # => [10] O.o
it's caused by that Ruby store variables by reference, so as we created only one array instance, ad set it as default value then it will be shared between all hash cells (unless it will be overwrite, i.e. by h[:a] += [10]).
It is solved by using constructor with block (doc) Hash.new { [] }. With this each time when new key is created block is called and each value is different array.
EDIT: Fixed error that #Uri Agassi is writing about.

Enumerator#each Restarts Sequence

I'm surprised that Enumerator#each doesn't start off at the current position in the sequence.
o = Object.new
def o.each
yield 1
yield 2
yield 3
end
e = o.to_enum
puts e.next
puts e.next
e.each{|x| puts x}
# I expect to see 1,2,3 but I see 1,2,1,2,3
# apparently Enumerator's each (inherited from Enumerable) restarts the sequence!
Am I doin' it wrong? Is there a way to maybe construct another Enumerator (from e) that will have the expected each behavior?
You're not doing it wrong, that's just not the semantics defined for Enumerator#each. You could make a derivative enumerator that only iterates from current position to end:
class Enumerator
def enum_the_rest
Enumerator.new { |y| loop { y << self.next } }
end
end
o = Object.new
def o.each
yield 1
yield 2
yield 3
end
e = o.to_enum
=> #<Enumerator: ...>
e.next
=> 1
e2 = e.enum_the_rest
=> #<Enumerator: ...>
e2.each { |x| puts x }
=> 2
=> 3
And, BTW, each doesn't restart the sequence, it just always runs over the entire span. Your enumerator still knows where it is in relation to the next next call.
e3 = o.to_enum
e3.next
=> 1
e3.next
=> 2
e3.map(&:to_s)
=> ["1", "2", "3"]
e3.next
=> 3
Enumerator#next and Enumerator#each work on the object differently. Per the documentation for #each (emphasis mine):
Iterates over the block according to how this Enumerable was constructed. If no block is given, returns self.
So #each always behaves based on the original setup, not on the current internal state. If you quickly peak at the source you'll see that rb_obj_dup is called to setup a new enumerator.

Resources