dup gives different results when hash is one vs. two dimensions - ruby

dup is shallow copy, so when doing this:
h = {one: {a:'a', b: 'b'}}
h_copy = h.dup
h_copy[:one][:b] = 'new b'
now h and h_copy is same: {:one=>{:a=>"a", :b=>"new b"}}
yes, that right.
But when h is a one dimension hash:
h = {a:'a', b: 'b'}
h_copy = h.dup
h_copy[:b] = 'new b'
h still is: {a:'a', b: 'b'}
h_copy is {a:'a', b: 'new b'}
Why?

You can think about your two-dimensional hash as some kind of container, which conatins another hash container. So you have 2 containers.
When you call dup on h, then dup returns you copy of your outermost container, but any inner containers are not copied, so this is what shallow copy does. Now after dup you have 3 containers: h_copy is your new third container, which :one key just points to h's inner container

As you said, dup is shallow copy.
It appears you want both h_copy and h to refer to the same object.
Then simply do h_copy = h (i.e. no dup).
h = {a:'a', b: 'b'}
h_copy = h.dup
h_copy[:b] = 'new b'
h #=> {a:'a', b: 'new b'}

So after 1 hour of brainstorming..I have come to the conclusion that in the multi dimensional hashes, the dup generates the same object_id for each key which is in turn referring to the hash whereas in the single dimensional hash, the object_ids are similar initially but when we make any changes to the object the Ruby would assign new object_id to the hash keys..
Look at the following code
h = { :a => "a", :b => "b" } # => {:a=>"a", :b=>"b"}
h_clone = h.dup #=> {:a=>"a", :b=>"b"}
h.object_id #=> 73436330
h_clone.object_id #=> 73295920
h[:a].object_id #=> 73436400
h_clone[:a].object_id #=> 73436400
h[:b].object_id #=> 73436380
h_clone[:b].object_id #=> 73436380
h_clone[:b] = "New B" #=> "New B"
h_clone[:b].object_id #=> 74385280
h.object_id #=> 73436330
h_clone.object_id #=> 73295920
h[:a].object_id #=> 73436400
h_clone[:a].object_id #=> 73436400
Look the following code for the multidimensional array
h = { :one => { :a => "a", :b => "b" } } #=> {:one=>{:a=>"a", :b=>"b"}}
h_copy = h.dup #=> {:one=>{:a=>"a", :b=>"b"}}
h_copy.object_id #=> 80410620
h.object_id #=> 80552610
h[:one].object_id #=> 80552620
h_copy[:one].object_id #=> 80552620
h[:one][:a].object_id #=> 80552740
h_copy[:one][:a].object_id #=> 80552740
h[:one][:b].object_id #=> 80552700
h_copy[:one][:b].object_id #=> 80552700
h_copy[:one][:b] = "New B" #=> "New B"
h_copy #=> {:one=>{:a=>"a", :b=>"New B"}}
h #=> {:one=>{:a=>"a", :b=>"New B"}}
h.object_id #=> 80552610
h_copy.object_id #=> 80410620
h[:one].object_id #=> 80552620
h_copy[:one].object_id #=> 80552620
h[:one][:b].object_id #=> 81558770
h_copy[:one][:b].object_id #=> 81558770

Related

Looking to convert information from a file into a hash Ruby

Hello I have been doing some research for sometime on this particular project I have been working on and I am at a loss. What I am looking to do is use information from a file and convert that to a hash using some of those components for my key. Within the file I have:1,Foo,20,Smith,40,John,55
An example of what I am looking for I am looking for an output like so {1 =>[Foo,20], 2 =>[Smith,40] 3 => [John,55]}
Here is what I got.
h = {}
people_file = File.open("people.txt") # I am only looking to read here.
until people_file.eof?
i = products_file.gets.chomp.split(",")
end
people_file.close
FName = 'test'
str = "1,Foo,20,Smith, 40,John,55"
File.write(FName, str)
#=> 26
base, *arr = File.read(FName).
split(/\s*,\s*/)
enum = (base.to_i).step
arr.each_slice(2).
with_object({}) {|pair,h| h[enum.next]=pair}
#=> {1=>["Foo", "20"], 2=>["Smith", "40"],
# 3=>["John", "55"]}
The steps are as follows.
s = File.read(FName)
#=> "1,Foo,20,Smith, 40,John,55"
base, *arr = s.split(/\s*,\s*/)
#=> ["1", "Foo", "20", "Smith", "40", "John", "55"]
base
#=> "1"
arr
#=> ["Foo", "20", "Smith", "40", "John", "55"]
a = base.to_i
#=> 1
I assume the keys are to be sequential integers beginning with a #=> 1.
enum = a.step
#=> (1.step)
enum.next
#=> 1
enum.next
#=> 2
enum.next
#=> 3
Continuing,
enum = a.step
b = arr.each_slice(2)
#=> #<Enumerator: ["Foo", "20", "Smith", "40", "John", "55"]:each_slice(2)>
Note I needed to redefine enum (or execute enum.rewind) to reinitialize it. We can see the elements that will be generated by this enumerator by converting it to an array.
b.to_a
#=> [["Foo", "20"], ["Smith", "40"], ["John", "55"]]
Continuing,
c = b.with_object({})
#=> #<Enumerator: #<Enumerator: ["Foo", "20", "Smith", "40", "John", "55"]
# :each_slice(2)>:with_object({})>
c.to_a
#=> [[["Foo", "20"], {}], [["Smith", "40"], {}], [["John", "55"], {}]]
The now-empty hashes will be constructed as calculations progress.
c.each {|pair,h| h[enum.next]=pair}
#=> {1=>["Foo", "20"], 2=>["Smith", "40"], 3=>["John", "55"]}
To see how the last step is performed, each initially directs the enumerator c to generate the first value, which it passes to the block. The block variables are assigned to that value, and the block calculation is performed.
enum = a.step
b = arr.each_slice(2)
c = b.with_object({})
pair, h = c.next
#=> [["Foo", "20"], {}]
pair
#=> ["Foo", "20"]
h #=> {}
h[enum.next]=pair
#=> ["Foo", "20"]
Now,
h#=> {1=>["Foo", "20"]}
The calculations are similar for the remaining two elements generated by the enumerator c.
See IO::write, IO::read, Numeric#step, Enumerable#each_slice, Enumerator#with_object, Enumerator#next and Enumerator#rewind. write and read respond to File because File is a subclass of IO (File.superclass #=> IO). split's argument, the regular expression, /\s*,\s*/, causes the string to be split on commas together with any spaces that surround the commas. Converting [["Foo", "20"], {}] to pair and h is a product of Array Decompostion.

Multiple sub-hashes out of one hash

I have a hash:
hash = {"a_1_a" => "1", "a_1_b" => "2", "a_1_c" => "3", "a_2_a" => "3",
"a_2_b" => "4", "a_2_c" => "4"}
What's the best way to get the following sub-hashes:
[{"a_1_a" => "1", "a_1_b" => "2", "a_1_c" => "3"},
{"a_2_a" => "3", "a_2_b" => "4", "a_2_c" => "4"}]
I want them grouped by the key, based on the regexp /^a_(\d+)/. I'll have 50+ key/value pairs in the original hash, so something dynamic would work best, if anyone has any suggestions.
If you're only concerned about the middle component you can use group_by to get you most of the way there:
hash.group_by do |k,v|
k.split('_')[1]
end.values.map do |list|
Hash[list]
end
# => [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"}, {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]
The final step is extracting the grouped lists and combining those back into the required hashes.
Code
def partition_hash(hash)
hash.each_with_object({}) do |(k,v), h|
key = k[/(?<=_).+(?=_)/]
h[key] = (h[key] || {}).merge(k=>v)
end.values
end
Example
hash = {"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3", "a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}
partition_hash(hash)
#=> [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]
Explanation
The steps are as follows.
enum = hash.each_with_object({})
#=> #<Enumerator: {"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3", "a_2_a"=>"3",
# "a_2_b"=>"4", "a_2_c"=>"4"}:each_with_object({})>
The first element of this enumerator is generated and passed to the block, and the block variables are computed using parallel assignment.
(k,v), h = enum.next
#=> [["a_1_a", "1"], {}]
k #=> "a_1_a"
v #=> "1"
h #=> {}
and the block calculation is performed.
key = k[/(?<=_).+(?=_)/]
#=> "1"
h[key] = (h[key] || {}).merge(k=>v)
#=> h["1"] = (h["1"] || {}).merge("a_1_a"=>"1")
#=> h["1"] = (nil || {}).merge("a_1_a"=>"1")
#=> h["1"] = {}.merge("a_1_a"=>"1")
#=> h["1"] = {"a_1_a"=>"1"}
so now
h #=> {"1"=>{"a_1_a"=>"1"}}
The next value of enum is now generated and passed to the block, and the following calculations are performed.
(k,v), h = enum.next
#=> [["a_1_b", "2"], {"1"=>{"a_1_a"=>"1"}}]
k #=> "a_1_b"
v #=> "2"
h #=> {"1"=>{"a_1_a"=>"1"}}
key = k[/(?<=_).+(?=_)/]
#=> "1"
h[key] = (h[key] || {}).merge(k=>v)
#=> h["1"] = (h["1"] || {}).merge("a_1_b"=>"2")
#=> h["1"] = ({"a_1_a"=>"1"}} || {}).merge("a_1_b"=>"2")
#=> h["1"] = {"a_1_a"=>"1"}}.merge("a_1_b"=>"2")
#=> h["1"] = {"a_1_a"=>"1", "a_1_b"=>"2"}
After the remaining four elements of enum have been passed to the block the following has is returned.
h #=> {"1"=>{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# "2"=>{"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}}
The final step is simply to extract the values.
h.values
#=> [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]

Iterate through array of hashes and merge [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have an array of hashes like so:
[{:a=>"a", :b=>"b", :c=>"c", :d=>"d"},
{:a=>"a", :b=>nil, :c=>"notc", :d=>"d"}]
I want to iterate over the hash and merge them where a specific key is the same, such as :a. If the next key element is the same - ignore it, but if the key is different, create an array. Result will look something like this:
{:a=>"a", :b=>"b", :c=>["c","notc"], :d=>"d"}
I think I have do a for loop through the array of hashes and then use the merge! method, but not sure where to start
I would also use Hash#merge! (aka update), like this (letting a denote the name of your array of hashes):
a.each_with_object({}) do |g,h|
h.update(g) do |_,o,n|
case o
when Array
o.include?(n) ? o : o + [n]
else
o.eql?(n) ? o : [o,n]
end
end
end
#=> {:a=>"a", :b=>["b", nil], :c=>["c", "notc"], :d=>"d"}
When o is an array, if you don't want to merge nil values, change the following line to:
(o.include?(n) || n.nil?) ? o : o + [n]
The steps:
a = [{:a=>"a", :b=>"b", :c=>"c", :d=>"d"},
{:a=>"a", :b=>nil, :c=>"notc", :d=>"d"},
{:a=>"aa", :b=>"b", :c=>"cc", :d=>"d"},
]
enum = a.each_with_object({})
#-> #<Enumerator: [{:a=>"a", :b=>"b", :c=>"c", :d=>"d"},
# {:a=>"a", :b=>nil, :c=>"notc", :d=>"d"},
# {:a=>"aa", :b=>"b", :c=>"cc", :d=>"d"}]
# :each_with_object({})>
We can see the values of the enumerator (which will be passed into the block) by converting it to an array:
enum.to_a
#=> [[{:a=>"a", :b=>"b", :c=>"c", :d=>"d"}, {}],
# [{:a=>"a", :b=>nil, :c=>"notc", :d=>"d"}, {}],
# [{:a=>"aa", :b=>"b", :c=>"cc", :d=>"d"}, {}]]
Pass in the first value and assign it to the block variables:
g,h = enum.next
#=> [{:a=>"a", :b=>"b", :c=>"c", :d=>"d"}, {}]
g #=> {:a=>"a", :b=>"b", :c=>"c", :d=>"d"}
h #=> {}
update's block is used for determining the values of keys that are present in both hashes being merged. As h is presently empty ({}), it is not used for this merge:
h.update(g)
#=> {:a=>"a", :b=>"b", :c=>"c", :d=>"d"}
The new value of h is returned.
Now pass the second value of enum into the block:
g,h = enum.next
#=> [{:a=>"a", :b=>nil, :c=>"notc", :d=>"d"},
# {:a=>"a", :b=>"b", :c=>"c", :d=>"d"}]
g #=> {:a=>"a", :b=>nil, :c=>"notc", :d=>"d"}
h #=> {:a=>"a", :b=>"b", :c=>"c", :d=>"d"}
and execute:
h.update(g)
When :a=>"a" from g is to be merged, update sees that h contains the same key, :a. It therefore defers to the block to determine the value for :a in the merged hash. It passes the following values to the block:
k = :a
o = "a"
n = "a"
where k is the key, o (for "old") is the value of k in h and n (for "new") is the value of k in g. (We're not using k in the block, so I've name the block variable _ to so signify.) In the case statement, o is not an array, so:
o.eql?(n) ? o : [o,n]
#=> "a".eql?("a") ? "a" : ["a","a"]
#=> "a"
is returned to returned to update to be the value for :a. That is, the value is not changed.
When the key is :b:
k = :b
o = "b"
n = nil
Again, o is not an array, so again we execute:
o.eql?(n) ? o : [o,n]
#=> ["b", nil]
but this time an array is returned. The remaining calculations for merging the second element of enum procede similarly. After the merge:
h #=> {:a=>"a", :b=>["b", nil], :c=>["c", "notc"], :d=>"d"}
When :c=>"cc" in the third element of enum is merged, the following values are passed to update's block:
_ :c
o = ["c", "notc"]
n = "cc"
Since o is an array, we execute the following line of the case statement:
o.include?(n) ? o : o + [n]
#=> ["c", "notc", "cc"]
and the value of :c is assigned that value. The remaining calculations are performed similarly.
I would do something like this:
array = [{ :a=>"a", :b=>"b", :c=>"c", :d=>"d" },
{ :a=>"a", :b=>nil, :c=>"notc", :d=>"d" }]
result = array.reduce({}) do |memo, hash|
memo.merge(hash) do |_, left, right|
combined = Array(left).push(right).uniq.compact
combined.size > 1 ? combined : combined.first
end
end
puts result
#=> { :a=>"a", :b=>"b", :c=>["c", "notc"], :d=>"d" }
Array(left) will ensure that the value of the one hash is an arary. push(right) adds the value from the other hash into that array. uniq.compact removes nil values and duplicates.
The combined.size > 1 ? combined : combined.first line returns just the element if the array holds only one element.
Assuming hs = [{:a=>"a", :b=>"b", :c=>"c", :d=>"d"}, {:a=>"a", :b=>nil, :c=>"notc", :d=>"d"}], you can do that with a (admittedly, terribly dense) one-liner:
Hash[hs.map { |h| h.map { |k,v| [k, v] } }.sort_by(&:first).reduce { |left,right| left.zip(right) }.map { |a| [a.first.first, a.map(&:last).compact.uniq] }]
To unpack what's going on here:
First we use map to convert the array of hashes into an array of arrays of pairs, and then use sort_by to sort the array so that all of the keys are 'lined up'.
[{:a=>"a", :b=>"b", :c=>"c", :d=>"d"}, {:a=>"a", :b=>nil, :c=>"notc", :d=>"d"}] becomes
[[[:a, "a"], [:b, "b"], [:c, "c"], [:d, "d"]],
[[:a, "a"], [:b, nil], [:c, "notc"], [:d, "d"]]]
That's this part:
hs.map { |h| h.map { |k,v| [k, v] } }.sort_by(&:first).
Then, we use reduce and zip all the arrays together, zip just takes two arrays [1,2,3], [:a,:b,:c] and outputs an array like: [[1,:a],[2,:b],[3,:c]]
reduce { |left,right| left.zip(right) }.
At this point we've grouped all the data together by key, and need to de-duplicate all the copies of the key, and we can remove the nils and uniq the values:
map { |a| [a.first.first, a.map(&:last).compact.uniq] }]
Here's a sample from a pry session:
[31] pry(main)> Hash[hs.map { |h| h.map { |k,v| [k, v] } }.sort_by(&:first).reduce { |left,right| left.zip(right) }.map { |a| [a.first.first, a.map(&:last).compact.uniq] }]
=> {:a=>["a"], :b=>["b"], :c=>["c", "notc"], :d=>["d"]}

Move elements of an array to a different array in Ruby

Simple ruby question. Lets say I have an array of 10 strings and I want to move elements at array[3] and array[5] into a totally new array. The new array would then only have the two elements I moved from the first array, AND the first array would then only have 8 elements since two of them have been moved out.
Use Array#slice! to remove the elements from the first array, and append them to the second array with Array#<<:
arr1 = ['Foo', 'Bar', 'Baz', 'Qux']
arr2 = []
arr2 << arr1.slice!(1)
arr2 << arr1.slice!(2)
puts arr1.inspect
puts arr2.inspect
Output:
["Foo", "Baz"]
["Bar", "Qux"]
Depending on your exact situation, you may find other methods on array to be even more useful, such as Enumerable#partition:
arr = ['Foo', 'Bar', 'Baz', 'Qux']
starts_with_b, does_not_start_with_b = arr.partition{|word| word[0] == 'B'}
puts starts_with_b.inspect
puts does_not_start_with_b.inspect
Output:
["Bar", "Baz"]
["Foo", "Qux"]
a = (0..9).map { |i| "el##{i}" }
x = [3, 5].sort_by { |i| -i }.map { |i| a.delete_at(i) }
puts x.inspect
# => ["el#5", "el#3"]
puts a.inspect
# => ["el#0", "el#1", "el#2", "el#4", "el#6", "el#7", "el#8", "el#9"]
As noted in comments, there is some magic to make indices stay in place. This can be avoided by first getting all the desired elements using a.values_at(*indices), then deleting them as above.
Code:
arr = ["null","one","two","three","four","five","six","seven","eight","nine"]
p "Array: #{arr}"
third_el = arr.delete_at(3)
fifth_el = arr.delete_at(4)
first_arr = arr
p "First array: #{first_arr}"
concat_el = third_el + "," + fifth_el
second_arr = concat_el.split(",")
p "Second array: #{second_arr}"
Output:
c:\temp>C:\case.rb
"Array: [\"null\", \"one\", \"two\", \"three\", \"four\", \"five\", \"six\", \"s
even\", \"eight\", \"nine\"]"
"First array: [\"null\", \"one\", \"two\", \"four\", \"six\", \"seven\", \"eight
\", \"nine\"]"
"Second array: [\"three\", \"five\"]"
Why not start deleting from the highest index.
arr = ['Foo', 'Bar', 'Baz', 'Qux']
index_array = [2, 1]
new_ary = index_array.map { |index| arr.delete_at(index) }
new_ary # => ["Baz", "Bar"]
arr # => ["Foo", "Qux"]
Here's one way:
vals = arr.values_at *pulls
arr = arr.values_at *([*(0...arr.size)] - pulls)
Try it.
arr = %w[Now is the time for all Rubyists to code]
pulls = [3,5]
vals = arr.values_at *pulls
#=> ["time", "all"]
arr = arr.values_at *([*(0...arr.size)] - pulls)
#=> ["Now", "is", "the", "for", "Rubyists", "to", "code"]
arr = %w[Now is the time for all Rubyists to code]
pulls = [5,3]
vals = arr.values_at *pulls
#=> ["all", "time"]
arr = arr.values_at *([*(0...arr.size)] - pulls)
#=> ["Now", "is", "the", "for", "Rubyists", "to", "code"]

How do I create a hash from this array?

I have an array that looks like this:
["value1=3", "value2=4", "value3=5"]
I'd like to end up with a hash like:
H['value1'] = 3
H['value2'] = 4
H['value3'] = 5
There's some parsing involved and I was hoping to get pointed in the right direction.
ary = ["value1=3", "value2=4", "value3=5"]
H = Hash[ary.map {|s| s.split('=') }]
This however will set all the values as strings '5' instead of integer. If you are sure they are all integers:
H = Hash[ary.map {|s| key, value = s.split('='); [key, value.to_i] }]
I'd do as #BroiSatse suggests, but here's another way that uses a Regex:
ary = ["value1=3", "value2=4", "value3=5"]
ary.join.scan(/([a-z]+\d+)=(\d+)/).map { |k,v| [k,v.to_i] }.to_h
=> {"value1"=>3, "value2"=>4, "value3"=>5}
Here's what's happening:
str = ary.join
#=> "value1=3value2=4value3=5"
a = str.scan(/([a-z]+\d+)=(\d+)/)
#=> [["value1", "3"], ["value2", "4"], ["value3", "5"]]
b = a.map { |k,v| [k,v.to_i] }
#=> [["value1", 3], ["value2", 4], ["value3", 5]]
b.to_h
#=> {"value1"=>3, "value2"=>4, "value3"=>5}
For Ruby versions < 2.0, the last line must be replaced with
Hash[b]
#=> {"value1"=>3, "value2"=>4, "value3"=>5}

Resources