Convert Ruby array of elements to Hash of counts with indices - ruby

Given a two dimensional array in Ruby:
[ [1, 1, 1],
[1, 1],
[1, 1, 1, 1],
[1, 1]
]
I'd like to create a Hash, where the keys are the counts of each internal array, and the values are arrays of indices of the original array whose internal array sizes have the particular count. The resulting Hash would be:
{ 2 => [1, 3], 3 => [0], 4 => [2] }
How do I concisely express this functionally in Ruby? I am attempting something akin to Hash.new([]).tap { |h| array.each_with_index { |a, i| h[a.length] << i } }, but the resulting Hash is empty.

There are two problems with your code. The first is that when h is empty and you write, say, h[2] << 1, since h does not have a key 2, h[2] returns the default, so this expression becomes [] << 1 #=> [1], but [1] is not attached to the hash, so no key and value are added.
You need to write h[2] = h[2] << 11. If you do that, your code returns h #=> {3=>[0, 1, 2, 3], 2=>[0, 1, 2, 3], 4=>[0, 1, 2, 3]}. Unfortunately, that's still incorrect, which takes us to the second problem with your code: you did not define the newly-created hash's default value correctly.
First note that
h[3].object_id
#=> 70113420279440
h[2].object_id
#=> 70113420279440
h[4].object_id
#=> 70113420279440
Aha, all three values are the same object! new's argument [] is returned by h[k] when h does not have a key k. The problem is that is the same array is returned for all keys k added to the hash, so you would be adding a key-value pair to an empty array for the first new key, then adding a second key-value pair to that same array for the next new key, and so on. See below for how the hash needs to be defined.
With these two changes your code works fine, but I would suggest writing it as follows.
arr = [ [1, 1, 1], [1, 1], [1, 1, 1, 1], [1, 1] ]
arr.each_with_index.with_object(Hash.new {|h,k| h[k]=[]}) { |(a,i),h|
h[a.size] << i }
#=> {3=>[0], 2=>[1, 3], 4=>[2]}
which use the form of Hash::new that uses a block to calculate the hash's default value (i.e., the value returned by h[k] when a hash h does not have a key k),
or
arr.each_with_index.with_object({}) { |(a,i),h| (h[a.size] ||= []) << i }
#=> {3=>[0], 2=>[1, 3], 4=>[2]}
both of which are effectively the following:
h = {}
arr.each_with_index do |a,i|
sz = a.size
h[sz] = [] unless h.key?(sz)
h[a.size] << i
end
h #=> {3=>[0], 2=>[1, 3], 4=>[2]}
Another way is to use Enumerable#group_by, grouping on array size, after picking up the index for each inner array.
h = arr.each_with_index.group_by { |a,i| a.size }
#=> {3=>[[[1, 1, 1], 0]],
# 2=>[[[1, 1], 1], [[1, 1], 3]],
# 4=>[[[1, 1, 1, 1], 2]]}
h.each_key { |k| h[k] = h[k].map(&:last) }
#=> {3=>[0], 2=>[1, 3], 4=>[2]}
1 The expression h[2] = h[2] << 1 uses the methods Hash#[]= and Hash#[], which is why h[2] on the left of = does not return the default value. This expression can alternatively be written h[2] ||= [] << 1.

arry = [ [1, 1, 1],
[1, 1],
[1, 1, 1, 1],
[1, 1]
]
h = {}
arry.each_with_index do |el,i|
c = el.count
h.has_key?(c) ? h[c] << i : h[c] = [i]
end
p h
This will give you
{3=>[0], 2=>[1, 3], 4=>[2]}

Related

How to merge hash of hashes and set default value if value don't exists

I need to merge values of hash a into out with sort keys in a.
a = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
out = [
{"X": [4, 1]},
{"Y": [5, 0]},
{"Z": [0, 5]},
]
I would do something like this:
a = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
sorted_keys = a.values.flat_map(&:keys).uniq.sort
#=> [11, 12]
a.map { |k, v| { k => v.values_at(*sorted_keys).map(&:to_i) } }
#=> [ { "X" => [4, 1] }, { "Y" => [5, 0] }, { "Z" => [0, 5] }]
Code
def modify_values(g)
sorted_keys = g.reduce([]) {|arr,(_,v)| arr | v.keys}.sort
g.each_with_object({}) {|(k,v),h| h[k] = Hash.new(0).merge(v).values_at(*sorted_keys)}
end
Example
g = {"X"=>{12=>1, 11=>4}, "Y"=>{11=>5}, "Z"=>{12=>5}}
modify_values(g)
#=> {"X"=>[4, 1], "Y"=>[5, 0], "Z"=>[0, 5]}
Explanation
The steps are as follows (for the hash a in the example). First obtain an array of the unique keys from g's values (see Enumerable#reduce and Array#|), then sort that array.
b = a.reduce([]) {|arr,(_,v)| arr | v.keys}
#=> [12, 11]
sorted_keys = b.sort
#=> [11, 12]
The first key-value pair of a, together with an empty hash, is passed to each_with_object's block. The block variables are computed using parallel assignment:
(k,v),h = [["X", {12=>1, 11=>4}], {}]
k #=> "X"
v #=> {12=>1, 11=>4}
h #=> {}
The block calculation is then performed. First an empty hash with a default value 0 is created:
f = Hash.new(0)
#=> {}
The hash v is then merged into f. The result is hash with the same key-value pairs as v but with a default value of 0. The significance of the default value is that if f does not have a key k, f[k] returns the default value. See Hash::new.
g = f.merge(v)
#=> {12=>1, 11=>4}
g.default
#=> 0 (yup)
Then extract the values corresponding to sorted_keys:
h[k] = g.values_at(*sorted_keys)
#=> {12=>1, 11=>4}.values_at(11, 12)
#=> [4, 1]
When a's next key-value pair is passed to the block, the calculations are as follows.
(k,v),h = [["Y", {11=>5}], {"X"=>[4, 1]}] # Note `h` has been updated
k #=> "Y"
v #=> {11=>5}
h #=> {"X"=>[4, 1]}
f = Hash.new(0)
#=> {}
g = f.merge(v)
#=> {11=>5}
h[k] = g.values_at(*sorted_keys)
#=> {11=>5}.values_at(11, 12)
#=> [5, 0] (Note h[12] equals h's default value)
and now
h #=> {"X"=>[4, 1], "Y"=>[5, 0]}
The calculation for the third key-value pair of a is similar.

Finding the mode of a Ruby Array (simplified_

I'm trying to find the mode of an Array. Mode = the element(s) that appear with the most frequency.
I know there are lots of tricks with #enumerable, but I'm not there yet in my learning. The exercise I'm doing assumes I can solve this problem without understanding enumerable.
I've written out my game plan, but I'm stuck on the 2nd part. I'm not sure if it's possible to compare a hash key against an array, and if found, increment the value.
def mode(array)
# Push array elements to hash. Hash should overwrite dup keys.
myhash = {}
array.each do |x|
myhash[x] = 0
end
# compare Hash keys to Array. When found, push +=1 to hash's value.
if myhash[k] == array[x]
myhash[k] += 1
end
# Sort hash by value
# Grab the highest hash value
# Return key(s) per the highest hash value
# rejoice!
end
test = [1, 2, 3, 3, 3, 4, 5, 6, 6, 6]
mode(test) # => 3, 6 (because they each appear 3 times)
You can create a hash with a default initial value:
myhash = Hash.new(0)
Then increment specific occurrences:
myhash["foo"] += 1
myhash["bar"] += 7
myhash["bar"] += 3
p myhash # {"foo"=>1, "bar"=>10}
With that understanding, if you replace your initial hash declaration and then do the incrementing in your array.each iterator, you're practically done.
myhash.sort_by{|key,value| value}[-1]
gives the last entry in the sorted set of hash values, which should be your mode. Note that there may be multiple modes, so you can iterate backwards while the value portion remains constant to determine them all.
There are many, many ways you could do this. Here are a few.
#1
array = [3,1,4,5,4,3]
a = array.uniq #=> [3, 1, 4, 5]
.map {|e| [e, array.count(e)]}
#=> [[3, 2], [1, 1], [4, 2], [5, 1]]
.sort_by {|_,cnt| -cnt} #=> [[3, 2], [4, 2], [1, 1], [5, 1]]
a.take_while {|_,cnt| cnt == a.first.last}
#=> [[3, 2], [4, 2]]
.map(&:first) #=> [3, 4]
#2
array.sort #=> [1, 3, 3, 4, 4, 5]
.chunk {|e| e}
#<Enumerator: #<Enumerator::Generator:0x000001021820b0>:each>
.map { |e,a| [e, a.size] } #=> [[1, 1], [3, 2], [4, 2], [5, 1]]
.sort_by { |_,cnt| -cnt } #=> [[4, 2], [3, 2], [1, 1], [5, 1]]
.chunk(&:last)
#<Enumerator: #<Enumerator::Generator:0x00000103037e70>:each>
.first #=> [2, [[4, 2], [3, 2]]]
.last #=> [[4, 2], [3, 2]]
.map(&:first) #=> [4, 3]
#3
h = array.each_with_object({}) { |e,h|
(h[e] || 0) += 1 } #=> {3=>2, 1=>1, 4=>2, 5=>1}
max_cnt = h.values.max #=> 2
h.select { |_,cnt| cnt == max_cnt }.keys
#=> [3, 4]
#4
a = array.group_by { |e| e } #=> {3=>[3, 3], 1=>[1], 4=>[4, 4], 5=>[5]}
.map {|e,ees| [e,ees.size]}
#=> [[3, 2], [1, 1], [4, 2], [5, 1]]
max = a.max_by(&:last) #=> [3, 2]
.last #=> 2
a.select {|_,cnt| cnt == max}.map(&:first)
#=> [3, 4]
In your approach, you have first initialized a hash containing keys taken from the unique values of the array, with the associated values all set to zero. For example, the array [1,2,2,3] would create the hash {1: 0, 2: 0, 3: 0}.
After this, you plan to count the instances of each of the values in the array by incrementing the value for the associated key in the hash by one for each instance. So, after finding the number 1 in the array, the hash would look like so: {1: 1, 2: 0, 3: 0}. You clearly need to do this for each value in the array, so given your approach and current level of understanding, I would suggest looping through the array again:
array.each do |x|
myhash[x] += 1
end
As you can see, we don't need to check that myhash[k] == array[x] since we have already created a key:value pair for each number in the array.
However, while this approach will work, it's not very efficient: we're having to loop through the array twice. The first time to initialize all the key:value pairs to some default (zero, in this case), and the second to count the frequencies of each number.
Since the default value for each key will be zero, we can remove the need to initialize the defaults by using a different hash constructor. myhash = {} will return nil if we access a key that doesn't exist, but myhash = Hash.new(0) will return 0 if we access a non-existent key (note that you could provide any other value or variable, if required).
By providing a default value of zero, we can get rid of the first loop entirely. When the second loop finds a key that doesn't exist, it will use the default provided and automatically initialize it.
def mode(array)
array.group_by{ |e| e }.group_by{ |k, v| v.size }.max.pop.map{ |e| e.shift }
end
Using the simple_stats gem:
test = [1, 2, 3, 3, 3, 4, 5, 6, 6, 6]
test.modes #=> [3, 6]
If it is an unsorted array, we can sort the array in descending order
array = array.sort!
Then use the sorted array to create a hash default 0 and with each element of the array as a key and number of occurrence as the value
hash = Hash.new(0)
array.each {|i| hash[i] +=1 }
Then mode will be the first element if the hash is sorted in descending order of value(number of occurrences)
mode = hash.sort_by{|key, value| -value}.first[0]

Ruby transpose table, array to string, regex

In Ruby I would have this array of arrays:
[[1,1,1,0,0],[1,1,1,0,0],[0,0,0,1,1]]
which translates into this matrix or table (no headers):
11100
11100
00011
What I want to do is to take every element of each array in the array to transpose the array, for instance, in the above table/array I would have this output as an array of arrays:
[[1,1,0],[1,1,0],[1,1,0],[0,0,1],[0,0,1]]
or this table
110
110
110
001
001
Finally, once the above is accomplished, I would like to convert every array in the array to a string which would exclude any values that are not consecutive 1s, for instance, if I convert the array [1,0,1,1,1,0,1] to a string where the non consecutive 1s are excluded I should get something like this: 111. Note that the first, second, sixth and seventh element are excluded because they are not consecutive 1s.
For the first part, all you need is Array#transpose.
array.transpose
#=> [[1,1,0],[1,1,0],[1,1,0],[0,0,1],[0,0,1]]
then you can do the following
.map {|arr| arr.join.scan(/11+/)}
to count the consecutive ones. The join converts each subarray to a string, then scan checks for two or more consecutive 1s.
Altogether:
array.transpose.map {|arr| arr.join.scan(/11+/)}
#=> [["11"], ["11"], ["11"], [], []]
If you want to remove the empty arrays, #Doorknob notes that you can append a reject:
array.transpose.map {|arr| arr.join.scan(/11+/)}.reject(&:empty?)
#=> [["11"], ["11"], ["11"]]
You could also use Enumerable#chunk:
Code
array.transpose
.map { |a|
a.chunk { |e| e }
.select { |f,a| f == 1 && a.size > 1 }
.map { |_,a| a.join } }
Example
array = [[1,1,1,0,0],[1,1,0,1,0],[0,0,1,1,1],[1,1,0,1,1],[1,0,1,1,1]]
#=> [["11", "11"], ["11"], [], ["1111"], ["111"]]
One could eliminate the empty set, if desired.
Explanation
For array above,
a = array.transpose
#=> [[1, 1, 0, 1, 1],
# [1, 1, 0, 1, 0],
# [1, 0, 1, 0, 1],
# [0, 1, 1, 1, 1],
# [0, 0, 1, 1, 1]]
a.map iterates over the elements (rows) of a. Consider the first element:
b = a.first
#=> [1, 1, 0, 1, 1]
c = b.chunk { |e| e }
#=> #<Enumerator: #<Enumerator::Generator:0x000001020495e0>:each>
To view the contents of this enumerator, add .to_a
b.chunk { |e| e }.to_a
#=> [[1, [1, 1]], [0, [0]], [1, [1, 1]]]
d = c.select { |f,a| f == 1 && a.size > 1 }
#=> [[1, [1, 1]], [1, [1, 1]]]
d.map { |_,a| a.join }
#=> ["11", "11"]

Returning all maximum or minimum values that can be multiple

Enumerable#max_by and Enumerable#min_by return one of the relevant elements (presumably the first one) when there are multiple max/min elements in the receiver. For example, the following:
[1, 2, 3, 5].max_by{|e| e % 3}
returns only 2 (or only 5).
Instead, I want to return all max/min elements and in an array. In the example above, it would be [2, 5] (or [5, 2]). What is the best way to get this?
arr = [1, 2, 3, 5]
arr.group_by{|a| a % 3} # => {1=>[1], 2=>[2, 5], 0=>[3]}
arr.group_by{|a| a % 3}.max.last # => [2, 5]
arr=[1, 2, 3, 5, 7, 8]
mods=arr.map{|e| e%3}
find max
max=mods.max
indices = []
mods.each.with_index{|m, i| indices << i if m.eql?(max)}
arr.select.with_index{|a,i| indices.include?(i)}
find min
min = mods.min
indices = []
mods.each.with_index{|m, i| indices << i if m.eql?(min)}
arr.select.with_index{|a,i| indices.include?(i)}
Sorry for clumsy code, will try to make it short.
Answer by #Sergio Tulentsev is the best and efficient answer, found things to learn there. +1
This is the hash equivalent of #Serio's use of group_by.
arr = [1, 2, 3, 5]
arr.each_with_object(Hash.new { |h,k| h[k] = [] }) { |e,h| h[e%3] << e }.max.last
#=> [2, 5]
The steps:
h = arr.each_with_object(Hash.new { |h,k| h[k] = [] }) { |e,h| h[e%3] << e }
#=> {1=>[1], 2=>[2, 5], 0=>[3]}
a = h.max
#=> [2, [2, 5]]
a.last
#=> [2, 5]

How to remember swapped array elements in another array?

This is driving me crazy! I've been trying to write a Ruby method to find all permutations, to solve Project Euler's problem 24. When I swap the elements of an array, they are swapped properly. But when I try to STORE this swapped array in a DIFFERENT array, this new array only remembers the latest copy of my swapped array! It won't remember the older version.
When I print out a during the loop, it shows all permutations properly. But when I print out perm (which I use to store all different permutations of a), it only shows 1 version of a repeated several times. How do I fix this?
a = [0, 1, 2, 3]
perms = []
p "a = #{a}" # output: "a = [0, 1, 2, 3]"
perms << a # add a to perms array
p "perms = #{perms}" # output: "perms = [[0, 1, 2, 3]]"
a[0], a[1] = a[1], a[0] # swap 1st 2 elements of a
p "a = #{a}" # output: "a = [1, 0, 2, 3]"
perms << a # add a to perms array
p "perms = #{perms}" # "perms = [[1, 0, 2, 3], [1, 0, 2, 3]]"
a[1], a[2] = a[2], a[1] # swap 2nd 2 elements of a
p "a = #{a}" # "a = [1, 2, 0, 3]"
perms << a # add a to perms array
p "perms = #{perms}" # "perms = [[1, 2, 0, 3], [1, 2, 0, 3], [1, 2, 0, 3]]"
Thanks to Sawa below, both "dup" and "clone" methods solved my problem! Why doesn't my original way work? When would I use "dup" vs. "clone"? Please give me some code examples.
a[0], a[1] = a[1], a[0] # swap 1st 2 elements of a
p "a = #{a}" # output: "a = [1, 0, 2, 3]"
b = a.dup (or a.clone)
perms << b
p "perms = #{perms}" # "perms = [[0, 1, 2, 3], [1, 0, 2, 3]]" *** it remembers!
a[1], a[2] = a[2], a[1] # swap 2nd 2 elements of a
p "a = #{a}" # "a = [1, 2, 0, 3]"
b = a.dup (or a.clone)
perms << b
p "perms = #{perms}" # "perms = [[0, 1, 2, 3], [1, 0, 2, 3], [1, 2, 0, 3]]"
Variables in Ruby (with some exceptions, such as variables bound to integers) contain references to objects, not values. Here's an example from running "irb":
1.9.3p374 :021 > str1="hi"
=> "hi"
1.9.3p374 :022 > str2=str1
=> "hi"
1.9.3p374 :023 > str1.replace("world")
=> "world"
1.9.3p374 :024 > str2
=> "world"
You'll notice that once I replace the value for str1, str2's "value" changes as well. That's because it contains a reference to the str1 object. I know one difference between dup and clone has to do with the "freeze" method. If I had called str1.freeze, then it would prevent the object str1 references from being modified, e.g.,
1.9.3p374 :055 > str1.freeze
=> "hi"
1.9.3p374 :056 > str1[0]="b"
RuntimeError: can't modify frozen String
from (irb):56:in `[]='
from (irb):56
from /.rvm/rubies/ruby-1.9.3-p374/bin/irb:13:in `<main>
"Dup"-ing a frozen object doesn't create a frozen object whereas cloning does.
EDIT: just a slight update....When assigning an object on the right to a variable on the left (e.g., str = Object.new), the variable receives an object reference. When assigning one variable to another, the left-hand side variable receives a copy of the reference that the variable on the right contains. In either case, you are still storing object references in the left-hand side variable.
Your original didn't work because you kept modifying the same array instance a.
Take a dup of the original array each time before you modify it into a different array. Or, create a new instance of Array by not relying on a destructive method.
a = original_array
b = a.dup
... # do some modifications to `b`
perms << b
c = a.dup
... # do some modifications to `c`
perms << c
...
If you don't like reinventing the wheel, you can use the facets gem.
gem install facets
https://github.com/rubyworks/facets/blob/d96ec0d700d1d7180ccbb5452e0a926386ec0b32/lib/backport/facets/array/permutation.rb
require 'facets'
[1, 2, 3].permutation
#=> [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]

Resources