Consolidate nested arrays and erase the subarrays that have been consolidated? - ruby

I'm trying to take a bunch of number-word pairs and group the words according to common numbers. I can match the numbers, merge the subarrays that share the number, and erase the first of those subarrays. But when I try to delete the second, I get this error:
"in block in <main>': undefined method[]' for nil:NilClass (NoMethodError)"
The guilty line -- ary.delete_at(i+1) -- has been commented out. Secondary problem: MRBTree is not taking the nested arrays as input...
ary = [[2.28, "cat"], [2.28, "bat"], [2.327, "bear"], [2.68, "ant"], [2.68, "anu"]]
i = 0
for i in 0 ... ary.size - 1
if ary[i][0] == ary[i+1][0]
b = (ary[i]+ary[i+1]).uniq
ary.delete_at(i)
# ary.delete_at(i+1)
c = [b.first], b.pop(b.length - 1)
h = Hash[*c]
ary.push(*h)
# mrbtree = MultiRBTree[c]
end
end
puts ary.inspect
output:
# => [
# => [2.28, "bat"],
# => [2.327, "bear"],
# => [2.68, "anu"],
# => [
# => [2.28], ["cat", "bat"]
# => ],
# => [
# => [2.68], ["ant", "anu"]
# => ]
# => ]
Any help appreciated!

Your attempt is failing because you are modifying the array (which has impact on a.size) in the loop. The loop end condition is not adjusted automagically. You are accessing things you have deleted before.
If your array is not too big, this will do:
p Hash[ary.group_by(&:first).map { | k, v | [k, v.map(&:last)] }]
# => {2.28=>["cat", "bat"], 2.327=>["bear"], 2.68=>["ant", "anu"]}
It works this way:
ary.group_by(&:first) # group the 2-elem arrays by the number, creating a hash
# like {2.28=>[[2.28, "cat"], [2.28, "bat"]], ...}
.map { | k, v | ... } # change the array of key-value pairs to
[k, v.map(&:last)] # pairs where the value-array contains just the strings
Hash[ ... ] # make the whole thing a hash again
Creating an intermediate array and transferring it back to a hash is some overhead. If this turns out to be an issue, something like this might be better:
h = Hash.new { | a, k | a[k] = [] } # a hash with [] as default value
p ary.inject(h) { | a, (k, v) | a[k] << v; a }

It looks like after
ary.delete_at(i)
the size of array is decreased by one, hence i is better than i+1:
# ary.delete_at(i+1)
ary.delete_at(i)

Alternate version for converting to hash:
ary = [[2.28, "cat"], [2.28, "bat"], [2.327, "bear"], [2.68, "ant"], [2.68, "anu"]]
hsh = {}
ary.each {|pair| hsh[pair[0]].nil? ? hsh[pair[0]] = [pair[1]] : hsh[pair[0]] << pair[1]}
puts hsh.inspect # => {2.28 => ["cat", "bat"], 2.327 => ["bear"], 2.68 => ["ant", "anu"]}

Related

Questions on implementing hashes in ruby

I'm new to ruby, I am solving a problem that involves hashes and key. The problem asks me to Implement a method, #pet_types, that accepts a hash as an argument. The hash uses people's # names as keys, and the values are arrays of pet types that the person owns. My question is about using Hash#each method to iterate through each num inside the array. I was wondering if there's any difference between solving the problem using hash#each or hash.sort.each?
I spent several hours coming up different solution and still to figure out what are the different approaches between the 2 ways of solving the problem below.
I include my code in repl.it: https://repl.it/H0xp/6 or you can see below:
# Pet Types
# ------------------------------------------------------------------------------
# Implement a method, #pet_types, that accepts a hash as an argument. The hash uses people's
# names as keys, and the values are arrays of pet types that the person owns.
# Example input:
# {
# "yi" => ["dog", "cat"],
# "cai" => ["dog", "cat", "mouse"],
# "venus" => ["mouse", "pterodactyl", "chinchilla", "cat"]
# }
def pet_types(owners_hash)
results = Hash.new {|h, k| h[k] = [ ] }
owners_hash.sort.each { |k, v| v.each { |pet| results[pet] << k } }
results
end
puts "-------Pet Types-------"
owners_1 = {
"yi" => ["cat"]
}
output_1 = {
"cat" => ["yi"]
}
owners_2 = {
"yi" => ["cat", "dog"]
}
output_2 = {
"cat" => ["yi"],
"dog" => ["yi"]
}
owners_3 = {
"yi" => ["dog", "cat"],
"cai" => ["dog", "cat", "mouse"],
"venus" => ["mouse", "pterodactyl", "chinchilla", "cat"]
}
output_3 = {
"dog" => ["cai", "yi"],
"cat" => ["cai", "venus", "yi"],
"mouse" => ["cai", "venus"],
"pterodactyl" => ["venus"],
"chinchilla" => ["venus"]
}
# method 2
# The 2nd and 3rd method should return a hash that uses the pet types as keys and the values should
# be a list of the people that own that pet type. The names in the output hash should
# be sorted alphabetically
# switched_hash = Hash.new()
# owners_hash.each do |owner, pets_array|
# pets_array.each do |pet|
# select_owners = owners_hash.select { |owner, pets_array|
owners_hash[owner].include?(pet) }
# switched_hash[pet] = select_owners.keys.sort
# end
# end
# method 3
#switched_hash
# pets = Hash.new {|h, k| h[k] = [ ] } # WORKS SAME AS: pets = Hash.new( Array.new )
# owners = owners_hash.keys.sort
# owners.each do |owner|
# owners_hash[owner].each do |pet|
# pets[pet] << owner
# end
# end
# pets
# Example output:
# output_3 = {
# "dog" => ["cai", "yi"],
# "cat" => ["cai", "venus", "yi"], ---> (sorted alphabetically!)
# "mouse" => ["cai", "venus"],
# "pterodactyl" => ["venus"],
# "chinchilla" => ["venus"]
# }
I used a hash data structure in my program to first solve this problem. Then I tried to rewrite it using the pet_hash. And my final codes is the following:
def pet_types(owners_hash)
pets_hash = Hash.new { |k, v| v = [] }
owners_hash.each do |owner, pets|
pets.each do |pet|
pets_hash[pet] += [owner]
end
end
pets_hash.values.each(&:sort!)
pets_hash
end
puts "-------Pet Types-------"
owners_1 = {
"yi" => ["cat"]
}
output_1 = {
"cat" => ["yi"]
}
owners_2 = {
"yi" => ["cat", "dog"]
}
output_2 = {
"cat" => ["yi"],
"dog" => ["yi"]
}
owners_3 = {
"yi" => ["dog", "cat"],
"cai" => ["dog", "cat", "mouse"],
"venus" => ["mouse", "pterodactyl", "chinchilla", "cat"]
}
output_3 = {
"dog" => ["cai", "yi"],
"cat" => ["cai", "venus", "yi"],
"mouse" => ["cai", "venus"],
"pterodactyl" => ["venus"],
"chinchilla" => ["venus"]
}
puts pet_types(owners_1) == output_1
puts pet_types(owners_2) == output_2
puts pet_types(owners_3) == output_3
Hash#sort has the same effect (at least for my basic test) as Hash#to_a followed by Array#sort.
hash = {b: 2, a: 1}
hash.to_a.sort # => [[:a, 1, [:b, 2]]
hash.sort # => the same
Now let's look at #each, both on Hash and Array.
When you provide two arguments to the block, that can handle both cases. For the hash, the first argument will be the key and the second will be the value. For the nested array, the values essentially get splatted out to the args:
[[:a, 1, 2], [:b, 3, 4]].each { |x, y, z| puts "#{x}-#{y}-#{z}" }
# => a-1-2
# => b-3-4
So basically, you should think of Hash#sort to be a shortcut to Hash#to_a followed by Array#sort, and recognize that #each will work the same on a hash as a hash converted to array (a nested array). In this case, it doesn't matter which approach you take. Clearly if you need to sort iteration by the keys then you should use sort.

Initializing a hash with an empty array keyed to an array of strings - Ruby

I have:
people=["Bob","Fred","Sam"]
holidays = Hash.new
people.each do |person|
a=Array.new
holidays[person]=a
end
gifts = Hash.new
people.each do |person|
a=Array.new
gifts[person]=a
end
Feels clunky. I can't seem to figure a more streamline way with an initialization block or somesuch thing. Is there an idiomatic approach here?
Ideally, I'd like to keep an array like:
lists["holidays","gifts",...]
... and itterate through it to initialize each element in the lists array.
people = %w|Bob Fred Sam|
data = %w|holidays gifts|
result = data.zip(data.map { people.zip(people.map { [] }).to_h }).to_h
result['holidays']['Bob'] << Date.today
#⇒ {
# "holidays" => {
# "Bob" => [
# [0] #<Date: 2016-11-04 ((2457697j,0s,0n),+0s,2299161j)>
# ],
# "Fred" => [],
# "Sam" => []
# },
# "gifts" => {
# "Bob" => [],
# "Fred" => [],
# "Sam" => []
# }
# }
More sophisticated example would be:
result = data.map do |d|
[d, Hash.new { |h, k| h[k] = [] if people.include?(k) }]
end.to_h
The latter produces the “lazy initialized nested hashes.” It uses the Hash#new with a block constructor for nested hashes.
Play with it to see how it works.
A common way of doing that would be to use Enumerable#each_with_objrect.
holidays = people.each_with_object({}) { |p,h| h[p] = [] }
#=> {"Bob"=>[], "Fred"=>[], "Sam"=>[]}
gifts is the same.
If you only want a number of such hashes then, the following should suffice:
count_of_hashes = 4 // lists.count; 4 is chosen randomly by throwing a fair die
people = ["Bob", "Fred", "Sam"]
lists = count_of_hashes.times.map do
people.map {|person| [person, []]}.to_h
end
This code also ensures the arrays and the hashes all occupy their own memory. As can be verified by the following code:
holidays, gifts, *rest = lists
holidays["Bob"] << "Rome"
And checking the values of all the other hashes:
lists
=> [
{"Bob"=>["Rome"], "Fred"=>[], "Sam"=>[]},
{"Bob"=>[], "Fred"=>[], "Sam"=>[]},
{"Bob"=>[], "Fred"=>[], "Sam"=>[]},
{"Bob"=>[], "Fred"=>[], "Sam"=>[]}
]

compare array of hashes and print expected & actual results

I have 2 array of hashes:
actual = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"},
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
{"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]
expected = [{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"},
{"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}]
I need to compare these 2 hashes and find out the ones for which the column_data_type differs.
to compare we can directly use:
diff = actual - expected
This will print the output as:
{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"}
My expected output is that in the result i want to print the actual and expected datatype, means the datatypes for the missing `column_name' from both the actual and expected array of hashes, something like:
{"column_name"=>"NONINTERESTEXPENSE", "expected_column_data_type"=>"NUMBER", "actual_column_data_type" => "VARCHAR"}
{"column_name"=>"TRANSACTIONDATE", "expected_column_data_type"=>"NUMBER","actual_column_data_type" => "TIMESTAMP" }
This will work irrespective of order of hashes in your array.
diff = []
expected.each do |elem|
column_name = elem['column_name']
column_type = elem['column_data_type']
match = actual.detect { |elem2| elem2['column_name'] == column_name }
if column_type != match['column_data_type']
diff << { 'column_name' => column_name,
'expected_column_data_type' => column_type,
'actual_column_data_type' => match['column_data_type'] }
end
end
p diff
[actual, expected].map { |a| a.map(&:dup).map(&:values) }
.map(&Hash.method(:[]))
.reduce do |actual, expected|
actual.merge(expected) do |k, o, n|
o == n ? nil : {name: k, actual: o, expected: n}
end
end.values.compact
#⇒ [
# [0] {
# :name => "NONINTERESTEXPENSE",
# :actual => "VARCHAR",
# :expected => "NUMBER"
# },
# [1] {
# :name => "TRANSACTIONDATE",
# :actual => "TIMESTAMP",
# :expected => "NUMBER"
# }
# ]
The method above easily expandable to merge N arrays (use reduce.with_index and merge with key "value_from_#{idx}".)
(expected - actual).
concat(actual - expected).
group_by { |column| column['column_name'] }.
map do |name, (expected, actual)|
{
'column_name' => name,
'expected_column_data_type' => expected['column_data_type'],
'actual_column_data_type' => actual['column_data_type'],
}
end
What about this?
def select(hashes_array, column_name)
hashes_array.select { |h| h["column_name"] == column_name }.first
end
diff = (expected - actual).map do |h|
{
"column_name" => h["column_name"],
"expected_column_data_type" => select(expected, h["column_name"])["column_data_type"],
"actual_column_data_type" => select(actual, h["column_name"])["column_data_type"],
}
end
PS: surely this code can be improved to look more elegant
Code
def convert(actual, expected)
hashify(actual-expected, "actual_data_type").
merge(hashify(expected-actual, "expected_data_type")) { |_,a,e| a.merge(e) }.values
end
def hashify(arr, key)
arr.each_with_object({}) { |g,h| h[g["column_name"]] =
{ "column_name"=>g["column_name"], key=>g["column_data_type"] } }
end
Example
actual = [
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"},
{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
{"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}
]
expected = [
{"column_name"=>"NONINTERESTINCOME", "column_data_type"=>"NUMBER"},
{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"},
{"column_name"=>"UPDATEDATE", "column_data_type"=>"TIMESTAMP"}
]
convert(actual, expected)
#=> [{"column_name"=>"TRANSACTIONDATE",
# "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"},
# {"column_name"=>"NONINTERESTEXPENSE",
# "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}]
Explanation
For the example above the steps are as follows.
First hashify actual and expected.
f = actual-expected
#=> [{"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"TIMESTAMP"},
# {"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"VARCHAR"}]
g = hashify(f, "actual_data_type")
#=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
# "actual_data_type"=>"TIMESTAMP"},
# "NONINTERESTEXPENSE"=>{ "column_name"=>"NONINTERESTEXPENSE",
# "actual_data_type"=>"VARCHAR"}}
h = expected-actual
#=> [{"column_name"=>"NONINTERESTEXPENSE", "column_data_type"=>"NUMBER"},
# {"column_name"=>"TRANSACTIONDATE", "column_data_type"=>"NUMBER"}]
i = hashify(h, "expected_data_type")
#=> {"NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE",
# "expected_data_type"=>"NUMBER"},
# "TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
# "expected_data_type"=>"NUMBER"}}
Next merge g and i using the form of Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for the definitions of the three block variables (the first of which, the common key, I've represented by an underscore to signify that it is not used in the block calculation).
j = g.merge(i) { |_,a,e| a.merge(e) }
#=> {"TRANSACTIONDATE"=>{"column_name"=>"TRANSACTIONDATE",
# "actual_data_type"=>"TIMESTAMP", "expected_data_type"=>"NUMBER"},
# "NONINTERESTEXPENSE"=>{"column_name"=>"NONINTERESTEXPENSE",
# "actual_data_type"=>"VARCHAR", "expected_data_type"=>"NUMBER"}}
Lastly, drop the keys.
k = j.values
#=> [{"column_name"=>"TRANSACTIONDATE", "actual_data_type"=>"TIMESTAMP",
# "expected_data_type"=>"NUMBER"},
# {"column_name"=>"NONINTERESTEXPENSE", "actual_data_type"=>"VARCHAR",
# "expected_data_type"=>"NUMBER"}]

Am I able to use each_cons(2).any? with hash values?

If I have an array of integers and wish to check if a value is less than the previous value. I'm using:
array = [1,2,3,1,5,7]
con = array.each_cons(2).any? { |x,y| y < x }
p con
This returns true, as expected as 1 is less than 3.
How would I go about checking if a hash value is less than the previous hash value?
hash = {"0"=>"1", "1"=>"2","2"=>"3","4"=>"1","5"=>"5","6"=>"7"}
I'm still learning Ruby so help would be greatly appreciated.
If you want to find out whether all elements meet the criteria, starting with an array:
array = [1,2,3,1,5,7]
con = array.each_cons(2).all? { |x,y| x < y }
con # => false
Changing the array so the elements are all less than the next:
array = [1,2,3,4,5,7]
con = array.each_cons(2).all? { |x,y| x < y }
con # => true
A lot of the methods behave similarly for array elements and hashes, so the basic code is the same, how they're passed into the block changes. I reduced the hash to the bare minimum to demonstrate the code:
hash = {"3"=>"3","4"=>"1","5"=>"5"}
con = hash.each_cons(2).all? { |(_, x_value), (_, y_value) | x_value < y_value }
con # => false
Changing the hash to be incrementing:
hash = {"3"=>"3","4"=>"4","5"=>"5"}
con = hash.each_cons(2).all? { |(_, x_value), (_, y_value) | x_value < y_value }
con # => true
Using any? would work the same way. If you want to know whether any are >=:
hash = {"3"=>"3","4"=>"1","5"=>"5"}
con = hash.each_cons(2).any? { |(_, x_value), (_, y_value) | y_value >= x_value }
con # => true
Or:
hash = {"3"=>"3","4"=>"4","5"=>"5"}
con = hash.each_cons(2).any? { |(_, x_value), (_, y_value) | x_value >= y_value }
con # => false
I'm creating the hash by
stripped = Hash[x.scan(/(\w+): (\w+)/).map { |(first, second)| [first.to_i, second.to_i] }]
I'm then removing empty arrays by
new = stripped.delete_if { |elem| elem.flatten.empty? }
This isn't a good way to use scan. Consider these:
'1: 23'.scan(/\d+/) # => ["1", "23"]
'1: 23'.scan(/(\d+)/) # => [["1"], ["23"]]
'1: 23'.scan(/(\d+): (\d+)/) # => [["1", "23"]]
In the first, scan returns an array of values. In the second, it returns an array of arrays, where each sub-array is a single element. In the third it returns an array of arrays, where each sub-array contains both elements scanned. You are using the third form, which unnecessarily complicates everything done after that.
Don't complicate the pattern passed to scan, and, instead, rely on its ability to return multiple matching elements as it looks through the string and to return an array of those:
'1: 23'.scan(/\d+/) # => ["1", "23"]
Build on top of that:
'1: 23'.scan(/\d+/).map(&:to_i) # => [1, 23]
Hash[*'1: 23'.scan(/\d+/).map(&:to_i)] # => {1=>23}
Notice the leading * inside Hash[]. That "splat" tells Ruby to burst or explode the array into its components. Here's what happens if it's not there:
Hash['1: 23'.scan(/\d+/).map(&:to_i)] # => {} # !> this causes ArgumentError in the next release
And, finally, if you don't need the hash elements to be integers, which contradicts the hash you gave in your question, just remove .map(&:to_i) from the examples above.
First, isolate the values from the hash:
values = hash.map { |key, value| value.to_i } #=> [1, 2, 3, 1, 5, 7]
or:
values = hash.values.map(&:to_i) #=> to_i is a shortcut for:
values = hash.values.map { |value| value.to_i }
and then do the same thing you did for your array example.

How to change all the keys of a hash by a new set of given keys

How do I change all the keys of a hash by a new set of given keys?
Is there a way to do that elegantly?
Assuming you have a Hash which maps old keys to new keys, you could do something like
hsh.transform_keys(&key_map.method(:[]))
Ruby 2.5 has Hash#transform_keys! method. Example using a map of keys
h = {a: 1, b: 2, c: 3}
key_map = {a: 'A', b: 'B', c: 'C'}
h.transform_keys! {|k| key_map[k]}
# => {"A"=>1, "B"=>2, "C"=>3}
You can also use symbol#toproc shortcut with transform_keys Eg:
h.transform_keys! &:upcase
# => {"A"=>1, "B"=>2, "C"=>3}
i assume you want to change the hash keys without changing the values:
hash = {
"nr"=>"123",
"name"=>"Herrmann Hofreiter",
"pferd"=>"010 000 777",
"land"=>"hight land"
}
header = ["aa", "bb", "cc", "dd"]
new_hash = header.zip(hash.values).to_h
Result:
{
"aa"=>"123",
"bb"=>"Herrmann Hofreiter",
"cc"=>"010 000 777",
"dd"=>"high land"
}
Another way to do it is:
hash = {
'foo' => 1,
'bar' => 2
}
new_keys = {
'foo' => 'foozle',
'bar' => 'barzle'
}
new_keys.values.zip(hash.values_at(*new_keys.keys)).to_h
# => {"foozle"=>1, "barzle"=>2}
Breaking it down:
new_keys
.values # => ["foozle", "barzle"]
.zip(
hash.values_at(*new_keys.keys) # => [1, 2]
) # => [["foozle", 1], ["barzle", 2]]
.to_h
# => {"foozle"=>1, "barzle"=>2}
It's benchmark time...
While I like the simplicity of Jörn's answer, I'm wasn't sure it was as fast as it should be, then I saw selvamani's comment:
require 'fruity'
HASH = {
'foo' => 1,
'bar' => 2
}
NEW_KEYS = {
'foo' => 'foozle',
'bar' => 'barzle'
}
compare do
mittag { HASH.dup.map {|k, v| [NEW_KEYS[k], v] }.to_h }
ttm { h = HASH.dup; NEW_KEYS.values.zip(h.values_at(*NEW_KEYS.keys)).to_h }
selvamani { h = HASH.dup; h.keys.each { |key| h[NEW_KEYS[key]] = h.delete(key)}; h }
end
# >> Running each test 2048 times. Test will take about 1 second.
# >> selvamani is faster than ttm by 39.99999999999999% ± 10.0%
# >> ttm is faster than mittag by 10.000000000000009% ± 10.0%
These are running very close together speed wise, so any will do, but 39% pays off over time so consider that. A couple answers were not included because there are potential flaws where they'd return bad results.
The exact solution would depend on the format that you have the new keys in (or if you can derive the new key from the old key.)
Assuming you have a hash h whose keys you want to modify and a hash new_keys that maps the current keys to the new keys you could do:
h.keys.each do |key|
h[new_keys[key]] = h[key] # add entry for new key
k.delete(key) # remove old key
end
If you also worry about performance, this is faster:
hsh.keys.each { |k| hsh[ key_map[k] ] = hsh.delete(k) if key_map[k] }
You don't create a new Hash and you rename only the necessary keys. That gives you better performance.
You can find more details in "How to elegantly rename all keys in a hash in Ruby?"
h = { 'foo'=>1, 'bar'=>2 }
key_map = { 'foo'=>'foozle', 'bar'=>'barzle' }
h.each_with_object({}) { |(k,v),g| g[key_map[k]]=v }
#=> {"foozle"=>1, "barzle"=>2}
or
h.reduce({}) { |g,(k,v)| g.merge(key_map[k]=>v) }
#=> {"foozle"=>1, "barzle"=>2}

Resources