Lazy enumerator for nested array of hashes - ruby

Suppose I have an Array like this
data = [
{
key: val,
important_key_1: { # call this the big hash
key: val,
important_key_2: [
{ # call this the small hash
key: val,
},
{
key: val,
},
]
},
},
{
key: val,
important_key_1: {
key: val,
important_key_2: [
{
key: val,
},
{
key: val,
},
]
},
},
]
I want to create a lazy enumerator that would return the next small hash on each #next, and move on to the next big hash and do the same when the first big hash reaches the end
The easy way to return all the internal hashes that I want would be something like this
data[:important_key_1].map do |internal_data|
internal_data[:important_key_2]
end.flatten
Is there someway to do this or do I need to implement my own logic ?

This returns a lazy enumerator which iterates over all the small hashes :
def lazy_nested_hashes(data)
enum = Enumerator.new do |yielder|
data.each do |internal_data|
internal_data[:important_key_1][:important_key_2].each do |small_hash|
yielder << small_hash
end
end
end
enum.lazy
end
With your input data and a val definition :
#i = 0
def val
#i += 1
end
It outputs :
puts lazy_nested_hashes(data).to_a.inspect
#=> [{:key=>3}, {:key=>4}, {:key=>7}, {:key=>8}]
puts lazy_nested_hashes(data).map { |x| x[:key] }.find { |k| k > 3 }
#=> 4
For the second example, the second big hash isn't considered at all (thanks to enum.lazy)

Related

Convert object with array values into array of object

I do have this kind of params
params = { "people" =>
{
"fname" => ['john', 'megan'],
"lname" => ['doe', 'fox']
}
}
Wherein I loop through using this code
result = []
params["people"].each do |key, values|
values.each_with_index do |value, i|
result[i] = {}
result[i][key.to_sym] = value
end
end
The problem on my code is that it always gets the last key and value.
[
{ lname: 'doe' },
{ lname: 'fox' }
]
i want to convert it into
[
{fname: 'john', lname: 'doe'},
{fname: 'megan', lname: 'fox'}
]
so that i can loop through of them and save to database.
Your question has been answered but I'd like to mention an alternative calculation that does not employ indices:
keys, values = params["people"].to_a.transpose
#=> [["fname", "lname"], [["john", "megan"], ["doe", "fox"]]]
keys = keys.map(&:to_sym)
#=> [:fname, :lname]
values.transpose.map { |val| keys.zip(val).to_h }
#=> [{:fname=>"john", :lname=>"doe"},
# {:fname=>"megan", :lname=>"fox"}]
result[i] = {}
The problem is that you're doing this each loop iteration, which resets the value and deletes any existing keys you already put there. Instead, only set the value to {} if it doesn't already exist.
result[i] ||= {}
In your inner loop, you're resetting the i-th element to an empty hash:
result[i] = {}
So you only end up with the data from the last key-value-pair, i.e. lname.
Instead you can use this to only set it to an empty hash if it doesn't already exist:
result[i] ||= {}
So the first loop through, it gets set to {}, but after that, it just gets set to itself.
Alternatively, you can also use
result[i] = {} if !result[i]
which may or may not be more performant. I don't know.

Find and replace specific hash and it's values within array

What is the most efficient method to find specific hash within array and replace its values in-place, so array get changed as well?
I've got this code so far, but in a real-world application with loads of data, this becomes the slowest part of application, which probably leaks memory, as unbounded memory grows constantly when I perform this operation on each websocket message.
array =
[
{ id: 1,
parameters: {
omg: "lol"
},
options: {
lol: "omg"
}
},
{ id: 2,
parameters: {
omg: "double lol"
},
options: {
lol: "double omg"
}
}
]
selection = array.select { |a| a[:id] == 1 }[0]
selection[:parameters][:omg] = "triple omg"
p array
# => [{:id=>1, :parameters=>{:omg=>"triple omg"}, :options=>{:lol=>"omg"}}, {:id=>2, :parameters=>{:omg=>"double lol"}, :options=>{:lol=>"double omg"}}]
This will do what you're after looping through the records only once:
array.each { |hash| hash[:parameters][:omg] = "triple omg" if hash[:id] == 1 }
You could always expand the block to handle other conditions:
array.each do |hash|
hash[:parameters][:omg] = "triple omg" if hash[:id] == 1
hash[:parameters][:omg] = "quadruple omg" if hash[:id] == 2
# etc
end
And it'll remain iterating over the elements just the once.
It might also be you'd be better suited adjusting your data into a single hash. Generally speaking, searching a hash will be faster than using an array, particularly if you've got unique identifier as here. Something like:
{
1 => {
parameters: {
omg: "lol"
},
options: {
lol: "omg"
}
},
2 => {
parameters: {
omg: "double lol"
},
options: {
lol: "double omg"
}
}
}
This way, you could just call the following to achieve what you're after:
hash[1][:parameters][:omg] = "triple omg"
Hope that helps - let me know how you get on with it or if you have any questions.

Convert Hash to OpenStruct recursively

Given I have this hash:
h = { a: 'a', b: 'b', c: { d: 'd', e: 'e'} }
And I convert to OpenStruct:
o = OpenStruct.new(h)
=> #<OpenStruct a="a", b="b", c={:d=>"d", :e=>"e"}>
o.a
=> "a"
o.b
=> "b"
o.c
=> {:d=>"d", :e=>"e"}
2.1.2 :006 > o.c.d
NoMethodError: undefined method `d' for {:d=>"d", :e=>"e"}:Hash
I want all the nested keys to be methods as well. So I can access d as such:
o.c.d
=> "d"
How can I achieve this?
You can monkey-patch the Hash class
class Hash
def to_o
JSON.parse to_json, object_class: OpenStruct
end
end
then you can say
h = { a: 'a', b: 'b', c: { d: 'd', e: 'e'} }
o = h.to_o
o.c.d # => 'd'
See Convert a complex nested hash to an object.
I came up with this solution:
h = { a: 'a', b: 'b', c: { d: 'd', e: 'e'} }
json = h.to_json
=> "{\"a\":\"a\",\"b\":\"b\",\"c\":{\"d\":\"d\",\"e\":\"e\"}}"
object = JSON.parse(json, object_class:OpenStruct)
object.c.d
=> "d"
So for this to work, I had to do an extra step: convert it to json.
personally I use the recursive-open-struct gem - it's then as simple as RecursiveOpenStruct.new(<nested_hash>)
But for the sake of recursion practice, I'll show you a fresh solution:
require 'ostruct'
def to_recursive_ostruct(hash)
result = hash.each_with_object({}) do |(key, val), memo|
memo[key] = val.is_a?(Hash) ? to_recursive_ostruct(val) : val
end
OpenStruct.new(result)
end
puts to_recursive_ostruct(a: { b: 1}).a.b
# => 1
edit
Weihang Jian showed a slight improvement to this here https://stackoverflow.com/a/69311716/2981429
def to_recursive_ostruct(hash)
hash.each_with_object(OpenStruct.new) do |(key, val), memo|
memo[key] = val.is_a?(Hash) ? to_recursive_ostruct(val) : val
end
end
Also see https://stackoverflow.com/a/63264908/2981429 which shows how to handle arrays
note
the reason this is better than the JSON-based solutions is because you can lose some data when you convert to JSON. For example if you convert a Time object to JSON and then parse it, it will be a string. There are many other examples of this:
class Foo; end
JSON.parse({obj: Foo.new}.to_json)["obj"]
# => "#<Foo:0x00007fc8720198b0>"
yeah ... not super useful. You've completely lost your reference to the actual instance.
Here's a recursive solution that avoids converting the hash to json:
def to_o(obj)
if obj.is_a?(Hash)
return OpenStruct.new(obj.map{ |key, val| [ key, to_o(val) ] }.to_h)
elsif obj.is_a?(Array)
return obj.map{ |o| to_o(o) }
else # Assumed to be a primitive value
return obj
end
end
My solution is cleaner and faster than #max-pleaner's.
I don't actually know why but I don't instance extra Hash objects:
def dot_access(hash)
hash.each_with_object(OpenStruct.new) do |(key, value), struct|
struct[key] = value.is_a?(Hash) ? dot_access(value) : value
end
end
Here is the benchmark for you reference:
require 'ostruct'
def dot_access(hash)
hash.each_with_object(OpenStruct.new) do |(key, value), struct|
struct[key] = value.is_a?(Hash) ? dot_access(value) : value
end
end
def to_recursive_ostruct(hash)
result = hash.each_with_object({}) do |(key, val), memo|
memo[key] = val.is_a?(Hash) ? to_recursive_ostruct(val) : val
end
OpenStruct.new(result)
end
require 'benchmark/ips'
Benchmark.ips do |x|
hash = { a: 1, b: 2, c: { d: 3 } }
x.report('dot_access') { dot_access(hash) }
x.report('to_recursive_ostruct') { to_recursive_ostruct(hash) }
end
Warming up --------------------------------------
dot_access 4.843k i/100ms
to_recursive_ostruct 5.218k i/100ms
Calculating -------------------------------------
dot_access 51.976k (± 5.0%) i/s - 261.522k in 5.044482s
to_recursive_ostruct 50.122k (± 4.6%) i/s - 250.464k in 5.008116s
My solution, based on max pleaner's answer and similar to Xavi's answer:
require 'ostruct'
def initialize_open_struct_deeply(value)
case value
when Hash
OpenStruct.new(value.transform_values { |hash_value| send __method__, hash_value })
when Array
value.map { |element| send __method__, element }
else
value
end
end
Here is one way to override the initializer so you can do OpenStruct.new({ a: "b", c: { d: "e", f: ["g", "h", "i"] }}).
Further, this class is included when you require 'json', so be sure to do this patch after the require.
class OpenStruct
def initialize(hash = nil)
#table = {}
if hash
hash.each_pair do |k, v|
self[k] = v.is_a?(Hash) ? OpenStruct.new(v) : v
end
end
end
def keys
#table.keys.map{|k| k.to_s}
end
end
Basing a conversion on OpenStruct works fine until it doesn't. For instance, none of the other answers here properly handle these simple hashes:
people = { person1: { display: { first: 'John' } } }
creds = { oauth: { trust: true }, basic: { trust: false } }
The method below works with those hashes, modifying the input hash rather than returning a new object.
def add_indifferent_access!(hash)
hash.each_pair do |k, v|
hash.instance_variable_set("##{k}", v.tap { |v| send(__method__, v) if v.is_a?(Hash) } )
hash.define_singleton_method(k, proc { hash.instance_variable_get("##{k}") } )
end
end
then
add_indifferent_access!(people)
people.person1.display.first # => 'John'
Or if your context calls for a more inline call structure:
creds.yield_self(&method(:add_indifferent_access!)).oauth.trust # => true
Alternatively, you could mix it in:
module HashExtension
def very_indifferent_access!
each_pair do |k, v|
instance_variable_set("##{k}", v.tap { |v| v.extend(HashExtension) && v.send(__method__) if v.is_a?(Hash) } )
define_singleton_method(k, proc { self.instance_variable_get("##{k}") } )
end
end
end
and apply to individual hashes:
favs = { song1: { title: 'John and Marsha', author: 'Stan Freberg' } }
favs.extend(HashExtension).very_indifferent_access!
favs.song1.title
Here is a variation for monkey-patching Hash, should you opt to do so:
class Hash
def with_very_indifferent_access!
each_pair do |k, v|
instance_variable_set("##{k}", v.tap { |v| v.send(__method__) if v.is_a?(Hash) } )
define_singleton_method(k, proc { instance_variable_get("##{k}") } )
end
end
end
# Note the omission of "v.extend(HashExtension)" vs. the mix-in variation.
Comments to other answers expressed a desire to retain class types. This solution accommodates that.
people = { person1: { created_at: Time.now } }
people.with_very_indifferent_access!
people.person1.created_at.class # => Time
Whatever solution you choose, I recommend testing with this hash:
people = { person1: { display: { first: 'John' } }, person2: { display: { last: 'Jingleheimer' } } }
If you are ok with monkey-patching the Hash class, you can do:
require 'ostruct'
module Structurizable
def each_pair(&block)
each do |k, v|
v = OpenStruct.new(v) if v.is_a? Hash
yield k, v
end
end
end
Hash.prepend Structurizable
people = { person1: { display: { first: 'John' } }, person2: { display: { last: 'Jingleheimer' } } }
puts OpenStruct.new(people).person1.display.first
Ideally, instead of pretending this, we should be able to use a Refinement, but for some reason I can't understand it didn't worked for the each_pair method (also, unfortunately Refinements are still pretty limited)

Ruby aggregation with objects

Lets say I have something like this:
class FruitCount
attr_accessor :name, :count
def initialize(name, count)
#name = name
#count = count
end
end
obj1 = FruitCount.new('Apple', 32)
obj2 = FruitCount.new('Orange', 5)
obj3 = FruitCount.new('Orange', 3)
obj4 = FruitCount.new('Kiwi', 15)
obj5 = FruitCount.new('Kiwi', 1)
fruit_counts = [obj1, obj2, obj3, obj4, obj5]
Now what I need, is a function build_fruit_summary which due to a given fruit_counts array, it returns the following summary:
fruits_summary = {
fruits: [
{
name: 'Apple',
count: 32
},
{
name: 'Orange',
count: 8
},
{
name: 'Kiwi',
count: 16
}
],
total: {
name: 'AllFruits',
count: 56
}
}
I just cannot figure out the best way to do the aggregations.
Edit:
In my example I have more than one count.
class FruitCount
attr_accessor :name, :count1, :count2
def initialize(name, count1, count2)
#name = name
#count1 = count1
#count2 = count2
end
end
Ruby's Enumerable is your friend, particularly each_with_object which is a form of reduce.
You first need the fruits value:
fruits = fruit_counts.each_with_object([]) do |fruit, list|
aggregate = list.detect { |f| f[:name] == fruit.name }
if aggregate.nil?
aggregate = { name: fruit.name, count: 0 }
list << aggregate
end
aggregate[:count] += fruit.count
aggregate[:count2] += fruit.count2
end
UPDATE: added multiple counts within the single fruity loop.
The above will serialize each fruit object - maintaining a count for each fruit - into a hash and aggregate them into an empty list array, and assign the aggregate array to the fruits variable.
Now, get the total value:
total = { name: 'AllFruits', count: fruit_counts.map { |f| f.count + f.count2 }.reduce(:+) }
UPDATE: total taking into account multiple count attributes within a single loop.
The above maps the fruit_counts array, plucking each object's count attribute, resulting in an array of integers. Then, reduce is getting the sum of the array's integers.
Now put it all together into the summary:
fruits_summary = { fruits: fruits, total: total }
You can formalize this in an OOP style by introducing a FruitCollection object that uses the Enumerable module:
class FruitCollection
include Enumerable
def initialize(fruits)
#fruits = fruits
end
def summary
{ fruits: fruit_counts, total: total }
end
def each(&block)
#fruits.each &block
end
def fruit_counts
each_with_object([]) do |fruit, list|
aggregate = list.detect { |f| f[:name] == fruit.name }
if aggregate.nil?
aggregate = { name: fruit.name, count: 0 }
list << aggregate
end
aggregate[:count] += fruit.count
aggregate[:count2] += fruit.count2
end
end
def total
{ name: 'AllFruits', count: map { |f| f.count + f.count2 }.reduce(:+) }
end
end
Now pass your fruit_count array into that object:
fruit_collection = FruitCollection.new fruit_counts
fruits_summary = fruit_collection.summary
The reason the above works is by overriding the each method which Enumerable uses under the hood for every enumerable method. This means we can call each_with_object, reduce, and map (among others listed in the enumerable docs above) and it will iterate over the fruits since we told it to in the above each method.
Here's an article on Enumerable.
UPDATE: your multiple counts can be easily added by adding a total attribute to your fruit object:
class FruitCount
attr_accessor :name, :count1, :count2
def initialize(name, count1, count2)
#name = name
#count1 = count1
#count2 = count2
end
def total
#count1 + #count2
end
end
Then just use fruit.total whenever you need to aggregate the totals:
fruit_counts.map(&:total).reduce(:+)
fruits_summary = {
fruits: fruit_counts
.group_by { |f| f.name }
.map do |fruit_name, objects|
{
name: fruit_name,
count: objects.map(&:count).reduce(:+)
}
end,
total: {
name: 'AllFruits',
count: fruit_counts.map(&:count).reduce(:+)
}
}
Not very efficient way, though :)
UPD: fixed keys in fruits collection
Or slightly better version:
fruits_summary = {
fuits: fruit_counts
.reduce({}) { |acc, fruit| acc[fruit.name] = acc.fetch(fruit.name, 0) + fruit.count; acc }
.map { |name, count| {name: name, count: count} },
total: {
name: 'AllFruits',
count: fruit_counts.map(&:count).reduce(:+)
}
}
counts = fruit_counts.each_with_object(Hash.new(0)) {|obj, h| h[obj.name] += obj.count}
#=> {"Apple"=>32, "Orange"=>8, "Kiwi"=>16}
fruits_summary =
{ fruits: counts.map { |name, count| { name: name, count: count } },
total: { name: 'AllFruits', count: counts.values.reduce(:+) }
}
#=> {:fruits=>[
# {:name=>"Apple", :count=>32},
# {:name=>"Orange", :count=> 8},
# {:name=>"Kiwi", :count=>16}],
# :total=>
# {:name=>"AllFruits", :count=>56}
# }

Ruby, Hash, Use Part Of A Composite Key

Is there a way to do this? This is called a composite key right?
a = { ["key1", "key2"] => "stuff" }
a[["key1",*]]
There is no wildcard interpretation built into Hash. You can implement your own with something like:
class MyHash < Hash
def select_composite(key)
mkeys = matching_keys(key)
select { |k, _| mkeys.include?(k) }
end
private
def matching_keys(key)
keys.select { |hkey| matching_key?(hkey, key) }
end
def matching_key?(hkey, key)
elements = Array(key)
Array(hkey).each_with_index.select { |helement, i|
helement == elements[i] || elements[i] == '*'
}.count == elements.count
end
end
a = MyHash[{ %w(key1 key2) => "stuff" }]
a.select_composite(%w(key1 *))
No, there isn't. You would need to linear search of all keys within the hash for the ones that matched your specific conditions.
What you're really after a nested hash:
a = {
key1: {
key2: "stuff"
}
}
a[:key1][:key2] # "stuff"
a[:key1] # { key2: "stuff" }

Resources