Iterate through hashes to find values predefined in an array - ruby

I have an array with hashes:
test = [
{"type"=>1337, "age"=>12, "name"=>"Eric Johnson"},
{"type"=>1338, "age"=>18, "name"=>"John Doe"},
{"type"=>1339, "age"=>22, "name"=>"Carl Adley"},
{"type"=>1340, "age"=>25, "name"=>"Anna Brent"}
]
I am interested in getting all the hashes where the name key equals to a value that can be found in an array:
get_hash_by_name = ["John Doe","Anna Brent"]
Which would end up in the following:
# test_sorted = would be:
# {"type"=>1338, "age"=>18, "name"=>"John Doe"}
# {"type"=>1340, "age"=>25, "name"=>"Anna Brent"}
I probably have to iterate with test.each somehow, but I still trying to get a grasp of Ruby. Happy for all help!

Here's something to meditate on:
Iterating over an array to find something is slow, even if it's a sorted array. Computer languages have various structures we can use to improve the speed of lookups, and in Ruby Hash is usually a good starting point. Where an Array is like reading from a sequential file, a Hash is like reading from a random-access file, we can jump right to the record we need.
Starting with your test array-of-hashes:
test = [
{'type'=>1337, 'age'=>12, 'name'=>'Eric Johnson'},
{'type'=>1338, 'age'=>18, 'name'=>'John Doe'},
{'type'=>1339, 'age'=>22, 'name'=>'Carl Adley'},
{'type'=>1340, 'age'=>25, 'name'=>'Anna Brent'},
{'type'=>1341, 'age'=>13, 'name'=>'Eric Johnson'},
]
Notice that I added an additional "Eric Johnson" record. I'll get to that later.
I'd create a hash that mapped the array of hashes to a regular hash where the key of each pair is a unique value. The 'type' key/value pair appears to fit that need well:
test_by_types = test.map { |h| [
h['type'], h]
}.to_h
# => {1337=>{"type"=>1337, "age"=>12, "name"=>"Eric Johnson"},
# 1338=>{"type"=>1338, "age"=>18, "name"=>"John Doe"},
# 1339=>{"type"=>1339, "age"=>22, "name"=>"Carl Adley"},
# 1340=>{"type"=>1340, "age"=>25, "name"=>"Anna Brent"},
# 1341=>{"type"=>1341, "age"=>13, "name"=>"Eric Johnson"}}
Now test_by_types is a hash using the type value to point to the original hash.
If I create a similar hash based on names, where each name, unique or not, points to the type values, I can do fast lookups:
test_by_names = test.each_with_object(
Hash.new { |h, k| h[k] = [] }
) { |e, h|
h[e['name']] << e['type']
}.to_h
# => {"Eric Johnson"=>[1337, 1341],
# "John Doe"=>[1338],
# "Carl Adley"=>[1339],
# "Anna Brent"=>[1340]}
Notice that "Eric Johnson" points to two records.
Now, here's how we look up things:
get_hash_by_name = ['John Doe', 'Anna Brent']
test_by_names.values_at(*get_hash_by_name).flatten
# => [1338, 1340]
In one quick lookup Ruby returned the matching types by looking up the names.
We can take that output and grab the original hashes:
test_by_types.values_at(*test_by_names.values_at(*get_hash_by_name).flatten)
# => [{"type"=>1338, "age"=>18, "name"=>"John Doe"},
# {"type"=>1340, "age"=>25, "name"=>"Anna Brent"}]
Because this is running against hashes, it's fast. The hashes can be BIG and it'll still run very fast.
Back to "Eric Johnson"...
When dealing with the names of people it's likely to get collisions of the names, which is why test_by_names allows multiple type values, so with one lookup all the matching records can be retrieved:
test_by_names.values_at('Eric Johnson').flatten
# => [1337, 1341]
test_by_types.values_at(*test_by_names.values_at('Eric Johnson').flatten)
# => [{"type"=>1337, "age"=>12, "name"=>"Eric Johnson"},
# {"type"=>1341, "age"=>13, "name"=>"Eric Johnson"}]
This will be a lot to chew on if you're new to Ruby, but the Ruby documentation covers it all, so dig through the Hash, Array and Enumerable class documentation.
Also, *, AKA "splat", explodes the array elements from the enclosing array into separate parameters suitable for passing into a method. I can't remember where that's documented.
If you're familiar with database design this will look very familiar, because it's similar to how we do database lookups.
The point of all of this is that it's really important to consider how you're going to store your data when you first ingest it into your program. Do it wrong and you'll jump through major hoops trying to do useful things with it. Do it right and the code and data will flow through very easily, and you'll be able to massage/extract/combine the data easily.
Said differently, Arrays are containers useful for holding things you want to access sequentially, such as jobs you want to print, sites you need to access in order, files you want to delete in a specific order, but they're lousy when you want to lookup and work with a record randomly.
Knowing which container is appropriate is important, and for this particular task, it appears that an array of hashes isn't appropriate, since there's no fast way of accessing specific ones.
And that's why I made my comment above asking what you were trying to accomplish in the first place. See "What is the XY problem?" and "XyProblem" for more about that particular question.

You can use select and include? so
test.select {|object| get_hash_by_name.include? object['name'] }
…should do the job.

Related

How to return array of hashes with modified values

I've been successfully converting an array of objects into an array of hashes. But I also want to modify the objects slightly as well, before getting the combined hash.
This is what I do to convert array of objects into a combined hash:
prev_vars.map(&:to_h).reduce({}, :merge)
{ "b"=>#<Money fractional:400 currency:GBP> }
But what I want to have instead, which required to additionally call to_i is:
{ "b"=> 4 }
I got this working using this line, but I am looking for a more elegant solution:
prev_vars.map(&:to_h).reduce({}) { |combined, v| combined.merge({v.keys[0] => v.values[0].to_i}) }
How large is prev_vars? map(&:to_h) could require a fair amount of memory overhead, because it instantiates an entirely new array. Instead, I'd recommend switching the order: first #reduce, then #to_h:
prev_vars.reduce({}) do |combined, var|
combined.merge! var.to_h.transform_values!(&:to_i)
end
Note the use of #merge! rather than #merge so that a new hash is not created for combined for each iteration of the loop.

Ruby - Merge two hashes with no like keys based on matching value

I would like to find an efficient way to merge two hashes together and the resulting hash must contain all original data and a new key/value pair based on criteria below. There are no keys in common between the two hashes, however the key in one hash matches the value of a key in the adjacent hash.
Also note that the second hash is actually an array of hashes.
I am working with a relatively large data set, so looking for an efficient solution but hoping to keep the code readable at the same time since it will likely end up in production.
Here is the structure of my data:
# Hash
hsh1 = { "devicename1"=>"active", "devicename2"=>"passive", "devicename3"=>"passive" }
# Array of Hashes
hsh2 = [ { "host" => "devicename3", "secure" => true },
{ "host" => "devicename2", "secure" => true },
{ "host" => "devicename1", "secure" => false } ]
Here is what I need to accomplish:
I need to merge the data from hsh1 into hsh2 keeping all of the original key/value pairs in hsh2 and adding a new key called activation_status using the the data in hsh1.
The resulting hsh2 would be as follows:
hsh2 = [{ "host"=>"devicename3", "secure"=>true, "activation_status"=>"passive" },
{ "host"=>"devicename2", "secure"=>true, "activation_status"=>"passive" },
{ "host"=>"devicename1", "secure"=>false, "activation_status"=>"active" }]
This may already be answered on StackOverflow but I looked for some time and couldn't find a match. My apologies in advance if this is a duplicate.
I suggest something along the lines of:
hash3 = hash2.map do |nestling|
host = nestling["host"]
status = hash1[host]
nestling["activation_status"] = status
nestling
end
Which of course you can shrink down a bit. This version uses less variables and in-place edit of hash2:
hash2.each do |nestling|
nestling["activation_status"] = hash1[nestling["host"]]
end
This will do it:
hsh2.map { |h| h.merge 'activation_status' => hsh1[h['host']] }
However, I think it will make a copy of the data instead of just walking the array of hashes and adding the appropriate key=>value pair. I don't think it would have a huge impact on performance unless your data set is large enough to consume a significant portion of the memory allocated to your app.

Sort a hash Ruby

I have a hash that looks like this
h1 = {"4c09a0da6071a593f051de32"=>["4c09a0da6071a593f051de32", "Cafe Bistro", 37.78458803130115, -122.40743637084961, 215.0], "4abbb03ef964a520668420e3"=>["4abbb03ef964a520668420e3", "The Plant Cafe Organic", 37.7977805076241, -122.3957633972168, 83.0] }
I would like to sort it by the final value in each hash e.g. 83.0, 215.0
I have tried
h1 = h1.sort_by{|k,v| v[4]}
but in out puts an array not a hash, i would like to keep the hash the same just reordered... how do I do this?
It's not a great idea to count on ordering in a Hash. Ruby didn't order hashes at all in 1.8. The data structure in its canonical form is not ordered.
It's better style to use an Array when ordering is important and a Hash or something else when key lookup is needed.
There is a grey area when writing tests. In that case, it may be reasonable to depend on Hash ordering since you are testing a specific Ruby program in certain conditions and you have, after all, a test that can fail should the implementation assumptions ever change.
You need to convert the array back to a hash:
h1 = Hash[h1.sort_by { |_,v| v[-1] }]
Note that this only works since Ruby 1.9. Before that, hashes were not an ordered data structure.

Populating array (by 'name') in array of arrays

Lets say i have an array of arrays, of which i dont know the names, just that they are arrays, and how many of them there are.
bigArray=[smallArrayA[], smallArrayB[]]
Now i can fetch the array(s) by indexposition, like:
smallA = bigArray[0]
smallA << 'input'
But what i'd like to know is the names of the arrays, stored in the 'big' one..
bigArray.inspect
..just gives me:
[['input'],[]]
My problem is that the names of the smaller ones are going to be created dynamiclly, and i need to know their names to modify the right one, later on.
Sounds like you need a hash:
bigHash = { :a => smallArrayA, :b => smallArrayB }
Now you can refer to each element of the hash by name:
bigHash[:a]

How do you modify array mapping data structure resultant from Ruby map?

I believe that I may be missing something here, so please bear with me as I explain two scenarios in hopes to reconcile my misunderstanding:
My end goal is to create a dataset that's acceptable by Highcharts via lazy_high_charts, however in this quest, I'm finding that it is rather particular about the format of data that it receives.
A) I have found that when data is formatted like this going into it, it draws the points just fine:
[0.0000001240,0.0000000267,0.0000000722, ..., 0.0000000512]
I'm able to generate an array like this simply with:
array = Array.new
data.each do |row|
array.push row[:datapoint1].to_f
end
B) Yet, if I attempt to use the map function, I end up with a result like and Highcharts fails to render this data:
[[6.67e-09],[4.39e-09],[2.1e-09],[2.52e-09], ..., [3.79e-09]]
From code like:
array = data.map{|row| [(row.datapoint1.to_f)] }
Is there a way to coax the map function to produce results in B that more akin to the scenario A resultant data structure?
This get's more involved as I have to also add datetime into this, however that's another topic and I just want to understand this first and what can be done to perhaps further control where I'm going.
Ultimately, EVEN SCENARIO B SHOULD WORK according to the data in the example here: http://www.highcharts.com/demo/spline-irregular-time (press the "View options" button at bottom)
Heck, I'll send you a sucker in the mail if you can fill me in on that part! ;)
You can fix arrays like this
[[6.67e-09],[4.39e-09],[2.1e-09],[2.52e-09], ..., [3.79e-09]]
that have nested arrays inside them by using the flatten method on the array.
But you should be able to avoid generating nested arrays in the first place. Just remove the square brackets from your map line:
array = data.map{|row| row.datapoint1.to_f }
Code
a = [[6.67e-09],[4.39e-09],[2.1e-09],[2.52e-09], [3.79e-09]]
b = a.flatten.map{|el| "%.10f" % el }
puts b.inspect
Output
["0.0000000067", "0.0000000044", "0.0000000021", "0.0000000025", "0.0000000038"]
Unless I, too, am missing something, your problem is that you're returning a single-element array from your block (thereby creating an array of arrays) instead of just the value. This should do you:
array = data.map {|row| row.datapoint1.to_f }
# => [ 6.67e-09, 4.39e-09, 2.1e-09, 2.52e-09, ..., 3.79e-09 ]

Resources