Show the structure of a large nested hash - ruby

I have a large hash which I'm trying to inspect, but because there are so many values it's hard to visually see what's going on.
For example say this is the hash:
{
days: {
monday: [1,2,3,4], # There are thousands of values here
tuesday: [1,2,3,4]
},
movies: {
action: ['Avengers', 'My Little Pony'],
adventure: ['Dorra The Explorer'],
comedy: ['Star Wars']
},
data_quality: 0.9,
verified: true
}
Now there is something going wrong and I need to examine what's going on here. It could be that I'm missing a movie category, a day of the week, or something in another field.
Because the Arrays are thousands of items long I can't just look at them to see what's missing.
Ideally I would like something like this:
{
days: {
monday: Array,
tuesday: Array
},
movies: {
action: Array,
adventure: Array,
comedy: Array
},
data_quality: Float,
verified: TrueClass
}
This would make the data a lot easier to analyse.
This is the method I'm currently using:
def hash_keys(hash)
unless hash.is_a?(Hash)
return hash.class
end
keys_hash = {}
hash.each do |key, value|
keys_hash[key] = hash_keys(value)
end
keys_hash
end
It's a recursive method which will run itself if the value is a hash, and return the values class otherwise.
The result for the sample input matches the expected output, however there is room for improvement. Like if all values in the Array are the same then show the value type (e.g. 'Array of ints') or if the array contains similar hashes, then what do those hashes look like?

I think #transform_values is perfect variant to help here:
def deep_values_transform(hash)
hash.transform_values do |value|
if value.is_a?(Hash)
deep_values_transform(value)
else
value.class
end
end
end
> hash = {
days: {
monday: [1,2,3,4],
tuesday: [1,2,3,4]
},
movies: {
action: ['Avengers', 'My Little Pony'],
adventure: ['Dorra The Explorer'],
comedy: ['Star Wars']
},
data_quality: 0.9,
verified: true
}
> deep_values_transform hash
=> {:days=>{:monday=>Array, :tuesday=>Array}, :movies=>{:action=>Array, :adventure=>Array, :comedy=>Array}, :data_quality=>Float, :verified=>TrueClass}

Related

Convert object with array values into array of object

I do have this kind of params
params = { "people" =>
{
"fname" => ['john', 'megan'],
"lname" => ['doe', 'fox']
}
}
Wherein I loop through using this code
result = []
params["people"].each do |key, values|
values.each_with_index do |value, i|
result[i] = {}
result[i][key.to_sym] = value
end
end
The problem on my code is that it always gets the last key and value.
[
{ lname: 'doe' },
{ lname: 'fox' }
]
i want to convert it into
[
{fname: 'john', lname: 'doe'},
{fname: 'megan', lname: 'fox'}
]
so that i can loop through of them and save to database.
Your question has been answered but I'd like to mention an alternative calculation that does not employ indices:
keys, values = params["people"].to_a.transpose
#=> [["fname", "lname"], [["john", "megan"], ["doe", "fox"]]]
keys = keys.map(&:to_sym)
#=> [:fname, :lname]
values.transpose.map { |val| keys.zip(val).to_h }
#=> [{:fname=>"john", :lname=>"doe"},
# {:fname=>"megan", :lname=>"fox"}]
result[i] = {}
The problem is that you're doing this each loop iteration, which resets the value and deletes any existing keys you already put there. Instead, only set the value to {} if it doesn't already exist.
result[i] ||= {}
In your inner loop, you're resetting the i-th element to an empty hash:
result[i] = {}
So you only end up with the data from the last key-value-pair, i.e. lname.
Instead you can use this to only set it to an empty hash if it doesn't already exist:
result[i] ||= {}
So the first loop through, it gets set to {}, but after that, it just gets set to itself.
Alternatively, you can also use
result[i] = {} if !result[i]
which may or may not be more performant. I don't know.

iterating over to make hashes within an array

so I know how I can iterate over and make array within hash
travel=["Round Trip Ticket Price:", "Price of Accommodation:", "Number of checked bags:"]
(1..3).each_with_object({}) do |trip, travels|
puts "Please input the following for trip # #{trip}"
travels["trip #{trip}"]= travel.map { |q| print q; gets.chomp.to_f }
end
==>{"trip 1"=>[100.0, 50.0, 1.0], "trip 2"=>[200.0, 100.0, 2.0], "trip 3"=>[300.0, 150.0,
3.0]}
BUT instead I want to iterate over to make three individual hashes within one array.
I want it to look something like this
travels=[{trip_transportation: 100.0, trip_accommodation:50.0, trip_bags:50}
{trip_transportation:200.0, trip_accommodation:100.0, trip_2_bags:100}
{trip_3_transportation:300.0, trip_accommodation:150.0, trip_3_bags:150}]
I am really confused, basically the only thing I want to know how to do is how do I make three separate hashes while using a loop.
I want every hash to represent a trip.
Is that even possible?
travel=[{ prompt: "Round Trip Ticket Price: ",
key: :trip_transportation, type: :float },
{ prompt: "Price of Accommodation : ",
key: :trip_accommodation, type: :float },
{ prompt: "Number of checked bags : ",
key: :trip_bags, type: :int }]
nbr_trips = 3
Suppose that as the following code is run the user were to input the values given in the question's example.
(1..nbr_trips).map do |trip|
puts "Please input the following for trip #{trip}"
travel.map do |h|
print h[:prompt]
s = gets
[h[:key], h[:type] == :float ? s.to_f : s.to_i]
end.to_h
end
#=> [{:trip_transportation=>100.0, :trip_accommodation=>50.0, :trip_bags=>1},
# {:trip_transportation=>200.0, :trip_accommodation=>100.0, :trip_bags=>2},
# {:trip_transportation=>300.0, :trip_accommodation=>150.0, :trip_bags=>3}]
I see no reason for keys to have different names for different trips (e.g., :trip_2_bags and trip_3_bags, rather than simply trip_bags for all trips).
Using an Hash for setting up, similar to Cary Swoveland's answer and similar to my answer here: https://stackoverflow.com/a/58485997/5239030
travel = { trip_transportation: { question: 'Round Trip Ticket Price:', convert: 'to_f' },
trip_accommodation: { question: 'Price of Accommodation:', convert: 'to_f' },
trip_bags: { question: 'Number of checked bags:', convert: 'to_i' } }
n = 2
res = (1..n).map do # |n| # uncomment if (*)
travel.map.with_object({}) do |(k, v), h|
puts v[:question]
# k = k.to_s.split('_').insert(1, n).join('_').to_sym # uncomment if (*)
h[k] = gets.send(v[:convert])
end
end
res
#=> [{:trip_transportation=>10.0, :trip_accommodation=>11.0, :trip_bags=>1}, {:trip_transportation=>20.0, :trip_accommodation=>22.0, :trip_bags=>2}]
(*) Uncomment if you want the result to appear like:
#=> [{:trip_1_transportation=>10.0, :trip_1_accommodation=>11.0, :trip_1_bags=>1}, {:trip_2_transportation=>20.0, :trip_2_accommodation=>22.0, :trip_2_bags=>2}]

Find and replace specific hash and it's values within array

What is the most efficient method to find specific hash within array and replace its values in-place, so array get changed as well?
I've got this code so far, but in a real-world application with loads of data, this becomes the slowest part of application, which probably leaks memory, as unbounded memory grows constantly when I perform this operation on each websocket message.
array =
[
{ id: 1,
parameters: {
omg: "lol"
},
options: {
lol: "omg"
}
},
{ id: 2,
parameters: {
omg: "double lol"
},
options: {
lol: "double omg"
}
}
]
selection = array.select { |a| a[:id] == 1 }[0]
selection[:parameters][:omg] = "triple omg"
p array
# => [{:id=>1, :parameters=>{:omg=>"triple omg"}, :options=>{:lol=>"omg"}}, {:id=>2, :parameters=>{:omg=>"double lol"}, :options=>{:lol=>"double omg"}}]
This will do what you're after looping through the records only once:
array.each { |hash| hash[:parameters][:omg] = "triple omg" if hash[:id] == 1 }
You could always expand the block to handle other conditions:
array.each do |hash|
hash[:parameters][:omg] = "triple omg" if hash[:id] == 1
hash[:parameters][:omg] = "quadruple omg" if hash[:id] == 2
# etc
end
And it'll remain iterating over the elements just the once.
It might also be you'd be better suited adjusting your data into a single hash. Generally speaking, searching a hash will be faster than using an array, particularly if you've got unique identifier as here. Something like:
{
1 => {
parameters: {
omg: "lol"
},
options: {
lol: "omg"
}
},
2 => {
parameters: {
omg: "double lol"
},
options: {
lol: "double omg"
}
}
}
This way, you could just call the following to achieve what you're after:
hash[1][:parameters][:omg] = "triple omg"
Hope that helps - let me know how you get on with it or if you have any questions.

How do you check for matching keys in a ruby hash?

I'm learning coding, and one of the assignments is to return keys is return the names of people who like the same TV show.
I have managed to get it working and to pass TDD, but I'm wondering if I've taken the 'long way around' and that maybe there is a simpler solution?
Here is the setup and test:
class TestFriends < MiniTest::Test
def setup
#person1 = {
name: "Rick",
age: 12,
monies: 1,
friends: ["Jay","Keith","Dave", "Val"],
favourites: {
tv_show: "Friends",
things_to_eat: ["charcuterie"]
}
}
#person2 = {
name: "Jay",
age: 15,
monies: 2,
friends: ["Keith"],
favourites: {
tv_show: "Friends",
things_to_eat: ["soup","bread"]
}
}
#person3 = {
name: "Val",
age: 18,
monies: 20,
friends: ["Rick", "Jay"],
favourites: {
tv_show: "Pokemon",
things_to_eat: ["ratatouille", "stew"]
}
}
#people = [#person1, #person2, #person3]
end
def test_shared_tv_shows
expected = ["Rick", "Jay"]
actual = tv_show(#people)
assert_equal(expected, actual)
end
end
And here is the solution that I found:
def tv_show(people_list)
tv_friends = {}
for person in people_list
if tv_friends.key?(person[:favourites][:tv_show]) == false
tv_friends[person[:favourites][:tv_show]] = [person[:name]]
else
tv_friends[person[:favourites][:tv_show]] << person[:name]
end
end
for array in tv_friends.values()
if array.length() > 1
return array
end
end
end
It passes, but is there a better way of doing this?
I think you could replace those for loops with the Array#each. But in your case, as you're creating a hash with the values in people_list, then you could use the Enumerable#each_with_object assigning a new Hash as its object argument, this way you have your own person hash from the people_list and also a new "empty" hash to start filling as you need.
To check if your inner hash has a key with the value person[:favourites][:tv_show] you can check for its value just as a boolean one, the comparison with false can be skipped, the value will be evaluated as false or true by your if statement.
You can create the variables tv_show and name to reduce a little bit the code, and then over your tv_friends hash to select among its values the one that has a length greater than 1. As this will give you an array inside an array you can get from this the first element with first (or [0]).
def tv_show(people_list)
tv_friends = people_list.each_with_object(Hash.new({})) do |person, hash|
tv_show = person[:favourites][:tv_show]
name = person[:name]
hash.key?(tv_show) ? hash[tv_show] << name : hash[tv_show] = [name]
end
tv_friends.values.select { |value| value.length > 1 }.first
end
Also you can omit parentheses when the method call doesn't have arguments.

Hash Enumerable methods: Inconsistent behavior when passing only one parameter

Ruby's enumerable methods for Hash expect 2 parameters, one for the key and one for the value:
hash.each { |key, value| ... }
However, I notice that the behavior is inconsistent among the enumerable methods when you only pass one parameter:
student_ages = {
"Jack" => 10,
"Jill" => 12,
}
student_ages.each { |single_param| puts "param: #{single_param}" }
student_ages.map { |single_param| puts "param: #{single_param}" }
student_ages.select { |single_param| puts "param: #{single_param}" }
student_ages.reject { |single_param| puts "param: #{single_param}" }
# results:
each...
param: ["Jack", 10]
param: ["Jill", 12]
map...
param: ["Jack", 10]
param: ["Jill", 12]
select...
param: Jack
param: Jill
reject...
param: Jack
param: Jill
As you can see, for each and map, the single parameter gets assigned to a [key, value] array, but for select and reject, the parameter is only the key.
Is there a particular reason for this behavior? The docs don't seem to mention this at all; all of the examples given just assume that you are passing in two parameters.
Just checked Rubinius behavior and it is indeed consistent with CRuby. So looking at the Ruby implementation - it is indeed because #select yields two values:
yield(item.key, item.value)
while #each yields an array with two values:
yield [item.key, item.value]
Yielding two values to a block that expects one takes the first argument and ignores the second one:
def foo
yield :bar, :baz
end
foo { |x| p x } # => :bar
Yielding an array will either get completely assigned if the block has one parameter or get unpacked and assigned to each individual value (as if you passed them one by one) if there are two or more parameters.
def foo
yield [:bar, :baz]
end
foo { |x| p x } # => [:bar, :baz]
As for why they made that descision - there probably isn't any good reason behind it, it just wasn't expected people to call them with one argument.
My guess is that internally map is just each with collect. Interesting they don't work quite the same way.
As to each...
The source code is below. It checks how many arguments you've passed into the block. If more than one it calls each_pair_i_fast, otherwise just each_pair_i.
static VALUE
rb_hash_each_pair(VALUE hash)
{
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
if (rb_block_arity() > 1)
rb_hash_foreach(hash, each_pair_i_fast, 0);
else
rb_hash_foreach(hash, each_pair_i, 0);
return hash;
}
each_pair_i_fast returns two distinct values:
each_pair_i_fast(VALUE key, VALUE value)
{
rb_yield_values(2, key, value);
return ST_CONTINUE;
}
each_pair_i does not:
each_pair_i(VALUE key, VALUE value)
{
rb_yield(rb_assoc_new(key, value));
return ST_CONTINUE;
}
rb_assoc_new returns a two element array (at least I'm assuming that is what rb_ary_new3 does
rb_assoc_new(VALUE car, VALUE cdr)
{
return rb_ary_new3(2, car, cdr);
}
select looks like this:
rb_hash_select(VALUE hash)
{
VALUE result;
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
result = rb_hash_new();
if (!RHASH_EMPTY_P(hash)) {
rb_hash_foreach(hash, select_i, result);
}
return result;
}
and select_i looks like this:
select_i(VALUE key, VALUE value, VALUE result)
{
if (RTEST(rb_yield_values(2, key, value))) {
rb_hash_aset(result, key, value);
}
return ST_CONTINUE;
}
And I'm going to assume that rb_hash_aset returns two distinct arguments similar to each_pair_i.
Most important notice that select/etc doesn't check the argument arity at all.
Sources:
https://github.com/ruby/ruby/blob/d5c5d5c778a0e8d61ab07669132dc18fb1a2e874/hash.c
https://github.com/ruby/ruby/blob/9f44b77a18d4d6099174c6044261eb1611a147ea/array.c

Resources