Hash of arrays to filepath-like array [duplicate] - ruby

This question already has answers here:
Converting a nested hash into a flat hash
(8 answers)
Closed 8 years ago.
Here is a structure of hash of arrays:
[
{
"key1" => [
"value1",
{"key2" => ["value2"]},
{"key3" => [
"value3",
{
"key4" => "value4"
}
]
}
]
},
{
"anotherKey1" => [],
}
]
I want desired output for that structure like filepaths:
/key1/value1
/key1/key2/value2
/key3/value3
/key3/key4/value4
How can I do that without inventing a wheel? Simple recursion could help, but is there any ready-to-go modules?

I do not think you would be reinventing any wheels to do this. You would like to traverse a nested structure of arrays and hashes and react completely different to the elements depending on whether something is an Array or a Hash. No library function is going to do exactly that for you, as you would need to vary more than one thing with blocks in order to be as flexible as you might like to be.
In short: write your recursive function to do this.
(Btw: The top level of your data structure is an array of hashes, not a hash of arrays …)

I decided to write my own wheel (thanks for Patru, vote up).
And I have this function:
def flat_hash_of_arrays(hash,string = "",delimiter="/",result = [])
# choose delimiter
hash.each do |key,value|
# string dup for avoid string-reference (oh, Ruby)
newString = string + delimiter + key
# if value is array
if value.is_a?(Array)
# if array not empty
value.each do |elementOfArray|
# if a string, I dont need recursion, hah
if elementOfArray.is_a?(String)
resultString = newString + delimiter + elementOfArray
# add new object
result << resultString
end
# if a hash, I need recursion
if elementOfArray.is_a?(Hash)
flat_hash_of_arrays(elementOfArray,newString,delimiter,result)
end
end
end
end
end
and test it:
flatten_hash = {
"key1" => [
"value1",
{"key2" => ["value2"]},
{"key3" => [
"value3",
{
"key4" => "value4"
}
]
},
"value4",
{
"key4" => ["value5"],
}
]
}
result = []
flat_hash_of_arrays(flatten_hash,"","/",result)
puts result
output is:
/key1/value1
/key1/key2/value2
/key1/key3/value3
/key1/value4
/key1/key4/value5
fine!

Related

Difficulty when constructing a nested data structure

While trying to create a JSON message for an API, I found myself struggling to do something that I thought would be simple. I needed to create a message like the following:
{ "list": [ { "foo": 1, "bar": 2 } ] }
However, my first attempt did not work:
say to-json { foo => [ { a => 1, b => 2 } ] };
# {"foo":[{"a":1},{"b":2}]}
Trying to simplify things further confused me more:
say { foo => [ { a => 1 } ] };
# {foo => [a => 1]}
# Note that this is not JSON, but I expected to see curly braces
Then I tried to use some temporary variables, and that worked:
my #list = { a => 1 };
say to-json { foo => #list };
# {"foo":[{"a":1}]}
my %hash = ( a => 1 );
say to-json { foo => [ %hash ] };
# {"foo":[{"a":1}]}
What's going on here?
And is there a way I can achieve my desired output without an extra temporary variable?
You've discovered the single argument rule. Numerous constructs in Raku will iterate the argument they are provided with. This includes the [...] array composer. This is why when we say:
say [1..10];
We get an array that contains 10 elements, not 1. However, it also means that:
say [[1,2]];
Iterates the [1,2], and thus results in [1,2] - as if the inner array were not there. A Hash iterates to its pairs, thus:
{ foo => [ { a => 1, b => 2 } ] }
Actually produces:
{ foo => [ a => 1, b => 2 ] }
That is, the array has the pairs. The JSON serializer then serializes each pair as a one-element object.
The solution is to produce a single-element iterable. The infix , operator is what produces lists, so we can use that:
say to-json { foo => [ { a => 1, b => 2 }, ] };
# note the , here ^
Then the single argument to be iterated is a 1-element list with a hash, and you get the result you want.
Easy way to remember it: always use trailing commas when specifying the values of a list, array or hash, even with a single element list, unless you actually are specifying the single iterable from which to populate it.

Manipulate hash in Ruby

I have a hash that looks like
{
"lt"=>"456",
"c"=>"123",
"system"=>{"pl"=>"valid-player-name", "plv"=>"player_version_1"},
"usage"=>{"trace"=>"1", "cq"=>"versionid", "stream"=>"od",
"uid"=>"9", "pst"=>[["0", "1", "10"]], "dur"=>"0", "vt"=>"2"}
}
How can I go about turning it into a hash that looks like
{
"lt"=>"456",
"c"=>"123",
"pl"=>"valid-player-name",
"plv"=>"player_version_1",
"trace"=>"1",
"cq"=>"versionid",
"stream"=>"od",
"uid"=>"9",
"pst"=>[["0", "1", "10"]], "dur"=>"0", "vt"=>"2"
}
I basically want to get rid of the keys system and usage and keep what's nested inside them
"Low-tech" version :)
h = { ... }
h.merge!(h.delete('system'))
h.merge!(h.delete('usage'))
Assuming no rails:
hash.reject { |key, _| %w(system usage).include? key }.merge(hash['system']).merge(hash['usage'])
With active support:
hash.except('system', 'usage').merge(hash['system']).merge(hash['usage'])
A more generic version.
Merge any key that contains a hash:
h = { ... }
hnew = h.inject(h.dup) { |h2, (k, v)|
h2.merge!(h2.delete(k)) if v.is_a?(Hash)
h2
}
Assuming that your data has the same structure each time, I might opt for something simple and easy to understand like this:
def manipulate_hash(h)
{
"lt" => h["lt"],
"c" => h["c"],
"pl" => h["system"]["pl"],
"plv" => h["system"]["plv"],
"trace" => h["usage"]["trace"],
"cq" => h["usage"]["cq"],
"stream" => h["usage"]["stream"],
"uid" => h["uid"],
"pst" => h["pst"],
"dur" => h["dur"],
"vt" => h["vt"]
}
end
I chose to make the hash using one big hash literal expression that spans multiple lines. If you don't like that, you could build it up on multiple lines like this:
def manipulate_hash
r = {}
r["lt"] = h["lt"]
r["c"] = h["c"]
...
r
end
You might consider using fetch instead of the [] angle brackets. That way, you'll get an exception if the expected key is missing from the hash. For example, replace h["lt"] with h.fetch("lt").
If you plan to have an arbitrarily large list of keys to merge, this is an easily scaleable method:
["system", "usage"].each_with_object(myhash) do |key|
myhash.merge!(myhash.delete(key))
end

Sorting a hash of hashes in ruby

How can I sort this hash of hashes by "clients". I tried using sort_by, but this transforms it into an array of hashes. I am using JSON.parse to create this object from a json file. Thanks!
{
"default_attributes": {
"clients": {
"ABC": {
"db_name": "databaseabc"
},
"HIJ": {
"db_name": "databasehij"
},
"DEF": {
"db_name": "databasedef"
}
}
}
}
Why do you want to sort a hash? There's no advantage to it. Instead, get the keys, sort those, then use the keys to retrieve the data in the order you want.
For instance:
hash = {'z' => 26, 'a' => 1}
sorted_keys = hash.keys.sort # => ["a", "z"]
hash.values_at(*sorted_keys) # => [1, 26]
Using your example hash:
hash = {
"default_attributes": {
"clients": {
"ABC": {
"db_name": "databaseabc"
},
"HIJ": {
"db_name": "databasehij"
},
"DEF": {
"db_name": "databasedef"
}
}
}
}
clients = hash[:default_attributes][:clients]
sorted_keys = clients.keys.sort # => [:ABC, :DEF, :HIJ]
clients.values_at(*sorted_keys)
# => [{:db_name=>"databaseabc"},
# {:db_name=>"databasedef"},
# {:db_name=>"databasehij"}]
Or:
sorted_keys.each do |k|
puts clients[k][:db_name]
end
# >> databaseabc
# >> databasedef
# >> databasehij
Note: From looking at your "hash", it really looks like a JSON string missing the original surrounding { and }. If it is, this question becomes somewhat of an "XY problem". The first question should be "how do I convert a JSON string back to a Ruby object?":
require 'json'
hash = '{
"default_attributes": {
"clients": {
"ABC": {
"db_name": "databaseabc"
},
"HIJ": {
"db_name": "databasehij"
},
"DEF": {
"db_name": "databasedef"
}
}
}
}'
foo = JSON[hash]
# => {"default_attributes"=>
# {"clients"=>
# {"ABC"=>{"db_name"=>"databaseabc"},
# "HIJ"=>{"db_name"=>"databasehij"},
# "DEF"=>{"db_name"=>"databasedef"}}}}
At that point foo would contain a regular hash, and the inconsistent symbol definitions like "default_attributes": and "clients": would make sense because they ARE JSON hash keys, and the resulting parsed object would be a standard Ruby hash definition. And, you'll have to adjust the code above to access the individual nested hash keys.
If you are using Ruby <1.9, hashes are order-undefined. Sorting them makes no sense.
Ruby 1.9+ has ordered hashes; you would use sort_by, then convert your array of hashes back into a hash. Ruby 2.0+ provides Array#to_h for this.
data["default_attributes"]["clients"] = data["default_attributes"]["clients"].sort_by(&:first).to_h
hash = {
default_attributes: {
clients: {
ABC: {
"db_name": "databaseabc"
},
HIJ: {
"db_name": "databasehij"
},
DEF: {
"db_name": "databasedef"
}
}
}
}
If you do not wish to mutate hash, it's easiest to first make a deep copy:
h = Marshal.load(Marshal.dump(hash))
and then sort the relevant part of h:
h[:default_attributes][:clients] =
h[:default_attributes][:clients].sort.to_h
h
#=> {:default_attributes=>
# {:clients=>
# {:ABC=>{:db_name=>"databaseabc"},
# :DEF=>{:db_name=>"databasedef"},
# :HIJ=>{:db_name=>"databasehij"}}}}
Confirm hash was not mutated:
hash
#=> {:default_attributes=>
# {:clients=>
# {:ABC=>{:db_name=>"databaseabc"},
# :HIJ=>{:db_name=>"databasehij"},
# :DEF=>{:db_name=>"databasedef"}}}}
One of our interns came up with a pretty slick gem to perform deep sorts on hashes/arrays:
def deep_sort_by(&block)
Hash[self.map do |key, value|
[if key.respond_to? :deep_sort_by
key.deep_sort_by(&block)
else
key
end,
if value.respond_to? :deep_sort_by
value.deep_sort_by(&block)
else
value
end]
end.sort_by(&block)]
end
You can inject it into all hashes and then just call it like this:
[myMap.deep_sort_by { |obj| obj }][1]
The code would be similar for an array. We published "deepsort" as a gem for others to use. See "Deeply Sort Nested Ruby Arrays And Hashes" for additional details.
Disclaimer: I work for this company.

Apply an array containg keys iterative to a hash

I have the following array
["key", "key_deeper", "key_even_deeper"]
and a hash:
{ "key" => { "key_deeper" => { "key_even_deeper" => "BINGO!" } } }
What is the shortest or most expressive way to apply the array on the hash to receive "BINGO!"?
That is for the base case, but there is also a special case where the value to a key is not only String => Hash, but also String => [Integer, Hash].
For instance
["key1", "key2"]
on Hash
{"key1" => [5, {"key2" => "BINGO!" }] }
should return again "BINGO!", but an array containing only ["key1"] would simply return 5.
Probably the easiest way is to use inject:
array.inject(hash) do |h, i|
h.fetch(i){ {} }
end
# => "BINGO!"
The fetch is used to prevent a NoMethodError in case one of your lookup values is not present in the hash. However, in that case, it will return an empty hash. You may want to do the standard lookup instead, i.e.
array.inject(hash) {|h,i| h[i] }
Edit:
Here's an even shorter way to do this (I don't know if I would say it's 'more expressive', but it is shorter):
array.inject(hash, :[])
You can change the original answer a little bit for your second version of question:
array.inject(hash){ |h,i| h[i].is_a?(Array) ? h[i].last : h[i] }

Ruby - Array of Hashes, Trying to Select Multiple Keys and Group By Key Value

I have a set of data that is an array of hashes, with each hash representing one record of data:
data = [
{
:id => "12345",
:bucket_1_rank => "2",
:bucket_1_count => "12",
:bucket_2_rank => "7",
:bucket_2_count => "25"
},
{
:id => "45678",
:bucket_1_rank => "2",
:bucket_1_count => "15",
:bucket_2_rank => "9",
:bucket_2_count => "68"
},
{
:id => "78901",
:bucket_1_rank => "5",
:bucket_1_count => "36"
}
]
The ranks values are always between 1 and 10.
What I am trying to do is select each of the possible values for the rank fields (the :bucket_1_rank and :bucket_2_rank fields) as keys in my final resultset, and the values for each key will be an array of all the values in its associated :bucket_count field. So, for the data above, the final resulting structure I have in mind is something like:
bucket 1:
{"2" => ["12", "15"], "5" => ["36"]}
bucket 2:
{"7" => ["25"], "9" => ["68"]}
I can do this working under the assumption that the field names stay the same, or through hard coding the field/key names, or just using group_by for the fields I need, but my problem is that I work with a different data set each month where the rank fields are named slightly differently depending on the project specs, and I want to identify the names for the count and rank fields dynamically as opposed to hard coding the field names.
I wrote two quick helpers get_ranks and get_buckets that use regex to return an array of fieldnames that are either ranks or count fields, since these fields will always have the literal string "_rank" or "_count" in their names:
ranks = get_ranks
counts = get_counts
results = Hash.new{|h,k| h[k] = []}
data.each do |i|
ranks.each do |r|
unless i[r].nil?
counts.each do |c|
results[i[r]] << i[c]
end
end
end
end
p results
This seems to be close, but feels awkward, and it seems to me there has to be a better way to iterate through this data set. Since I haven't worked on this project using Ruby I'd use this as an opportunity to improve my understanding iterating through arrays of hashes, populating a hash with arrays as values, etc. Any resources/suggestions would be much appreciated.
You could shorten it to:
result = Hash.new{|h,k| h[k] = Hash.new{|h2,k2| h2[k2] = []}}
data.each do |hsh|
hsh.each do |key, value|
result[$1][value] << hsh["#{$1}_count".to_sym] if key =~ /(.*)_rank$/
end
end
puts result
#=> {"bucket_1"=>{"2"=>["12", "15"], "5"=>["36"]}, "bucket_2"=>{"7"=>["25"], "9"=>["68"]}}
Though this is assuming that :bucket_2_item_count is actually supposed to be :bucket_2_count.

Resources