Creating a map from an array of arrays in Ruby - ruby

I'm trying to write a function in Ruby which can take in a in an array of arrays, and convert it into a hash. This will be used to simulate sentences and create arbitrary word sequences. The resulting map will be helpful in generating all combination of sentences available with the current sentence rules.
How do I go about achieving this? I'm a bit lost as to where to start.

Try this:
arr = [ ["<start>", "The <object> <verb> tonight."],
["<object>", "waves", "big yellow flowers", "slugs"],
["<verb>", "sigh <adverb>", "portend like <object>", "die <adverb>"],
["<adverb>", "warily", "grumpily"] ]
arr.map { |ar| [ar.shift, ar.map { |str| str.split }] }.to_h
#=>
#{ "<start>" => [["The", "<object>", "<verb>", "tonight."]],
# "<object>" => [["waves"], ["big", "yellow", "flowers"], ["slugs"]],
# "<verb>" => [["sigh", "<adverb>"], ["portend", "like", "<object>"], ["die", "<adverb>"]],
# "<adverb>" => [["warily"], ["grumpily"]] }
ar.shift takes the first element of each subarray. The block used with ar.map splits (on whitespace) the remaining elements into arrays. Finally to_h converts the resulting array into a hash.

Related

Convert a matrix into a hash data strucutre

i have a matrix like this
[
["name", "company1", "company2", "company3"],
["hr_admin", "Tom", "Joane", "Kris"],
["manager", "Philip", "Daemon", "Kristy"]
]
How can I convert into this data structure?
{
"company1" => {
"hr_admin"=> "Tom",
"manager" => "Philip"
},
"Company2" => {
"hr_admin"=> "Joane",
"manager" => "Daemon"
},
"company3" => {
"hr_admin"=> "Kris",
"manager" => "Kristy"
}
}
I have tried approach like taking out the first row of matrix (header) and zipping the rest o
f the matrix to change their position. It worked to some extent but it doesnt looks very good. So I am turning up here for help.
matrix[0][1...matrix[0].length].each_with_index.map do |x,i|
values = matrix[1..matrix.length].map do |x|
[x[0], x[i+1]]
end.to_h
[x, values]
end.to_h
matrix[0].length and matrix.length could be omittable depending on ruby version.
First you take all elements of first row but first.
then you map them with index to e.g. [["hr_admin", "Tom"],["manager", "Phil"]] using the index
then you call to_h on every element and on whole array.
arr = [
["name", "company1", "company2", "company3"],
["hr_admin", "Tom", "Joane", "Kris"],
["manager", "Philip", "Daemon", "Kristy"]
]
Each key-value pair of the hash to be constructed is formed from the "columns" of arr. It therefore is convenient to compute the transpose of arr:
(_, *positions), *by_company = arr.transpose
#=> [["name", "hr_admin", "manager"],
# ["company1", "Tom", "Philip"],
# ["company2", "Joane", "Daemon"],
# ["company3", "Kris", "Kristy"]]
I made use of Ruby's array decomposition (a.k.a array destructuring) feature (see this blog for elabortion) to assign different parts of the inverse of arr to variables. Those values are as follows1.
_ #=> "name"
positions
#=> ["hr_admin", "manager"]
by_company
#=> [["company1", "Tom", "Philip"],
# ["company2", "Joane", "Daemon"],
# ["company3", "Kris", "Kristy"]]
It is now a simple matter to form the desired hash. Once again I will use array decomposition to advantage.
by_company.each_with_object({}) do |(company_name, *employees),h|
h[company_name] = positions.zip(employees).to_h
end
#=> {"company1"=>{"hr_admin"=>"Tom", "manager"=>"Philip"},
# "company2"=>{"hr_admin"=>"Joane", "manager"=>"Daemon"},
# "company3"=>{"hr_admin"=>"Kris", "manager"=>"Kristy"}}
When, for example,
company_name, *employees = ["company1", "Tom", "Philip"]
company_name
#=> "company1"
employees
#=> ["Tom", "Philip"]
so
h[company_name] = positions.zip(employees).to_h
h["company1"] = ["hr_admin", "manager"].zip(["Tom", "Philip"]).to_h
= [["hr_admin", "Tom"], ["manager", "Philip"]].to_h
= {"hr_admin"=>"Tom", "manager"=>"Philip"}
Note that these calculations do not depend on the numbers of rows or columns of arr.
1. As is common practice, I used the special variable _ to signal to the reader that its value is not used in subsequent calculations.

Merge duplicate values in json using ruby

I have the following item.json file
{
"items": [
{
"brand": "LEGO",
"stock": 55,
"full-price": "22.99",
},
{
"brand": "Nano Blocks",
"stock": 12,
"full-price": "49.99",
},
{
"brand": "LEGO",
"stock": 5,
"full-price": "199.99",
}
]
}
There are two items named LEGO and I want to get output for the total number of stock for the individual brand.
In ruby file item.rb i have code like:
require 'json'
path = File.join(File.dirname(__FILE__), '../data/products.json')
file = File.read(path)
products_hash = JSON.parse(file)
products_hash["items"].each do |brand|
puts "Stock no: #{brand["stock"]}"
end
I got output for stock no individually for each brand wherein I need the stock to be summed for two brand name "LEGO" displayed as one.
Anyone has solution for this?
json = File.open(path,'r:utf-8',&:read) # in case the JSON uses UTF-8
items = JSON.parse(json)['items']
stock_by_brand = items
.group_by{ |h| h['brand'] }
.map do |brand,array|
[ brand,
array
.map{ |item| item['stock'] }
.inject(:+) ]
end
.to_h
#=> {"LEGO"=>60, "Nano Blocks"=>12}
It works like this:
Enumerable#group_by takes the array of items and creates a hash mapping the brand name to an array of all item hashes with that brand
Enumerable#map turns each brand/array pair in that hash into an array of the brand (unchanged) followed by:
Enumerable#map on the array of items picks out just the "stock" counts, and then
Enumerable#inject sums them all together
Array#to_h then turns that array of two-value arrays into a hash, mapping the brand to the sum of stock values.
If you want simpler code that's less functional and possibly easier to understand:
stock_by_brand = {} # an empty hash
items.each do |item|
stock_by_brand[ item['brand'] ] ||= 0 # initialize to zero if unset
stock_by_brand[ item['brand'] ] += item['stock']
end
p stock_by_brand #=> {"LEGO"=>60, "Nano Blocks"=>12}
To see what your JSON string looks like, let's create it from your hash, which I've denoted h:
require 'json'
j = JSON.generate(h)
#=> "{\"items\":[{\"brand\":\"LEGO\",\"stock\":55,\"full-price\":\"22.99\"},{\"brand\":\"Nano Blocks\",\"stock\":12,\"full-price\":\"49.99\"},{\"brand\":\"LEGO\",\"stock\":5,\"full-price\":\"199.99\"}]}"
After reading that from a file, into the variable j, we can now parse it to obtain the value of "items":
arr = JSON.parse(j)["items"]
#=> [{"brand"=>"LEGO", "stock"=>55, "full-price"=>"22.99"},
# {"brand"=>"Nano Blocks", "stock"=>12, "full-price"=>"49.99"},
# {"brand"=>"LEGO", "stock"=>5, "full-price"=>"199.99"}]
One way to obtain the desired tallies is to use a counting hash:
arr.each_with_object(Hash.new(0)) {|g,h| h.update(g["brand"]=>h[g["brand"]]+g["stock"])}
#=> {"LEGO"=>60, "Nano Blocks"=>12}
Hash.new(0) creates an empty hash (represented by the block variable h) with with a default value of zero1. That means that h[k] returns zero if the hash does not have a key k.
For the first element of arr (represented by the block variable g) we have:
g["brand"] #=> "LEGO"
g["stock"] #=> 55
Within the block, therefore, the calculation is:
g["brand"] => h[g["brand"]]+g["stock"]
#=> "LEGO" => h["LEGO"] + 55
Initially h has no keys, so h["LEGO"] returns the default value of zero, resulting in { "LEGO"=>55 } being merged into the hash h. As h now has a key "LEGO", h["LEGO"], will not return the default value in subsequent calculations.
Another approach is to use the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged:
arr.each_with_object({}) {|g,h| h.update(g["brand"]=>g["stock"]) {|_,o,n| o+n}}
#=> {"LEGO"=>60, "Nano Blocks"=>12}
1 k=>v is shorthand for { k=>v } when it appears as a method's argument.

Creating a hash where the keys are strings, values are numbers

I have an array:
["Melanie", "149", "Joe", "2", "16", "216", "Sarah"]
I want to create a hash:
{"Melanie"=>[149], "Joe"=>[2, 16, 216] "Sarah"=>nil}
How would I accomplish this when the keys and values are in the same array?
All values would be integers (although they are in string form in the array.) All keys start and end with a letter.
Your expected hash is invalid. Therefore, it is impossible to get what you wrote that you want.
From your issue, it looks reasonable to expect the values to be array. In that case, you can do it like this:
["Melanie", "149", "Joe", "2", "16", "216", "Sarah"]
.slice_before(/[a-z]/i).map{|k, *v| [k, v.map(&:to_i)]}.to_h
# => {"Melanie"=>[149], "Joe"=>[2, 16, 216], "Sarah"=>[]}
With little modification, you can let the value be a number instead of an array when the array length is one, but that is not a good design; it would introduce exceptions.
Try this
def numeric?(x)
x.chars.all? { |y| ('0'..'9').include?(y) }
end
array = ["Melanie", "149", "Joe", "2", "16", "216", "Sarah"]
keys = array.select { |x| not numeric?(x) }
map = {}
keys.each do |k|
from = array.index(k) + 1
to = array.index( keys[keys.index(k) + 1] )
map[k] = to ? array[from...to] : array[from..from]
end
p map
Output:
{"Melanie"=>["149"], "Joe"=>["2", "16", "216"], "Sarah"=>[]}
[Finished in 0.1s]
Here's another way:
arr = ["Melanie", "149", "Joe", "2", "16", "216", "Sarah"]
class String
def integer?
!!(self =~ /^-?\d+$/)
end
end
Hash[*arr.each_with_object([]) { |s,a| s.integer? ? a[-1] << s.to_i : a<<s<<[] }].
tap { |h| h.each_key { |k| h[k] = nil if h[k].empty? } }
#=> {"Melanie"=>[149], "Joe"=>[2, 16, 216], "Sarah"=>nil}
There are three components to your question, and I will try to answer them separately.
Regarding storing a multi-valued mapping, while there are specialized solutions available, the most common recommendation is just to store a hash whose values are arrays. That is, for your use case, your primary data structure is a hash whose keys are strings and whose values are arrays of integers. Depending on your desired behavior for duplicates etc., etc, you may wish to substitute a different data structure for the value structure, possibly a set.
Regarding identifying strings containing numbers and strings not containing numbers, well, that depends on exactly what your non-number-containing strings could instead contain, but a good starting point would be to perform a regular expression match for digits. You didn't specify whether your allowable numeric strings represented integers, floating points, etc. The particular answer to that may affect your overall strategy. Unfortunately, input parsing and validation is a complex and messy topic in the general case.
Regarding the actual conversion process, I would recommend the following strategy. Iterate through your input array. Check each string for whether it is numeric or non-numeric. If it is non-numeric, store that as the current key in a local. Also, in your hash, create a mapping from that key to a new empty array. If, instead, the string is numeric, convert it into a number, and add it to the array under the appropriate key.
I don't know if there's a pretty way to do it. I'd do something like this:
def numeric?(string)
# `!!` converts parsed number to `true`
!!Kernel.Float(string)
rescue TypeError, ArgumentError
false
end
def my_method(input_array)
# associate values with proper key and stores result in output
curr_key = nil
output = {}
input_array.each do |e|
if !numeric?(e)
output[e] = []
curr_key = e
else
# use Float if values may be floating-point
output[curr_key] << Integer(e, 10)
end
end
output.each do |k, v|
output[k] = v.empty? ? nil : v
end
output
end
Source for numeric method.

Can someone explain why this ruby code for finding anagrams works?

Here's the line:
words.group_by { |word| word.downcase.chars.sort }.values
words is an array of words that are then sorted into groups if they share the same letters as another word in the original array. Can someone go down the line and explain how this works?
Okay, let's go through the methods one by one:
group_by: is a method that takes an Enumerable (in this case an Array) and a block. It passes each value in the array to the block, and then looks at the block's results. It then returns a Hash, where the keys are the results of the block, and the values are arrays of the original input that had the same result.
downcase: takes a string and makes it all lowercase
chars: converts a string into an array of characters
sort: sorts a collection
values: returns just the values of a Hash.
With those definitions, let's return to your code:
words.group_by { |word| word.downcase.chars.sort }.values
"Take each word in the array of words, make it lowercase, then sort its
letters into alphabetical order. The words that contain exactly the same
letters will have the same sorted result, so we can group them into a
Hash of Arrays, with the sorted letters as the key and the Array of
matching words as the value. We don't actually care what the actual letters
are, just the words that share them, so we take only the Arrays (the values)."
To break it down with an example:
words = %w[throw worth threw Live evil]
# => [ 'throw', 'worth', 'threw', 'Live', 'evil' ]
words_hash = words.group_by {|word| word.downcase.chars.sort}
# => {
# 'eilv' => [ 'evil', 'Live' ],
# 'ehrtw' => [ 'threw'],
# 'hortw' => [ 'throw', 'worth']
# }
words_hash.values
# => [['evil', 'Live'], ['threw'], ['throw', 'worth']]

Convert array-of-hashes to a hash-of-hashes, indexed by an attribute of the hashes

I've got an array of hashes representing objects as a response to an API call. I need to pull data from some of the hashes, and one particular key serves as an id for the hash object. I would like to convert the array into a hash with the keys as the ids, and the values as the original hash with that id.
Here's what I'm talking about:
api_response = [
{ :id => 1, :foo => 'bar' },
{ :id => 2, :foo => 'another bar' },
# ..
]
ideal_response = {
1 => { :id => 1, :foo => 'bar' },
2 => { :id => 2, :foo => 'another bar' },
# ..
}
There are two ways I could think of doing this.
Map the data to the ideal_response (below)
Use api_response.find { |x| x[:id] == i } for each record I need to access.
A method I'm unaware of, possibly involving a way of using map to build a hash, natively.
My method of mapping:
keys = data.map { |x| x[:id] }
mapped = Hash[*keys.zip(data).flatten]
I can't help but feel like there is a more performant, tidier way of doing this. Option 2 is very performant when there are a very minimal number of records that need to be accessed. Mapping excels here, but it starts to break down when there are a lot of records in the response. Thankfully, I don't expect there to be more than 50-100 records, so mapping is sufficient.
Is there a smarter, tidier, or more performant way of doing this in Ruby?
Ruby <= 2.0
> Hash[api_response.map { |r| [r[:id], r] }]
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
However, Hash::[] is pretty ugly and breaks the usual left-to-right OOP flow. That's why Facets proposed Enumerable#mash:
> require 'facets'
> api_response.mash { |r| [r[:id], r] }
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
This basic abstraction (convert enumerables to hashes) was asked to be included in Ruby long ago, alas, without luck.
Note that your use case is covered by Active Support: Enumerable#index_by
Ruby >= 2.1
[UPDATE] Still no love for Enumerable#mash, but now we have Array#to_h. It creates an intermediate array, but it's better than nothing:
> object = api_response.map { |r| [r[:id], r] }.to_h
Something like:
ideal_response = api_response.group_by{|i| i[:id]}
#=> {1=>[{:id=>1, :foo=>"bar"}], 2=>[{:id=>2, :foo=>"another bar"}]}
It uses Enumerable's group_by, which works on collections, returning matches for whatever key value you want. Because it expects to find multiple occurrences of matching key-value hits it appends them to arrays, so you end up with a hash of arrays of hashes. You could peel back the internal arrays if you wanted but could run a risk of overwriting content if two of your hash IDs collided. group_by avoids that with the inner array.
Accessing a particular element is easy:
ideal_response[1][0] #=> {:id=>1, :foo=>"bar"}
ideal_response[1][0][:foo] #=> "bar"
The way you show at the end of the question is another valid way of doing it. Both are reasonably fast and elegant.
For this I'd probably just go:
ideal_response = api_response.each_with_object(Hash.new) { |o, h| h[o[:id]] = o }
Not super pretty with the multiple brackets in the block but it does the trick with just a single iteration of the api_response.

Resources