i have a matrix like this
[
["name", "company1", "company2", "company3"],
["hr_admin", "Tom", "Joane", "Kris"],
["manager", "Philip", "Daemon", "Kristy"]
]
How can I convert into this data structure?
{
"company1" => {
"hr_admin"=> "Tom",
"manager" => "Philip"
},
"Company2" => {
"hr_admin"=> "Joane",
"manager" => "Daemon"
},
"company3" => {
"hr_admin"=> "Kris",
"manager" => "Kristy"
}
}
I have tried approach like taking out the first row of matrix (header) and zipping the rest o
f the matrix to change their position. It worked to some extent but it doesnt looks very good. So I am turning up here for help.
matrix[0][1...matrix[0].length].each_with_index.map do |x,i|
values = matrix[1..matrix.length].map do |x|
[x[0], x[i+1]]
end.to_h
[x, values]
end.to_h
matrix[0].length and matrix.length could be omittable depending on ruby version.
First you take all elements of first row but first.
then you map them with index to e.g. [["hr_admin", "Tom"],["manager", "Phil"]] using the index
then you call to_h on every element and on whole array.
arr = [
["name", "company1", "company2", "company3"],
["hr_admin", "Tom", "Joane", "Kris"],
["manager", "Philip", "Daemon", "Kristy"]
]
Each key-value pair of the hash to be constructed is formed from the "columns" of arr. It therefore is convenient to compute the transpose of arr:
(_, *positions), *by_company = arr.transpose
#=> [["name", "hr_admin", "manager"],
# ["company1", "Tom", "Philip"],
# ["company2", "Joane", "Daemon"],
# ["company3", "Kris", "Kristy"]]
I made use of Ruby's array decomposition (a.k.a array destructuring) feature (see this blog for elabortion) to assign different parts of the inverse of arr to variables. Those values are as follows1.
_ #=> "name"
positions
#=> ["hr_admin", "manager"]
by_company
#=> [["company1", "Tom", "Philip"],
# ["company2", "Joane", "Daemon"],
# ["company3", "Kris", "Kristy"]]
It is now a simple matter to form the desired hash. Once again I will use array decomposition to advantage.
by_company.each_with_object({}) do |(company_name, *employees),h|
h[company_name] = positions.zip(employees).to_h
end
#=> {"company1"=>{"hr_admin"=>"Tom", "manager"=>"Philip"},
# "company2"=>{"hr_admin"=>"Joane", "manager"=>"Daemon"},
# "company3"=>{"hr_admin"=>"Kris", "manager"=>"Kristy"}}
When, for example,
company_name, *employees = ["company1", "Tom", "Philip"]
company_name
#=> "company1"
employees
#=> ["Tom", "Philip"]
so
h[company_name] = positions.zip(employees).to_h
h["company1"] = ["hr_admin", "manager"].zip(["Tom", "Philip"]).to_h
= [["hr_admin", "Tom"], ["manager", "Philip"]].to_h
= {"hr_admin"=>"Tom", "manager"=>"Philip"}
Note that these calculations do not depend on the numbers of rows or columns of arr.
1. As is common practice, I used the special variable _ to signal to the reader that its value is not used in subsequent calculations.
Related
For example, I have
array = [ {name: 'robert', nationality: 'asian', age: 10},
{name: 'robert', nationality: 'asian', age: 5},
{name: 'sira', nationality: 'african', age: 15} ]
I want to get the result as
array = [ {name: 'robert', nationality: 'asian', age: 15},
{name: 'sira', nationality: 'african', age: 15} ]
since there are 2 Robert's with the same nationality.
Any help would be much appreciated.
I have tried Array.uniq! {|e| e[:name] && e[:nationality] } but I want to add both numbers in the two hashes which is 10 + 5
P.S: Array can have n number of hashes.
I would start with something like this:
array = [
{ name: 'robert', nationality: 'asian', age: 10 },
{ name: 'robert', nationality: 'asian', age: 5 },
{ name: 'sira', nationality: 'african', age: 15 }
]
array.group_by { |e| e.values_at(:name, :nationality) }
.map { |_, vs| vs.first.merge(age: vs.sum { |v| v[:age] }) }
#=> [
# {
# :name => "robert",
# :nationality => "asian",
# :age => 15
# }, {
# :name => "sira",
# :nationality => "african",
# :age => 15
# }
# ]
Let's take a look at what you want to accomplish and go from there. You have a list of some objects, and you want to merge certain objects together if they have the same ethnicity and name. So we have a key by which we will merge. Let's put that in programming terms.
key = proc { |x| [x[:name], x[:nationality]] }
We've defined a procedure which takes a hash and returns its "key" value. If this procedure returns the same value (according to eql?) for two hashes, then those two hashes need to be merged together. Now, what do we mean by "merge"? You want to add the ages together, so let's write a merge function.
merge = proc { |x, y| x.dup.tap { |x1| x1[:age] += y[:age] } }
If we have two values x and y such that key[x] and key[y] are the same, we want to merge them by making a copy of x and adding y's age to it. That's exactly what this procedure does. Now that we have our building blocks, we can write the algorithm.
We want to produce an array at the end, after merging using the key procedure we've written. Fortunately, Ruby has a handy function called each_with_object which will do something very nice for us. The method each_with_object will execute its block for each element of the array, passing in a predetermined value as the other argument. This will come in handy here.
result = array.each_with_object({}) do |x, hsh|
# ...
end.values
Since we're using keys and values to do the merge, the most efficient way to do this is going to be with a hash. Hence, we pass in an empty hash as the extra object, which we'll modify to accumulate the merge results. At the end, we don't care about the keys anymore, so we write .values to get just the objects themselves. Now for the final pieces.
if hsh.include? key[x]
hsh[ key[x] ] = merge.call hsh[ key[x] ], x
else
hsh[ key[x] ] = x
end
Let's break this down. If the hash already includes key[x], which is the key for the object x that we're looking at, then we want to merge x with the value that is currently at key[x]. This is where we add the ages together. This approach only works if the merge function is what mathematicians call a semigroup, which is a fancy way of saying that the operation is associative. You don't need to worry too much about that; addition is a very good example of a semigroup, so it works here.
Anyway, if the key doesn't exist in the hash, we want to put the current value in the hash at the key position. The resulting hash from merging is returned, and then we can get the values out of it to get the result you wanted.
key = proc { |x| [x[:name], x[:nationality]] }
merge = proc { |x, y| x.dup.tap { |x1| x1[:age] += y[:age] } }
result = array.each_with_object({}) do |x, hsh|
if hsh.include? key[x]
hsh[ key[x] ] = merge.call hsh[ key[x] ], x
else
hsh[ key[x] ] = x
end
end.values
Now, my complexity theory is a bit rusty, but if Ruby implements its hash type efficiently (which I'm fairly certain it does), then this merge algorithm is O(n), which means it will take a linear amount of time to finish, given the problem size as input.
array.each_with_object(Hash.new(0)) { |g,h| h[[g[:name], g[:nationality]]] += g[:age] }.
map { |(name, nationality),age| { name:name, nationality:nationality, age:age } }
[{ :name=>"robert", :nationality=>"asian", :age=>15 },
{ :name=>"sira", :nationality=>"african", :age=>15 }]
The two steps are as follows.
a = array.each_with_object(Hash.new(0)) { |g,h| h[[g[:name], g[:nationality]]] += g[:age] }
#=> { ["robert", "asian"]=>15, ["sira", "african"]=>15 }
This uses the class method Hash::new to create a hash with a default value of zero (represented by the block variable h). Once this hash heen obtained it is a simple matter to construct the desired hash:
a.map { |(name, nationality),age| { name:name, nationality:nationality, age:age } }
I have the following item.json file
{
"items": [
{
"brand": "LEGO",
"stock": 55,
"full-price": "22.99",
},
{
"brand": "Nano Blocks",
"stock": 12,
"full-price": "49.99",
},
{
"brand": "LEGO",
"stock": 5,
"full-price": "199.99",
}
]
}
There are two items named LEGO and I want to get output for the total number of stock for the individual brand.
In ruby file item.rb i have code like:
require 'json'
path = File.join(File.dirname(__FILE__), '../data/products.json')
file = File.read(path)
products_hash = JSON.parse(file)
products_hash["items"].each do |brand|
puts "Stock no: #{brand["stock"]}"
end
I got output for stock no individually for each brand wherein I need the stock to be summed for two brand name "LEGO" displayed as one.
Anyone has solution for this?
json = File.open(path,'r:utf-8',&:read) # in case the JSON uses UTF-8
items = JSON.parse(json)['items']
stock_by_brand = items
.group_by{ |h| h['brand'] }
.map do |brand,array|
[ brand,
array
.map{ |item| item['stock'] }
.inject(:+) ]
end
.to_h
#=> {"LEGO"=>60, "Nano Blocks"=>12}
It works like this:
Enumerable#group_by takes the array of items and creates a hash mapping the brand name to an array of all item hashes with that brand
Enumerable#map turns each brand/array pair in that hash into an array of the brand (unchanged) followed by:
Enumerable#map on the array of items picks out just the "stock" counts, and then
Enumerable#inject sums them all together
Array#to_h then turns that array of two-value arrays into a hash, mapping the brand to the sum of stock values.
If you want simpler code that's less functional and possibly easier to understand:
stock_by_brand = {} # an empty hash
items.each do |item|
stock_by_brand[ item['brand'] ] ||= 0 # initialize to zero if unset
stock_by_brand[ item['brand'] ] += item['stock']
end
p stock_by_brand #=> {"LEGO"=>60, "Nano Blocks"=>12}
To see what your JSON string looks like, let's create it from your hash, which I've denoted h:
require 'json'
j = JSON.generate(h)
#=> "{\"items\":[{\"brand\":\"LEGO\",\"stock\":55,\"full-price\":\"22.99\"},{\"brand\":\"Nano Blocks\",\"stock\":12,\"full-price\":\"49.99\"},{\"brand\":\"LEGO\",\"stock\":5,\"full-price\":\"199.99\"}]}"
After reading that from a file, into the variable j, we can now parse it to obtain the value of "items":
arr = JSON.parse(j)["items"]
#=> [{"brand"=>"LEGO", "stock"=>55, "full-price"=>"22.99"},
# {"brand"=>"Nano Blocks", "stock"=>12, "full-price"=>"49.99"},
# {"brand"=>"LEGO", "stock"=>5, "full-price"=>"199.99"}]
One way to obtain the desired tallies is to use a counting hash:
arr.each_with_object(Hash.new(0)) {|g,h| h.update(g["brand"]=>h[g["brand"]]+g["stock"])}
#=> {"LEGO"=>60, "Nano Blocks"=>12}
Hash.new(0) creates an empty hash (represented by the block variable h) with with a default value of zero1. That means that h[k] returns zero if the hash does not have a key k.
For the first element of arr (represented by the block variable g) we have:
g["brand"] #=> "LEGO"
g["stock"] #=> 55
Within the block, therefore, the calculation is:
g["brand"] => h[g["brand"]]+g["stock"]
#=> "LEGO" => h["LEGO"] + 55
Initially h has no keys, so h["LEGO"] returns the default value of zero, resulting in { "LEGO"=>55 } being merged into the hash h. As h now has a key "LEGO", h["LEGO"], will not return the default value in subsequent calculations.
Another approach is to use the form of Hash#update (aka merge!) that employs a block to determine the values of keys that are present in both hashes being merged:
arr.each_with_object({}) {|g,h| h.update(g["brand"]=>g["stock"]) {|_,o,n| o+n}}
#=> {"LEGO"=>60, "Nano Blocks"=>12}
1 k=>v is shorthand for { k=>v } when it appears as a method's argument.
I have an object with many arrays of hashes, one of which I want to sort by a value in the 'date' key.
#array['info'][0] = {"name"=>"personA", "date"=>"23/09/1980"}
#array['info'][1] = {"name"=>"personB", "date"=>"01/04/1970"}
#array['info'][2] = {"name"=>"personC", "date"=>"03/04/1975"}
I have tried various methods using Date.parse and with collect but an unable to find a good solution.
Edit:
To be clear I want to sort the original array in place
#array['info'].sort_by { |i| Date.parse i['date'] }.collect
How might one solve this elegantly the 'Ruby-ist' way. Thanks
Another way, which doesn't require converting the date strings to date objects, is the following.
Code
def sort_by_date(arr)
arr.sort_by { |h| h["date"].split('/').reverse }
end
If arr is to be sorted in place, use Array#sort_by! rather than Enumerable#sort_by.
Example
arr = [{ "name"=>"personA", "date"=>"23/09/1980" },
{ "name"=>"personB", "date"=>"01/04/1970" },
{ "name"=>"personC", "date"=>"03/04/1975" }]
sort_by_date(arr)
#=> [{ "name"=>"personB", "date"=>"01/04/1970" },
# { "name"=>"personC", "date"=>"03/04/1975" },
# { "name"=>"personA", "date"=>"23/09/1980" }]
Explanation
For arr in the example, sort_by passes the first element of arr into its block and assigns it to the block variable:
h = { "name"=>"personA", "date"=>"23/09/1980" }
then computes:
a = h["date"].split('/')
#=> ["23", "09", "1980"]
and then:
b = a.reverse
#=> ["1980", "09", "23"]
Similarly, we obtain b equal to:
["1970", "04", "01"]
and
["1975", "04", "03"]
for each of the other two elements of arr.
If you look at the docs for Array#<=> you will see that these three arrays are ordered as follows:
["1970", "04", "01"] < ["1975", "04", "03"] < ["1980", "09", "23"]
There is no need to convert the string elements to integers.
Looks fine overall. Although you can drop the collect call since it's not needed and use sort_by! to modify the array in-place (instead of reassigning):
#array['info'].sort_by! { |x| Date.parse x['date'] }
I have 2d array like this:
ary = [
["Source", "attribute1", "attribute2"],
["db", "usage", "value"],
["import", "usage", "value"],
["webservice", "usage", "value"]
]
I want to pull out the following in hash:
{1 => "db", 2 => "import", 3 => "webservice"} // keys are indexes or outer 2d array
I know how to get this by looping trough 2d array. But since I'm learning ruby I thought I could do it with something like this
ary.each_with_index.map {|element, index| {index => element[0]}}.reduce(:merge)
This gives me :
{0=> "Source", 1 => "db", 2 => "import", 3 => "webservice"}
How do I get rid of 0 element from my output map?
I'd write:
Hash[ary.drop(1).map.with_index(1) { |xs, idx| [idx, xs.first] }]
#=> {1=>"db", 2=>"import", 3=>"webservice"}
ary.drop(1) drops the first element, returns the rest.
You could build the hash directly without the merge reduction using each_with_object
ary.drop(1)
.each_with_object({})
.with_index(1) { |((source,_,_),memo),i| memo[i] = source }
Or map to tuples and send to the Hash[] constructor.
Hash[ ary.drop(1).map.with_index(1) { |(s,_,_),i| [i, s] } ]
I have a set of data that is an array of hashes, with each hash representing one record of data:
data = [
{
:id => "12345",
:bucket_1_rank => "2",
:bucket_1_count => "12",
:bucket_2_rank => "7",
:bucket_2_count => "25"
},
{
:id => "45678",
:bucket_1_rank => "2",
:bucket_1_count => "15",
:bucket_2_rank => "9",
:bucket_2_count => "68"
},
{
:id => "78901",
:bucket_1_rank => "5",
:bucket_1_count => "36"
}
]
The ranks values are always between 1 and 10.
What I am trying to do is select each of the possible values for the rank fields (the :bucket_1_rank and :bucket_2_rank fields) as keys in my final resultset, and the values for each key will be an array of all the values in its associated :bucket_count field. So, for the data above, the final resulting structure I have in mind is something like:
bucket 1:
{"2" => ["12", "15"], "5" => ["36"]}
bucket 2:
{"7" => ["25"], "9" => ["68"]}
I can do this working under the assumption that the field names stay the same, or through hard coding the field/key names, or just using group_by for the fields I need, but my problem is that I work with a different data set each month where the rank fields are named slightly differently depending on the project specs, and I want to identify the names for the count and rank fields dynamically as opposed to hard coding the field names.
I wrote two quick helpers get_ranks and get_buckets that use regex to return an array of fieldnames that are either ranks or count fields, since these fields will always have the literal string "_rank" or "_count" in their names:
ranks = get_ranks
counts = get_counts
results = Hash.new{|h,k| h[k] = []}
data.each do |i|
ranks.each do |r|
unless i[r].nil?
counts.each do |c|
results[i[r]] << i[c]
end
end
end
end
p results
This seems to be close, but feels awkward, and it seems to me there has to be a better way to iterate through this data set. Since I haven't worked on this project using Ruby I'd use this as an opportunity to improve my understanding iterating through arrays of hashes, populating a hash with arrays as values, etc. Any resources/suggestions would be much appreciated.
You could shorten it to:
result = Hash.new{|h,k| h[k] = Hash.new{|h2,k2| h2[k2] = []}}
data.each do |hsh|
hsh.each do |key, value|
result[$1][value] << hsh["#{$1}_count".to_sym] if key =~ /(.*)_rank$/
end
end
puts result
#=> {"bucket_1"=>{"2"=>["12", "15"], "5"=>["36"]}, "bucket_2"=>{"7"=>["25"], "9"=>["68"]}}
Though this is assuming that :bucket_2_item_count is actually supposed to be :bucket_2_count.