I just began to work on ruby Sinatra.
I am having an issue over creating an array from an existing array.
I have this array which is grouped by date and each of the elements contain all the entries of that day.
{
"2015-05-15": [{
"minutes": 25,
"key1": [{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
}],
"key6": "2015-05-15"
}, {
"minutes": 25,
"key1": [{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
}],
"key6": "2015-05-15"
}],
"2015-05-25": [{
"minutes": 25,
"key1": [{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
}],
"key6": "2015-05-25"
}],
"2015-06-10": [{
"minutes": 25,
"key1": [{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
}],
"key6": "2015-06-10"
}, {
"minutes": 25,
"key1": [{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
}],
"key6": "2015-06-10"
}]
}
I want to club this array such that all the sub element arrays are arranged via keys for each date. For example the following array is what I am looking for. Here all the sub arrays in date 2015-05-15 are clubbed into one key element.
{
"2015-05-15": [{
"minutes": 50,
"key1": [{
"some key1": "14",
"some key": "subject here"
},
{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
},{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
},
{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
},
[{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
},
{
"some key": "0.6"
}],
"key6": "2015-05-15"
}],
"2015-05-25": [{
"minutes": 50,
"key1": [{
"some key1": "14",
"some key": "subject here"
},
{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
},{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
},
{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
},
[{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
},
{
"some key": "0.6"
}],
"key6": "2015-05-25"
}],
"2015-06-10": [{
"minutes": 50,
"key1": [{
"some key1": "14",
"some key": "subject here"
},
{
"some key1": "14",
"some key": "subject here"
}],
"key2": [{
"some key": "0/0"
},{
"some key": "0/0"
}],
"key3": [{
"some key": "5/5"
},
{
"some key": "5/5"
}],
"key4": [{
"some key": 0.48
},
[{
"some key": 0.48
}],
"key5": [{
"some key": "0.6"
},
{
"some key": "0.6"
}],
"key6": "2015-06-10"
}]
}
I tried to arrange them using a custom method I created.
def self.iterateArray(array)
result = Hash.new #{ |h, k| h[k] = [] }
count = 0
result[count] = Array.new(7)
array.each { |key, data|
result[count]["date"] = key
data.each { |k, d|
k.each { |key_, data1|
if key_ == 'key1'
result[count][key_] += data1
end
if key_ == 'key2'
result[count][key_] << data1
end
if key_ == 'key3'
result[count][key_] << data1
end
if key_ == 'key4'
result[count][key_] << data1
end
if key_ == 'key5'
result[count][key_] << data1
end
if key_ == 'key6'
result[count][key_] << data1
end
}
}
count += 1
}
puts "result: #{result}"
result
end
But every time I try to run this method I am getting weird errors like "no implicit conversion of String into Integer" at
result[count]["date"] = key
earlier it was creating such error at result[count] but then I initiated it with
result[count] = Hash.new { |h, k| h[k] = [] }
And system started pointing to the next element.
Can anyone tell me what I am doing wrong? Or Tell me how can I deal with such problems or how custom arrays are created.. Any help is appreciated.
Guys, I apologize for adding 2 "key2" elements there. It was actually key1 .. to key6 elements. I have updated the sample arrays.
Problems in your question
You have key "key2" twice in each hash (both input and output). So, the latter one will override the values of first assignment. From engineersmnky's example: {key2: 1, key2: 2} #=> {key2: 2}
You have not explained the complete logic to convert your input into output.
Don't know if the resulting hash is to be prepared considering the type of values i.e. Integer, Array, etc. or is it key specific i.e. always sum up values for key "minutes" without checking its type.
Solution
So, though i am not sure about the logic behind your expected output, i was able to convert your original array to it:
Input:
input = {
"2015-05-15" => [{
"minutes" => 25,
"key1" => [{
"some key1" => "14",
"some key" => "subject here"
}],
"key2" => [{
"some key" => "5/5"
}],
"key3" => [{
"some key" => 0.48
}],
"key4" => [{
"some key" => "0.6"
}],
"key5" => "2015-05-15"
}, {
"minutes" => 25,
"key1" => [{
"some key1" => "14",
"some key" => "subject here"
}],
"key2" => [{
"some key" => "5/5"
}],
"key3" => [{
"some key" => 0.48
}],
"key4" => [{
"some key" => "0.6"
}],
"key5" => "2015-05-15"
}],
"2015-05-25" => [{
"minutes" => 25,
"key1" => [{
"some key1" => "14",
"some key" => "subject here"
}],
"key2" => [{
"some key" => "5/5"
}],
"key3" => [{
"some key" => 0.48
}],
"key4" => [{
"some key" => "0.6"
}],
"key5" => "2015-05-25"
}],
"2015-06-10" => [{
"minutes" => 25,
"key1" => [{
"some key1" => "14",
"some key" => "subject here"
}],
"key2" => [{
"some key" => "5/5"
}],
"key3" => [{
"some key" => 0.48
}],
"key4" => [{
"some key" => "0.6"
}],
"key5" => "2015-06-10"
}, {
"minutes" => 25,
"key1" => [{
"some key1" => "14",
"some key" => "subject here"
}],
"key2" => [{
"some key" => "5/5"
}],
"key3" => [{
"some key" => 0.48
}],
"key4" => [{
"some key" => "0.6"
}],
"key5" => "2015-06-10"
}]
}
Procedure:
output = {}
input.each do |date, ary|
keys = ary.map(&:keys).flatten.uniq # Returns all the keys in all the hash elements for given date
hash = {}
keys.each do |key|
if ary.all? { |e| e[key].is_a?(Integer) }
hash[key] = ary.inject(0) { |sum, h| sum + h[key].to_i } # Assuming you want to sum up values for keys with integer values
elsif ary.all? { |e| e[key].is_a?(Array) }
hash[key] = ary.inject([]) { |sum, h| sum + h[key] } # Assuming you need `+` operation on the array values too
elsif ary.all? { |e| e[key].is_a?(String) }
hash[key] = ary[0][key] # Assuming string values are same in all the elements of array
else
raise "Invalid type for value of key: #{key}"
end
end
output[date] = [hash] # I don't know why are we putting it in array as it will always be one element only. You could do: `output[date] = hash`
end
Output:
output
=> {"2015-05-15"=>[{
"minutes"=>50,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}, {"some key"=>"5/5"}],
"key3"=>[{"some key"=>0.48}, {"some key"=>0.48}],
"key4"=>[{"some key"=>"0.6"}, {"some key"=>"0.6"}],
"key5"=>"2015-05-15"}],
"2015-05-25"=>[{
"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}],
"key3"=>[{"some key"=>0.48}],
"key4"=>[{"some key"=>"0.6"}],
"key5"=>"2015-05-25"
}],
"2015-06-10"=>[{
"minutes"=>50,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}, {"some key"=>"5/5"}],
"key3"=>[{"some key"=>0.48}, {"some key"=>0.48}],
"key4"=>[{"some key"=>"0.6"}, {"some key"=>"0.6"}],
"key5"=>"2015-06-10"
}]
}
I have achieved my answer using this method I created. Here I have removed the "date" field as it was getting repeated.
def self.iterateArray(array)
response = []
array.each { |key, data|
result = Hash.new { |h, k| h[k] = [] }
result["date"] = key
result["minutes"] =0
result["key1"] =[]
result["key2"] =[]
result["key3"] =[]
result["key4"] =[]
result["key5"] =[]
data.each { |k, d|
k.each { |key1, data1|
if key1 == "minutes"
result[key1] += data1
elsif key1 != "date"
result[key1] << data1[0]
end
}
}
response << result
}
response
end
But I still don't like the way we need to declare all variables within the array before adding value to them. Is there any solution to this?
Code
def convert(h)
h.each_with_object(Hash.new { |e,k| e[k]={} }) do |(date, v),f|
v.each { |g| f[date].update(g) { |_,ov,nv|ov.is_a?(Array) ? ov+nv : ov } }
end
end
Example
Due to the large size and formatting of the example it is difficult to recognize the forest for the trees. The first thing I did was pare down the example to its essential structure. I reduced the numbers of keys (date strings) from three to two and reduced the number of keys in each hash. I also used indentation to clarify the structure. By doing so I obtained the following.
h = {
"2015-05-15"=>[
{ "minutes"=>25,
"key1"=>[{ "some key1"=>"14", "some key"=>"subject here" }],
"key2"=>[{ "some key"=>"5/5"}],
"key5"=>"2015-05-15"
},
{ "minutes"=>25,
"key1"=>[{ "some key1"=>"14", "some key"=>"subject here" }],
"key2"=>[{ "some key"=>"5/5" }],
"key4"=>[{ "some key"=>"0.6" }],
"key5"=>"2015-05-15"
}
],
"2015-06-10"=>[
{ "minutes"=>25,
"key1"=>[{ "some key1"=>"14", "some key"=>"subject here" }],
"key2"=>[{ "some key"=>"5/5" }],
"key5"=>"2015-06-10"
},
{ "minutes"=>25,
"key1"=>[{ "some key1"=>"14", "some key"=>"subject here" }],
"key2"=>[{ "some key"=>"5/5"}]
}
]
}
convert(h)
#=> { "2015-05-15"=>{
# "minutes"=>25,
# "key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
# {"some key1"=>"14", "some key"=>"subject here"}],
# "key2"=>[{"some key"=>"5/5"},
# {"some key"=>"5/5"}],
# "key5"=>"2015-05-15",
# "key4"=>[{"some key"=>"0.6"}]},
# "2015-06-10"=>{
# "minutes"=>25,
# "key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
# {"some key1"=>"14", "some key"=>"subject here"}],
# "key2"=>[{"some key"=>"5/5"},
# {"some key"=>"5/5"}],
# "key5"=>"2015-06-10"
# }
# }
This result differs slightly from what was requested. The value of each date string key (e.g., "2015-05-15") was to be an array containing a single element (a hash). The is no purpose to having arrays that always contain a single element, so I've simplified by making the values of the date strings hashes.
Explanation
Hash.new { |h,k| h[k]={} } causes h[k] to be set equal to an empty hash when h does not have a key k. See the form of Hash::new that employs a block.
I've used the form of Hash#update (a.k.a. merge!) that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for the values of the three block variables,k(common key),ov("*old value*") andnv` ("new value" )
The code is more-or-less equivalent to the following.
f = {}
h.each do |date, v|
v.each do |g|
f[date] = {} unless f.key?(date)
f[date].update(g) { |_,ov,nv| ov.is_a?(Array) ? ov+nv : ov }
end
end
f
If this is still not clear try running this code with some puts statements added.
f = {}
h.each do |date, v|
puts "date=#{date}"
puts "v=#{v}"
puts "f=#{f}"
v.each do |g|
puts " f=#{g}"
if f.key?(date)
puts " f has key date=#{date} so f[#{date}] is not set to {}"
else
puts " f does not has key date=#{date} so f[#{date}] is to {}"
end
f[date] = {} unless f.key?(date)
f[date].update(g) { |_,ov,nv| ov.is_a?(Array) ? ov+nv : ov }
puts " f[#{date}]= #{f[date]}"
end
end
f
This prints the following.
date=2015-05-15
v=[{"minutes"=>25, "key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}], "key5"=>"2015-05-15"},
{"minutes"=>25, "key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}], "key4"=>[{"some key"=>"0.6"}],
"key5"=>"2015-05-15"}]
f={}
f={"minutes"=>25, "key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}], "key5"=>"2015-05-15"}
f does not has key date=2015-05-15 so f[2015-05-15] is to {}
f[2015-05-15]= {"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}], "key5"=>"2015-05-15"}
f={"minutes"=>25, "key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}],
"key4"=>[{"some key"=>"0.6"}],
"key5"=>"2015-05-15"}
f has key date=2015-05-15 so f[2015-05-15] is not set to {}
f[2015-05-15]= {"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"},
{"some key"=>"5/5"}],
"key5"=>"2015-05-15",
"key4"=>[{"some key"=>"0.6"}]}
date=2015-06-10
v=[{"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}],
"key5"=>"2015-06-10"},
{"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}]}]
f={"2015-05-15"=>{"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"},
{"some key"=>"5/5"}],
"key5"=>"2015-05-15",
"key4"=>[{"some key"=>"0.6"}]}}
f={"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}],
"key5"=>"2015-06-10"}
f does not has key date=2015-06-10 so f[2015-06-10] is to {}
f[2015-06-10]= {"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}], "key5"=>"2015-06-10"}
f={"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}]}
f has key date=2015-06-10 so f[2015-06-10] is not set to {}
f[2015-06-10]= {"minutes"=>25,
"key1"=>[{"some key1"=>"14", "some key"=>"subject here"},
{"some key1"=>"14", "some key"=>"subject here"}],
"key2"=>[{"some key"=>"5/5"}, {"some key"=>"5/5"}],
"key5"=>"2015-06-10"}
I am using Apache drill to store large JSON files which I'm then querying using the Drill API as follows:
{
"queryType": "SQL",
"query": "select * from db.table.`/path/to/JSON.json` w "
}
This correctly returns the data. However, some of the JSON files have an empty array.
For example, the following is the JSON stored in the database
{
"key1": ["array", "of", "data"],
"key2": ["array", "of", "data"],
"key3": ["array", "of", "data"],
"key4": ["array", "of", "data"],
"key5": ["array", "of", "data"],
"key6": ["array", "of", "data"],
"key7": [],
}
When I retrieve this data, it returns as the following
{
"columns": [
"key1",
"key2",
"key3",
"key4",
"key5",
"key6",
],
"rows": [
{}
]
}
key7 is missing. How do I get the response to show this key even though it maybe empty for some of the stored JSON files.
Drill is schema less, So if there is no data in any row it will ignore that column, if you know that column required, you may need to use "case" or "if" statement to add default value or create a view.
I have a json file of the following format:
[
{
"organization": "ABC",
"type": "School",
"contact": "Joe Schmo",
"contact_title": "Principal",
"mailing_address": "123 Main Street, Anytown, MA",
"phone": "214-555-5430",
"fax": "214-555-5444"
},
{
"organization": "XYZ",
"type": "School",
"contact": "John Doe",
"contact_title": "Asst Principal",
"mailing_address": "123 Main Street, Anycity, TX",
"phone": "512-555-5430",
"fax": "512-555-5444"
},
.
.
.
.
]
I want to duplicate the line starting with "organization" and then add it back to the file twice after replacing "organization" with "company" and "long name". I want to keep the original line too.
The output I want is:
[
{
"organization": "ABC",
"company": "ABC",
"long name": "ABC",
"type": "School",
"contact": "Joe Schmo",
"contact_title": "Principal",
"mailing_address": "123 Main Street, Anytown, MA",
"phone": "214-555-5430",
"fax": "214-555-5444"
},
{
"organization": "XYZ",
"company": "XYZ",
"name": "XYZ",
"type": "School",
"contact": "John Doe",
"contact_title": "Asst Principal",
"mailing_address": "123 Main Street, Anycity, TX",
"phone": "512-555-5430",
"fax": "512-555-5444"
},
.
.
.
.
]
awk or sed solutions preferred.
Here is one way:
sed '/organization/p;s/organization/company/p;s/company/long name/' file
Here is another:
awk '$1~/organization/{print $0;sub(/organization/,"company");print $0;sub(/company/,"long name")}1' file