This question already has answers here:
Ruby recursive map of a hash of objects
(2 answers)
Closed 8 years ago.
I have a hash object that could be arbitrarily large - keys are always strings, but values could be strings, arrays, or other hashes. I want to recursively walk through the object and if the value of any particular key (or the value of any array) is a string, I want to strip line endings and leading and trailing whitespace (\r\n, \t, etc")
Do I need to write this algorithm myself or is there some faster ruby-esque way to do it?
You will need to write it yourself. One way to do it is:
def strip_hash_values!(hash)
hash.each do |k, v|
case v
when String
v.strip!
when Array
v.each {|av| av.strip!}
when Hash
strip_hash_values!(v)
end
end
end
This method modifies the original hash:
hash = {:a => [" a ", " b ", " c "], :b => {:x => "xyz "}, :c => "abc "}
strip_hash_values!(hash)
puts hash
# returns {:b=>{:x=>"xyz"}, :c=>"abc", :a=>["a", "b", "c"]}
This is one way to do it.
Code
def strip_strings(o)
case o
when Hash
o.each do |k,v|
o[k] = case v
when String
v.strip
else
strip_strings(v)
end
end
else # Array
o.map do |e|
case e
when String
e.strip
else
strip_strings(e)
end
end
end
end
Example
h = { a: [b: { c: " cat ", d: [" dog ", {e: " pig " }] }], f: " pig " }
#=> {:a=>[{:b=>{:c=>" cat ", :d=>[" dog ", {:e=>" pig "}]}}], :f=>" pig "}
strip_strings(h)
#=> {:a=>[{:b=>{:c=>"cat", :d=>["dog", {:e=>"pig"}]}}], :f=>"pig"}
Related
I want to take a hash with nested hashes and arrays and flatten it out into a single hash with unique values. I keep trying to approach this from different angles, but then I make it way more complex than it needs to be and get myself lost in what's happening.
Example Source Hash:
{
"Name" => "Kim Kones",
"License Number" => "54321",
"Details" => {
"Name" => "Kones, Kim",
"Licenses" => [
{
"License Type" => "PT",
"License Number" => "54321"
},
{
"License Type" => "Temp",
"License Number" => "T123"
},
{
"License Type" => "AP",
"License Number" => "A666",
"Expiration Date" => "12/31/2020"
}
]
}
}
Example Desired Hash:
{
"Name" => "Kim Kones",
"License Number" => "54321",
"Details_Name" => "Kones, Kim",
"Details_Licenses_1_License Type" => "PT",
"Details_Licenses_1_License Number" => "54321",
"Details_Licenses_2_License Type" => "Temp",
"Details_Licenses_2_License Number" => "T123",
"Details_Licenses_3_License Type" => "AP",
"Details_Licenses_3_License Number" => "A666",
"Details_Licenses_3_Expiration Date" => "12/31/2020"
}
For what it's worth, here's my most recent attempt before giving up.
def flattify(hashy)
temp = {}
hashy.each do |key, val|
if val.is_a? String
temp["#{key}"] = val
elsif val.is_a? Hash
temp.merge(rename val, key, "")
elsif val.is_a? Array
temp["#{key}"] = enumerate val, key
else
end
print "=> #{temp}\n"
end
return temp
end
def rename (hashy, str, n)
temp = {}
hashy.each do |key, val|
if val.is_a? String
temp["#{key}#{n}"] = val
elsif val.is_a? Hash
val.each do |k, v|
temp["#{key}_#{k}#{n}"] = v
end
elsif val.is_a? Array
temp["#{key}"] = enumerate val, key
else
end
end
return flattify temp
end
def enumerate (ary, str)
temp = {}
i = 1
ary.each do |x|
temp["#{str}#{i}"] = x
i += 1
end
return flattify temp
end
Interesting question!
Theory
Here's a recursive method to parse your data.
It keeps track of which keys and indices it has found.
It appends them in a tmp array.
Once a leaf object has been found, it gets written in a hash as value, with a joined tmp as key.
This small hash then gets recursively merged back to the main hash.
Code
def recursive_parsing(object, tmp = [])
case object
when Array
object.each.with_index(1).with_object({}) do |(element, i), result|
result.merge! recursive_parsing(element, tmp + [i])
end
when Hash
object.each_with_object({}) do |(key, value), result|
result.merge! recursive_parsing(value, tmp + [key])
end
else
{ tmp.join('_') => object }
end
end
As an example:
require 'pp'
pp recursive_parsing(data)
# {"Name"=>"Kim Kones",
# "License Number"=>"54321",
# "Details_Name"=>"Kones, Kim",
# "Details_Licenses_1_License Type"=>"PT",
# "Details_Licenses_1_License Number"=>"54321",
# "Details_Licenses_2_License Type"=>"Temp",
# "Details_Licenses_2_License Number"=>"T123",
# "Details_Licenses_3_License Type"=>"AP",
# "Details_Licenses_3_License Number"=>"A666",
# "Details_Licenses_3_Expiration Date"=>"12/31/2020"}
Debugging
Here's a modified version with old-school debugging. It might help you understand what's going on:
def recursive_parsing(object, tmp = [], indent="")
puts "#{indent}Parsing #{object.inspect}, with tmp=#{tmp.inspect}"
result = case object
when Array
puts "#{indent} It's an array! Let's parse every element:"
object.each_with_object({}).with_index(1) do |(element, result), i|
result.merge! recursive_parsing(element, tmp + [i], indent + " ")
end
when Hash
puts "#{indent} It's a hash! Let's parse every key,value pair:"
object.each_with_object({}) do |(key, value), result|
result.merge! recursive_parsing(value, tmp + [key], indent + " ")
end
else
puts "#{indent} It's a leaf! Let's return a hash"
{ tmp.join('_') => object }
end
puts "#{indent} Returning #{result.inspect}\n"
result
end
When called with recursive_parsing([{a: 'foo', b: 'bar'}, {c: 'baz'}]), it displays:
Parsing [{:a=>"foo", :b=>"bar"}, {:c=>"baz"}], with tmp=[]
It's an array! Let's parse every element:
Parsing {:a=>"foo", :b=>"bar"}, with tmp=[1]
It's a hash! Let's parse every key,value pair:
Parsing "foo", with tmp=[1, :a]
It's a leaf! Let's return a hash
Returning {"1_a"=>"foo"}
Parsing "bar", with tmp=[1, :b]
It's a leaf! Let's return a hash
Returning {"1_b"=>"bar"}
Returning {"1_a"=>"foo", "1_b"=>"bar"}
Parsing {:c=>"baz"}, with tmp=[2]
It's a hash! Let's parse every key,value pair:
Parsing "baz", with tmp=[2, :c]
It's a leaf! Let's return a hash
Returning {"2_c"=>"baz"}
Returning {"2_c"=>"baz"}
Returning {"1_a"=>"foo", "1_b"=>"bar", "2_c"=>"baz"}
Unlike the others, I have no love for each_with_object :-). But I do like passing a single result hash around so I don't have to merge and remerge hashes over and over again.
def flattify(value, result = {}, path = [])
case value
when Array
value.each.with_index(1) do |v, i|
flattify(v, result, path + [i])
end
when Hash
value.each do |k, v|
flattify(v, result, path + [k])
end
else
result[path.join("_")] = value
end
result
end
(Some details adopted from Eric, see comments)
Non-recursive approach, using BFS with an array as a queue. I keep the key-value pairs where the value isn't an array/hash, and push array/hash contents to the queue (with combined keys). Turning arrays into hashes (["a", "b"] ↦ {1=>"a", 2=>"b"}) as that felt neat.
def flattify(hash)
(q = hash.to_a).select { |key, value|
value = (1..value.size).zip(value).to_h if value.is_a? Array
!value.is_a?(Hash) || !value.each { |k, v| q << ["#{key}_#{k}", v] }
}.to_h
end
One thing I like about it is the nice combination of keys as "#{key}_#{k}". In my other solution, I could've also used a string path = '' and extended that with path + "_" + k, but that would've caused a leading underscore that I'd have to avoid or trim with extra code.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
I am trying to write a method that takes in a string and a hash and "encodes" the string based on hash keys and values.
def encode(str,encoding)
end
str = "12#3"
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
I am expecting the output to be "one two three" any char in the string that is not a key in the hash is replaced with an empty string.
Right now my code looks like the following:
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += encoding[ch]
else
output += ""
end
end
return output
end
Any help is appreciated
You can use use the form of String#gsub that uses a hash for substitutions, and a simple regex:
str = "12#3"
encoding = {"1"=>"one", "2"=>"two", "3"=>"three"}
First create a new hash that adds a space to each value in encoding:
adj_encoding = encoding.each_with_object({}) { |(k,v),h| h[k] = "#{v} " }
#=> {"1"=>"one ", "2"=>"two ", "3"=>"three "}
Now perform the substitutions and strip off the extra space if one of the keys of encoding is the last character of str:
str.gsub(/./, adj_encoding).rstrip
#=> "one two three"
Another example:
"1ab 2xx4cat".gsub(/./, adj_encoding).rstrip
#=> "one two"
Ruby determines whether each character of str (the /./ part) equals a key of adj_encodeing. If it does, she substitutes the key's value for the character; else she substitutes an empty string ('') for the character.
You can build a regular expression that matches your keys via Regexp.union:
re = Regexp.union(encoding.keys)
#=> /1|2|3/
scan the string for occurrences of keys using that regular expression:
keys = str.scan(re)
#=> ["1", "2", "3"]
fetch the corresponding values using values_at:
values = encoding.values_at(*keys)
#=> ["one", "two", "three"]
and join the array with a single space:
values.join(' ')
#=> "one two three"
As a "one-liner":
encoding.values_at(*str.scan(Regexp.union(encoding.keys))).join(' ')
#=> "one two three"
Try:
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += encoding[ch] + " "
else
output += ""
end
end
return output.split.join(' ')
end
str = "12#3"
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
p encode(str, encoding) #=> "one two three"
If you are expecting "one two three" you just need to add an space to your concat line and before return, add .lstrip to remove the first space.
Hint: You don't need the "else" concatenating an empty string. If the "#" don't match the encoding hash, it will be ignored.
Like this:
#str = "12#3"
#encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
def encode(str, encoding)
output = ""
str.each_char do |ch|
if encoding.has_key?(ch)
output += " " + encoding[ch]
end
end
return output.lstrip
end
# Output: "one two three"
I would do:
encoding = {"1" => "one", "2"=> "two", "3"=> "three"}
str = "12#3"
str.chars.map{|x|encoding.fetch(x,nil)}.compact.join(' ')
Or two lines like this:
in_encoding_hash = -> x { encoding.has_key? x }
str.chars.grep(in_encoding_hash){|x|encoding[x]}.join(' ')
Say I have a hash like so:
top_billed = { ghostbusters: 'Bill Murray', star_wars: 'Harrison Ford' }
What would be the best way to format it in a nice, human-readable way?
For example if you called a method on the hash and it displayed the hash as a capitalized list, minus underscores.
"Ghostbusters: Bill Murray
Star Wars: Harrison Ford
I guess iterating over the array and using gsub to remove underscores then capitalizing might work, but I was wondering whether there was anything more elegant.
Thanks
Manually:
top_billed.each do |k, v|
puts k.to_s.gsub("_", " ") + ": " + v
end
if you are using Rails or ActiveSupport, you can also use the "humanize" method (on a String).
Here is a recursive solution:
top_billed = { ghostbusters: { my_name: 'Bill Murray', my_age: 29 }, star_wars: { my_name: 'Harrison Ford' }, something_esle: 'Its Name'}
def print_well(key, value, indent)
key = key.to_s.split('_').map(&:capitalize).join(' ')
if Hash === value
puts "#{key}: "
value.each do |k, v|
print_well k, v, indent + 1
end
else
puts "#{' '*indent}#{key}: #{value}"
end
end
def print_hash hash, indent=0
hash.each do |key, value|
print_well key, value, indent
end
end
print_hash top_billed
As per the question, just wondering how to do this without the use of the Ruby stdlib 'JSON' module (and thus the JSON.pretty_generate method).
So I have an array of hashes that looks like:
[{"h1"=>"a", "h2"=>"b", "h3"=>"c"}, {"h1"=>"d", "h2"=>"e", "h3"=>"f"}]
and I'd like to convert it so that it looks like the following:
[
{
"h1": "a",
"h2": "b",
"h3": "c",
},
{
"h1": "d",
"h2": "e",
"h3": "f",
}
]
I can get the hash-rockets replaced with colon spaces using a simple gsub (array_of_hashes.to_s.gsub!(/=>/, ": ")), but not sure about how to generate it so that it looks like the above example. I had originally thought of doing this use a here-doc approach, but not sure this is the best way, plus i havn't managed to get it working yet either. I'm new to Ruby so apologies if this is obvious! :-)
def to_json_pretty
json_pretty = <<-EOM
[
{
"#{array_of_hashes.each { |hash| puts hash } }"
},
]
EOM
json_pretty
end
In general, working with JSON well without using a library is going to take more than just a few lines of code. That being said, the best way of JSON-ifying things is generally to do it recursively, for example:
def pretty_json(obj)
case obj
when Array
contents = obj.map {|x| pretty_json(x).gsub(/^/, " ") }.join(",\n")
"[\n#{contents}\n]"
when Hash
contents = obj.map {|k, v| "#{pretty_json(k.to_s)}: #{pretty_json(v)}".gsub(/^/, " ") }.join(",\n")
"{\n#{contents}\n}"
else
obj.inspect
end
end
This should work well if you input is exactly in the format you presented and not nested:
a = [{"h1"=>"a", "h2"=>"b", "h3"=>"c"}, {"h1"=>"d", "h2"=>"e", "h3"=>"f"}]
hstart = 0
astart = 0
a.each do |b|
puts "[" if astart == 0
astart+=1
b.each do |key, value|
puts " {" if hstart == 0
hstart += 1
puts " " + key.to_s + ' : ' + value
if hstart % 2 == 0
if hstart == a.collect(&:size).reduce(:+)
puts " }"
else
puts " },\n {"
end
end
end
puts "]" if astart == a.size
end
Output:
[
{
h1 : a
h2 : b
},
{
h3 : c
h1 : d
},
{
h2 : e
h3 : f
}
]
You can take a look at my NeatJSON gem for how I did it. Specifically, look at neatjson.rb, which uses a recursive solution (via a proc).
My code has a lot of variation based on what formatting options you supply, so it obviously does not have to be as complex as this. But the general pattern is to test the type of object supplied to your method/proc, serialize it if it's simple, or (if it's an Array or Hash) re-call the method/proc for each value inside.
Here's a far-simplified version (no indentation, no line wrapping, hard-coded spacing):
def simple_json(object)
js = ->(o) do
case o
when String then o.inspect
when Symbol then o.to_s.inspect
when Numeric then o.to_s
when TrueClass,FalseClass then o.to_s
when NilClass then "null"
when Array then "[ #{o.map{ |v| js[v] }.join ', '} ]"
when Hash then "{ #{o.map{ |k,v| [js[k],js[v]].join ":"}.join ', '} }"
else
raise "I don't know how to deal with #{o.inspect}"
end
end
js[object]
end
puts simple_json({a:1,b:[2,3,4],c:3})
#=> { "a":1, "b":[ 2, 3, 4 ], "c":3 }
So I have the following code which counts the frequency of each letter in a string (or in this specific instance from a file):
def letter_frequency(file)
letters = 'a' .. 'z'
File.read(file) .
split(//) .
group_by {|letter| letter.downcase} .
select {|key, val| letters.include? key} .
collect {|key, val| [key, val.length]}
end
letter_frequency(ARGV[0]).sort_by {|key, val| -val}.each {|pair| p pair}
Which works great, but I would like to see if there is someway to do something in ruby that is similar to this but to catch all the different possible symbols? ie spaces, commas, periods, and everything in between. I guess to put it more simply, is there something similar to 'a' .. 'z' that holds all the symbols? Hope that makes sense.
You won't need a range when you're trying to count every possible character, because every possible character is a domain. You should only create a range when you specifically need to use a subset of said domain.
This is probably a faster implementation that counts all characters in the file:
def char_frequency(file_name)
ret_val = Hash.new(0)
File.open(file_name) {|file| file.each_char {|char| ret_val[char] += 1 } }
ret_val
end
p char_frequency("1003v-mm") #=> {"\r"=>56, "\n"=>56, " "=>2516, "\xC9"=>2, ...
For reference I used this test file.
It may not use much Ruby magic with Ranges but a simple way is to build a character counter that iterates over each character in a string and counts the totals:
class CharacterCounter
def initialize(text)
#characters = text.split("")
end
def character_frequency
character_counter = {}
#characters.each do |char|
character_counter[char] ||= 0
character_counter[char] += 1
end
character_counter
end
def unique_characters
character_frequency.map {|key, value| key}
end
def frequency_of(character)
character_frequency[character] || 0
end
end
counter = CharacterCounter.new("this is a test")
counter.character_frequency # => {"t"=>3, "h"=>1, "i"=>2, "s"=>3, " "=>3, "a"=>1, "e"=>1}
counter.unique_characters # => ["t", "h", "i", "s", " ", "a", "e"]
counter.frequency_of 't' # => 3
counter.frequency_of 'z' # => 0