Generating configuration hash with reduce in the Jekyll source code? - ruby

I've been looking through the Jekyll source code, and stumbled upon this method:
# Public: Generate a Jekyll configuration Hash by merging the default
# options with anything in _config.yml, and adding the given options on top.
#
# override - A Hash of config directives that override any options in both
# the defaults and the config file. See Jekyll::DEFAULTS for a
# list of option names and their defaults.
#
# Returns the final configuration Hash.
def self.configuration(override)
# Convert any symbol keys to strings and remove the old key/values
override = override.reduce({}) { |hsh,(k,v)| hsh.merge(k.to_s => v) }
# _config.yml may override default source location, but until
# then, we need to know where to look for _config.yml
source = override['source'] || Jekyll::DEFAULTS['source']
# Get configuration from <source>/_config.yml or <source>/<config_file>
config_file = override.delete('config')
config_file = File.join(source, "_config.yml") if config_file.to_s.empty?
begin
config = YAML.safe_load_file(config_file)
raise "Configuration file: (INVALID) #{config_file}" if !config.is_a?(Hash)
$stdout.puts "Configuration file: #{config_file}"
rescue SystemCallError
# Errno:ENOENT = file not found
$stderr.puts "Configuration file: none"
config = {}
rescue => err
$stderr.puts " " +
"WARNING: Error reading configuration. " +
"Using defaults (and options)."
$stderr.puts "#{err}"
config = {}
end
# Merge DEFAULTS < _config.yml < override
Jekyll::DEFAULTS.deep_merge(config).deep_merge(override)
end
end
I can't figure out what it does despite the comments. reduce({}) especially bothers me - what does it do?
Also, the method that is called just before configuration is:
options = normalize_options(options.__hash__)
What does __hash__ do?

Let's look at the code in question:
override.reduce({}) { |hsh,(k,v)| hsh.merge(k.to_s => v) }
Now let's look at the docs for Enumerable#reduce:
Combines all elements of enum by applying a binary operation, specified by a block or a symbol that names a method or operator.
If you specify a block, then for each element in enum the block is passed an accumulator value (memo) and the element. If you specify a symbol instead, then each element in the collection will be passed to the named method of memo. In either case, the result becomes the new value for memo. At the end of the iteration, the final value of memo is the return value for the method.
So, override is going to be your typical Ruby options hash, like:
{
debug: 'true',
awesomeness: 'maximum'
}
So what happens when you use that reduce on override?
It will combine all the elements of the enum (key => value pairs of the override hash) using the binary function merge. Merge takes a hash and merges it into the receiver. So what's happening here?
hsh starts out as {} and the first key/value pair is merged: {}.merge(:debug.to_s => "true").
hsh is now {"debug" => "true"}.
The next key/value pair is merged into that: {"debug" => "true"}.merge(:awesomeness.to_s => "maximum").
hsh is now {"debug" => "true", "awesomeness" => "maximum"}
There are no more elements, so this value of hsh is returned.
This matches up with the code comment, which says "Convert any symbol keys to strings and remove the old key/values", although technically the old values are not removed. Rather, a new hash is constructed and the old hash with the old values is discarded by replacing the variable with the new value, to eventually be collected – along with the intermediate objects created by the merges in the reduce – by the garbage collector. As an aside, this means that merge! would be slightly more efficient than merge in this case as it would not create those intermediate objects.
__foo__ is a ruby idiom for a quasi-private and/or 'core' method that you want to make sure isn't redefined, e.g., __send__ because things like Socket want to use send. In Ruby, hash is the hash value of an object (computed using a hash function, used when the object is used as a hash key), so __hash__ probably points to an instance variable of the options object that stores its data as a hash. Here's a class from a gem that does just that. You'd have to look at the docs for whatever type of object options is to be sure though. (You'd have to look at the code to be really sure. ;)

reduce is often used to build an array or hash, in a way that is similar to using map or collect, by iteratively adding each element to that container, usually after some manipulation to the element.
I use each_with_object instead as it's more intuitive for that sort of operation:
[:foo, :bar].each_with_object({}) do |e, h|
h[e.to_s] = e
end
Notice that each_with_object doesn't need to have the "remembered" value returned from the block like reduce or inject wants. reduce and inject are great for other types of summing magic that each_with_object doesn't do though, so leave those in your toolbox too.

Related

Unexpected FrozenError when appending elements via <<

I have a hash containing names and categories:
hash = {
'Dog' => 'Fauna',
'Rose' => 'Flora',
'Cat' => 'Fauna'
}
and I want to reorganize it so that the names are grouped by their corresponding category:
{
'Fauna' => ['Dog', 'Cat'],
'Flora' => ['Rose']
}
I am adding each names via <<:
new_hash = Hash.new
hash.each do |name , category|
if new_hash.key?(category)
new_file[category] << name
else
new_hash[category] = name
end
end
But I am being told that this operation is being performed on a frozen element:
`<<' : Can’t modify frozen string (FrozenError)
I suppose this is because each yields frozen objects. How can I restructure this code so the '.each' doesn't provide frozen variables?
I needed to add the first name to an array and then that array to the hash.
new_hash = Hash.new
hash.each do |name , category|
if new_hash.key?(category)
new_file[category] << name
else
new_hash[category] = [name] # <- must be an array
end
end
How can I restructure this code so the '.each' doesn't provide frozen variables?
Short answer: you can't.
Hash#each doesn't "provide frozen variables".
First off, there is no such thing as a "frozen variable". Variables aren't frozen. Objects are. The distinction between variables and objects is fundamental, not just in Ruby but in any programming language (and in fact pretty much everywhere else, too). If I have a sticker with the name "Seamus" on it, then this sticker is not you. It is simply a label that refers to you.
Secondly, Hash#each doesn't provide "variables". In fact, it doesn't provide anything that is not in the hash already. It simply yields the objects that are already in the hash.
Note that, in order to avoid confusion and bugs, strings are automatically frozen when used as keys. So, you can't modify string keys. You can either make sure they are correct from the beginning, or you can create a new hash with new string keys. (You can also add the new keys to the existing hash and delete the old keys, but that is a lot of complexity for little gain.)

My Inner Hash is replacing the existing value when I assign a new value

I am storing the result in hash like this
I have assigned the result like this
Result['UserCreation']={"Test1"=>"Rajagopalan"}
So it created the hash like this
{"UserCreation"=>{"Test1"=>"Rajagopalan"}}
Now, I don't know how to assign another result for Test2. When I tend to assign result like this
Result['UserCreation']={"Test2"=>"Kali"}
It's replacing the existing result, and it's correctly doing it's Job, but I want to create result hash like given below when I assign the Result of Test2
{"UserCreation"=>{"Test1"=>"Rajagopalan","Test2"=>"Kali"}}
How can I achieve this?
Let us assume in this order I receive parameters
'UserCreation',{"Test1"=>"Rajagopalan"},
'UserCreation',{"Test2"=>"Kali"}
'contactcreate',{"Test2"=>"Kali"}
Result
{"UserCreation"=>{"Test1"=>"Rajagopalan","Test2"=>"Kali"},'contactcreate'=>{"Test2"=>"Kali"}}
All these values are the parameter to the functions.
You should use Hash#merge! method:
Result['UserCreation'].merge!({"Test2"=>"Kali"})
Here's a brief explanation:
When you use the assignment (Result['UserCreation']={"Test2"=>"Kali"}) you completely replace the value for the particular hash key. If you want to add (merge) something inside the existing hash you should use merge! method.
Notice that you can use Hash#merge! method because you know that the value of Result['UserCreation'] is a hash itself.
Also notice that there's merge method without bang (!). The difference that bang-version will mutate (change) your object. Consider this:
hash = {}
hash.merge({'one' => 1})
# hash variable will hold its initial value
# because `merge` method will not change it.
p hash # => {}
hash.merge!('one' => 1)
# This time we use bang-version, so hash variable
# will be updated.
p hash # => {"one"=>1}
One more thing about Ruby, notice how in the bang-version we omit curly braces. It's possible to do it if the last argument you passing to the method is a Hash.
Also, by convention in Ruby snake-case is using for variable and method naming, i.e.
result = {}
result['user_creation'] = {'test_1' => 'Rajagopalan'}
result['user_creation'].merge!('test_2' => 'Kali')
Of course, there's a field to play. For example, you can set the initial value like this:
result = {'user_creation' => {}}
result['user_creation'].merge!('test_1' => 'Rajagopalan')
result['user_creation'].merge!('test_2' => 'Kali')
or event update several pairs:
result = {'user_creation' => {}}
result['user_creation'].merge!(
'test_1' => 'Rajagopalan',
'test_2' => 'Kali'
)
UPDATE
For your case if you receive these parameters:
'UserCreation',{"Test1"=>"Rajagopalan"},
'UserCreation',{"Test2"=>"Kali"}
'contactcreate',{"Test2"=>"Kali"}
suppose that the first parameter named kind and the last one named value:
# kind = 'UserCreation' and value = '{"Test1"=>"Rajagopalan"}'.
result = {}
# Here we check `result[kind]` if there's no key, a new hash will
# be assigned, otherwise the existing value will be used.
result[kind] ||= {}
result[kind].merge!(value)
Maybe you want to use Hash#store:
result = {}
result['UserCreation'] = {"Test1"=>"Rajagopalan"}
result['UserCreation'].store("Test2", "Kali")
result #=> {"UserCreation"=>{"Test1"=>"Rajagopalan", "Test2"=>"Kali"}}

how can I extract out this ruby code as a method that converts a hash to variables?

I have the following code spread out across a bunch of methods:
json_element is passed as an argument to the method.
issues:
The values in the hash change, meaning one could have key but the next time could have search
sometimes the value is nil so it blows up.
The gem I used which creates it has those ['$'] for elements if there's a value, but it errors out if you do json_element['COLLECTION']['$].nil?
json = json_element['JSON']['$'] unless json_element['JSON'].nil?
predicate = json_element['PREDICATE']['$'] unless json_element['PREDICATE'].nil?
key = json_element['KEY']['$'] unless json_element['KEY'].nil?
options = json_element['OPTIONS']['$'] unless json_element['OPTIONS'].nil?
cache_key = json_element['CACHE-KEY']['$'] unless json_element['CACHE-KEY'].nil?
question: how can I extract this whole bit as a method which allows for flexible keys and doesn't error out when a value is nil
I'm not sure I understand this right. If the values in the hash are nil, it shouldn't error out. The variables would just be assigned nil. #nil? also shouldn't error out.
Just refactoring your code into a method:
def process(json_element)
return if json_element.nil?
hash = {} # store the variables in a hash
%w{json predicate key options cache-key}.each do |i|
hash[i.upcase] = json_element[i.upcase]['$'] unless json_element[i.upcase].nil?
end
end

How to merge hash results from an API in Ruby?

I am retrieving subscriber information from our email provider's API. They return a hash of 25 subscribers per page. I would like to merge the results from each page into one hash (#members_subs_all). However, while my method returns the #members_subs correctly #members_subs_all does not contain any data.
Is merging the right approach? If so, what am I missing to make this work correctly? If not, what should I do differently?
def retrieve_members_info
#members_subs_all = {}
pages =* (0..1000)
pages.each do |i|
#members_subs = # API returns hash of 25 subscribers per page
#members_subs_all.merge(#members_subs)
break if #members_subs["data"].empty?
end
puts "members_subs_all #{#members_subs_all["data"]}"
end
You want merge! rather than merge. merge returns the value of the merged hashes, but doesn't change the hash it's called on. merge! returns the value of the merged hashes and updates the hash it's called on to that value.
Compare the docs on each:
http://www.ruby-doc.org/core-2.0.0/Hash.html#method-i-merge
http://www.ruby-doc.org/core-2.0.0/Hash.html#method-i-merge-21
Note also that this is a pattern for naming in Ruby and Ruby libraries:
method returns the value of some change to the caller, but doesn't change the caller's value permanently.
method! returns the value of the change and updates the caller with that value.
(See also method? which are predicates that return a truthy value based on whether or not method is true of the caller.)
def merge_hashs(a,b)
a.merge(b) {|key, a_v, b_v| merge_hashs(a_v, b_v) }
end

undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)

I'm trying to return a list of values based on user defined arguments, from hashes defined in the local environment.
def my_method *args
#initialize accumulator
accumulator = Hash.new(0)
#define hashes in local environment
foo=Hash["key1"=>["var1","var2"],"key2"=>["var3","var4","var5"]]
bar=Hash["key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"]]
baz=Hash["key6"=>["var13","var14","var15","var16"]]
#iterate over args and build accumulator
args.each do |x|
if foo.has_key?(x)
accumulator=foo.assoc(x)
elsif bar.has_key?(x)
accumulator=bar.assoc(x)
elsif baz.has_key?(x)
accumulator=baz.assoc(x)
else
puts "invalid input"
end
end
#convert accumulator to list, and return value
return accumulator = accumulator.to_a {|k,v| [k].product(v).flatten}
end
The user is to call the method with arguments that are keywords, and the function to return a list of values associated with each keyword received.
For instance
> my_method(key5,key6,key1)
=> ["var10","var11","var12","var13","var14","var15","var16","var1","var2"]
The output can be in any order. I received the following error when I tried to run the code:
undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)
Please would you point me how to troubleshoot this? In Terminal assoc performs exactly how I expect it to:
> foo.assoc("key1")
=> ["var1","var2"]
I'm guessing you're coming to Ruby from some other language, as there is a lot of unnecessary cruft in this method. Furthermore, it won't return what you expect for a variety of reasons.
`accumulator = Hash.new(0)`
This is unnecessary, as (1), you're expecting an array to be returned, and (2), you don't need to pre-initialize variables in ruby.
The Hash[...] syntax is unconventional in this context, and is typically used to convert some other enumerable (usually an array) into a hash, as in Hash[1,2,3,4] #=> { 1 => 2, 3 => 4}. When you're defining a hash, you can just use the curly brackets { ... }.
For every iteration of args, you're assigning accumulator to the result of the hash lookup instead of accumulating values (which, based on your example output, is what you need to do). Instead, you should be looking at various array concatenation methods like push, +=, <<, etc.
As it looks like you don't need the keys in the result, assoc is probably overkill. You would be better served with fetch or simple bracket lookup (hash[key]).
Finally, while you can call any method in Ruby with a block, as you've done with to_a, unless the method specifically yields a value to the block, Ruby will ignore it, so [k].product(v).flatten isn't actually doing anything.
I don't mean to be too critical - Ruby's syntax is extremely flexible but also relatively compact compared to other languages, which means it's easy to take it too far and end up with hard to understand and hard to maintain methods.
There is another side effect of how your method is constructed wherein the accumulator will only collect the values from the first hash that has a particular key, even if more than one hash has that key. Since I don't know if that's intentional or not, I'll preserve this functionality.
Here is a version of your method that returns what you expect:
def my_method(*args)
foo = { "key1"=>["var1","var2"],"key2"=>["var3","var4","var5"] }
bar = { "key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"] }
baz = { "key6"=>["var13","var14","var15","var16"] }
merged = [foo, bar, baz].reverse.inject({}, :merge)
args.inject([]) do |array, key|
array += Array(merged[key])
end
end
In general, I wouldn't define a method with built-in data, but I'm going to leave it in to be closer to your original method. Hash#merge combines two hashes and overwrites any duplicate keys in the original hash with those in the argument hash. The Array() call coerces an array even when the key is not present, so you don't need to explicitly handle that error.
I would encourage you to look up the inject method - it's quite versatile and is useful in many situations. inject uses its own accumulator variable (optionally defined as an argument) which is yielded to the block as the first block parameter.

Resources