I have created a wildcard matcher for RR that matches JSON strings by parsing them into hashes. This is because JSON (de)serialization doesn't preserve order; if we have:
{ 'foo': 42, 'bar': 123 }
... then after (de)serialization, we might find that our update method is called with:
{ 'bar': 123, 'foo': 42 }
The wildcard matcher looks like this:
class RR::WildcardMatchers::MatchesJsonString
attr_reader :expected_json_hash
def initialize(expected_json_string)
#expected_json_hash = JSON.parse(expected_json_string)
end
def ==(other)
other.respond_to?(:expected_json_hash) && other.expected_json_hash == self.expected_json_hash
end
def wildcard_matches?(actual_json_string)
actual_json_hash = JSON.parse(actual_json_string)
#expected_json_hash == actual_json_hash
end
end
module RR::Adapters::RRMethods
def matches_json(expected_json_string)
RR::WildcardMatchers::MatchesJsonString.new(expected_json_string)
end
end
... and we're using it like:
describe 'saving manifests' do
before do
#manifests = [
{ :sections => [], 'title' => 'manifest1' },
{ :sections => [], 'title' => 'manifest2' }
]
mock(manifest).create_or_update!(matches_json(#manifests[0].to_json)) { raise 'uh oh' }
mock(manifest).create_or_update!(matches_json(#manifests[1].to_json))
parser = ContentPack::ContentPackParser.new({
'manifests' => #manifests
})
#errors = parser.save
end
it 'updates manifests' do
manifest.should have_received.create_or_update!(anything).twice
end
end
This in accordance with the RR documentation. However, instead of mock() expecting an argument that matches JSON, it expects the argument to be a MatchesJsonString object:
1) ContentPack::ContentPackParser saving manifests updates manifests
Failure/Error: mock(Manifest).create_or_update!(matches_json(#manifests[0].to_json)) { raise 'uh oh' }
RR::Errors::TimesCalledError:
create_or_update!(#<RR::WildcardMatchers::MatchesJsonString:0x13540def0 #expected_json_hash={"title"=>"manifest1", "sections"=>[]}>)
Called 0 times.
Expected 1 times.
# ./spec/models/content_pack/content_pack_parser_spec.rb:196
The answer is that there's a typo in the documentation to which I linked. This (my emphasis):
#wildcard_matches?(other)
wildcard_matches? is the method that actually checks the argument against the expectation. It should return true if other is considered to match, false otherwise. In the case of DivisibleBy, wildcard_matches? reads:
... should actually read:
#wildcard_match?(other)
...
One of my colleagues suggested that we compare our code with one of the matchers defined in the rr gem, and then the difference stood out.
Related
I'd like to calculate the difference for various values inside 2 hashes with the same structure, as concisely as possible. Here's a simplified example of the data I'd like to compare:
hash1 = {"x" => { "y" => 20 } }
hash2 = {"x" => { "y" => 12 } }
I have a very simple method to get the value I want to compare. In reality, the hash can be nested a lot deeper than these examples, so this is mostly to keep the code readable:
def get_y(data)
data["x"]["y"]
end
I want to create a method that will calculate the difference between the 2 values, and can take a method like get_y as an argument, allowing me to re-use the code for any value in the hash. I'd like to be able to call something like this, and I'm not sure how to write the method get_delta:
get_delta(hash1, hash2, get_y) # => 8
The "Ruby way" would be to pass a block:
def get_delta_by(obj1, obj2)
yield(obj1) - yield(obj2)
end
hash1 = {"x" => { "y" => 20 } }
hash2 = {"x" => { "y" => 12 } }
get_delta_by(hash1, hash2) { |h| h["x"]["y"] }
#=> 8
A method could be passed (indirectly) via:
def get_y(data)
data["x"]["y"]
end
get_delta_by(hash1, hash2, &method(:get_y))
#=> 8
Building on Stefan's response, if you want a more flexible get method you can actually return a lambda from the function and pass arguments for what you want to get. This will let you do error handling nicely:
Starting with the basics from above...
def get_delta_by(obj1, obj2)
yield(obj1) - yield(obj2)
end
hash1 = {"x" => { "y" => 20 } }
hash2 = {"x" => { "y" => 12 } }
get_delta_by(hash1, hash2) { |h| h["x"]["y"] }
Then we can define a get_something function which takes a list of arguments for the path of the element to get:
def get_something(*args)
lambda do |data|
args.each do |arg|
begin
data = data.fetch(arg)
rescue KeyError
raise RuntimeError, "KeyError for #{arg} on path #{args.join(',')}"
end
end
return data
end
end
Finally we call the function using the ampersand to pass the lambda as a block:
lambda_getter = get_something("x","y")
get_delta_by(hash1, hash2, &lambda_getter)
That last bit can be a one liner... but wrote it as two for clarity here.
In Ruby 2.3, you can use Hash#dig method, if it meets your needs.
hash1.dig("x", "y") - hash2.dig("x", "y")
#=> 8
I have a simple function which takes a JSON and 'does something' with it. The main part works good BUT the function returns not only what I want but additionally the result of .each loop!
The code:
module Puppet::Parser::Functions
newfunction(:mlh, :type => :rvalue) do |args|
lvm_default_hash = args[0]
lvm_additional_hash = args[1]
if lvm_additional_hash.keys.length == 1
if lvm_additional_hash.keys.include? 'logical_volumes'
# do stuff - we have only 'logical_volumes'
lvm_default_hash.keys.each do |key|
pv_array = Hash['physical_volumes' => lvm_default_hash[key]['physical_volumes']]
lv_hash = lvm_default_hash[key]['logical_volumes']
new_lv_hash = lvm_additional_hash['logical_volumes']
merged_lv_hash = Hash['logical_volumes' => lv_hash.merge(new_lv_hash)]
# this is what I want to return to init.pp
puts Hash[key => pv_array.merge(merged_lv_hash)]
end
end
end
end
end
Variables in the init.pp are:
$default_volume_groups = {
'sys' => {
'physical_volumes' => [
'/dev/sda2',
],
'logical_volumes' => {
'root' => {'size' => '4G'},
'swap' => {'size' => '256M'},
'var' => {'size' => '8G'},
'docker' => {'size' => '16G'},
},
},
}
and the second argument from a hieradata:
modified_volume_groups:
logical_volumes:
cloud_log:
size: '16G'
In the init.pp I have something like this to test it:
notice(mlh($default_volume_groups, $modified_volume_groups))
which gives me a result:
syslogical_volumesvarsize8Gdockersize16Gcloud_logsize16Gswapsize256Mrootsize4Gphysical_volumes/dev/sda2
Notice: Scope(Class[Ops_lvm]): sys
The "long" part before the Notice is the proper result from the puts but the Notice: Scope(): sys is this what I do not want to!
I know that this is the result of this each loop over the default_volumes_groups:
lvm_default_hash.keys.each do |key|
# some stuff
end
How to block of this unwanted result? It blows my puppet's logic because my init.pp sees this sys and not what I want.
Does someone knows how to handle such problem?
Thank you!
I found how to handle this problem but maybe someone could explain me why it works in this way :)
This does not work (short version):
module Puppet::Parser::Functions
newfunction(:mlh, :type => :rvalue) do |args|
lvm_default_hash = args[0]
lvm_additional_hash = args[1]
if lvm_additional_hash.keys.length == 1
if lvm_additional_hash.keys.include? 'logical_volumes'
lvm_default_hash.keys.each do |key|
pv_array = Hash['physical_volumes' => lvm_default_hash[key]['physical_volumes']]
lv_hash = lvm_default_hash[key]['logical_volumes']
new_lv_hash = lvm_additional_hash['logical_volumes']
merged_lv_hash = Hash['logical_volumes' => lv_hash.merge(new_lv_hash)]
puts Hash[key => pv_array.merge(merged_lv_hash)]
end
end
end
end
end
but this works:
module Puppet::Parser::Functions
newfunction(:mlh, :type => :rvalue) do |args|
lvm_default_hash = args[0]
lvm_additional_hash = args[1]
# empty Hash
hash_to_return = {}
if lvm_additional_hash.keys.length == 1
if lvm_additional_hash.keys.include? 'logical_volumes'
lvm_default_hash.keys.each do |key|
pv_array = Hash['physical_volumes' => lvm_default_hash[key]['physical_volumes']]
lv_hash = lvm_default_hash[key]['logical_volumes']
new_lv_hash = lvm_additional_hash['logical_volumes']
merged_lv_hash = Hash['logical_volumes' => lv_hash.merge(new_lv_hash)]
# assigned value in the 'each' loop we want to return to puppet
hash_to_return = Hash[key => pv_array.merge(merged_lv_hash)]
end
# returned Hash - instead of previous 'puts'
return hash_to_return
end
end
end
end
Now I have what I need!
Notice: Scope(Class[Ops_lvm]): sysphysical_volumes/de
You've got it -- the first one doesn't work because in Ruby, the return value of a block or function is the last evaluated statement. In the case of the one that didn't work, the last evaluated statement was the .each. As it turns out, each evaluates to the enumerable that it was looping through.
A simple example:
def foo
[1, 2, 3].each do |n|
puts n
end
end
If I were to run this, the return value of the function would be the array:
> foo
1
2
3
=> [1, 2, 3]
So what you have works, because the last thing evaluated is return hash_to_return. You could even just go hash_to_return and it'd work.
If you wanted to get rid of the return and clean that up a little bit (and if you're using Ruby 1.9 or above), you could replace your each line with:
lvm_default_hash.keys.each_with_object({}) do |key, hash_to_return|
This is because each_with_object evaluates to the "object" (in this case the empty hash passed into the method, and referred to as hash_to_return in the block params). If you do this you can remove the return as well as the initialization hash_to_return = {}.
Hope this helps!
Your custom function has rvalue type which means it needs to return value. If you don't specify return <something> by default, your last statement is implicitly your return.
In the example above, first one that does not work correctly, has last statement inside each block:
puts Hash[key => pv_array.merge(merged_lv_hash)]
Your second example is correct simply because you set value for hash_to_return in each block and then "return" it outside of each block. Not sure if this is the behavior you want since last assigned hash value (in last loop inside each block) will be the one that will be returned from this function.
I have some ruby code that gets a json from Jenkins that contains an array of n items. The item I want has a key called "lastBuiltRevision"
I know I can loop through the array like so
actions.each do |action|
if action["lastBuiltRevision"]
lastSuccessfulRev = action["lastBuiltRevision"]["SHA1"]
break
end
end
but that feels very clunky and devoid of the magic that I usually feel when working with ruby.
I have only been tinkering with it for roughly a week now, and I feel that I may be missing something to make this easier/faster.
Is there such a thing? or is manual iteration all I can do?
I am kind of hoping for something like
lastSuccessfulRev = action.match("lastBuildRevision/SHA1")
Using Enumerable#find:
actions = [
{'dummy' => true },
{'dummy' => true },
{'dummy' => true },
{'lastBuiltRevision' => { "SHA1" => "123abc" }},
{'dummy' => true },
]
actions.find { |h|
h.has_key? 'lastBuiltRevision'
}['lastBuiltRevision']['SHA1']
# => "123abc"
UPDATE
Above code will throw NoMethodError if there's no matched item. Use follow code if you don't want get an exception.
rev = actions.find { |h| h.has_key? 'lastBuiltRevision' }
rev = rev['lastBuiltRevision']['SHA1'] if rev
Here's another way to do it, making use of the form of Enumerable#find that takes a parameter ifnone which is called, and its return value returned, if find's block never evaluates true.
I assume the method is to return nil if either key is not found.
Code
def look_for(actions, k1, k2)
actions.find(->{{k1=>{}}}) { |e| e.key?(k1) }[k1][k2]
end
Examples
actions = [{ 'dog'=>'woof' }, { 'lastBuiltRevision'=>{ 'SHA1'=> 2 } }]
look_for(actions, 'lastBuiltRevision', 'SHA1') #=> 2
look_for(actions, 'cat, 'SHA1') #=> nil
look_for(actions, 'lastBuiltRevision', 'cat') #=> nil
Explanation
I've made find's ifnone parameter the lambda:
l = ->{{k1=>{}}}
so that:
k1 = "cats"
h = l.call #=> {"cats"=>{}}
h[k1]['SHA1'] #=> {}['SHA1']
#=> nil
Try:
action.map { |a| a["lastBuiltRevision"] }.compact.map { |lb| lb["SHA1"] }.first
I am attempting to parse a simple indentation sensitive syntax using the Parslet library within Ruby.
The following is an example of the syntax I am attempting to parse:
level0child0
level0child1
level1child0
level1child1
level2child0
level1child2
The resulting tree would look like so:
[
{
:identifier => "level0child0",
:children => []
},
{
:identifier => "level0child1",
:children => [
{
:identifier => "level1child0",
:children => []
},
{
:identifier => "level1child1",
:children => [
{
:identifier => "level2child0",
:children => []
}
]
},
{
:identifier => "level1child2",
:children => []
},
]
}
]
The parser that I have now can parse nesting level 0 and 1 nodes, but cannot parse past that:
require 'parslet'
class IndentationSensitiveParser < Parslet::Parser
rule(:indent) { str(' ') }
rule(:newline) { str("\n") }
rule(:identifier) { match['A-Za-z0-9'].repeat.as(:identifier) }
rule(:node) { identifier >> newline >> (indent >> identifier >> newline.maybe).repeat.as(:children) }
rule(:document) { node.repeat }
root :document
end
require 'ap'
require 'pp'
begin
input = DATA.read
puts '', '----- input ----------------------------------------------------------------------', ''
ap input
tree = IndentationSensitiveParser.new.parse(input)
puts '', '----- tree -----------------------------------------------------------------------', ''
ap tree
rescue IndentationSensitiveParser::ParseFailed => failure
puts '', '----- error ----------------------------------------------------------------------', ''
puts failure.cause.ascii_tree
end
__END__
user
name
age
recipe
name
foo
bar
It's clear that I need a dynamic counter that expects 3 indentation nodes to match a identifier on the nesting level 3.
How can I implement an indentation sensitive syntax parser using Parslet in this way? Is it possible?
There are a few approaches.
Parse the document by recognising each line as a collection of indents and an identifier, then apply a transformation afterwards to reconstruct the hierarchy based on the number of indents.
Use captures to store the current indent and expect the next node to include that indent plus more to match as a child (I didn't dig into this approach much as the next one occurred to me)
Rules are just methods. So you can define 'node' as a method, which means you can pass parameters! (as follows)
This lets you define node(depth) in terms of node(depth+1). The problem with this approach, however, is that the node method doesn't match a string, it generates a parser. So a recursive call will never finish.
This is why dynamic exists. It returns a parser that isn't resolved until the point it tries to match it, allowing you to now recurse without problems.
See the following code:
require 'parslet'
class IndentationSensitiveParser < Parslet::Parser
def indent(depth)
str(' '*depth)
end
rule(:newline) { str("\n") }
rule(:identifier) { match['A-Za-z0-9'].repeat(1).as(:identifier) }
def node(depth)
indent(depth) >>
identifier >>
newline.maybe >>
(dynamic{|s,c| node(depth+1).repeat(0)}).as(:children)
end
rule(:document) { node(0).repeat }
root :document
end
This is my favoured solution.
I don't like the idea of weaving knowledge of the indentation process through the whole grammar. I would rather just have INDENT and DEDENT tokens produced that other rules could use similarly to just matching "{" and "}" characters. So the following is my solution. It is a class IndentParser that any parser can extend to get nl, indent, and decent tokens generated.
require 'parslet'
# Atoms returned from a dynamic that aren't meant to match anything.
class AlwaysMatch < Parslet::Atoms::Base
def try(source, context, consume_all)
succ("")
end
end
class NeverMatch < Parslet::Atoms::Base
attr_accessor :msg
def initialize(msg = "ignore")
self.msg = msg
end
def try(source, context, consume_all)
context.err(self, source, msg)
end
end
class ErrorMatch < Parslet::Atoms::Base
attr_accessor :msg
def initialize(msg)
self.msg = msg
end
def try(source, context, consume_all)
context.err(self, source, msg)
end
end
class IndentParser < Parslet::Parser
##
# Indentation handling: when matching a newline we check the following indentation. If
# that indicates an indent token or detent tokens (1+) then we stick these in a class
# variable and the high-priority indent/dedent rules will match as long as these
# remain. The nl rule consumes the indentation itself.
rule(:indent) { dynamic {|s,c|
if #indent.nil?
NeverMatch.new("Not an indent")
else
#indent = nil
AlwaysMatch.new
end
}}
rule(:dedent) { dynamic {|s,c|
if #dedents.nil? or #dedents.length == 0
NeverMatch.new("Not a dedent")
else
#dedents.pop
AlwaysMatch.new
end
}}
def checkIndentation(source, ctx)
# See if next line starts with indentation. If so, consume it and then process
# whether it is an indent or some number of dedents.
indent = ""
while source.matches?(Regexp.new("[ \t]"))
indent += source.consume(1).to_s #returns a Slice
end
if #indentStack.nil?
#indentStack = [""]
end
currentInd = #indentStack[-1]
return AlwaysMatch.new if currentInd == indent #no change, just match nl
if indent.start_with?(currentInd)
# Getting deeper
#indentStack << indent
#indent = indent #tells the indent rule to match one
return AlwaysMatch.new
else
# Either some number of de-dents or an error
# Find first match starting from back
count = 0
#indentStack.reverse.each do |level|
break if indent == level #found it,
if level.start_with?(indent)
# New indent is prefix, so we de-dented this level.
count += 1
next
end
# Not a match, not a valid prefix. So an error!
return ErrorMatch.new("Mismatched indentation level")
end
#dedents = [] if #dedents.nil?
count.times { #dedents << #indentStack.pop }
return AlwaysMatch.new
end
end
rule(:nl) { anynl >> dynamic {|source, ctx| checkIndentation(source,ctx) }}
rule(:unixnl) { str("\n") }
rule(:macnl) { str("\r") }
rule(:winnl) { str("\r\n") }
rule(:anynl) { unixnl | macnl | winnl }
end
I'm sure a lot can be improved, but this is what I've come up with so far.
Example usage:
class MyParser < IndentParser
rule(:colon) { str(':') >> space? }
rule(:space) { match(' \t').repeat(1) }
rule(:space?) { space.maybe }
rule(:number) { match['0-9'].repeat(1).as(:num) >> space? }
rule(:identifier) { match['a-zA-Z'] >> match["a-zA-Z0-9"].repeat(0) }
rule(:block) { colon >> nl >> indent >> stmt.repeat.as(:stmts) >> dedent }
rule(:stmt) { identifier.as(:id) >> nl | number.as(:num) >> nl | testblock }
rule(:testblock) { identifier.as(:name) >> block }
rule(:prgm) { testblock >> nl.repeat }
root :prgm
end
I have the following class called Tree that builds a simple tree
class Tree
attr_accessor :children, :node_name
def initialize(name, children=[])
#children = children
#node_name = name
end
def visit_all(&block)
visit &block
children.each {|c| c.visit_all &block}
end
def visit(&block)
block.call self
end
end
ruby_tree = Tree.new("grandpa",
[Tree.new("dad", [Tree.new("child1"), Tree.new("child2")]),
Tree.new("uncle", [Tree.new("child3"), Tree.new("child4")])])
puts "Visiting a node"
ruby_tree.visit {|node| puts node.node_name}
puts
puts "visiting entire tree"
ruby_tree.visit_all {|node| puts node.node_name}
Now what I am trying to do is to be able to create a tree as nested hashes instead. For example, for this one this would be:
{'grandpa'=>{'dad'=>{'child 1'=>{},'child 2'=>{}}, 'uncle'=>{'child 3'=>{}, 'child 4'=>{}}}}
Any ideas that could help?
It was melting my brain so I wrote a spec for it:
# encoding: UTF-8
require 'rspec' # testing/behaviour description framework
require_relative "../tree.rb" # pull in the actual code
# Everything in the `describe` block is rspec "tests"
describe :to_h do
# contexts are useful for describing what happens under certain conditions, in the first case, when there is only the top of the tree passed to to_h
context "One level deep" do
# a let is a way of declaring a variable in rspec (that keeps it useful)
let(:ruby_tree) { Tree.new "grandpa" }
let(:expected) { {"grandpa" => {} } }
subject { ruby_tree.to_h } # this the behaviour you're testing
it { should == expected } # it should equal what's in expected above
end
# The next two contexts are just testing deeper trees. I thought that each depth up to 3 should be tested, as past 3 levels it would be the same as 3.
context "Two levels deep" do
let(:ruby_tree) {
Tree.new( "grandpa",
[Tree.new("dad"), Tree.new("uncle") ]
)
}
let(:expected) do
{"grandpa" => {
"dad" => {}, "uncle" => {}
}
}
end
subject { ruby_tree.to_h }
it { should == expected }
end
context "grandchildren" do
let(:ruby_tree){
ruby_tree = Tree.new("grandpa",
[Tree.new("dad", [Tree.new("child1"), Tree.new("child2")]),
Tree.new("uncle", [Tree.new("child3"), Tree.new("child4")])])
}
let(:expected) {
{'grandpa'=>{'dad'=>{'child1'=>{},'child2'=>{}}, 'uncle'=>{'child3'=>{}, 'child4'=>{}}}}
}
subject { ruby_tree.to_h }
it { should == expected }
end
end
class Tree
def to_h
hash ={} # create a hash
# `reduce` is a synonym for `inject`, see the other answer for a link to the docs,
# but it's a type of fold
# http://en.wikipedia.org/wiki/Fold_(higher-order_function),
# which will take a list of several objects and
# fold them into one (or fewer, but generally one) through application of a function.
# It reduces the list through injecting a function, hence the synonyms.
# Here, the current node's list of children is folded into one hash by
# applying Hash#merge to each child (once the child has been been made
# into a one key hash, possibly with children too), and then assigned as
# the current node's hash value, with the node_name as the key.
hash[#node_name] = children.reduce({}){|mem,c| mem.merge c.to_h}
hash # return the hash
end
end
I'm certain this could be done better, but it works at least.
Btw, the hash you provided has some extra spaces in it that I don't think should be there? e.g. "child 1" when it should be "child1", unless you really want that added in?
class Tree
def to_hash
{ #node_name => #children.inject({}) { |acum, child| acum.merge(child.to_hash) } }
end
end
p ruby_tree.to_hash
See documentation for inject here
Break it into simpler subproblems and use recursion:
def make_node(name,subhash)
Tree.new(name,subhash.keys.collect{|k|make_node(k,subhash[k])})
end
def make_root(hash)
make_node(hash.keys[0],hash[hash.keys[0]])
end
Then to prove it works:
tree_like_this = make_root({'grandpa' => { 'dad' => {'child 1' => {}, 'child 2' => {} },
'uncle' => {'child 3' => {}, 'child 4' => {} } } })
puts 'tree like this'
tree_like_this.visit_all{|n|puts n.node_name}
This was an exercise from Seven Languages In Seven Weeks. The original exercise said to put it all in initialize.