Ruby Exceptions In A Loop, how to tackle skipped elements? - ruby

I have a loop, which runs a chain of gsub methods taking a RegExp as an Argument, over 'Macro' text templates which I feed into the loop. Some have all of the desired RegExp matches, others have only one or two. The next element in the loop is being skipped as a result. How do I retry the ensure the loop continues, only skipping the .match that is unfulfilled?
Macro Objects:
[{"url"=>"macros/360081752739.json",
"id"=>360081752739,
"actions"=>[{"field"=>"comment_value_html", "value"=>" DE89 3704 0044 0532 0130 00 DE89370400440532013000 test#gmail.com "}],
"restriction"=>nil},
{"url"=>"macros/360081755559.json",
"id"=>360081755559,
"actions"=>[{"field"=>"comment_value_html", "value"=>"test#gmail.com "}],
"restriction"=>nil}]
Loop:
PARTS = [
{ re: /\bDE(?:[0-9a-zA-Z]\s?){20}\b/, name: 'IBAN' },
{ re: /\b[\w\.-]+#[\w\.-]+\.\w{2,4}\b/, name: 'E-Mail' },
{ re: /\b(0|0049\s?|\+49\s?|\(\+49\)\s?){1}([1-9]{2,4})([ \-\
/]?[0-9]{1,10})+\b/, name: 'Phone Number' }].freeze
PARTS.each do |x|
list = macros.select { |m| m['actions'].any? { |w| x[:re].match?(w['value']) } }.map { |m| m['id'] }
puts "Macro ID with #{x[:name]} #{list}"
list.each do |i|
#id = i
data_macro = macros.select { |m| m['actions'].any? { |w| x[:re].match?(w['value']) } }.to_s.gsub(/\b[\w\.-]+#[\w\.-]+\.\w{2,4}\b/, '{{email}}').gsub(/\bDE(?:[0-9a-zA-Z]\s?){20}\b/, '{{IBAN}}').gsub(/\b(0|0049\s?|\+49\s?|\(\+49\)\s?){1}([1-9]{2,4})([ \-\/]?[0-9]{1,10})+\b/, '{{phone_number}}')
hash = eval(data_macro)
data_hash = hash.to_json
#final_data = JSON.parse(data_hash)
end
end
Expected Output:
[{"url"=>"macros/360081752739.json",
"id"=>360081752739,
"actions"=>[{"field"=>"comment_value_html", "value"=>" {{IBAN}} {{IBAN}} {{email}} "}],
"restriction"=>nil},
{"url"=>"macros/360081755559.json",
"id"=>360081755559,
"actions"=>[{"field"=>"comment_value_html", "value"=>"{{email}} "}],
"restriction"=>nil}]
Actual Output:
[{"url"=>"macros/360081752739.json",
"id"=>360081752739,
"actions"=>[{"field"=>"comment_value_html", "value"=>" {{IBAN}} {{IBAN}} {{email}} "}],
"restriction"=>nil},
{"url"=>"macros/360081755559.json",
"id"=>360081755559,
"actions"=>[{"field"=>"comment_value_html", "value"=>"test#gmail.com "}],
"restriction"=>nil}]

Related

Serialize an array of hashes

I have an array of hashes:
records = [
{
ID: 'BOATY',
Name: 'McBoatface, Boaty'
},
{
ID: 'TRAINY',
Name: 'McTrainface, Trainy'
}
]
I'm trying to combine them into an array of strings:
["ID,BOATY","Name,McBoatface, Boaty","ID,TRAINY","Name,McTrainface, Trainy"]
This doesn't seem to do anything:
irb> records.collect{|r| r.each{|k,v| "\"#{k},#{v}\"" }}
#=> [{:ID=>"BOATY", :Name=>"McBoatface, Boaty"}, {:ID=>"TRAINY", :Name=>"McTrainface, Trainy"}]
** edit **
Formatting (i.e. ["Key0,Value0","Key1,Value1",...] is required to match a vendor's interface.
** /edit **
What am I missing?
records.flat_map(&:to_a).map { |a| a.join(',') }
#=> ["ID,BOATY", "Name,McBoatface, Boaty", "ID,TRAINY", "Name,McTrainface, Trainy"]
records = [
{
ID: 'BOATY',
Name: 'McBoatface, Boaty'
},
{
ID: 'TRAINY',
Name: 'McTrainface, Trainy'
}
]
# strait forward code
result= []
records.each do |hash|
hash.each do |key, value|
result<< key.to_s
result<< value
end
end
puts result.inspect
# a rubyish way (probably less efficient, I've not done the benchmark)
puts records.map(&:to_a).flatten.map(&:to_s).inspect
Hope it helps.
li = []
records.each do |rec|
rec.each do |k,v|
li << "#{k.to_s},#{v.to_s}".to_s
end
end
print li
["ID,BOATY", "Name,McBoatface, Boaty", "ID,TRAINY", "Name,McTrainface,
Trainy"]
You sure you wanna do it this way?
Check out Marshal. Or JSON.
You could even do it this stupid way using Hash#inspect and eval:
serialized_hashes = records.map(&:inspect) # ["{ID: 'Boaty'...", ...]
unserialized = serialized_hashes.map { |s| eval(s) }

Ruby aggregation with objects

Lets say I have something like this:
class FruitCount
attr_accessor :name, :count
def initialize(name, count)
#name = name
#count = count
end
end
obj1 = FruitCount.new('Apple', 32)
obj2 = FruitCount.new('Orange', 5)
obj3 = FruitCount.new('Orange', 3)
obj4 = FruitCount.new('Kiwi', 15)
obj5 = FruitCount.new('Kiwi', 1)
fruit_counts = [obj1, obj2, obj3, obj4, obj5]
Now what I need, is a function build_fruit_summary which due to a given fruit_counts array, it returns the following summary:
fruits_summary = {
fruits: [
{
name: 'Apple',
count: 32
},
{
name: 'Orange',
count: 8
},
{
name: 'Kiwi',
count: 16
}
],
total: {
name: 'AllFruits',
count: 56
}
}
I just cannot figure out the best way to do the aggregations.
Edit:
In my example I have more than one count.
class FruitCount
attr_accessor :name, :count1, :count2
def initialize(name, count1, count2)
#name = name
#count1 = count1
#count2 = count2
end
end
Ruby's Enumerable is your friend, particularly each_with_object which is a form of reduce.
You first need the fruits value:
fruits = fruit_counts.each_with_object([]) do |fruit, list|
aggregate = list.detect { |f| f[:name] == fruit.name }
if aggregate.nil?
aggregate = { name: fruit.name, count: 0 }
list << aggregate
end
aggregate[:count] += fruit.count
aggregate[:count2] += fruit.count2
end
UPDATE: added multiple counts within the single fruity loop.
The above will serialize each fruit object - maintaining a count for each fruit - into a hash and aggregate them into an empty list array, and assign the aggregate array to the fruits variable.
Now, get the total value:
total = { name: 'AllFruits', count: fruit_counts.map { |f| f.count + f.count2 }.reduce(:+) }
UPDATE: total taking into account multiple count attributes within a single loop.
The above maps the fruit_counts array, plucking each object's count attribute, resulting in an array of integers. Then, reduce is getting the sum of the array's integers.
Now put it all together into the summary:
fruits_summary = { fruits: fruits, total: total }
You can formalize this in an OOP style by introducing a FruitCollection object that uses the Enumerable module:
class FruitCollection
include Enumerable
def initialize(fruits)
#fruits = fruits
end
def summary
{ fruits: fruit_counts, total: total }
end
def each(&block)
#fruits.each &block
end
def fruit_counts
each_with_object([]) do |fruit, list|
aggregate = list.detect { |f| f[:name] == fruit.name }
if aggregate.nil?
aggregate = { name: fruit.name, count: 0 }
list << aggregate
end
aggregate[:count] += fruit.count
aggregate[:count2] += fruit.count2
end
end
def total
{ name: 'AllFruits', count: map { |f| f.count + f.count2 }.reduce(:+) }
end
end
Now pass your fruit_count array into that object:
fruit_collection = FruitCollection.new fruit_counts
fruits_summary = fruit_collection.summary
The reason the above works is by overriding the each method which Enumerable uses under the hood for every enumerable method. This means we can call each_with_object, reduce, and map (among others listed in the enumerable docs above) and it will iterate over the fruits since we told it to in the above each method.
Here's an article on Enumerable.
UPDATE: your multiple counts can be easily added by adding a total attribute to your fruit object:
class FruitCount
attr_accessor :name, :count1, :count2
def initialize(name, count1, count2)
#name = name
#count1 = count1
#count2 = count2
end
def total
#count1 + #count2
end
end
Then just use fruit.total whenever you need to aggregate the totals:
fruit_counts.map(&:total).reduce(:+)
fruits_summary = {
fruits: fruit_counts
.group_by { |f| f.name }
.map do |fruit_name, objects|
{
name: fruit_name,
count: objects.map(&:count).reduce(:+)
}
end,
total: {
name: 'AllFruits',
count: fruit_counts.map(&:count).reduce(:+)
}
}
Not very efficient way, though :)
UPD: fixed keys in fruits collection
Or slightly better version:
fruits_summary = {
fuits: fruit_counts
.reduce({}) { |acc, fruit| acc[fruit.name] = acc.fetch(fruit.name, 0) + fruit.count; acc }
.map { |name, count| {name: name, count: count} },
total: {
name: 'AllFruits',
count: fruit_counts.map(&:count).reduce(:+)
}
}
counts = fruit_counts.each_with_object(Hash.new(0)) {|obj, h| h[obj.name] += obj.count}
#=> {"Apple"=>32, "Orange"=>8, "Kiwi"=>16}
fruits_summary =
{ fruits: counts.map { |name, count| { name: name, count: count } },
total: { name: 'AllFruits', count: counts.values.reduce(:+) }
}
#=> {:fruits=>[
# {:name=>"Apple", :count=>32},
# {:name=>"Orange", :count=> 8},
# {:name=>"Kiwi", :count=>16}],
# :total=>
# {:name=>"AllFruits", :count=>56}
# }

Efficiently building a file system tree structure with nested hashes

I have a list of the diff stats per file for a commit (using diff --numstat in Git) that I need to parse into a tree structure as a hash so I can use it as JSON. The raw data is in a format like this:
1 1 app/assets/javascripts/foo.js.coffee
2 1 app/assets/javascripts/bar.js
16 25 app/assets/javascripts/baz.js.coffee
11 0 app/controllers/foo_controller.rb
3 2 db/schema.rb
41 1 lib/foobar.rb
I need to parse this into a nested hash format something like the following:
{ name: "app", children: [
{ name: "assets", children: [
{ name: "javascripts", children: [
{ name: "foo.js.coffee", add: 1, del: 1 },
{ name: "bar.js", add: 2, del: 1 }
{ name: "baz.js.coffee", add: 16, del: 25 }
], add: 19, del: 27 },
...
] }
] }
Where every level of the tree is represented by its name, children as a hash and the total number of additions and deletions for that tree.
Is there an efficient way to construct a hash like this in Ruby?
Full source here: https://gist.github.com/dimitko/5541709. You can download it and directly run it without any trouble (just make sure to have the awesome_print gem; it shows you the object hierarchy in much more human-readable format).
I enriched your test input a little, to make sure the algorithm doesn't make stupid mistakes.
Given this input:
input = <<TEXT
2 1 app/assets/javascripts/bar.js
16 25 app/assets/javascripts/baz.js.coffee
1 1 app/assets/javascripts/foo.js.coffee
4 9 app/controllers/bar_controller.rb
3 2 app/controllers/baz_controller.rb
11 0 app/controllers/foo_controller.rb
3 2 db/schema.rb
41 1 lib/foobar.rb
12 7 lib/tasks/cache.rake
5 13 lib/tasks/import.rake
TEXT
And this expected result:
[{:name=>"app", :add=>37, :del=>38, :children=>[{:name=>"assets", :add=>19, :del=>27, :children=>[{:name=>"javascripts", :add=>19, :del=>27, :children=>[{:name=>"bar.js", :add=>2, :del=>1}, {:name=>"baz.js.coffee", :add=>16, :del=>25}, {:name=>"foo.js.coffee", :add=>1, :del=>1}]}]}, {:name=>"controllers", :add=>18, :del=>11, :children=>[{:name=>"bar_controller.rb", :add=>4, :del=>9}, {:name=>"baz_controller.rb", :add=>3, :del=>2}, {:name=>"foo_controller.rb", :add=>11, :del=>0}]}]}, {:add=>3, :del=>2, :name=>"db", :children=>[{:name=>"schema.rb", :add=>3, :del=>2}]}, {:add=>58, :del=>21, :name=>"lib", :children=>[{:name=>"foobar.rb", :add=>41, :del=>1}, {:name=>"tasks", :add=>17, :del=>20, :children=>[{:name=>"cache.rake", :add=>12, :del=>7}, {:name=>"import.rake", :add=>5, :del=>13}]}]}]
And this code:
def git_diffnum_parse_paths(list, depth, out)
to = 1
base = list.first[:name][depth]
while list[to] and list[to][:name][depth] == base do
to += 1
end
if list.first[:name][depth+1]
out << {name: base, add: 0, del: 0, children: []}
# Common directory found for the first N records; recurse deeper.
git_diffnum_parse_paths(list[0..to-1], depth + 1, out.last[:children])
add = del = 0
out.last[:children].each do |x| add += x[:add].to_i; del += x[:del].to_i; end
out.last[:add] = add
out.last[:del] = del
else
# It's a file, we can't go any deeper.
out << {name: list.first[:name].last, add: list.first[:add].to_i, del: list.first[:del].to_i}
end
if list[to]
# Recurse in to try find common directories for the deeper records.
git_diffnum_parse_paths(list[to..-1], depth, out)
end
nil
end
def to_git_diffnum_tree(txt)
items = []
txt.split("\n").each do |line|
m = line.match(/(\d+)\s+(\d+)\s+(.+)/).to_a[1..3]
items << {add: m[0], del: m[1], name: m[2]}
end
items.sort! { |a,b|
a[:name] <=> b[:name]
}
items.each do |item|
item[:name] = item[:name].split("/")
end
out = []
git_diffnum_parse_paths(items, 0, out)
out
end
And this code, which is using it:
require 'awesome_print'
out = to_git_diffnum_tree(input)
puts; ap out; puts
puts; puts "Expected result:"; puts expected.inspect
puts; puts "Actual result: "; puts out.inspect
puts; puts "Are expected and actual results identical: #{expected == out}"
It seems to produce what you want.
Notes:
I am sorting the array of parsed entries by directory/file names. This is done to avoid walking the entire list to search for a common directory; instead, the algorithm can scan the list up until the first non-match.
I am far from thinking that's the most optimal solution, but it's what I have came up with for a free hour.
I have left some [un-]commented puts statements in the gist, in case you wanna have a rough glimpse on how does the algorithm work.
In case you want to give it a more solid test, try something like this:
git diff --numstat `git rev-list --max-parents=0 HEAD | head -n 1` HEAD
That'd give you number of additions and deletions since the initial commit (provided your Git version is >=1.7.4.2), which is a far bigger input where you can give the algorithm a lot more rigorous testing.
Hope I helped.
Define "efficient".
If your problem is "performance", your solution isn't ruby.
Unless you're literally running this script on the Linux source code, I wouldn't be worrying about performance, just clarity of intent.
I took inspiration from #dimitko's solution and I minimized the code used.
https://gist.github.com/x1024/3d0f9ad61fcb4b189be3
def git_group lines, root = 'root'
if lines.count == 1 and lines[0][:name].empty? then
return {
name: root,
add: lines.map { |l| l[:add] }.reduce(0, :+),
del: lines.map { |l| l[:del] }.reduce(0, :+),
}
end
lines = lines.group_by { |line| line[:name].shift }
.map { |key, value| git_group(value, key) }
return {
name: root,
add: lines.map { |l| l[:add] }.reduce(0, :+),
del: lines.map { |l| l[:del] }.reduce(0, :+),
children: lines
}
end
def to_git_diffnum_tree(txt)
data = txt.split("\n")
.map { |line| line.split() }
.map { |line| {add: line[0].to_i, del: line[1].to_i, name: line[2].split('/')} }
.sort_by { |item| item[:name] }
git_group(data)[:children]
end
And if you are willing to compromise with your data format (i.e. return the same data but in a different structure), you can do this with even less code:
https://gist.github.com/x1024/5ecfdfe886e31f8b5ab9
def git_group lines
dirs = lines.select { |line| line[:name].count > 1 }
files = (lines - dirs).map! { |file| [file.delete(:name).shift, file] }
dirs_processed = dirs.group_by { |dir| dir[:name].shift }
.map { |key, value| [key, git_group(value)] }
data = dirs_processed.concat(files)
return {
add: data.map { |k,l| l[:add] }.reduce(0, :+),
del: data.map { |k,l| l[:del] }.reduce(0, :+),
children: Hash[data]
}
end
def to_git_diffnum_tree(txt)
data = txt.split("\n")
.map { |line| line.split() }
.map { |line| {add: line[0].to_i, del: line[1].to_i, name: line[2].split('/')} }
.sort_by { |item| item[:name] }
git_group(data)[:children]
end
Remember kids, writing C++ in Ruby is bad.

Scanning an array of strings for match in Ruby

I have an array:
a = ["http://design.example.com", "http://www.domcx.com", "http://subr.com"]
and then I want to return true if one of the elements in that array matches the string:
s = "example.com"
I tried with include? and any?.
a.include? s
a.any?{|w| s=~ /#{w}/}
I don't know how to use it here. Any suggestions?
You can use any? like:
[
"http://design.example.com",
"http://www.domcx.com",
"http://subr.com"
].any?{ |s| s['example.com'] }
Substituting your variable names:
a = [
"http://design.example.com",
"http://www.domcx.com",
"http://subr.com"
]
s = "example.com"
a.any?{ |i| i[s] }
You can do it several other ways also, but the advantage using any? is it will stop as soon as you get one hit, so it can be much faster if that hit occurs early in the list.
How is the below:
a=["http://design.example.com", "http://www.domcx.com", "http://subr.com"]
s= "sus"
p a.any? { |w| w.include? s } #=> false
a=["http://design.example.com", "http://www.domcx.com", "http://subr.com"]
s= "design.example"
p a.any? { |w| w.include? s } #=>true
a=["http://design.example.com", "http://www.domcx.com", "http://subr.com"]
s= "desingn.example"
p a.any? { |w| w.include? s } #=>false
a=["http://design.example.com", "http://www.domcx.com", "http://subr.com"]
s= "example"
p a.any? { |w| w.include? s } #=>true
a=["http://design.example.com", "http://www.domcx.com", "http://subr.com"]
s= "example.com"
p a.any? { |w| w.include? s } #=>true

How to divide and sort an array in defined order

Input
cycle = 4
order = []
order[0] = [
/foobar/, /vim/
]
order[1] = [ /simple/,/word/, /.*/ ]
record = [ 'vim', 'foobar', 'foo', 'word', 'bar', 'something', 'something1', 'something2', 'something3', 'something4']
Requirement
I want to make a list named report. Original source is record which is an one-dimension array. All elements of record will be split into different group and sorted. The group and order is defined in order.
This is pseudo code:
order.each do |group|
group.each do |pattern|
record.each do |r|
if r =~ pattern
#report[# of group][# of row][ # of element (max is 4th)] = r
end
end
end
end
Please note:
the element number in a [row] is 4 which is defined in cycle.
[# of row] : If # of element > 4 , # of row will + 1
Every element(string) in report is unique.
Expected output:
require 'ap'
ap report
[
[0] [
[0] [
[0] "foobar",
[1] "vim"
]
],
[1] [
[0] [
[0] "word",
[1] "foo",
[2] "bar",
[3] "something"
],
[1] [
[0] "something1",
[1] "something2"
[2] "something3"
[3] "something4"
]
]
]
This should do it (though it's not very pretty):
report = []
record.uniq!
order.each_with_index do |group, gi|
group.each do |pattern|
record.select { |r| r =~ pattern }.each do |match|
report[gi] ||= [[]]
report[gi] << [] if report[gi].last.length == cycle
report[gi].last << match
end
record.delete_if { |r| r =~ pattern }
end
end
puts report.inspect
#=> [[["foobar", "vim"]], [["word", "foo", "bar", "something"], ["something1", "something2", "something3", "something4"]]]
Note that record is mutated, so if you need it to remain the same you should dup it.
Here's another approach. I'm still not entirely happy with this -- couldn't figure out how to boil down the last two steps into one. Also it ended up having more lines than Andrew Marshall's answer. Boo.
Spec attached.
require 'spec_helper'
def report(cycle, order, record)
record.uniq!
order.each_with_index.map do |pattern_list, index|
pattern_list.map do |pattern|
record.each_with_index.inject([]) do |memo, (item, item_index)|
memo.tap do
if pattern =~ item
memo << item
record[item_index] = nil
end
end
end
end.flatten
end.map do |items|
items.each_with_index.group_by do |item, index|
index.div(cycle)
end.map do |ordering, item_with_index|
item_with_index.map(&:first)
end
end
end
describe 'report' do
let(:cycle) { 4 }
let(:order) { [
[/foobar/, /vim/],
[/simple/,/word/, /.*/]
] }
let(:record) {
[ 'vim', 'foobar', 'foo', 'word', 'bar', 'something', 'something1', 'something2', 'something3', 'something4']
}
it "just works" do
report(cycle, order, record.dup).should == [
[["foobar","vim"]],
[["word","foo","bar","something"],["something1","something2","something3","something4"]]
]
end
end
Simple answer, you can use each_with_index which works similar to each but gives you the index if the loop as the second parameter.
I can't give you a full example sadly, as I didn't fully understand your use case. However with the documentation you should be able to proceed.

Resources