Most efficient way to extract arguments from String in Ruby - ruby

I would like to extract some information from a string in Ruby by only reading the String once (O(n) time complexity).
Here is an example:
The string looks like this: -location here -time 7:30pm -activity biking
I have a Ruby object I want to populate with this info. All the keywords are known, and they are all optional.
def ActivityInfo
_attr_reader_ :location, :time, :activity
def initialize(str)
#location, #time, #activity = DEFAULT_LOCATION, DEFAULT_TIME, DEFAULT_ACTIVITY
# Here is how I was planning on implementing this
current_string = ""
next_parameter = nil # A reference to keep track of which parameter the current string is refering to
words = str.split
while !str.empty?
word = str.shift
case word
when "-location"
if !next_parameter.nil?
next_parameter.parameter = current_string # Set the parameter value to the current_string
current_string = ""
else
next_parameter = #location
when "-time"
if !next_parameter.nil?
next_parameter.parameter = current_string
current_string = ""
else
next_parameter = #time
when "-activity"
if !next_parameter.nil?
next_parameter.parameter = current_string
current_string = ""
else
next_parameter = #time
else
if !current_string.empty?
current_string += " "
end
current_string += word
end
end
end
end
So basically I just don't know how to make a variable be the reference of another variable or method, so that I can then set it to a specific value. Or maybe there is just another more efficient way to achieve this?
Thanks!

The string looks suspiciously like a command-line, and there are some good Ruby modules to parse those, such as optparse.
Assuming it's not, here's a quick way to parse the commands in your sample into a hash:
cmd = '-location here -time 7:30pm -activity biking'
Hash[*cmd.scan(/-(\w+) (\S+)/).flatten]
Which results in:
{
"location" => "here",
"time" => "7:30pm",
"activity" => "biking"
}
Expanding it a bit farther:
class ActivityInfo
def initialize(h)
#location = h['location']
#time = h['time' ]
#activity = h['activity']
end
end
act = ActivityInfo.new(Hash[*cmd.scan(/-(\w+) (\S+)/).flatten])
Which sets act to an instance of ActivityInfo looking like:
#<ActivityInfo:0x101142df8
#activity = "biking",
#location = "here",
#time = "7:30pm"
>
--
The OP asked how to deal with situations where the commands are not flagged with - or are multiple words. These are equivalent, but I prefer the first stylistically:
irb(main):003:0> cmd.scan(/-((?:location|time|activity)) \s+ (\S+)/x)
[
[0] [
[0] "location",
[1] "here"
],
[1] [
[0] "time",
[1] "7:30pm"
],
[2] [
[0] "activity",
[1] "biking"
]
]
irb(main):004:0> cmd.scan(/-(location|time|activity) \s+ (\S+)/x)
[
[0] [
[0] "location",
[1] "here"
],
[1] [
[0] "time",
[1] "7:30pm"
],
[2] [
[0] "activity",
[1] "biking"
]
]
If the commands are multiple words, such as "at location":
irb(main):009:0> cmd = '-at location here -time 7:30pm -activity biking'
"-at location here -time 7:30pm -activity biking"
irb(main):010:0>
irb(main):011:0* cmd.scan(/-((?:at \s location|time|activity)) \s+ (\S+)/x)
[
[0] [
[0] "at location",
[1] "here"
],
[1] [
[0] "time",
[1] "7:30pm"
],
[2] [
[0] "activity",
[1] "biking"
]
]
If you need even more flexibility look at Ruby's strscan module. You can use that to tear apart a string and find the commands and their parameters.

Convert String to Options Hash
If you just want easy access to your flags and their values, you can split your string into a hash where each flag is a key. For example:
options = Hash[ str.scan /-(\w+)\s+(\S+)/ ]
=> {"location"=>"here", "time"=>"7:30pm", "activity"=>"biking"}
You can then reference values directly (e.g. options['location']) or iterate through your hash in key/value pairs. For example:
options.each_pair { |k, v| puts "%s %s" % [k, v] }
A Dash of Metaprogramming
Okay, this is serious over-engineering, but I spent a little extra time on this question because I found it interesting. I'm not claiming the following is useful; I'm just saying it was fun for me to do.
If you want to parse your option flags and and dynamically create a set of attribute readers and set some instance variables without having to define each flag or variable separately, you can do this with a dash of metaprogramming.
# Set attribute readers and instance variables dynamically
# using Kernel#instance_eval.
class ActivityInfo
def initialize(str)
options = Hash[ str.scan /-(\w+)\s+(\S+)/ ]
options.each_pair do |k, v|
self.class.instance_eval { attr_reader k.to_sym }
instance_variable_set("##{k}", v)
end
end
end
ActivityInfo.new '-location here -time 7:30pm -activity biking'
=> #<ActivityInfo:0x00000001b49398
#activity="biking",
#location="here",
#time="7:30pm">
Honestly, I think setting your variables explicitly from an options hash such as:
#activity = options['activity']`
will convey your intent more clearly (and be more readable), but it's always good to have alternatives. Your mileage may vary.

Why reinvent the wheel when Thor can do the heavy lifting for you?
class ActivityInfo < Thor
desc "record", "record details of your activity"
method_option :location, :type => :string, :aliases => "-l", :required => true
method_option :time, :type => :datetime, :aliases => "-t", :required => true
method_option :activity, :type => :string, :aliases => "-a", :required => true
def record
location = options[:location]
time = options[:time]
activity = options[:activity]
# record details of the activity
end
end
The options will be parse for you based on the datatype you specified. You can invoke it programmatically:
task = ActivityInfo.new([], {location: 'NYC', time: Time.now, activity: 'Chilling out'})
task.record
Or from command line: thor activity_info:record -l NYC -t "2012-06-23 02:30:00" -a "Chilling out"

Related

Initializing a hash with an empty array keyed to an array of strings - Ruby

I have:
people=["Bob","Fred","Sam"]
holidays = Hash.new
people.each do |person|
a=Array.new
holidays[person]=a
end
gifts = Hash.new
people.each do |person|
a=Array.new
gifts[person]=a
end
Feels clunky. I can't seem to figure a more streamline way with an initialization block or somesuch thing. Is there an idiomatic approach here?
Ideally, I'd like to keep an array like:
lists["holidays","gifts",...]
... and itterate through it to initialize each element in the lists array.
people = %w|Bob Fred Sam|
data = %w|holidays gifts|
result = data.zip(data.map { people.zip(people.map { [] }).to_h }).to_h
result['holidays']['Bob'] << Date.today
#⇒ {
# "holidays" => {
# "Bob" => [
# [0] #<Date: 2016-11-04 ((2457697j,0s,0n),+0s,2299161j)>
# ],
# "Fred" => [],
# "Sam" => []
# },
# "gifts" => {
# "Bob" => [],
# "Fred" => [],
# "Sam" => []
# }
# }
More sophisticated example would be:
result = data.map do |d|
[d, Hash.new { |h, k| h[k] = [] if people.include?(k) }]
end.to_h
The latter produces the “lazy initialized nested hashes.” It uses the Hash#new with a block constructor for nested hashes.
Play with it to see how it works.
A common way of doing that would be to use Enumerable#each_with_objrect.
holidays = people.each_with_object({}) { |p,h| h[p] = [] }
#=> {"Bob"=>[], "Fred"=>[], "Sam"=>[]}
gifts is the same.
If you only want a number of such hashes then, the following should suffice:
count_of_hashes = 4 // lists.count; 4 is chosen randomly by throwing a fair die
people = ["Bob", "Fred", "Sam"]
lists = count_of_hashes.times.map do
people.map {|person| [person, []]}.to_h
end
This code also ensures the arrays and the hashes all occupy their own memory. As can be verified by the following code:
holidays, gifts, *rest = lists
holidays["Bob"] << "Rome"
And checking the values of all the other hashes:
lists
=> [
{"Bob"=>["Rome"], "Fred"=>[], "Sam"=>[]},
{"Bob"=>[], "Fred"=>[], "Sam"=>[]},
{"Bob"=>[], "Fred"=>[], "Sam"=>[]},
{"Bob"=>[], "Fred"=>[], "Sam"=>[]}
]

Iterating over an array to create a nested hash

I am trying to create a nested hash from an array that has several elements saved to it. I've tried experimenting with each_with_object, each_with_index, each and map.
class Person
attr_reader :name, :city, :state, :zip, :hobby
def initialize(name, hobby, city, state, zip)
#name = name
#hobby = hobby
#city = city
#state = state
#zip = zip
end
end
steve = Person.new("Steve", "basketball","Dallas", "Texas", 75444)
chris = Person.new("Chris", "piano","Phoenix", "Arizona", 75218)
larry = Person.new("Larry", "hunting","Austin", "Texas", 78735)
adam = Person.new("Adam", "swimming","Waco", "Texas", 76715)
people = [steve, chris, larry, adam]
people_array = people.map do |person|
person = person.name, person.hobby, person.city, person.state, person.zip
end
Now I just need to turn it into a hash. One issue I am having is, when I'm experimenting with other methods, I can turn it into a hash, but the array is still inside the hash. The expected output is just a nested hash with no arrays inside of it.
# Expected output ... create the following hash from the peeps array:
#
# people_hash = {
# "Steve" => {
# "hobby" => "golf",
# "address" => {
# "city" => "Dallas",
# "state" => "Texas",
# "zip" => 75444
# }
# # etc, etc
Any hints on making sure the hash is a nested hash with no arrays?
This works:
person_hash = Hash[peeps_array.map do |user|
[user[0], Hash['hobby', user[1], 'address', Hash['city', user[2], 'state', user[3], 'zip', user[4]]]]
end]
Basically just use the ruby Hash [] method to convert each of the sub-arrays into an hash
Why not just pass people?
people.each_with_object({}) do |instance, h|
h[instance.name] = { "hobby" => instance.hobby,
"address" => { "city" => instance.city,
"state" => instance.state,
"zip" => instance.zip } }
end

Ruby: calculate average() while excluding nil values from data

I'm very new to Ruby and I'm having some difficulties with a seemingly simple problem.
Code is here...
https://github.com/sensu/sensu-community-plugins/blob/master/plugins/graphite/check-stats.rb
...but I've included a full copy of the current source at the end, because it may change as new versions are submitted to Github.
It's a Sensu plugin. It collects data from Graphite via an HTTP request. Stores the reply in body, which is then JSON.parse() into data.
For each metric in data, it collects datapoints, and performs an average on the datapoints. If average is higher than certain thresholds (options -w or -c), it throws a warning or a critical.
Sometimes the Graphite store is a bit behind times. The most recent data point may be missing from some metrics. When that happens, the data point is nil.
The problem is, nil is counted as zero when computing average(datapoints). This artificially lowers the average, sometimes to the effect that the plugin doesn't trigger when it should.
What's the best way to eliminate the nil values from the calculation of average?
Ideally, the elimination of the nils should happen in such a way that, if all data points are nil, then it should trigger the datapoints.empty condition. Basically, kill all the nils before they reach "unless datapoints.empty?" because if all are nil then we don't actually have any data points.
Or somehow metric.collect{} should skip the nil values.
I've tried to use .compact but that didn't seem to make a difference (probably I've used it wrong).
This is the current version of the code:
#!/usr/bin/env ruby
#
# Checks metrics in graphite, averaged over a period of time.
#
# The fired sensu event will only be critical if a stat is
# above the critical threshold. Otherwise, the event will be warning,
# if a stat is above the warning threshold.
#
# Multiple stats will be checked if * are used
# in the "target" query.
#
# Author: Alan Smith (alan#asmith.me)
# Date: 08/28/2014
#
require 'rubygems' if RUBY_VERSION < '1.9.0'
require 'json'
require 'net/http'
require 'sensu-plugin/check/cli'
class CheckGraphiteStat < Sensu::Plugin::Check::CLI
option :host,
:short => "-h HOST",
:long => "--host HOST",
:description => "graphite hostname",
:proc => proc {|p| p.to_s },
:default => "graphite"
option :period,
:short => "-p PERIOD",
:long => "--period PERIOD",
:description => "The period back in time to extract from Graphite. Use -24hours, -2days, -15mins, etc, same format as in Graphite",
:proc => proc {|p| p.to_s },
:required => true
option :target,
:short => "-t TARGET",
:long => "--target TARGET",
:description => "The graphite metric name. Can include * to query multiple metrics",
:proc => proc {|p| p.to_s },
:required => true
option :warn,
:short => "-w WARN",
:long => "--warn WARN",
:description => "Warning level",
:proc => proc {|p| p.to_f },
:required => false
option :crit,
:short => "-c Crit",
:long => "--crit CRIT",
:description => "Critical level",
:proc => proc {|p| p.to_f },
:required => false
def average(a)
total = 0
a.to_a.each {|i| total += i.to_f}
total / a.length
end
def danger(metric)
datapoints = metric['datapoints'].collect {|p| p[0].to_f}
unless datapoints.empty?
avg = average(datapoints)
if !config[:crit].nil? && avg > config[:crit]
return [2, "#{metric['target']} is #{avg}"]
elsif !config[:warn].nil? && avg > config[:warn]
return [1, "#{metric['target']} is #{avg}"]
end
end
[0, nil]
end
def run
body =
begin
uri = URI("http://#{config[:host]}/render?format=json&target=#{config[:target]}&from=#{config[:period]}")
res = Net::HTTP.get_response(uri)
res.body
rescue Exception => e
warning "Failed to query graphite: #{e.inspect}"
end
status = 0
message = ''
data =
begin
JSON.parse(body)
rescue
[]
end
unknown "No data from graphite" if data.empty?
data.each do |metric|
s, msg = danger(metric)
message += "#{msg} " unless s == 0
status = s unless s < status
end
if status == 2
critical message
elsif status == 1
warning message
end
ok
end
end
Well, if you want to eliminate nils before doing collect, you can do
metric['datapoints'].reject { |p| p.nil? }.collect {|p| p[0].to_f}
instead of
metric['datapoints'].collect {|p| p[0].to_f}
BTW, you average can also be rewritten as
def average(a)
a.reduce(0,:+)/a.size
end
You can use Array#compact which does exactly that:
["a", nil, "b", nil, "c", nil].compact
#=> [ "a", "b", "c" ]
http://ruby-doc.org/core-2.1.3/Array.html#method-i-compact

Reading and writing Sinatra params using symbols, e.g. params[:id]

My form receives data via POST. When I do puts params I can see:
{"id" => "123", "id2" => "456"}
now the commands:
puts params['id'] # => 123
puts params[:id] # => 123
params['id'] = '999'
puts params # => {"id" => "999", "id2" => "456"}
but when I do:
params[:id] = '888'
puts params
I get
{"id" => "999", "id2" => "456", :id => "888"}
In IRB it works fine:
params
# => {"id2"=>"2", "id"=>"1"}
params[:id]
# => nil
params['id']
# => "1"
Why can I read the value using :id, but not set the value using that?
Hashes in Ruby allow arbitrary objects to be used as keys. As strings (e.g. "id") and symbols (e.g. :id) are separate types of objects, a hash may have as a key both a string and symbol with the same visual contents without conflict:
irb(main):001:0> { :a=>1, "a"=>2 }
#=> {:a=>1, "a"=>2}
This is distinctly different from JavaScript, where the keys for objects are always strings.
Because web parameters (whether via GET or POST) are always strings, Sinatra has a 'convenience' that allows you to ask for a parameter using a symbol and it will convert it to a string before looking for the associated value. It does this by using a custom default_proc that calls to_s when looking for a value that does not exist.
Here's the current implementation:
def indifferent_hash
Hash.new {|hash,key| hash[key.to_s] if Symbol === key }
end
However, it does not provide a custom implementation for the []=(key, val) method, and thus you can set a symbol instead of the string.

Refactor ruby on rails model

Given the following code,
How would you refactor this so that the method search_word has access to issueid?
I would say that changing the function search_word so it accepts 3 arguments or making issueid an instance variable (#issueid) could be considered as an example of bad practices, but honestly I cannot find any other solution. If there's no solution aside from this, would you mind explaining the reason why there's no other solution?
Please bear in mind that it is a Ruby on Rails model.
def search_type_of_relation_in_text(issueid, type_of_causality)
relation_ocurrences = Array.new
keywords_list = {
:C => ['cause', 'causes'],
:I => ['prevent', 'inhibitors'],
:P => ['type','supersets'],
:E => ['effect', 'effects'],
:R => ['reduce', 'inhibited'],
:S => ['example', 'subsets']
}[type_of_causality.to_sym]
for keyword in keywords_list
relation_ocurrences + search_word(keyword, relation_type)
end
return relation_ocurrences
end
def search_word(keyword, relation_type)
relation_ocurrences = Array.new
#buffer.search('//p[text()*= "'+keyword+'"]/a').each { |relation|
relation_suggestion_url = 'http://en.wikipedia.org'+relation.attributes['href']
relation_suggestion_title = URI.unescape(relation.attributes['href'].gsub("_" , " ").gsub(/[\w\W]*\/wiki\//, ""))
if not #current_suggested[relation_type].include?(relation_suggestion_url)
if #accepted[relation_type].include?(relation_suggestion_url)
relation_ocurrences << {:title => relation_suggestion_title, :wiki_url => relation_suggestion_url, :causality => type_of_causality, :status => "A", :issue_id => issueid}
else
relation_ocurrences << {:title => relation_suggestion_title, :wiki_url => relation_suggestion_url, :causality => type_of_causality, :status => "N", :issue_id => issueid}
end
end
}
end
If you need additional context, pass it through as an additional argument. That's how it's supposed to work.
Setting #-type instance variables to pass context is bad form as you've identified.
There's a number of Ruby conventions you seem to be unaware of:
Instead of Array.new just use [ ], and instead of Hash.new use { }.
Use a case statement or a constant instead of defining a Hash and then retrieving only one of the elements, discarding the remainder.
Avoid using return unless strictly necessary, as the last operation is always returned by default.
Use array.each do |item| instead of for item in array
Use do ... end instead of { ... } for multi-line blocks, where the curly brace version is generally reserved for one-liners. Avoids confusion with hash declarations.
Try and avoid duplicating large chunks of code when the differences are minor. For instance, declare a temporary variable, conditionally manipulate it, then store it instead of defining multiple independent variables.
With that in mind, here's a reworking of it:
KEYWORDS = {
:C => ['cause', 'causes'],
:I => ['prevent', 'inhibitors'],
:P => ['type','supersets'],
:E => ['effect', 'effects'],
:R => ['reduce', 'inhibited'],
:S => ['example', 'subsets']
}
def search_type_of_relation_in_text(issue_id, type_of_causality)
KEYWORDS[type_of_causality.to_sym].collect do |keyword|
search_word(keyword, relation_type, issue_id)
end
end
def search_word(keyword, relation_type, issue_id)
relation_occurrences = [ ]
#buffer.search(%Q{//p[text()*= "#{keyword}'"]/a}).each do |relation|
relation_suggestion_url = "http://en.wikipedia.org#{relation.attributes['href']}"
relation_suggestion_title = URI.unescape(relation.attributes['href'].gsub("_" , " ").gsub(/[\w\W]*\/wiki\//, ""))
if (!#current_suggested[relation_type].include?(relation_suggestion_url))
occurrence = {
:title => relation_suggestion_title,
:wiki_url => relation_suggestion_url,
:causality => type_of_causality,
:issue_id => issue_id
}
occurrence[:status] =
if (#accepted[relation_type].include?(relation_suggestion_url))
'A'
else
'N'
end
relation_ocurrences << occurrence
end
end
relation_occurrences
end

Resources