How to implement complex message validation/handling flow - ruby

I'm developing a web service (in Ruby) which needs to do a number of
different things for each message it receives.
Before my web service can process a message it must do different things:
sanitizing (e.g. remove HTML/JS)
check format (e.g. valid email provided?)
check IP in blacklist
invoke 3rd party web service
plus 10-30 other things
I'm thinking about implementing a filter/composite filter architecture
where each step/phase is a filter. For instance, I could have these filters
Sanitize input filter
Email filter
Country code filter
Blacklist filter
Each filter should be possible to reject a message, so I'm considering
that a filter should raise/throw exceptions.
This will give a lot of flexibility and hopefully a codebase that are
easy to understand.
How would you did this? And what are pros and cons of above design?

I would leave Exceptions for the cases when the filter itself actually broke down (e.g blacklist not available etc) and indicate the valid/invalid state either by true/false return values or, as you also suggested, throwing a tag.
If you don't want to stop at first failure, but execute all filters anyway, you should choose the boolean return type and conjunct them together (success &= next_filter(msg))
If I understood your situation correctly, the filter can both modify the message or check some other source for validity (e.g blacklist).
So I would do it like this:
module MessageFilters
EmailValidator = ->(msg) do
throw :failure unless msg.txt =~ /#/
end
HTMLSanitizer = ->(msg) do
# this filter only modifies message, doesn't throw anything
# msg.text.remove_all_html!
end
end
class Message
attr_accessor :filters
def initialize
#filters = []
end
def execute_filters!
begin
catch(:failure) do
filters.each{|f| f.call self}
true # if all filters pass, this is returned, else nil
end
rescue => e
# Handle filter errors
end
end
end
message = Message.new
message.filters << MessageFilters::EmailValidator
message.filters << MessageFilters::HTMLSanitizer
success = message.execute_filters! # returns either true or nil

Related

Creating a Ruby API

I have been tasked with creating a Ruby API that retrieves youtube URL's. However, I am not sure of the proper way to create an 'API'... I did the following code below as a Sinatra server that serves up JSON, but what exactly would be the definition of an API and would this qualify as one? If this is not an API, how can I make in an API? Thanks in advance.
require 'open-uri'
require 'json'
require 'sinatra'
# get user input
puts "Please enter a search (seperate words by commas):"
search_input = gets.chomp
puts
puts "Performing search on YOUTUBE ... go to '/videos' API endpoint to see the results and use the output"
puts
# define query parameters
api_key = 'my_key_here'
search_url = 'https://www.googleapis.com/youtube/v3/search'
params = {
part: 'snippet',
q: search_input,
type: 'video',
videoCaption: 'closedCaption',
key: api_key
}
# use search_url and query parameters to construct a url, then open and parse the result
uri = URI.parse(search_url)
uri.query = URI.encode_www_form(params)
result = JSON.parse(open(uri).read)
# class to define attributes of each video and format into eventual json
class Video
attr_accessor :title, :description, :url
def initialize
#title = nil
#description = nil
#url = nil
end
def to_hash
{
'title' => #title,
'description' => #description,
'url' => #url
}
end
def to_json
self.to_hash.to_json
end
end
# create an array with top 3 search results
results_array = []
result["items"].take(3).each do |video|
#video = Video.new
#video.title = video["snippet"]["title"]
#video.description = video["snippet"]["description"]
#video.url = video["snippet"]["thumbnails"]["default"]["url"]
results_array << #video.to_json.gsub!(/\"/, '\'')
end
# define the API endpoint
get '/videos' do
results_array.to_json
end
An "API = Application Program Interface" is, simply, something that another program can reliably use to get a job done, without having to busy its little head about exactly how the job is done.
Perhaps the simplest thing to do now, if possible, is to go back to the person who "tasked" you with this task, and to ask him/her, "well, what do you have in mind?" The best API that you can design, in this case, will be the one that is most convenient for the people (who are writing the programs which ...) will actually have to use it. "Don't guess. Ask!"
A very common strategy for an API, in a language like Ruby, is to define a class which represents "this application's connection to this service." Anyone who wants to use the API does so by calling some function which will return a new instance of this class. Thereafter, the program uses this object to issue and handle requests.
The requests, also, are objects. To issue a request, you first ask the API-connection object to give you a new request-object. You then fill-out the request with whatever particulars, then tell the request object to "go!" At some point in the future, and by some appropriate means (such as a callback ...) the request-object informs you that it succeeded or that it failed.
"A whole lot of voodoo-magic might have taken place," between the request object and the connection object which spawned it, but the client does not have to care. And that, most of all, is the objective of any API. "It Just Works.™"
I think they want you to create a third-party library. Imagine you are schizophrenic for a while.
Joe wants to build a Sinatra application to list some YouTube videos, but he is lazy and he does not want to do the dirty work, he just wants to drop something in, give it some credentials, ask for urls and use them, finito.
Joe asks Bob to implement it for him and he gives him his requirements: "Bob, I need YouTube library. I need it to do:"
# Please note that I don't know how YouTube API works, just guessing.
client = YouTube.new(api_key: 'hola')
video_urls = client.videos # => ['https://...', 'https://...', ...]
And Bob says "OK." end spends a day in his interactive console.
So first, you should figure out how you are going to use your not-yet-existing lib, if you can – sometimes you just don't know yet.
Next, build that library based on the requirements, then drop it in your Sinatra app and you're done. Does that help?

how do you mock dependent methods using rspec

I'm trying to write a custom parser for my cucumber results. In doing so, I want to write rspec tests around it. What I currently have is as follows:
describe 'determine_test_results' do
it 'returns a scenario name as the key of the scenario results, with the scenario_line attached' do
pcr = ParseCucumberJsonReport.new
expected_results = {"I can login successfully"=>{"status"=>"passed", "scenario_line"=>4}}
cucumber_results = JSON.parse(IO.read('example_json_reports/json_passing.json'))
pcr.determine_test_results(cucumber_results[0]).should == expected_results
end
end
The problem is, determine_test_results has a sub method called determine_step_results, which means this is really an integration test between the 2 methods and not a unit test for determine_test_results.
How would I mock out the "response" from determine_step_results?
Assume determine_step_results returns {"status"=>"passed", "scenario_line"=>4}
what I have tried:
pcr.stub(:determine_step_results).and_return({"status"=>"passed", "scenario_line"=>6})
and
allow(pcr).to receive(:determine_step_results).and_return({"status"=>"passed", "scenario_line"=>6})
You could utilize stubs for what you're trying to accomplish. Project: RSpec Mocks 2.3 would be good reading regarding this particular case. I have added some code below as a suggestion.
describe 'determine_test_results' do
it 'returns a scenario name as the key of the scenario results, with the scenario_line attached' do
pcr = ParseCucumberJsonReport.new
expected_results = {"I can login successfully"=>{"status"=>"passed", "scenario_line"=>4}}
# calls on pcr will return expected results every time determine_step_results is called in any method on your pcr object.
pcr.stub!(:determine_step_results).and_return(expected_results)
cucumber_results = JSON.parse(IO.read('example_json_reports/json_passing.json'))
pcr.determine_test_results(cucumber_results[0]).should == expected_results
end
end
If all what determine_test_results does is call determine_step_results, you should not really test it, since it is trivial...
If you do decide to test it, all you need to test is that it calls the delegate function, and returns whatever is passed to it:
describe ParseCucumberJsonReport do
describe '#determine_test_results' do
it 'calls determine_step_results' do
result = double(:result)
input = double(:input)
expect(subject).to receive(:determine_step_results).with(input).and_return(result)
subject.determine_test_results(input).should == result
end
end
end
If it is doing anything more (like adding the result to a larger hash) you can describe it too:
describe ParseCucumberJsonReport do
describe '#determine_test_results' do
it 'calls determine_step_results' do
result = double(:result)
input = double(:input)
expect(subject).to receive(:determine_step_results).with(input).and_return(result)
expect(subject.larger_hash).to receive(:merge).with(result)
subject.determine_test_results(input).should == result
end
end
end

Ruby dynamic object creation

Okay so I am trying to create an object for each found email domain in a text file. So far I have the matching system working and now have ran into a problem creating the objects on the fly. Here is what I got so far.
# domain = emails domain name (e.g. 'example.com')
# Agency = class for domain
if (domain + "Object").nil? == false
domain = Agency.new(domain + "Object")
#agencyList << domain
domain.addEmail(match)
puts "false"
elsif (domain + "Object").nil? == true
domain.addEmail(match)
puts "true"
end
end
end
So basically I want to check if the email domain already has an object created for it. If it doesn't, create an object using the domain name and send the matched up with the object method addEmail. If it does send the match to object method addEmail. I don't want to use hashes because I want the matches in separate arrays.
I have tried many things and I think I am in over my head. This is my first ruby script. Any help would be greatly appreciated.
I think you just want to check whether the object is in your agency list. Something like:
if #agencyList.any? {|agency| agency.domain == domain }
agency = Agency.new(domain)
#agencyList << domain
agency.addEmail(match)
puts "false"
else
domain.addEmail(match)
puts "true"
end

Using rspec to test a method on an object that's a property of an object

If I have a method like this:
require 'tweetstream'
# client is an instance of TweetStream::Client
# twitter_ids is an array of up to 1000 integers
def add_first_users_to_stream(client, twitter_ids)
# Add the first 100 ids to sitestream.
client.sitestream(twitter_ids.slice!(0,100))
# Add any extra IDs individually.
twitter_ids.each do |id|
client.control.add_user(id)
end
return client
end
I want to use rspec to test that:
client.sitestream is called, with the first 100 Twitter IDs.
client.control.add_user() is called with the remaining IDs.
The second point is trickiest for me -- I can't work out how to stub (or whatever) a method on an object that is itself a property of an object.
(I'm using Tweetstream here, although I expect the answer could be more general. If it helps, client.control would be an instance of TweetStream::SiteStreamClient.)
(I'm also not sure a method like my example is best practice, accepting and returning the client object like that, but I've been trying to break my methods down so that they're more testable.)
This is actually a pretty straightforward situation for RSpec. The following will work, as an example:
describe "add_first_users_to_stream" do
it "should add ids to client" do
bulk_add_limit = 100
twitter_ids = (0..bulk_add_limit+rand(50)).collect { rand(4000) }
extras = twitter_ids[bulk_add_limit..-1]
client = double('client')
expect(client).to receive(:sitestream).with(twitter_ids[0...bulk_add_limit])
client_control = double('client_control')
expect(client).to receive(:control).exactly(extras.length).times.and_return(client_control)
expect(client_control).to receive(:add_user).exactly(extras.length).times.and_return {extras.shift}
add_first_users_to_stream(client, twitter_ids)
end
end

Ruby design strategy on using "retry" logic

I am dealing with an API that requires me to do the following (the API can not be changed):
Log into the service to use it (login)
Use the api method passing some token info from login and some api method specific parameters (each such method invocation is wrapped in its own "api_user" method - see below).
In step 2 though, the api can result in certain exceptions in which case I have to retry the login method and invoke the same api again with new token. Each of the api methods may have additional parameters (apart from the token parameter). Conceptually, if I have already logged in, I have the token now which can be used for some time.
def api_user
begin
api_method1 token, x,y,z
RetryException => e
new_token = login
api_method1 token, x,y,z
end
end
How do I do this elegantly?
Option 1:
For each api_user method - do the above individually
Option 2:
Use ruby's metaprogramming. I have tried to show this below.
class Y
def self.api_invoker(token,y,z)
if token == 'old'
raise "Old token - renew it"
end
puts "Token = #{token}"
end
def self.call_method(m, *params)
method = Y.method(m)
begin
method.call(*params)
rescue Exception => e
if e.message.include? "Old token"
puts "params before = #{params}"
params[0] = "new"
puts "params after = #{params}"
method.call(*params)
end
end
end
end
If you invoke the above method as follows, the retrial is triggered and the "new" token is passed to the second invocation of the method.
Y.call_method("api_invoker", "old",2,3)
I dont like the design since
1. it seems a bit complicated though I prefer it to the option 1 since it removes the duplicate retry logic from all api invoker methods.
2. Since ruby does not have access to parameters using their names, I have to force the convention of making the token parameter the first parameter in all "api invoker" methods. This is so that I can then replace that parameter with the new token in a retrial attempt.
If ruby had a way to access parameters using their names, the above would have been an acceptable design.
Can you suggest a better way?
Thanks!
PS: I can pass all parameters in a "hash" for each api_invoker and then use the token parameter name to access it regardless of where it is positioned (similar to as mentioned in the link http://deepfall.blogspot.com/2008/08/named-parameters-in-ruby.html - but that seems even uglier to me.
Went with option 2 to avoid repeating the try catch block.

Resources