How do I mock AWS SDK (v2) with rspec? - ruby

I have a class which reads/processes messages from an SQS queue using the aws-sdk-rails gem (which is a wrapper on aws-sdk-ruby v2). How do I mock the AWS calls so I can test my code without hitting the external services?
communicator.rb:
class Communicator
def consume_messages
sqs_client = Aws::SQS::Client.new
# consume messages until the queue is empty
loop do
r = sqs_client.receive_message({
queue_url: "https://sqs.region.amazonaws.com/xxxxxxxxxxxx/foo",
visibility_timeout: 1,
max_number_of_messages: 1
})
break if (response.message.length == 0)
# process r.messages.first.body
r = sqs_client.delete_message({
queue_url: "https://sqs.region.amazonaws.com/xxxxxxxxxxxx/foo",
receipt_handle: r.messages.first.receipt_handle
})
end
end
end

The AWS SDK already provides stubbing. q.v. http://docs.aws.amazon.com/sdkforruby/api/Aws/ClientStubs.html for more information (Linked to official documentation.)

I had a hard time finding examples mocking AWS resources. I spent a few days figuring it out and wanted to share my results on Stack Overflow for posterity. I used rspec-mocks (doubles & verifying doubles). Here's an example with the communicator.rb example in the question.
communicator_spec.rb:
RSpec.describe Communicator do
describe "#consume_messages" do
it "can use rspec doubles & verifying doubles to mock AWS SDK calls" do
sqs_client = instance_double(Aws::SQS::Client)
allow(Aws::SQS::Client).to receive(:new).and_return(sqs_client)
SQSResponse = Struct.new(:messages)
SQSMessage = Struct.new(:body, :receipt_handle)
response = SQSResponse.new([SQSMessage.new(File.read('data/expected_body.json'), "receipt_handle")])
empty_response = SQSResponse.new([])
allow(sqs_client).to receive(:receive_message).
and_return(response, empty_response)
allow(sqs_client).to receive(:delete_message).and_return(nil)
Communicator.new.consume_messages
end
end
end

Related

Save Google Cloud Speech API operation(job) object to retrieve results later

I'm struggling to use the Google Cloud Speech Api with the ruby client (v0.22.2).
I can execute long running jobs and can get results if I use
job.wait_until_done!
but this locks up a server for what can be a long period of time.
According to the API docs, all I really need is the operation name(id).
Is there any way of creating a job object from the operation name and retrieving it that way?
I can't seem to create a functional new job object such as to use the id from #grpc_op
What I want to do is something like:
speech = Google::Cloud::Speech.new(auth_credentials)
job = speech.recognize_job file, options
saved_job = job.to_json #Or some element of that object such that I can retrieve it.
Later, I want to do something like....
job_object = Google::Cloud::Speech::Job.new(saved_job)
job.reload!
job.done?
job.results
Really hoping that makes sense to somebody.
Struggling quite a bit with google's ruby clients on the basis that everything seems to be translated into objects which are much more complex than the ones required to use the API.
Is there some trick that I'm missing here?
You can monkey-patch this functionality to the version you are using, but I would advise upgrading to google-cloud-speech 0.24.0 or later. With those more current versions you can use Operation#id and Project#operation to accomplish this.
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :linear16,
language: "en-US",
sample_rate: 16000
op = audio.process
# get the operation's id
id = op.id #=> "1234567890"
# construct a new operation object from the id
op2 = speech.operation id
# verify the jobs are the same
op.id == op2.id #=> true
op2.done? #=> false
op2.wait_until_done!
op2.done? #=> true
results = op2.results
Update Since you can't upgrade, you can monkey-patch this functionality to an older-version using the workaround described in GoogleCloudPlatform/google-cloud-ruby#1214:
require "google/cloud/speech"
# Add monkey-patches
module Google
Module Cloud
Module Speech
class Job
def id
#grpc.name
end
end
class Project
def job id
Job.from_grpc(OpenStruct.new(name: id), speech.service).refresh!
end
end
end
end
end
# Use the new monkey-patched methods
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :linear16,
language: "en-US",
sample_rate: 16000
job = audio.recognize_job
# get the job's id
id = job.id #=> "1234567890"
# construct a new operation object from the id
job2 = speech.job id
# verify the jobs are the same
job.id == job2.id #=> true
job2.done? #=> false
job2.wait_until_done!
job2.done? #=> true
results = job2.results
Ok. Have a very ugly way of solving the issue.
Get the id of the Operation from the job object
operation_id = job.grpc.grpc_op.name
Get an access token to manually use the RestAPI
json_key_io = StringIO.new(ENV["GOOGLE_CLOUD_SPEECH_JSON_KEY"])
authorisation = Google::Auth::ServiceAccountCredentials.make_creds(
json_key_io:json_key_io,
scope:"https://www.googleapis.com/auth/cloud-platform"
)
token = authorisation.fetch_access_token!
Make an api call to retrieve the operation details.
This will return with a "done" => true parameter, once results are in and will display the results. If "done" => true isn't there then you'll have to poll again later until it is.
HTTParty.get(
"https://speech.googleapis.com/v1/operations/#{operation_id}",
headers: {"Authorization" => "Bearer #{token['access_token']}"}
)
There must be a better way of doing that. Seems such an obvious use case for the speech API.
Anyone from google in the house who can explain a much simpler/cleaner way of doing it?

Is it reasonable to use resque(ruby) to manage external long-running commands (and log tasks)

I have to run bash heavy-job.sh <data-num> (that takes 0.5~2 days) frequently on my computer to process data located at ~/a/data/num . The script call a few sub-processes sequentially and write a log to ~/a/result/num.log . I have done this manually until now.
I wanted to visualize processed tasks and it's status(success or fail), etc as html table. I wrote simple sinatra app to render a table that shows
the list of ~/a/data/num to be processed
~/a/result/num.log exists or not (process not-launched/processing/done)
it's status (the log file contains the word "error" or not)
I found that it would be convenient that if I could launch a bash heavy-job.sh <data-num> from the sinatra app, log the tasks (and info like time,date,etc..) and it's args (heavy-jobs takes some optional args ) and show them as html table.
So I need something that manages jobs and logs to files (or db).
First I wrote a code like below for test (! for test, not integrated with my system yet !), but later I found resque is what i wanted. I am a beginner and not sure if my decision is reasonable or not.
my questions are
is it reasonable to use resque to manage external long-running commands (and log tasks)
or should I use another tool (not necessarily ruby-tool).
(extra;) the task-manager and the sinatra app should work separately (and communicate each other over REST or something) OR not ?
The jobs are not critical since I can retry tasks manually later if failed.
I am not good at English and my question may be misleading. I appreciate any help :) .
class TaskSpawn
def initialize()
#pids = []
end
def spawn(command, options = {})
#opt = {:pgroup => true}
#pids << Kernel.spawn(command, options)
end
def pids()
return #pids.clone
end
def waitany_nohang()
delete_idx = nil
ret = nil
#pids.each_with_index do |p, idx|
pid,status = Process.waitpid2(p, Process::WNOHANG)
unless pid.nil?
delete_idx = idx
ret = [pid,status]
break
end
end
if delete_idx
#pids.delete_at(delete_idx)
return ret
else
# no task fininshed
return nil
end
end
def waitall()
ret = waitall
raise "interal error" if ret.size != pids.size
return ret
end
end

rails2 ruby geocoder issue when finding the distance

I'm using ruby geocoder gem in my rails2 app. When a user tries to sign up i'm taking his ip and fetching lat and long using geocoder. Next, taking user entered address and fetching lat and longitude. Finally checking the distance between those two.If distance is greater than 100 am adding an error. This is happening in before_create callback(As per clients requirement). But the issue here is geocoder is sending continuous requests and breaking.Below is my code.
Given some dummy address and ip for testing purpose.
def validate_address_and_ip
p "validate_address_and_ip"
dup_addr = "Level 8, Umiya Business Bay Tower 1,Cessna Business Park, Maratahalli ORR,Sarjapur Ring Road, Kadubeesanahalli,Bangalore"
ip_based_lat_lng = Geocoder.search("183.82.98.134").map{ |obj| [obj.latitude, obj.longitude] }.flatten#ip_address
p "ip_based_lat_lng=", ip_based_lat_lng
addr_results = Geocoder.search(dup_addr)#address.full_address
addr_based_result = addr_results.first.geometry["location"]
lat,lang = addr_based_result["lat"], addr_based_result["lng"]
dist_in_miles = Geocoder::Calculations.distance_between(ip_based_lat_lng, [lat, lang]).round
puts "***************#{dist_in_miles}*******************************"
if 300 > 100
p "*****************in if condition**********************"
self.errors.add_to_base('zip code and IP mismatch.')
return false
puts "***************#{self.errors}*******************************"
end
end
Error: Numerical argument out of domain - sqrt
in log am getting
"validate_address_and_ip"
"ip_based_lat_lng="
[17.3753, 78.4744]
******311***************
"*******in if condition***********"
"validate_address_and_ip"
Geocoding API not responding fast enough (use Geocoder.configure(:timeout => ...) to set limit).
"ip_based_lat_lng="
[]
Why is that method getting called multiple times? Can anyone tell me the issue

How to test deferred action - EventMachine

I have a Sinatra app that runs inside of EventMachine. Currently, I am taking a post request of JSON data, deferring storage, and returning a 200 OK status code. The deferred task simply pushes the data to a queue and increments a stats counter. The code is similar to:
class App < Sinatra::Base
...
post '/' do
json = request.body.read
operation = lambda do
push_to_queue(json)
incr_incoming_stats
end
callback = lambda {}
EM.defer(operation, callback)
end
...
end
My question is, how do I test this functionality. If I use Rack::Test::Methods, then I have to put in something like sleep 1 to make sure the deferred task has completed before checking the queue and stats such that my test may look like:
it 'should push data to queue with valid request' do
post('/', #json)
sleep 1
#redis.llen("#{#opts[:redis_prefix]}-queue").should > 0
end
Any help is appreciated!
The solution was pretty simple and once I realized it, I felt kind of silly. I created a test-helper that contained the following:
module EM
def self.defer(op, callback)
callback.call(op.call)
end
end
Then just include this into your test-files. This way the defer method will just run the operation and callback on the same thread.

Ruby: Dynamically defining classes based on user input

I'm creating a library in Ruby that allows the user to access an external API. That API can be accessed via either a SOAP or a REST API. I would like to support both.
I've started by defining the necessary objects in different modules. For example:
soap_connecton = Library::Soap::Connection.new(username, password)
response = soap_connection.create Library::Soap::LibraryObject.new(type, data, etc)
puts response.class # Library::Soap::Response
rest_connecton = Library::Rest::Connection.new(username, password)
response = rest_connection.create Library::Rest::LibraryObject.new(type, data, etc)
puts response.class # Library::Rest::Response
What I would like to do is allow the user to specify that they only wish to use one of the APIs, perhaps something like this:
Library::Modes.set_mode(Library::Modes::Rest)
rest_connection = Library::Connection.new(username, password)
response = rest_connection.create Library::LibraryObject.new(type, data, etc)
puts response.class # Library::Response
However, I have not yet discovered a way to dynamically set, for example, Library::Connection based on the input to Library::Modes.set_mode. What would be the best way to implement this functionality?
Murphy's law prevails; find an answer right after posting the question to Stack Overflow.
This code seems to have worked for me:
module Library
class Modes
Rest = 1
Soap = 2
def self.set_mode(mode)
case mode
when Rest
Library.const_set "Connection", Class.new(Library::Rest::Connection)
Library.const_set "LibraryObject", Class.new(Library::Rest::LibraryObject)
when Soap
Library.const_set "Connection", Class.new(Library::Soap::Connection)
Library.const_set "LibraryObject", Class.new(Library::Soap::LibraryObject)
else
throw "#{mode.to_s} is not a valid Library::Mode"
end
end
end
end
A quick test:
Library::Modes.set_mode(Library::Modes::Rest)
puts Library::Connection.class == Library::Rest::Connection.class # true
c = Library::Connection.new(username, password)

Resources