require_relative 'config/environment'
HTTP_ERRORS = [
RestClient::Exception
]
module API
class Client
def initialize
#client = RawClient.new
end
def search(params = {})
call { #client.search(params) }
end
def call
raise 'No block specified' unless block_given?
loop do # Keep retrying on error
begin
return yield
rescue *HTTP_ERRORS => e
puts "#{e.response&.request.url}"
sleep 5
end
end
end
end
class RawClient
BASE_URL = 'https://www.google.com'
def search(params = {})
go "search/#{params.delete(:query)}", params
end
private
def go(path, params = {})
RestClient.get(BASE_URL + '/' + path, params: params)
end
end
end
API::Client.new.search(query: 'tulips', per_page: 10)
Will output
https://www.google.com/search/tulips?per_page=10 # First time
https://www.google.com/search/?per_page=10 # On retry
I thought I was being clever here: have a flexible and unified way to pass parameters (ie. search(query: 'tulips', per_page: 10)) and let the client implementation figure out what goes into the url itself (ie. query) and what should be passed as GET parameters (ie. per_page).
But the query param is lost from the params after the first retry, because the hash is passed by reference and delete makes a permanent change to it. The second time yield is called, it apparently preserves the context and params won't have the deleted query anymore in it.
What would be an elegant way to solve this? Doing call { #client.search(params.dup) } seems a bit excessive.
Related
I'm trying to pass some context (binding? To a block, since I'm wrapping a block into another. No idea how to do this.
Here is the code that demonstrates this. The problem happens at the wrap - when I don't wrap the proc gets the context as it should.
require 'sinatra'
class MyWebApp < Sinatra::Base
##help = {}
def processing_policy(policytag)
## do special stuff here that might end in halt()
end
def self.api_endpoint(http_method, uri, policytag, helptext)
##helptext[uri] = { policy: policytag, help: helptext }
if policytag.nil?
## It's an open endpoint. Create as-is. This part works
send(http_method, uri, &block)
else
## This is an endpoint with policy processing
send(http_method, uri) do |*args|
processing_policy(uri,policytag,request)
# I probably need to do some kind of binding passthru for passed block
# How do I do that???
block.call(*args) # Block doesn't get context things like request etc
end
end
end
api_endpoint(:post, '/open_endpoint', nil, 'Some open endpoint') do
"Anyone can view this - you posted #{request.body.read}"
end
api_endpoint(:post, '/close_endpoint', 'mypolicytag', 'Closed endpoint') do
"This is closed = #{request.body.read}"
# Doesn't work - block.call doesn't know about request since
# it doesn't have context
end
api_endpoint(:get, '/help', nil, "Help") do
"Help:\n\n" +
##help.map do |uri, data|
" #{uri} - policytag: #{data[:policy]} - #{data[:help]}\n"
end.join()
end
end
run MyWebApp
Any insights?
OK so I found the answer.
Instead of block.call(*args) I can use
instance_exec(*args, &block) and it works.
I have an around action_action called set_current_user
def set_current_user
CurrentUser.set(current_user) do
yield
end
end
In the CurrentUser singleton
def set(user)
self.user = user
yield
ensure
self.user = nil
end
I cannot figure out how to stub out the yield and the not have the ensure part of the method called
Ideally I would like to do something like
it 'sets the user' do
subject.set(user)
expect(subject.user).to eql user
end
Two errors I am getting
No block is given
When I do pass a block self.user = nil gets called
Thanks in advance
A few things to point out that might help:
ensure is reserved for block of codes that you want to run no matter what happens, hence the reason why your self.user will always be nil. I think what you want is to assign user to nil if there's an exception. In this case, you should be using rescue instead.
def set(user)
self.user = user
yield
rescue => e
self.user = nil
end
As for the unit test, what you want is to be testing only the .set method in the CurrentUser class. Assuming you have everything hooked up correctly in your around filter, here's a sample that might work for you:
describe CurrentUser do
describe '.set' do
let(:current_user) { create(:user) }
subject do
CurrentUser.set(current_user) {}
end
it 'sets the user' do
subject
expect(CurrentUser.user).to eq(current_user)
end
end
end
Hope this helps!
I am not sure what you intend to accomplish with this as it appears you just want to make sure that user is set in the block and unset afterwards. If this is the case then the following should work fine
class CurrentUser
attr_accessor :user
def set(user)
self.user = user
yield
ensure
self.user = nil
end
end
describe '.set' do
subject { CurrentUser.new }
let(:user) { OpenStruct.new(id: 1) }
it 'sets user for the block only' do
subject.set(user) do
expect(subject.user).to eq(user)
end
expect(subject.user).to be_nil
end
end
This will check that inside the block (where yield is called) that subject.user is equal to user and that afterwards subject.user is nil.
Output:
.set
sets user for the block only
Finished in 0.03504 seconds (files took 0.14009 seconds to load)
1 example, 0 failures
I failed to mention I need to clear out the user after every request.
This is what I came up with. Its kinda crazy to put the expectation inside of the lambda but does ensure the user is set prior to the request being processed and clears it after
describe '.set' do
subject { described_class }
let(:user) { OpenStruct.new(id: 1) }
let(:user_expectation) { lambda{ expect(subject.user).to eql user } }
it 'sets the user prior to the block being processed' do
subject.set(user) { user_expectation.call }
end
context 'after the block has been processed' do
# This makes sure the user is always cleared after a request
# even if there is an error and sidekiq will never have access to it.
before do
subject.set(user) { lambda{} }
end
it 'clears out the user' do
expect(subject.user).to eql nil
end
end
end
# users_show_controller.rb
class Controllers::Users::Show
include Hanami::Action
params do
required(:id).filled(:str?)
end
def call(params)
result = users_show_interactor(id: params[:id])
halt 404 if result.failure?
#user = result.user
end
end
# users_show_interactor.rb
class Users::Show::Interactor
include Hanami::Interactor
expose :user
def call(:id)
#user = UserRepository.find_by(:id)
end
end
I have a controller and a interactor like above.
And I'm considering the better way to distinguish ClientError from ServerError, on the controller.
I think It is nice if I could handle an error like below.
handle_exeption StandardError => :some_handler
But, hanami-interactor wraps errors raised inside themselves and so, controller receive errors through result object from interactor.
I don't think that re-raising an error on the controller is good way.
result = some_interactor.call(params)
raise result.error if result.failure
How about implementing the error handler like this?
I know the if statement will increase easily and so this way is not smart.
def call(params)
result = some_interactor.call(params)
handle_error(result.error) if result.faulure?
end
private
def handle_error(error)
return handle_client_error(error) if error.is_a?(ClientError)
return server_error(error) if error.is_a?(ServerError)
end
Not actually hanami-oriented way, but please have a look at dry-monads with do notation. The basic idea is that you can write the interactor-like processing code in the following way
def some_action
value_1 = yield step_1
value_2 = yield step_2(value_1)
return yield(step_3(value_2))
end
def step_1
if condition
Success(some_value)
else
Failure(:some_error_code)
end
end
def step_2
if condition
Success(some_value)
else
Failure(:some_error_code_2)
end
end
Then in the controller you can match the failures using dry-matcher:
matcher.(result) do |m|
m.success do |v|
# ok
end
m.failure :some_error_code do |v|
halt 400
end
m.failure :some_error_2 do |v|
halt 422
end
end
The matcher may be defined in the prepend code for all controllers, so it's easy to remove the code duplication.
Hanami way is validating input parameters before each request handler. So, ClientError must be identified always before actions logic.
halt 400 unless params.valid? #halt ClientError
#your code
result = users_show_interactor(id: params[:id])
halt 422 if result.failure? #ServerError
halt 404 unless result.user
#user = result.user
I normally go about by raising scoped errors in the interactor, then the controller only has to rescue the errors raised by the interactor and return the appropriate status response.
Interactor:
module Users
class Delete
include Tnt::Interactor
class UserNotFoundError < ApplicationError; end
def call(report_id)
deleted = UserRepository.new.delete(report_id)
fail_with!(UserNotFoundError) unless deleted
end
end
end
Controller:
module Api::Controllers::Users
class Destroy
include Api::Action
include Api::Halt
params do
required(:id).filled(:str?, :uuid?)
end
def call(params)
halt 422 unless params.valid?
Users::Delete.new.call(params[:id])
rescue Users::Delete::UserNotFoundError => e
halt_with_status_and_error(404, e)
end
end
end
fail_with! and halt_with_status_and_error are helper methods common to my interactors and controllers, respectively.
# module Api::Halt
def halt_with_status_and_error(status, error = ApplicationError)
halt status, JSON.generate(
errors: [{ key: error.key, message: error.message }],
)
end
# module Tnt::Interactor
def fail_with!(exception)
#__result.fail!
raise exception
end
I am building a simple web spider using Sidekiq and Mechanize.
When I run this for one domain, it works fine. When I run it for multiple domains, it fails. I believe the reason is that web_page gets overwritten when instantiated by another Sidekiq worker, but I am not sure if that's true or how to fix it.
# my scrape_search controller's create action searches on google.
def create
#scrape = ScrapeSearch.build(keywords: params[:keywords], profession: params[:profession])
agent = Mechanize.new
scrape_search = agent.get('http://google.com/') do |page|
search_result = page.form...
search_result.css("h3.r").map do |link|
result = link.at_css('a')['href'] # Narrowing down to real search results
#domain = Domain.new(some params)
ScrapeDomainWorker.perform_async(#domain.url, #domain.id, remaining_keywords)
end
end
end
I'm creating a Sidekiq job per domain. Most of the domains I'm looking for should contain just a few pages, so there's no need for sub-jobs per page.
This is my worker:
class ScrapeDomainWorker
include Sidekiq::Worker
...
def perform(domain_url, domain_id, keywords)
#domain = Domain.find(domain_id)
#domain_link = #domain.protocol + '://' + domain_url
#keywords = keywords
# First we scrape the homepage and get the first links
#domain.to_parse = ['/'] # to_parse is an array of PATHS to parse for the domain
mechanize_path('/')
#domain.verified << '/' # verified is an Array field containing valid domain paths
get_paths(#web_page) # Now we should have to_scrape populated with homepage links
#domain.scraped = 1 # Loop counter
while #domain.scraped < 100
#domain.to_parse.each do |path|
#domain.to_parse.delete(path)
#domain.scraped += 1
mechanize_path(path) # We create a Nokogiri HTML doc with mechanize for the valid path
...
get_paths(#web_page) # Fire this to repopulate to_scrape !!!
end
end
#domain.save
end
def mechanize_path(path)
agent = Mechanize.new
begin
#web_page = agent.get(#domain_link + path)
rescue Exception => e
puts "Mechanize Exception for #{path} :: #{e.message}"
end
end
def get_paths(web_page)
paths = web_page.links.map {|link| link.href.gsub((#domain.protocol + '://' + #domain.url), "") } ## This works when I scrape a single domain, but fails with ".gsub for nil" when I scrape a few domains.
paths.uniq.each do |path|
#domain.to_parse << path
end
end
end
This works when I scrape a single domain, but fails with .gsub for nil for web_page when I scrape a few domains.
You can wrap you code in another class, and then create and object of that class within your worker:
class ScrapeDomainWrapper
def initialize(domain_url, domain_id, keywords)
# ...
end
def mechanize_path(path)
# ...
end
def get_paths(web_page)
# ...
end
end
And your worker:
class ScrapeDomainWorker
include Sidekiq::Worker
def perform(domain_url, domain_id, keywords)
ScrapeDomainWrapper.new(domain_url, domain_id, keywords)
end
end
Also, bear in mind that Mechanize::Page#links may be a nil.
I'm reading a Redis set within an EventMachine reactor loop using a suitable Redis EM gem ('em-hiredis' in my case) and have to check if some Redis sets contain members in a cascade. My aim is to get the name of the set which is not empty:
require 'eventmachine'
require 'em-hiredis'
def fetch_queue
#redis.scard('todo').callback do |scard_todo|
if scard_todo.zero?
#redis.scard('failed_1').callback do |scard_failed_1|
if scard_failed_1.zero?
#redis.scard('failed_2').callback do |scard_failed_2|
if scard_failed_2.zero?
#redis.scard('failed_3').callback do |scard_failed_3|
if scard_failed_3.zero?
EM.stop
else
queue = 'failed_3'
end
end
else
queue = 'failed_2'
end
end
else
queue = 'failed_1'
end
end
else
queue = 'todo'
end
end
end
EM.run do
#redis = EM::Hiredis.connect "redis://#{HOST}:#{PORT}"
# How to get the value of fetch_queue?
foo = fetch_queue
puts foo
end
My question is: how can I tell EM to return the value of 'queue' in 'fetch_queue' to use it in the reactor loop? a simple "return queue = 'todo'", "return queue = 'failed_1'" etc. in fetch_queue results in "unexpected return (LocalJumpError)" error message.
Please for the love of debugging use some more methods, you wouldn't factor other code like this, would you?
Anyway, this is essentially what you probably want to do, so you can both factor and test your code:
require 'eventmachine'
require 'em-hiredis'
# This is a simple class that represents an extremely simple, linear state
# machine. It just walks the "from" parameter one by one, until it finds a
# non-empty set by that name. When a non-empty set is found, the given callback
# is called with the name of the set.
class Finder
def initialize(redis, from, &callback)
#redis = redis
#from = from.dup
#callback = callback
end
def do_next
# If the from list is empty, we terminate, as we have no more steps
unless #current = #from.shift
EM.stop # or callback.call :error, whatever
end
#redis.scard(#current).callback do |scard|
if scard.zero?
do_next
else
#callback.call #current
end
end
end
alias go do_next
end
EM.run do
#redis = EM::Hiredis.connect "redis://#{HOST}:#{PORT}"
finder = Finder.new(redis, %w[todo failed_1 failed_2 failed_3]) do |name|
puts "Found non-empty set: #{name}"
end
finder.go
end