Should I be using EM::Synchrony::Multi or EM::Synchrony::FiberIterator with Goliath? - ruby

Maybe this is the wrong approach, but I'm trying to parallelize em-hiredis puts and lookups in Goliath with EM::Synchrony::Multi or EM::Synchrony::FiberIterator. However, I can't seem to access basic values initialized in the config. I keep getting method_missing errors.
Here's the basic watered down version of what I'm trying to do:
/lib/config/try.rb
config['redisUri'] = 'redis://localhost:6379/0'
config['redis_db'] ||= EM::Hiredis.connect
config['user_agent'] = "MyCrawler Mozilla/5.0 Compat etc."
Here's the basic Goliath Setup
/try.rb
require "goliath"
require "em-hiredis"
require "em-synchrony/fiber_iterator"
require "em-synchrony/em-hiredis"
require "em-synchrony/em-multi"
class Try < Goliath::API
use Goliath::Rack::Params
use Goliath::Rack::DefaultMimeType
def response(env)
case env['REQUEST_PATH']
when "/start" then
start_crawl()
body = "STARTING"
[200, {}, body]
end
end
def start_crawl
urls = ["http://www.example.com/",
"http://www.example.com/photos/",
"http://www.example.com/video/",
]
EM::Synchrony::FiberIterator.new(urls, 3).each do |url|
p "#{user_agent}"
redis_db.sadd 'test_queue', url
end
# multi = EM::Synchrony::Multi.new
# urls.each_with_index do |url, index|
# p "#{user_agent}"
# multi.add index, redis_db.sadd('test_queue', url)
# end
end
end
However, I keep getting errors where Goliath doesn't know what user_agent is or redis_db which were initialized in the config.
[936:INFO] 2012-09-21 23:47:10 :: Starting server on 0.0.0.0:9000 in development mode. Watch out for stones.
/Users/ewu/.rvm/gems/ruby-1.9.3-p194#crawler/gems/goliath-1.0.0/lib/goliath/api.rb:143:in `method_missing': undefined local variable or method `user_agent' for #<Try:0x007ff5a431c4e0 #opts={}> (NameError)
from ./lib/try.rb:27:in `block in start_crawl'
from /Users/ewu/.rvm/gems/ruby-1.9.3-p194#crawler/gems/em-synchrony-1.0.2/lib/em-synchrony/fiber_iterator.rb:10:in `call'
from /Users/ewu/.rvm/gems/ruby-1.9.3-p194#crawler/gems/em-synchrony-1.0.2/lib/em-synchrony/fiber_iterator.rb:10:in `block (2 levels) in each'
...
...
...
Ideally I'd be able to get FiberIterator working, because I have additional conditionals to check for:
EM::Synchrony::FiberIterator.new(urls, 3).each do |new_url}
is_member = redis_db.sismember('crawled_urls', new_url)
is_member += redis_db.sismember('queued_urls', new_url)
if is_member == 0
redis_db.lpush 'crawl_queue', new_url
redis_db.sadd 'queued_urls', new_url
end
end

I don't think your config file is getting loaded. The name of try.rb needs to match the name of the robojin.rb file in the config directory.

Related

Automatic Airbrake errors with plain Ruby (no Rails or Sinatra)

Is there a way to integrate Airbrake with a pure Ruby project (not rails or sinatra) so that unanticipated errors get reported? I have it set up and I am able to catch errors by calling Airbrake.notify_or_ignore and passing in the exception, but I can't get it to report errors without explicitly calling this.
The following is the code that works for explicitly calling Airbrake.notify but doesn't work for sending errors to Airbrake without explicitly calling notify:
require 'airbrake'
Airbrake.configure do |config|
config.api_key = ENV['AIRBRAKE_API_KEY']
config.development_environments = []
config.ignore_only = []
end
I tried adding Rack as a middleware with the following code:
require 'rack'
require 'airbrake'
Airbrake.configure do |config|
config.api_key = ENV['AIRBRAKE_API_KEY']
config.development_environments = []
config.ignore_only = []
end
app = Rack::Builder.app do
run lambda { |env| raise "Rack down" }
end
use Airbrake::Rack
run app
But I get an "undefined method `use' for main:Object (NoMethodError)"
Any thoughts?
Copied from Mark's comment's link to airbrake for future googlers:
# code at http://gist.github.com/3350
# tests at http://gist.github.com/3354
class Airbrake < ActiveResource::Base
self.site = "http://your_account.airbrake.io"
class << self
##auth_token = 'your_auth_token'
def find(*arguments)
arguments = append_auth_token_to_params(*arguments)
super(*arguments)
end
def append_auth_token_to_params(*arguments)
opts = arguments.last.is_a?(Hash) ? arguments.pop : {}
opts = opts.has_key?(:params) ? opts : opts.merge(:params => {})
opts[:params] = opts[:params].merge(:auth_token => ##auth_token)
arguments << opts
arguments
end
end
end
class Error < Airbrake
end
# Errors are paginated. You get 30 at a time.
#errors = Error.find :all
#errors = Error.find :all, :params => { :page => 2 }

Wrong number of arguments on Rack::Builder.new

I apparently have a Rack::Builder misunderstanding. Inside my config.ru file i've got:
require 'rack'
require 'rack/lobster'
class Shrimp
SHRIMP_STRING = 'teste'
def initialize(app)
#app = app
end
def call(env)
status, headers, response = #app.call(env)
response_body = ""
response.each { |part| response_body += part }
response_body += "<pre>#{SHRIMP_STRING}</pre>"
headers["Content-Length"] = response_body.length.to_s
[status, headers, response_body]
end
end
app = Rack::Builder.new do
use Rack::Lobster
run Shrimp.new
end
Rack::Handler::WEBrick.run app
When I do a
rackup config.ru
I get a
/home/vagrant/config.ru:7:in `initialize': wrong number of arguments (0 for 1) (ArgumentError)
from /home/vagrant/config.ru:26:in `new'
Am I missing something? According to this tutorial Rack::Builder.new only receives a block as a parameter.
EDIT:
changing this line
run Shrimp.new
to:
run Shrimp
I still get a wrong number of arguments, but this time for Rack::Builder
ERROR ArgumentError: wrong number of arguments (1 for 0)
/home/vagrant/.rbenv/versions/2.0.0-p353/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/builder.rb:86:in `initialize'
For Rack middleware, you don't need to do Shrimp.new, You just need to do use Shrimp and it should do.
You can find it's one example here.
As par this link you only need to do following:
# config.ru
require 'rack'
require 'rack/lobster'
require 'shrimp'
use Shrimp
run Rack::Lobster.new

Rails 4: Undefined method on module

I have a module in app/misc/dsl/builder.rb that has this code
module Dsl
class Builder
def initialize(context, &block)
return if not block_given?
parent_context = block.binding.eval "self"
parent_context.extend Proxy
parent_context.object = context
parent_context.instance_eval &block
end
end
def self.set_context(context, &block)
Dsl::Builder.new(context, &block)
end
end
Note: this directory misc is preloaded in application.rb
config.autoload_paths += Dir[Rails.root.join('app', 'models', '{**/}'),
Rails.root.join('app', 'misc', '{**/}')
]
Then, somewhere in the text (lets say at foo.rb) I have this code:
Dsl.set_context(obj) do
#some code with obj receiving messages
end
The test stack we are using consists on Zeus+Guard+Rspec. Now, lets say I rewrite the code to something not working
Dsl.set_context(obj) do
asdqwe #this message does not exists
end
From times to times, I receive this baffling message
1) SomeOtherClass search_hash receiving keywords params should query for those keywords
Failure/Error: subject.search_hash
NoMethodError:
undefined method `set_context' for Dsl:Module
# ./app/misc/product_query.rb:116:in `base_search_hash'
# ./app/misc/product_query.rb:25:in `search_hash'
# ./spec/misc/product_query_spec.rb:78:in `block (4 levels) in <top (required)>'
# -e:1:in `<main>'
instead of the correct message that should be regarding undefined method asdqwe
Any clue about this?
Look here
it says:
Rails 3 has been updated such that classes/modules (henceforth, C/M)
are lazy loaded from the autoload paths as they are needed
so, you can do require_relative 'app/misc/dsl/builder.rb' in your rspec_helper.rb (can it be better with just require?) The problem must be that the loader doesn't know in advance where to find Dsl.set_context, but he will know once you have referenced Dsl::Builder
Hope it helps

NoMethodError: undefined method `split' for #<Proc: ...> with Faraday

I want to send a get request with a JSON body (for search) using Faraday, but am getting the above error. I thought that self inside the Proc was messing things up, but that had nothing to do with it. I'm following the documentation on the [faraday github page][1] but have gotten stuck on this.
def perform_query
response = self.database.connection.get do |request|
request.url self.path
request.headers['Content-Type'] = 'application/json'
request.body(self.to_json)
end
end
def terms_to_json
terms_array = self.terms.keys.inject([]) do |terms_array, field|
value = self.terms[field]
terms_array.tap do |ary|
if value
ary << "\"#{field}\": \"#{value}\""
end
end
end
"{ #{terms_array.join ','} }"
end
def to_json
"{ \"queryb\" : #{self.terms_to_json} }"
end
Here is the stack trace, with the error coming somewhere in the get Proc in #perform_query :
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/request.rb:60:in `url'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/connection.rb:219:in `block in run_request'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/connection.rb:237:in `block in build_request'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/request.rb:35:in `block in create'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/request.rb:34:in `tap'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/request.rb:34:in `create'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/connection.rb:233:in `build_request'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/connection.rb:218:in `run_request'
from /Users/chrismaddox/.rvm/gems/ruby-1.9.3-p125/gems/faraday-0.8.1/lib/faraday/connection.rb:87:in `get'
from /Users/chrismaddox/Dropbox/LivingSocial/Hungry Academy/Projects/hackchat/search_ruby/elastic.rb:83:in `method_missing'
from /Users/chrismaddox/Dropbox/LivingSocial/Hungry Academy/Projects/hackchat/search_ruby/elastic.rb:112:in `perform_query'
from /Users/chrismaddox/Dropbox/LivingSocial/Hungry Academy/Projects/hackchat/search_ruby/elastic.rb:61:in `send_query'
UPDATE:
the path method returns a string of the path to the search for a given index. Eg /wombats/animals/_search
Elastic::Database#path calls Elastic::Index#index_path:
module Elastic
ELASTIC_URL = "http://localhost:9200"
class Index
attr_reader :index_name, :type_name, :last
def initialize(type)
#index_name = "#{type}-index"
#type_name = type
#last = 0
add_to_elastic
end
def add_to_elastic
index_url = URI.parse "#{ELASTIC_URL}#{index_path}/"
Connection.new(index_url).put()
end
def index_path
"/#{self.index_name}"
end
def search_path
"#{type_path}/_search/"
end
def type_path
"#{self.index_path}/#{type_name}/"
end
end
end
A call to search_path = "#{type_path}/_search/"
A call to type_path = "#{self.index_path}/#{type_name}/"
A call to index_path = "/#{self.index_name}"
So if index name is wombat and type name is animal, search_path evaluates to /wombat/animal//_search
It turns out that this wasn't the problem showing the error, but was caused because Faraday's methods are inconsistent. Faraday::Request#url and Faraday::Request#headers are themselves setter methods, whereas Faraday::Request#body= is the setter method for body.

Having 'allocator undefined for Data' when saving with ActiveResource

What I am missing? I am trying to use a rest service for with Active resource, I have the following:
class User < ActiveResource::Base
self.site = "http://localhost:3000/"
self.element_name = "users"
self.format = :json
end
user = User.new(
:name => "Test",
:email => "test.user#domain.com")
p user
if user.save
puts "success: #{user.uuid}"
else
puts "error: #{user.errors.full_messages.to_sentence}"
end
And the following output for the user:
#<User:0x1011a2d20 #prefix_options={}, #attributes={"name"=>"Test", "email"=>"test.user#domain.com"}>
and this error:
/Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1233:in `new': allocator undefined for Data (TypeError)
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1233:in `load'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1219:in `each'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1219:in `load'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1322:in `load_attributes_from_response'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1316:in `create_without_notifications'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1314:in `tap'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1314:in `create_without_notifications'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/observing.rb:11:in `create'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1117:in `save_without_validation'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/validations.rb:87:in `save_without_notifications'
from /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/observing.rb:11:in `save'
from import_rest.rb:22
If I user curl for my rest service it would be like:
curl -v -X POST -H 'Content-Type: application/json' -d '{"name":"test curl", "email":"test#gmail.com"}' http://localhost:3000/users
with the response:
{"email":"test#gmail.com","name":"test curl","admin":false,"uuid":"afb8c98b-562a-4603-bbe4-f8f0816cef0d","creation_limit":5}
There is a built-in type named Data, whose purpose is rather mysterious. You appear to be bumping into it:
$ ruby -e 'Data.new'
-e:1:in `new': allocator undefined for Data (TypeError)
from -e:1
The question is, how did it get there? The last stack frame puts us here. So, it appears Data wandered out of a call to find_or_create_resource_for. The code branch here looks likely:
$ irb
>> class C
>> end
=> nil
>> C.const_get('Data')
=> Data
This leads me to suspect you have an attribute or similar floating around named :data or "data", even though you don't mention one above. Do you? Particularly, it seems we have a JSON response with a sub-hash whose key is "data".
Here's a script that can trigger the error for crafted input, but not from the response you posted:
$ cat ./activeresource-oddity.rb
#!/usr/bin/env ruby
require 'rubygems'
gem 'activeresource', '3.0.10'
require 'active_resource'
class User < ActiveResource::Base
self.site = "http://localhost:3000/"
self.element_name = "users"
self.format = :json
end
USER = User.new :name => "Test", :email => "test.user#domain.com"
def simulate_load_attributes_from_response(response_body)
puts "Loading #{response_body}.."
USER.load User.format.decode(response_body)
end
OK = '{"email":"test#gmail.com","name":"test curl","admin":false,"uuid":"afb8c98b-562a-4603-bbe4-f8f0816cef0d","creation_limit":5}'
BORKED = '{"data":{"email":"test#gmail.com","name":"test curl","admin":false,"uuid":"afb8c98b-562a-4603-bbe4-f8f0816cef0d","creation_limit":5}}'
simulate_load_attributes_from_response OK
simulate_load_attributes_from_response BORKED
produces..
$ ./activeresource-oddity.rb
Loading {"email":"test#gmail.com","name":"test curl","admin":false,"uuid":"afb8c98b-562a-4603-bbe4-f8f0816cef0d","creation_limit":5}..
Loading {"data":{"email":"test#gmail.com","name":"test curl","admin":false,"uuid":"afb8c98b-562a-4603-bbe4-f8f0816cef0d","creation_limit":5}}..
/opt/local/lib/ruby/gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1233:in `new': allocator undefined for Data (TypeError)
from /opt/local/lib/ruby/gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1233:in `load'
from /opt/local/lib/ruby/gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1219:in `each'
from /opt/local/lib/ruby/gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb:1219:in `load'
from ./activeresource-oddity.rb:17:in `simulate_load_attributes_from_response'
from ./activeresource-oddity.rb:24
If I were you, I would open /Library/Ruby/Gems/1.8/gems/activeresource-3.0.10/lib/active_resource/base.rb, find load_attributes_from_response on line 1320 and temporarily change
load(self.class.format.decode(response.body))
to
load(self.class.format.decode(response.body).tap { |decoded| puts "Decoded: #{decoded.inspect}" })
..and reproduce the error again to see what is really coming out of your json decoder.
I just ran into the same error in the latest version of ActiveResource, and I found a solution that does not require monkey-patching the lib: create a Data class in the same namespace as the ActiveResource object. E.g.:
class User < ActiveResource::Base
self.site = "http://localhost:3000/"
self.element_name = "users"
self.format = :json
class Data < ActiveResource::Base; end
end
Fundamentally, the problem has to do with the way ActiveResource chooses the classes for the objects it instantiates from your API response. It will make an instance of something for every hash in your response. For example, it'll want to create User, Data and Pet objects for the following JSON:
{
"name": "Bob",
"email": "bob#example.com",
"data": {"favorite_color": "purple"},
"pets": [{"name": "Puffball", "type": "cat"}]
}
The class lookup mechanism can be found here. Basically, it checks the resource (User) and its ancestors for a constant matching the name of the sub-resource it wants to instantiate (i.e. Data here). The exception is caused by the fact that this lookup finds the top-level Data constant from the Stdlib; you can therefore avoid it by providing a more specific constant in the resource's namespace (User::Data). Making this class inherit from ActiveResource::Base replicates the behaviour you'd get if the constant was not found at all (see here).
Thanks to phs for his analysis - it got me pointed in the right direction.
I had no choice but to hack into ActiveResource to fix this problem because an external service over which I have no control had published an API where all attributes of the response were tucked away inside a top-level :data attribute.
Here's the hack I ended up putting in config/initializers/active_resource.rb to get this working for me using active resource 3.2.8:
class ActiveResource::Base
def load(attributes, remove_root = false)
raise ArgumentError, "expected an attributes Hash, got #{attributes.inspect}" unless attributes.is_a?(Hash)
#prefix_options, attributes = split_options(attributes)
if attributes.keys.size == 1
remove_root = self.class.element_name == attributes.keys.first.to_s
end
# THIS IS THE PATCH
attributes = ActiveResource::Formats.remove_root(attributes) if remove_root
if data = attributes.delete(:data)
attributes.merge!(data)
end
# END PATCH
attributes.each do |key, value|
#attributes[key.to_s] =
case value
when Array
resource = nil
value.map do |attrs|
if attrs.is_a?(Hash)
resource ||= find_or_create_resource_for_collection(key)
resource.new(attrs)
else
attrs.duplicable? ? attrs.dup : attrs
end
end
when Hash
resource = find_or_create_resource_for(key)
resource.new(value)
else
value.duplicable? ? value.dup : value
end
end
self
end
class << self
def find_every(options)
begin
case from = options[:from]
when Symbol
instantiate_collection(get(from, options[:params]))
when String
path = "#{from}#{query_string(options[:params])}"
instantiate_collection(format.decode(connection.get(path, headers).body) || [])
else
prefix_options, query_options = split_options(options[:params])
path = collection_path(prefix_options, query_options)
# THIS IS THE PATCH
body = (format.decode(connection.get(path, headers).body) || [])
body = body['data'] if body['data']
instantiate_collection( body, prefix_options )
# END PATCH
end
rescue ActiveResource::ResourceNotFound
# Swallowing ResourceNotFound exceptions and return nil - as per
# ActiveRecord.
nil
end
end
end
end
I solved this using a monkey-patch approach, that changes "data" to "xdata" before running find_or_create_resource_for (the offending method). This way when the find_or_create_resource_for method runs it won't search for the Data class (which would crash). It searches for the Xdata class instead, which hopefully doesn't exist, and will be created dynamically by the method. This will be a a proper class subclassed from ActiveResource.
Just add a file containig this inside config/initializers
module ActiveResource
class Base
alias_method :_find_or_create_resource_for, :find_or_create_resource_for
def find_or_create_resource_for(name)
name = "xdata" if name.to_s.downcase == "data"
_find_or_create_resource_for(name)
end
end
end

Resources