workaround for Twitter api rate limiting - ruby

I've collected a bunch of users and put them in a variable 'users'. I'm looping through them and trying to follow them with my new twitter account. However, after about 15, I'm getting stopped by Twitter for exceeding rate limit. I want to run this again but without the users that i've already followed. How do I remove 'i' from the array of 'users' after they've been followed, or somehow return a new array out of this with the users I've yet to follow? I'm aware of methods like pop and unshift etc, but I'm not sure where 'i' is coming from within the 'users' array. I'm a perpetual newbie, so please include as much detail as possible
Not, users is actually a 'cursor' and not an array, therefore, it has no length method
>> users.each do |i|
?> myuseraccount.twitter.follow(i)
>> end
Twitter::Error::TooManyRequests: Rate limit exceeded

A simple hack would could make use of a call to sleep(n):
>> users.each do |i|
?> myuseraccount.twitter.follow(i)
?> sleep(3)
>> end
Increment the sleep count until twitter-api stops throwing errors.
A proper solution to this problem is achieved via rate-limiting.
A possible ruby solution for method call rate limiting would be glutton_ratelimit.
Edit - And, as Kyle has pointed out, there is a documented solution to this problem.
Below is an enhanced version of that solution:
def rate_limited_follow (account, user)
num_attempts = 0
begin
num_attempts += 1
account.twitter.follow(user)
rescue Twitter::Error::TooManyRequests => error
if num_attempts % 3 == 0
sleep(15*60) # minutes * 60 seconds
retry
else
retry
end
end
end
>> users.each do |i|
?> rate_limited_follow(myuseraccount, i)
>> end

There are a number of solutions, but the easiest in your case is probably shift:
while users.length > 0 do
myuseraccount.twitter.follow(users.first)
users.shift
end
This will remove each user from the array as they are processed.

Here is what I did
def self.careful(&block)
begin
client = get_current_client()
yield client
rescue Twitter::Error::TooManyRequests => error
current_user= User.find_by_token(client.instance_variable_get("#oauth_token"))
current_user.update_attribute(:rate_limit_at, Time.now)
change_current_client()
retry
end
end
this block executes an api call using the current client. If it hits a rate limit, it changes the client to another one using the change_current_client() method, then it retries the call using the new client. you can add a sleep() there if you want.
This can be used like
careful{|client| client.search("#something")}

Related

Sinatra + Ruby: Random Number keeps changing every time I guess. Scope issue?

I am using Sinatra to build a WebGuesser with Jumpstart Labs. I enter a number into a text field in my browser. I click submit and I am supposed to get a response saying if my number is too low or too high (or within 5). I use Shotgun to load the server. I want to be able to guess a number without having the random number change every time I guess.
Code:
require 'sinatra'
require 'sinatra/reloader'
def check_guess(guess)
if params["guess"].to_i == guess
"You got it right!"
elsif params["guess"].to_i > guess
if params["guess"].to_i > (guess + 5)
"Way too high!"
else
"Close.. but too high!"
end
elsif params["guess"].to_i < guess
if params["guess"].to_i < (guess - 5)
"Way too low!"
else
"Close.. but too low!"
end
end
end
# Home route
get '/' do
SECRET_NUMBER = rand(100)
message = check_guess(SECRET_NUMBER)
erb :index, :locals => { :message => message }
end
Currently, I get a new random number every time I guess which doesn't help. I feel like it may have something to do with where my SECRET_NUMBER is scope-wise. Any thoughts?
Every time there is a GET request to "/", the relevant code is executed, which generates (with warnings) a new SECRET_NUMBER.
One way to deal with this is to route to different URLs for the first guess (in which case a secret number should be generated), and the consecutive guesses (in which case a new secret number should not be generated).
Also, it is very bad practice to use a constant for something that changes over time.
You could store the initial value in the user session, for that you would have to enable sessions in sinatra.
configure do
enable :sessions
set :session_secret, "somesecretstring"
end
After that you can create a number by going to a certain route
get '/random' do
session[:number] = rand(100)
end
You can then check your guesses on a different route
get '/checkguess' do
check_guess(session[:number]) unless session[:number].nil?
end
That's the basic thought, you'd have to define it further though. Hope it helps you a little
I was searching for the exact same question now and :
require "sinatra"
require "sinatra/reloader"
number = rand(100)
get '/' do
guess = params["guess"].to_i
message = check_guess(guess, number)
erb :index, :locals => {:bok => number, :alert => guess, :msg => message}
end
putting the rng outside the get block just worked. Generated number stays same until you change something in the code (even adding a space to the end and saving the file would work to re-random the number.) or restart the server completely.
About the constant(SECRET_NUMBER), it helps to give check_guess method only one argument since you define it as a constant at top. (since i'm new to ruby someone can correct me if i'm wrong.)
SECRET_NUMBER = rand(100)
get '/' do ... end
def check_guess(guess)
if guess < SECRET_NUMBER
"Your Guess is Too LOW!"
elsif guess > SECRET_NUMBER
"Your Guess is Too HIGH!"
else
"Conguratulations! You guessed it right:)"
end
For anyone still looking for the answer. rand should be defined outside of get block
require 'sinatra'
require 'sinatra/reloader'
rand = (rand() * 100).to_i
get '/' do
"The secret number is #{rand}"
end

Twitter rate limit hit while requesting friends with ruby gem

I am having trouble printing out a list of people I am following on twitter. This code worked at 250, but fails now that I am following 320 people.
Failure Description: The code request exceeds twitter's rate limit. The code sleeps for the time required for the limit to reset, and then tries again.
I think the way it's written, it just keeps retrying the same entire rejectable request, rather than picking up where it left off.
MAX_ATTEMPTS = 3
num_attempts = 0
begin
num_attempts += 1
#client.friends.each do |user|
puts "#{user.screen_name}"
end
rescue Twitter::Error::TooManyRequests => error
if num_attempts <= MAX_ATTEMPTS
sleep error.rate_limit.reset_in
retry
else
raise
end
end
Thanks!
The following code will return an array of usernames. The vast majority of the code was written by the author of: http://workstuff.tumblr.com/post/4556238101/a-short-ruby-script-to-pull-your-twitter-followers-who
First create the following definition.
def get_cursor_results(action, items, *args)
result = []
next_cursor = -1
until next_cursor == 0
begin
t = #client.send(action, args[0], args[1], {:cursor => next_cursor})
result = result + t.send(items)
next_cursor = t.next_cursor
rescue Twitter::Error::TooManyRequests => error
puts "Rate limit error, sleeping for #{error.rate_limit.reset_in} seconds...".color(:yellow)
sleep error.rate_limit.reset_in
retry
end
end
return result
end
Second gather your twitter friends using the following two lines
friends = get_cursor_results('friends', 'users', 'twitterusernamehere')
screen_names = friends.collect{|x| x.screen_name}
try using a cursor: http://rdoc.info/gems/twitter/Twitter/API/FriendsAndFollowers#friends-instance_method (for example, https://gist.github.com/kent/451413)

How do you have threads in Ruby send strings back to a parent thread

I want to be able to call a method that repeats x amount of times on a separate thread that sends messages such as "still running" every few moments to the console while I am free to call other methods that do the same thing.
This works in my test environment and everything checks out via rspec - but when I move the code into a gem and call it from another script, it appears that the code is working in additional threads, but the strings are never sent to my console (or anywhere that I can tell).
I will put the important parts of the code below, but for a better understanding it is important to know that:
The code will check stock market prices at set intervals with the intent of notifying the user when the value of said stock reaches a specific price.
The code should print to the console a message stating that the code is still running when the price has not been met.
The code should tell the user that the stock has met the target price and then stop looping.
Here is the code:
require "trade_watcher/version"
require "market_beat"
module TradeWatcher
def self.check_stock_every_x_seconds_for_value(symbol, seconds, value)
t1 = Thread.new{(self.checker(symbol, seconds, value))}
end
private
def self.checker(symbol, seconds, value)
stop_time = get_stop_time
pp stop_time
until is_stock_at_or_above_value(symbol, value) || Time.now >= stop_time
pp "#{Time.now} #{symbol} has not yet met your target of #{value}."
sleep(seconds)
end
if Time.now >= stop_time
out_of_time(symbol, value)
else
reached_target(symbol, value)
end
end
def self.get_stop_time
Time.now + 3600 # an hour from Time.now
end
def self.reached_target(symbol, value)
pp "#{Time.now} #{symbol} has met or exceeded your target of #{value}."
end
def self.out_of_time(symbol, value)
pp "#{Time.now} The monitoring of #{symbol} with a target of #{value} has expired due to the time limit of 1 hour being rached."
end
def self.last_trade(symbol)
MarketBeat.last_trade_real_time symbol
end
def self.is_stock_at_or_above_value(symbol, value)
last_trade(symbol).to_f >= value
end
end
Here are the tests (that all pass):
require 'spec_helper'
describe "TradeWatcher" do
context "when comparing quotes to targets values" do
it "can report true if a quote is above a target value" do
TradeWatcher.stub!(:last_trade).and_return(901)
TradeWatcher.is_stock_at_or_above_value(:AAPL, 900).should == true
end
it "can report false if a quote is below a target value" do
TradeWatcher.stub!(:last_trade).and_return(901)
TradeWatcher.is_stock_at_or_above_value(:AAPL, 1000).should == false
end
end
it "checks stock value multiple times while stock is not at or above the target value" do
TradeWatcher.stub!(:last_trade).and_return(200)
TradeWatcher.should_receive(:is_stock_at_or_above_value).at_least(2).times
TradeWatcher.check_stock_every_x_seconds_for_value(:AAPL, 1, 400.01)
sleep(2)
end
it "triggers target_value_reahed when the stock has met or surpassed the target value" do
TradeWatcher.stub!(:last_trade).and_return(200)
TradeWatcher.should_receive(:reached_target).exactly(1).times
TradeWatcher.check_stock_every_x_seconds_for_value(:AAPL, 1, 100.01)
sleep(2)
end
it "returns a 'time limit reached' message once a stock has been monitored for the maximum of 1 hour" do
TradeWatcher.stub!(:last_trade).and_return(200)
TradeWatcher.stub!(:get_stop_time).and_return(Time.now - 3700)
TradeWatcher.check_stock_every_x_seconds_for_value(:AAPL, 1, 100.01)
TradeWatcher.should_receive(:out_of_time).exactly(1).times
sleep(2)
end
end
And here is a very simple script that (in my understanding) should print "{Time.now} AAPL has not yet met your target of 800.54." every 1 second that the method is still running and should at least be visible for 20 seconds (I test this using sleep in rspec and am able to see the strings printed to the console):
require 'trade_watcher'
TradeWatcher.check_stock_every_x_seconds_for_value(:AAPL, 1, 800.54)
sleep (20)
However I get no output - although the program does wait 20 seconds to finish. If I add other lines to print out to the console they work just fine, but nothing within the thread triggered by my TradeWatcher method call actually work.
In short, I'm not understanding how to have threads communicate with each other appropriately - or how to sync them up with each other (I don't think thread.join is appropriate here because it would leave the main thread hanging and unable to accept another method call if I chose to send one at a time in the future). My understanding of Ruby multithreading is weak anyone able to understand what I'm trying to get at here and nudge me in the right direction?
It looks like the pp function is simply not yet loaded by ruby when you go to print. By adding:
require 'pp'
to the top of trade_watcher.rb I was able to get the output you're expecting. You might also want to consider adding:
$stdout.sync = $stderr.sync = true
to your binary/executable script so that your output is not buffered internally by the IO class and instead passed directly to the os.

Odd bug with DataMapper, Mutexes, and Threads?

I have a database full of URLs that I need to test HTTP response time for on a regular basis. I want to have many worker threads combing the database at all times for a URL that hasn't been tested recently, and if it finds one, test it.
Of course, this could cause multiple threads to snag the same URL from the database. I don't want this. So, I'm trying to use Mutexes to prevent this from happening. I realize there are other options at the database level (optimistic locking, pessimistic locking), but I'd at least prefer to figure out why this isn't working.
Take a look at this test code I wrote:
threads = []
mutex = Mutex.new
50.times do |i|
threads << Thread.new do
while true do
url = nil
mutex.synchronize do
url = URL.first(:locked_for_testing => false, :times_tested.lt => 150)
if url
url.locked_for_testing = true
url.save
end
end
if url
# simulate testing the url
sleep 1
url.times_tested += 1
url.save
mutex.synchronize do
url.locked_for_testing = false
url.save
end
end
end
sleep 1
end
end
threads.each { |t| t.join }
Of course there is no real URL testing here. But what should happen is at the end of the day, each URL should end up with "times_tested" equal to 150, right?
(I'm basically just trying to make sure the mutexes and worker-thread mentality are working)
But each time I run it, a few odd URLs here and there end up with times_tested equal to a much lower number, say, 37, and locked_for_testing frozen on "true"
Now as far as I can tell from my code, if any URL gets locked, it will have to unlock. So I don't understand how some URLs are ending up "frozen" like that.
There are no exceptions and I've tried adding begin/ensure but it didn't do anything.
Any ideas?
I'd use a Queue, and a master to pull what you want. if you have a single master you control what's getting accessed. This isn't perfect but it's not going to blow up because of concurrency, remember if you aren't locking the database a mutex doesn't really help you is something else accesses the db.
code completely untested
require 'thread'
queue = Queue.new
keep_running = true
# trap cntrl_c or something to reset keep_running
master = Thread.new do
while keep_running
# check if we need some work to do
if queue.size == 0
urls = URL.all(:times_tested.lt => 150)
urls.each do |u|
queue << u.id
end
# keep from spinning the queue
sleep(0.1)
end
end
end
workers = []
50.times do
workers << Thread.new do
while keep_running
# get an id
id = queue.shift
url = URL.get(id)
#do something with the url
url.save
sleep(0.1)
end
end
end
workers.each do |w|
w.join
end

Catching timeout errors with ruby mechanize

I have a mechanize function to log me out of a site but on very rare occasions it times me out. The function involves going to a specific page, and then clicking on a logout button. On the occasional that mechanize suffers a timeout when either going to the logout page or clicking the logout button the code crashes. So I put in a small rescue and it seems to be working as seen below the first piece of code.
def logmeout(agent)
page = agent.get('http://www.example.com/')
agent.click(page.link_with(:text => /Log Out/i))
end
Logmeout with rescue:
def logmeout(agent)
begin
page = agent.get('http://www.example.com/')
agent.click(page.link_with(:text => /Log Out/i))
rescue Timeout::Error
puts "Timeout!"
retry
end
end
Assuming I understand rescue correctly, it will do both actions over even if just the clicking timed out, so in the effort to be efficient I am was wondering if I could use a proc in this situation and pass it a code block. Would something like this work:
def trythreetimes
tries = 0
begin
yield
rescue
tries += 1
puts "Trying again!"
retry if tries <= 3
end
end
def logmeout(agent)
trythreetimes {page = agent.get('http://www.example.com/')}
trythreetimes {agent.click(page.link_with(:text => /Log Out/i))}
end
Note in my trythreetimes function I left it as generic rescue so the function would be more re-usable.
Thanks so much for any help anyone can provide, I realize there are a couple different questions in here but they are all things I am trying to learn!
Instead of retrying some timeouts on some mechanize requests I think you'd better set Mechanize::HTTP::Agent::read_timeout attribute to a reasonable amount of seconds like 2 or 5, anyway one that prevent timeouts errors for this request.
Then, it seem's that your log out procedure only required access to a simple HTTP GET request. I mean there is no form to fill in so no HTTP POST request.
So if I were you, I would prefere inspected the page source code (Ctrl+U with Firefox or Chrome) in order to identify the link which is reached by your agent.click(page.link_with(:text => /Log Out/i))
It should be faster because these type of pages are usually blank and Mechanize will not have to load a full html web page in memory.
Here is the code I would prefer use :
def logmeout(agent)
begin
agent.read_timeout=2 #set the agent time out
page = agent.get('http://www.example.com/logout_url.php')
agent.history.pop() #delete this request in the history
rescue Timeout::Error
puts "Timeout!"
puts "read_timeout attribute is set to #{agent.read_timeout}s" if !agent.read_timeout.nil?
#retry #retry is no more needed
end
end
but you can use your retry function too :
def trythreetimes
tries = 0
begin
yield
rescue Exception => e
tries += 1
puts "Error: #{e.message}"
puts "Trying again!" if tries <= 3
retry if tries <= 3
puts "No more attempt!"
end
end
def logmeout(agent)
trythreetimes do
agent.read_timeout=2 #set the agent time out
page = agent.get('http://www.example.com/logout_url.php')
agent.history.pop() #delete this request in the history
end
end
hope it helps ! ;-)
Using mechanize 1.0.0 I got this problem from a different source of error.
In my case I was blocked by proxy and then SSL. This worked for me:
ag = Mechanize.new
ag.set_proxy('yourproxy', yourport)
ag.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
ag.get( url )

Resources