We've use an API that impose a rate-limit per hour.
I wonder what'll be the best way to do a set number of requests per hour to the API for our own scripts. I.e.: Making 10 request per hour to not exceed our allowance and avoids overcharges.
I was thinking just using sleep(60*6) in my loop but API calls can take minutes, so it might be doing a lot less requests than allowed.
What will be the best practice to spread out our requests?
Edit:
I ended doing something like this, what do you guys think?
while(queue.size > 0) do
Thread.new {
element = queue.pop
# do the rate limited API calls and things
}
sleep(60*6)
end
consider rack attack middleware
to sum it up - you keep somewhere (in memory, in database like redis) number of requestes executed by specific client (know by IP, identity or in any other form) within given time.
Related
I have 250 users ,and around 60000 sample counts should be hit including all the requests. Whichever request is supposed to get huge sample count,I have put those request within loop count,But the requests outside the loop are getting executed only 3-4 times which is less than expected. How do I handle this?
It is not very possible to provide the comprehensive answer without knowing what you're trying to achieve and seeing your test or at least Thread Group configuration
The easiest option is moving the requests which you want to execute more times into a separate Thread Group
If the requests have to stay in one Thread Group you can control the frequency using Throughput Controller
If the logic is more complex - consider using Switch Controller or Weighted Switch Controller
I am trying to load test an API and I am trying to make sure I fire only 2 requests in a second due to the throttling limit set at the API Gateway level so if the third request is sent within a second (this happens if the response time of the earlier request is < 1 sec) then I get HTTP-429 error saying 'too many requests'. Please could someone suggest if I can use any timer to achieve this?
Thanks,
N
Constant Throughput Timer is the easiest of built-in test elements, 2 requests per second is 120 requests per minute. However it is precise enough only on "minute" level so you might need to play with ramp-up period
Precise Throughput Timer is more "precise" but a little bit harder to use as you need to provide controlled throughput and test duration
If you don't mind using JMeter Plugins there is Throughput Shaping Timer which provides the maximum flexibility and visual way of defining the load
For this particular case, I suggest the Arrivals Tread Group. This TG will let you exactly configure the desired TPS (Arrival Rate), plus the plugin will instantiate the necessary threads to generate the load. No need to guess how many threads/vusers you'll need.
I want to build a cronjob like system that gets all users from database and make multiple (I mean lots of) concurrent requests for each of them and make some executions and save the result to db. It will run every hour in every day 7/24.
I came up with the solution that:
Gets all users from db (that's the easy part)
Dynamically creates lambda functions and distributes all users to these functions
Each lambda function makes concurrent requests and executions
(handling results and saving them to db)
Communicate these functions with SNS when needed
So, does my approach make sense for this situation?
The most important thing here is scaling (that's why I thought to distribute all users to lambda functions, for limiting concurrent requests and resources), how we can come with an scalable and efficient idea for exponentially increased user count?
Or any other suggestions?
Here is my solution:
if 100 concurrent lambdas are not enough for your need, create a ticket to increase your limit, you will be charged what will you use.
However, still you can't determine that how many lambdas will be required in future. It is not necessary to process each user in a separate lambda, instead you can invoke a lambda with a chunk of user data. e.g. Let's say, your max. lambda limit is 100 and there are 1000 users then you can do something (i don't know go, here is a python code which may not be 100% syntactically correct)
users = get_users_fromdb() # users = [1,2,3,... 1000]
number_of_users = len(users)
chunk_size = number_of_users / 100 # 100 is your lambda limit
for i in range(0, number_of_users, chunk_size)
# e.g. chunk_users_data = [1,2,3 ... 10]
chunk_users_data = users[i * chunk_size : (i + 1) * chunk_size ]
invoke_lambda_to_process_users_chunk_data()
Here is what you can do in other lambda
users = event.get('users')
for user in users:
try:
process_user(user)
except Exception as e:
print(e) # handle exception / error if you want
Update:
By default, 100 is limit for concurrent running lambdas. If you have 100K users, IMO, you should go for a support case to increase your account's concurrent lambda limit to 1000 or more. I am working on lambda and we have 10K limit. One more thing to keep in mind that it is not sure that your one lambda invocation will be able to process all users in a chunk, so add some logic to reinvoke with remaining users before timeout. A lambda can run upto max. of 5 minutes. YOu can get remaining time from context object in milli seconds.
This question not about ruby only.
I have many workers running to creates many connections to external API. This API has a limit.
At now I use sidekiq and redis for limiting access.
i.e. at every access to API that rated I run worker.
Then worker is started it check when was last time to access to API, if is earlier than that API allows it, the worker is rescheduled, else it touch time in redis and run request.
ex:
def run_or_schedule
limiter = RedisLimiter.new(account_token)
if limiter.can_i_run?
limiter.like
run
else
ApiWorker.perform_at(limiter.next_like, *params)
end
end
Problem is that I create many request and they many times rescheduled.
Maybe can someone can recommend a better solution?
Maybe any design patterns exist for this?
One alternative to the polling approach you are using would be to have a supervisor.
So instead of having each worker handle the question itself, you have another object/worker/process which decides when is time for the next worker to run.
If the API imposes a time-limit between requests you could have this supervisor execute as often as the time limit allows.
If the limit is more complex (e.g. total number of requests per X interval) you could have the supervisor running constantly and upon hitting a limit block (sleep) for that amount of interval. Upon resuming it could continue with the next worker from the queue.
One apparent advantage of this approach is that you could skip the overhead related to each individual worker being instantiated, checking itself whether it should run, and being rescheduled if it shouldn't.
You can use a queue, and a dedicated thread that sends events (when there any waiting) at the maximum rate allowable.
Say you can send one API call every second, you can do the following:
class APIProxy
def like(data)
#like_queue << data
end
def run
Thread.new do
#like_queue = Queue.new
loop do
actual_send_like #like_queue.pop
sleep 1
end
end
end
private
def actual_send_like(data)
# use the API you need
end
end
I've been messing around with Ruby and threading a little bit today. I have a list of proxies that I want to check. Assuming a timeout of 10 seconds going through a very large list of proxies will take many hours if I write something that goes like:
proxies.each do |proxy|
check_proxy(proxy)
end
My first problem with trying to figure out threads is how to START multiple at the same exact time. I found a neat little snippet of code online:
for page in pages
threads << Thread.new(page) { |myPage|
puts "Fetching: #{myPage}\n"
doc = Hpricot(open(myPage.to_s)).to_s
puts "Got #{myPage}: #{doc.size}"
}
end
Seems to work nicely as far as starting them all at the same time. So now I can... start checking all 7 thousand records at the same time?
How do I go to a file, take out a line for each thread, run a batch of like 20 and repeat the process?
Can I run a while loop that in turn starts 20 threads at the same (which remove lines from a file) and keeps going until the file is blank?
I'm a little weak on the logic of what I'm supposed to do.
Thanks guys!
PS.
Another thought: Will there be file access issues if 20 workers are constantly messing with it randomly? What would be a good way around that if this is so?
The keyword you are after is threadpool. You can either try to find one for Ruby (I am sure there's couple at least on Github), or roll your own.
Here's a simple implementation here on SO.
Re: the file access, IMO you shouldn't let workers alter the file directly, but do it in your main thread. You don't want to allow simultaneous edits there.
Try to use gem DelayJob:
https://github.com/tobi/delayed_job
You don't need to generate that many Threads in order to do this work. In fact generating a lot of Threads can decrease the overall performance of your application. If you handle checking each proxy asynchronously, without blocking, you can get by with far fewer threads.
You'd create a file manager thread to process the file. Each line gets added as a request to an array(request queue). On the other end of the request queue you can use eventmachine to send the requests without blocking. eventmachine would also be used to receive the responses and handle the timeout. The response can then be placed on another array(response queue) which your file manager thread polls. The file manager thread pulls the responses from the response queue and resolves if the proxy exists or not.
This gets you down to just creating two threads. One issue that you will have is limiting the number of requests that have been sent since this model will be able to send out all of the requests in less than a second and flood the nearest router. In my experience you should be able to have around 500 outstanding requests at any one time.
There is more than one way to solve this problem asynchronously but hopefully the above is enough to help get you started with non-blocking I/O.