I'm trying to implement Rate Limiting using Redis. I need to call an API that has TPS 100 or that can be called 100 times in every second. I've a limitation that I cannot call API for testing. How can I test that my API is called not more than 100 times in a second ?
Redis::throttle('key')->allow(100)->every(1)->then(function () {
// API called that has rate limit
}, function () {
// Could not obtain lock...
return $this->release(10);
});
Rate limiting is not a trivial task. Finding the exact count at the second level becomes difficult due to round trip time. In your case, as you've specified 100 as max count, Larval guarantees that it will process at the max 100 messages in a second.
Each method call will add an addition of 1-5 Ms in execution time unless your Redis is always providing a response in microsecond(s).
The best bet could be, you benchmark your Redis calls and see how many locks you can acquire using the Laravel throttle primitive. As a basic test set the throttle to 10_000 and simply run a print statement and check how many calls you can make (you should do in production), this number will give you the maximum number of the locks you can acquire in a second.
If you can acquire more than 100 locks in a second then nothing should stop you from using the primitive you have mentioned.
For testing, you can do a couple of things like
Find p99/p95/p90 response time of the said API
Add a dummy API/method in your systems, API/method can simply take either p99/p95/p90 seconds. For simplicity, it can just sleep for the required interval.
Now you can hit your dummy method/API, here you can do all sorts of counting to identify if you're exceeding throttle limit, you can use millisecond for logging and aggregate over seconds to identify any issue, run this for an hour or so.
Related
I would like to load test https://app-staging.servespark.com site. I have completed scripts on Jmeter for login and am able to go to any page.
How can find out the max number of concurrent requests per second that the server could handle in Jmeter?
Is it possible in the Jmeter? Please advise.
It looks like you need to conduct a Stress Test, something like:
Start with 1 user
Gradually increase the load at the same time looking into the following charts:
Active Threads Over Time
Response Times Over Time
Transactions Per Second
At the beginning the response time should not change and the throughput (number of transactions per second) should increase by the same factor as the number of users increase
At certain stage of test you will notice that response time will start growing and the number of transactions per second will go down. This will indicate a bottleneck
You may continue increasing the load to see at which stage the errors will start occurring
And finally you can decrease the load gradually as well to see if the application gets back to normal when the load comes down (i.e. errors disappear, throughput grows, etc.)
You could try this;
https://jmeter-plugins.org/wiki/ConcurrencyThreadGroup/
And ramp up the users to a value higher than expected.
This is the sort of traffic pattern I'm consistently seeing.
I understand that RPS roughly equals number of users/(response time + sleep time), hence my RPS will be roughly flat if my number of users and my response times are increasing at a similar rate (I'm using 0 sleep time).
I also understand that you can't help me debug the underlying system whose response time is increasing! That's another thread I'll be pursuing separately. The increasing response time is not a Locust issue.
My question is how can I get Locust to ignore response time, in order to produce a constantly increasing RPS? I would like to take response time out of the equation entirely so that RPS is proportional to number of users.
(Why do I want to do this? In order to effectively load test my particular system.)
An individual Locust user is syncronous/sequential and cannot ”ignore response times” any more than any other Python program can ”ignore the time spent executing a line of code”
But you can use wait_time = constant_pacing(seconds_per_iteration) to ensure a fixed iteration time for each user https://docs.locust.io/en/stable/writing-a-locustfile.html#wait-time-attribute
Or wait_time = constant_pacing(1/iterations_per_second) if you prefer.
For a ”global” version of the same type of wait, use https://github.com/SvenskaSpel/locust-plugins/blob/master/examples/constant_total_ips_ex.py
Make sure your user count is high enough, as none of these methods can launch additional users/concurrent requests.
You may also want to have a look at https://github.com/locustio/locust/wiki/FAQ#increase-my-request-raterps
Building on cyberwiz's answer, you can't make the individual Locust users ignore response time. Each has made a request and can't do anything else until it gets a response. With ever increasing response times, all you can do is make Locust spawn more and more users. You'd need to run in distributed mode and add more workers who can spawn more users. You can specify a higher user count and maybe even a higher hatch rate, depending on the behavior you're trying to achieve.
I'm using jmeter to generate a performance test, to keep things short and straight i read the initial data from a json file, i have a single thread group in which after reading the data i randomize certain values to prevent data duplication when i need it, then i'm passing the final data to the endpoint using variables, this will end up in a json body that is recieved by the endpoint and it will basically generate a new transaction in the database. Also i added a constant timer to add a 7 seconds delay between requests, with a test duration of 10 minutes and no ramp up, i calculated the requests per second like this:
1 minute has 60 seconds and i have a delay of 7 seconds per request then it's logical to say that every minute i'm sending approximately 8.5 requests per minute, this is my calculation (60/7) = 8.5 now if the test lasts for 10 minutes then i multiply (8.5*10) = 85 giving me a total of 85 transactions in 10 minutes, so i should be able to see that exact same amount of transactions created in the database after the test completes.
This is true when i'm running 10-20-40 users, after the load test run i query the db and i get the exact same number of transaction however, as i increase the users in the thread group this doesn't happen anymore, for example if i set 1000 users i should be able to generate 8500 transactions in 10 minutes, but this is not the case, the db only creates around 5.1k transactions.
What is happening, what is wrong? Why it initially works as expected and as i increase the users it doesn't? I can provide more information if needed. Please help.
There could be 2 possible reasons for this:
You discovered your application bottleneck. When you add more users the application response time increases therefore throughput decreases. There is a term called saturation point which stands for the maximum performance of the system, if you go beyond this point - the system will respond slower and you will get less TPS than initially. From the application under test side you should take a look into the following areas:
It might be the case your application simply lacks resources (CPU, RAM, Network, etc.), make sure that it has enough headroom to operate using i.e. JMeter PerfMon Plugin
Your application middleware (application server, database, load balancer, etc.) are not properly set up for the high loads. Identify your application infrastructure stack and make sure to follow performance tuning guidelines for each component
It is also possible that your application code needs optimization, you can detect the most time/resource consuming functions, largest objects, slowest DB queries, idle times, etc. using profiling tools
JMeter is not sending requests fast enough
Just like for the application under test check that JMeter machine(s) have enough resources (CPU, RAM, etc.)
Make sure to follow JMeter Best Practices
Consider going for Distributed Testing
Can you please check once CPU and Memory utilization(RAM and java heap utilization) of jmeter load generator while running jemter for 1000 users? If it is higher or reaching to max then it may affect requests/sec. Also just to confirm requests/sec from Jmeter side, can you please add listener in Jmeter script to track Hit/sec or TPS?
This will also be true(8.5K requests in 10 mins test duration) if your API response time is 1 second and also you have provided enough ramp-up time for those 1000 users.
So possible reason is:
You did not provide enough ramp-up time for 1000 users.
Your API average response time is more than 1 second while you performing tests for 1000 users.
Possible workarounds:
First, try to measure the API response time for 1 user.
Then calculate accordingly that how many users you need to reach 8500 requests in 10 mins. Use this formula:
TPS* max response time in second
Give proper ramp-up time for 1000 users. Check this thread to understand how you should calculate ramp-up time.
Check that your load generator is able to generate 1000 users without any memory or health (i.e CPU usage) issues. If requires, try to use distributed architecture.
I have a program that make many queries to Google Search Analytics server. My program does the queries one after the other sequentially, so each instant, only one query will be in process.
Google has advised about a throughput limit of 2000 queries per each 100 seconds at most so to configure my system to be the more efficient it could be I have two ideas on mind:
Known that 2000 queries per 100 seconds is one query per each 0.05 seconds, i have separated my queries by sleeping the process, but only if any query take less than 0.05 seconds, so the time the process will sleep in that case is the remaining time to complete the 0.05 second interval. If the query takes 0.05s or more I trigger the following without waiting.
The second idea is more easy to implement but I think it will be less efficient: i will trigger the queries taking note of the time when the process start so if i reach 2000 queries before 100 seconds, I will wait the remaining time sleeping.
So far I don't know how to measure which one is the best.
Which is your opinion about the two options? Any of them is better and why? Any additional option I haven't figured out? (specially if it's better than mine)
Actually what you need to consider is that its 2000 requests per 100 seconds. But you could do all 2000 requests in 10 seconds and still be on the good side of the quota.
I am curious as to why you are worried about it though. If you get one of the following errors
403 userRateLimitExceeded
403 rateLimitExceeded
429 RESOURCE_EXHAUSTED
Google just recommends that you implement Exponential backoff which consists of making your request getting the error sleeping for a bit and trying again. (do this up to eight times). Google will not penalize you for getting these errors they just ask that you wait a bit before trying again.
If you want to go crazy you can do something like what i did in my C# application I created a request queue that i use to track how much time has gone since i created the last 100 request. I call it Google APIs Flood Buster.
Basically i have a queue where i log each requests as i make it before i make a new request i check how long it has gone since i started. Yes this requires moving the items around the queue a bit. If there has gone more then 90 seconds then i sleep (100 - time since ) this has reduced my errors a great deal. Its not perfect but that's because google is not perfect with regard to tracking your quota. they are normally off by a little.
I have started 300 requests per second by providing following values:
Number of Threads : 300
Tamp-up : 0
But I am getting following results:
summary + 55 in 21s
summary + 225 in 31.1s
summary = 280 in 31.1s
what different configurations would be needed to start all requests in one go?
IMHO, you have to take in account the time it takes to run the transaction itself.
I tend NOT to use synchronization, but to take measure on longer period of time, e.g. 15 minutes.
E.g., if your system is able to deliver one page in 2 seconds, you need to run AT LEAST 600 threads to deliver the throughput you want (probably more).
Also, remember that the time of a single page increase with the load, so a single measurement is not enough, AND take care of errors: you have to define an acceptable threshold for errors (e.g. 0.01%), and stop measurement when you go above that.
If you need to start all 300 requests at the same moment you need to use Synchronizing Timer
If you need provide constant load at 300 requests per second rate - you'll need Constant Throughput Timer
Two above would cover the majority of use cases, however if you want more control on your load pattern look into Throughput Shaping Timer (available through JMeter plugin)