I have a for loop that iterates over an array. For each item in the array, it calls a function that makes django-rest-framework requests. Each function call is independent of the others.
If the array has 25 items, it currently takes 30 seconds to complete. I am trying to get the total time down to less than 10 seconds.
Half the time spent in the function is taken up by DRF requests. Would it make sense to replace the for loop with a multiprocessing Pool? If so, how do I to ensure each process makes requests over a separate connection using the requests package?
I tried just replacing:
for scenario_id in scenario_ids:
step_scenario_partial(scenario_id)
with:
pool = Pool(processes=2)
pool.map(step_scenario_partial, scenario_ids)
which failed due to OpenSSL.SSL.Error: [('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')]
According to this, the error was due to re-using the same SSL connection in more than one process.
You can use the concurrent python module (docs) which can execute parallel tasks. Example method that returns a list of response objects:
from concurrent import futures
def execute_all(scenario_ids, num_workers=5):
'''
Method to make parallel API calls
'''
with futures.ThreadPoolExecutor(max_workers=num_workers) as executor:
return [result for result in executor.map(step_scenario_partial, scenario_ids)]
The ThreadPoolExecutor uses a pool of threads to execute asynchronous parallel calls. You can experiment with values of num_workers, starting with 5, to ensure the total execution time is <10 seconds.
Related
I am using Locust and my code looks as below
class RecommenderTasks(TaskSet):
#task
def test_recommender_multiple_platforms(self):
start = round(time.time() * 1000)
self.client.get('recommendations', name='Test')
end = round(time.time() * 1000)
print(end - start)
class RecommenderUser(FastHttpUser):
tasks = [RecommenderTasks]
wait_time = constant(1)
host = "https://my-host.com/"
When I test with this code, I get the following output times
374
62
65
68
64
I am not sure why the very first task time alone is about 300+ ms and the rest are as expected. With this, my overall average time also increases. Could you please help me here?
Locust response times are measured from the time the initial request is sent to the server to the time a response is received. By default Locust reuses socket connections when available but creates new ones if an existing one isn't available. When connecting via HTTPS, there are a number of things that need to be done to set up the connection initially. Generally performance of that connection set up is dependent on things the server is doing. You could look into ways of reducing your connection setup time. How to do that will vary widely depending on your stack but you can find general principles in SO answers like this one:
how to reduce ssl time of website
I am currently doing performance Testing for the application.. We need to test with number of concurrent users (eg. 300). We are using Stepping Thread group and it is working fine..
The test is about 38 mins. At some point, when the server memory is overloaded the memory is cleaned and getting restarted takes 10 to 20 seconds during that time we are getting 502 - Bad Gateway response..
We have almost 6 Modules (each is in Transaction controller) and each controller has almost 20 to 30 api calls)
I just wanted to pause 20 seconds when first we encounter 502.. Is it possible to do that? I can use If controller but i can not add for all the 20 calls is that previous sample is OK which is time taking process. Is there any other way?
I would check ResponseCodes in PostProcessor and in case it is 502 Bad Gateway, I would get the current thread to sleep using Java Tread and Jmeter Api using
JMeterThread getThread() from JMeterContext.
JMeterContext jmctx = JMeterContextService.getContext();
JMeterThread currentThread = jmctx.getThread();
currentThread.sleep(20000);
I am not sure about that currentThread.sleep(20000); because I need to check if JMeterThread inherits sleep() from Java Thread.
Checking it locally.
more samples are here :
https://www.programcreek.com/java-api-examples/?api=org.apache.jmeter.threads.JMeterContext
I want to test a rate-limiting app with Ruby where I define different behavior based on the number of requests per second.
For example, if I see 300 request per second or more, I want it to respond with a block.
But how would I test this by generating 300 requests per second in Ruby? I understand there are hard limitations based on CPU for example, but if I kept the number well below that limitation, how would I still send something that both exceeds the threshold and stays below?
Just looping N-times doesn't guarantee me the throughput.
The quick and dirty way is to spin up 300 threads that each do one request per second. The more elegant way is to use something like Eventmachine to create requests at the required rate. With the right non-blocking HTTP library it can easily generate that level of activity.
You also might try these tools:
ab the Apache benchmarking tool, common many systems. It's very good at abusing your system.
Seige for load testing.
How about a minimal homebrew solution:
OPS_PER_SECOND = 300
count = 0
duration = 10
start = Time.now
while true
elapsed = Time.now - start
break if elapsed >= duration
delay = (count - (elapsed / OPS_PER_SECOND)) / OPS_PER_SECOND
sleep(delay) if delay > 0
do_request
count += 1
end
I am running a load testing with Jmeter and python Requests package, but get different result when I try to access the same website.
target website: http://www.somewebsite.com/
request times: 100
avg response time for Jmeter: 1965ms
avg response time for python Requests: 4076ms
I have checked response html content of jmeter and python Requests are the same. So it means they all got the correct response from website. but not sure why it has 2 times difference with each other. Is there anyone know is there any deep reason for that?
the python Requests sample code:
repeat_time = 100
url = 'http://www.somewebsite.com/'
base_time = datetime.datetime.now()
time_cost = base_time
for i in range(repeat_time):
start_time = datetime.datetime.now()
r = requests.get(url, headers=headers)
end_time = datetime.datetime.now()
print str(r.status_code) + ';time cost: %s' % (end_time - start_time)
time_cost += (end_time - start_time)
print 'total time: %s' % (time_cost - base_time)
print 'average time: %s' % ((time_cost - base_time).total_seconds() / repeat_time)
Without your JMeter code, I can't tell you what the difference is, but let me give you an idea of what's happening in that one call to requests:
We create a Session object, plus the urllib3 connection pools we use
We do a DNS look-up for 'www.somewebsite.com' which shouldn't be too negatively affecting this request
We open a socket for 'www.somewebsite.com:80'
We send the request
We receive the first byte of the response
We determine if the user wanted to stream the body of the response, if not we read all of it and cache it locally.
Keep in mind that the three most intensive parts (usually) are:
DNS lookup (for various reasons, but as I already said, it shouldn't be a problem here)
Socket creation (this is always an expensive operation)
Reading the entirety of the body and caching it locally.
That said, each response object should have an attribute, elapsed which will give you the time to the first byte of the response body. In other words, it will measure the time between when the request is actually sent and when the end of the headers is found.
That might give you far more accurate information than what you're measuring now, which is the time to the last byte of the message.
That said, keep in mind that what you're doing in that for-loop is also invoking the garbage collector a lot:
Create Session, it's adapters, the adapters connection pools, etc.
Create socket
Discard socket
Discard Session
Goto 1
If you create a session once, your script will perform better in general.
I need to regulate how often a Mechanize instance connects with an API (once every 2 seconds, so limit connections to that or more)
So this:
instance.pre_connect_hooks << Proc.new { sleep 2 }
I had thought this would work, and it sort of does BUT now every method in that class sleeps for 2 seconds, as if the mechanize instance is touched and told to hold 2 seconds. I'm going to try a post connect hook, but it is obvious I need something a bit more elaborate, but what I don't know what at this point.
Code is more explanation so if you are interested following along: https://github.com/blueblank/reddit_modbot, otherwise my question concerns how to efficiently and effectively rate limit a Mechanize instance to within a specific time frame specified by an API (where overstepping that limit results in dropped requests and bans). Also, I'm guessing I need to better integrate a mechanize instance to my class as well, any pointers on that appreciated as well.
Pre and post connect hooks are called on every connect, so if there is some redirection it could trigger many times for one request. Try history_added which only gets called once:
instance.history_added = Proc.new {sleep 2}
I use SlowWeb to rate limit calls to a specific URL.
require 'slowweb'
SlowWeb.limit('example.com', 10, 60)
In this case calls to example.com domain are limited to 10 requests every 60 seconds.