Coldfusion request never timeout for ldap requests ! - performance

I have an application running in CF8 which does calls to external systems like search engine and ldaps often. But at times some request never gets the response and is shown always in the the active request list.
Even tho there is request timeout set in the administration, its not getting applied to these scenarios.
I have around 5 request still pending to be finished for the last 20hours !!!
My server settings are as below
Timeout Requests after ( seconds) : 300 sec
Max no of simultaneous requests : 20
Maximum number of running JRun threads : 50
Maximum number of running JRun threads : 1000
Timeout requests waiting in queue after 300 seconds
I read through some articles and found there are cases where threads are never responded or killed. But i dont have a solid solution how can i timeout this or kill this automatically
really appreciated if you guys have some idea on this :)

The ColdFusion timeout does not apply to 'third party' connections.
A long-running LDAP query, for example, will take as long as it needs. When the calling template gets the result from the query your timeout will apply.
This often leads to confusion interpreting errors. You will get an error saying that whichever function after the long running request causes the timeout.
Further reading available here

You can (and probably should) set a timeout on the CFLDAP call itself. http://help.adobe.com/en_US/ColdFusion/9.0/CFMLRef/WSc3ff6d0ea77859461172e0811cbec22c24-7f97.html

Thanks, Antony, for recommending my blog entry CF911: Lies, Damned Lies, and CF Request Timeouts...What You May Not Realize. This problem of requests not timing out when expected can be very troublesome and a surprise for most.
But Anooj, while that at least explains WHY they don't die (and you can't kill them within CF), one thing to consider is that you may be able to kill them in the REMOTE server being called, in your case, the LDAP server.
You may be able to go to the administrator of THAT server and on showing them that CF has a long-running request, they may be able to spot and resolve the problem. And if they can, that may free the connection from CF and your request then will stop.
I have just added a new section on this idea to the bottom of that blog entry, as "So is there really nothing I can do for the hung requests?"
Hope that helps.

Related

Response code 500 in JMeter when running with threads

Getting the following error in JMeter while running the list of APIs (with no of threads:1-140 with ramp up period-1).
Response code:500
Response message: Internal Server Error
How should I overcome this Error Response code in order to get the accurate response?
What should do to decrease amount of response with this response code?
In general a 500 is an unhandled response on the part of a developer. Usually on the backend but also on the performance testing tool front end.
Ask yourself, are you validating responses that come back from the server for appropriate content? I am not just suggesting an HTTP200 is valid. You need to check response content to ensure that it is what you expect is valid for the business process, for you can have a completely valid HTTP200 class page which contains a response which will send your business process off the rails. If you do not handle the exception on the part of the unexpected response then you will find that one to two steps down the road in the business process then you are pretty much guaranteed that you will find a 500 as your request is completely out of context with the state of the application at that point.
Test101, for every step there is an expected and positive result which allows the business process to continue. Check for that result and branch your code when you do not find that the result is true.
Or, if this is a single step business process then you are likely handing the service poor data and the developer has not fully fleshed out the graceful part of dealing with your poor data.
The general advice in JMeter is Ramp-up = number of threads, in your case 140
Start with Ramp-up = number of threads and adjust up or down as needed.
Currently you are sending every 1/140 seconds new thread which is almost simultaneously, the reason to the change is:
Ramp-up needs to be long enough to avoid too large a work-load at the start of a test
Status code - 500 comes from server/API's and it's not an issue of Jmeter. Sometimes the concurrent requests are rejected by server as it's too weak to handle that number of requests.In my case, I asked my server team to scale up servers so that we can test the underlying API . It's worth mentioning that sometimes Jmeter also runs out of memory. You can do some tweaking in set HEAP=-Xms512m -Xmx512m property of jmeter execuble file. Also listeners consume too much resources.Try not to use them.

Does JMeter show the correct average response time for the first page it hits for many virtual users?

I'm load testing a system with 500 virtual users. I've kept the "Ramp-Up period (in seconds)" option to zero. So, what I understand, JMeter will hit the system with 500 virtual users all at the same time. Please correct me if I'm wrong here.
Now, the summary report shows the average response time for the first page is ~100 seconds!. Which is more than a minute and a half of wait time. But while the JMeter is running, I manually went to the same page/url using a browser and didn't have to wait for that long. It was not even close, the page response was almost immediate for me.
My question is: is there any known issue for the average response time of the first page? Is it JMeter which is taking long to trigger that many users?
Thanks in advance.
--Ishtiaque
There is no issue in Jmeter related to first page response time.
Summary Report shows all response time details in Milliseconds, the value "100" seconds have you converted milliseconds to seconds?
Also in order to make sure that 500 users hit concurrently, use Synchronizing Timer.
Hope this will help.
While the response times will be accurate, you need to consider the affect of starting so many threads at once on both your server and your client.
500 threads to start at once is not insignificant n the client. If your server has the connections, it will start 500 threads as well.
Ramping over a period of time is more realistic loadwise, but still not really indicative of server capability until the threads have all started and settled in.
Databases can also require a settling in period which can affect response times.
Alternative to ramping is introducing a random wait at the start of each thread before firing the first sample. You can then choose not to ramp over time, but still expect resources on the client to suddenly come under load and change the settings if you hit limits. This will make the entire run much more realistic of typical behaviour. However, you need to determine if your use cases are typical.
Although the heap size is increased, i notice there is still longer time as compared to actual response time. Later i realised it was the probe effect (the extra time a tool generates due to test execution)

Occasional slow requests on Heroku

We are seeing inconsistent performance on Heroku that is unrelated to the recent unicorn/intelligent routing issue.
This is an example of a request which normally takes ~150ms (and 19 out of 20 times that is how long it takes). You can see that on this request it took about 4 seconds, or between 1 and 2 orders of magnitude longer.
Some things to note:
the database was not the bottleneck, and it spent only 25ms doing db queries
we have more than sufficient dynos, so I don't think this was the bottleneck (20 double dynos running unicorn with 5 workers each, we get only 1000 requests per minute, avg response time of 150ms, which means we should be able to serve (60 / 0.150) * 20 * 5 = 40,000 requests per minute. In other words we had 40x the capacity on dynos when this measurement was taken.
So I'm wondering what could cause these occasional slow requests. As I mentioned, anecdotally it seems to happen in about 1 in 20 requests. The only thing I can think of is there is a noisy neighbor problem on the boxes, or the routing layer has inconsistent performance. If anyone has additional info or ideas I would be curious. Thank you.
I have been chasing a similar problem myself, with not much luck so far.
I suppose the first order of business would to be to recommend NewRelic. It may have some more info for you on these cases.
Second, I suggest you look at queue times: how long your request was queued. Look at NewRelic for this, or do it yourself with the "start time" HTTP header that Heroku adds to your incoming request (just print now() minus "start time" as your queue time).
When those failed me in my case, I tried coming up with things that could go wrong, and here's a (unorthodox? weird?) list:
1) DNS -- are you making any DNS calls in your view? These can take a while. Even DNS requests for resolving DB host names, Redis host names, external service providers, etc.
2) Log performance -- Heroku collects all your stdout using their "Logplex", which it then drains to your own defined logdrains, services such as Papertrail, etc. There is no documentation on the performance of this, and writes to stdout from your process could block, theoretically, for periods while Heroku is flushing any buffers it might have there.
3) Getting a DB connection -- not sure which framework you are using, but maybe you have a connection pool that you are getting DB connections from, and that took time? It won't show up as query time, it'll be blocking time for your process.
4) Dyno performance -- Heroku has an add-on feature that will print, every few seconds, some server metrics (load avg, memory) to stdout. I used Graphite to graph those and look for correlation between the metrics and times where I saw increased instances of "sporadic slow requests". It didn't help me, but might help you :)
Do let us know what you come up with.

Apache Makes some AJAX Request Behave Synchronously

I have this strange issue where sometimes if I make two AJAX requests to my Apache 2.2 server in rapid succession, the second request will wait for the first to finish before finishing.
Example, I have two requests, one that sleeps for 10 seconds and one that returns immediately. If I run the request that returns immediatly by itself it will always return within 300ms. However, if I call the request that takes 10 seconds, and then call the request that returns right away about 50% of the time the second request will wait until the first finishes and chrome will report that the request too about 10 seconds before receiving a response. The other half of the time the quick request will return right away.
I can't find any pattern to make it behave one way or another, it will just randomly block the quick AJAX requests sometimes, and other times it will behave as expected. I'm working on a dev server that only I am accessing and I've set several variables such as MaxRequestsPerChild to a high value.
Does anyone have any idea why Apache, seemingly at random, is turning my AJAX requests into synchronous requests?
Here is the code I'm running:
$.ajax({async:true,dataType:'json',url:'/progressTest',success:function(d){console.log('FINAL',d)}}); // Sleeps for 10 seconds
$.ajax({async:true,dataType:'json',url:'/progressTestStatus',success:function(d){console.log('STATUS',d)}}); // Takes ~300ms
And here are two screen shots. The first where it behaved as expected and the second where it waited for the slow process to finish first (in the example the timeout was set to 3 seconds).
UPDATE: Per the comments below - this appears to be related to Chrome only performing one request at a time. Any ideas why Chrome would set such a low limit on async requests?
The problem is not with Apache but with Google Chrome limiting the number of concurrent requests to your development server. I can only make guesses as to why it's limited to one request. Here are a couple:
1) Do you have many tabs open? There is a limit to the total number of concurrent connections and if you have many tabs making requests with KeepAlive you may be at that limit and can only establish one connect to your server. If that's the case you might be able to fix that by adding KeepAlive to your own output headers.
2) Do you have some extensions enabled. Some extensions do weird things to the browser. Try disabling all your extensions and making the same requests. If it works then enable them one at a time to find the culprit extension.

How to simulate browser timeout of ajax request?

I'm trying to secure my web application against timeouts of ajax requests.
To do it, I obviously need to simulate such a timeout.
From what I've found here:
http://kb.mozillazine.org/Network.http.connect.timeout#Background
the firefox timeout is system-dependent and from what I've found here: http://support.microsoft.com/kb/181050 the IE timeout period is 60 minutes by default.
So I see the following ways to simulate a timeout:
make the server wait 60 minutes (yuck ;))
change the IE timeout period to a smaller value (which requires registry changes)
configure a proxy between the client and the server and make it timeout
All the ways above seem like an overkill to me. Does anyone know an easier way (possibly on a different browser)?
Thanks!
Wouldn't it be much easier to simply set the ajax timeout to 1 millisecond. Even on localhost it will always timeout at that value. This is the method I always use. The only thing you don't exercise with this approach is the actual "feel" that your preferred timeout period gives to the end user (ie, does 3 seconds feel long, is 2 seconds too short). But if you're just looking to exercise the code under the error response this does the trick for me.
Eventually the easiest way for me was simulate the timeout by setting ReceiveTimeout in registry HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Internet Settings
as described here:
http://support.microsoft.com/kb/181050
Darshan's solution might also work, but I just tested the above. Thank you all for help!
whats harm in setting KeepAliveTimeout in registry
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\InternetSettings ?
More information can be found here:
http://support.microsoft.com/kb/181050
It's simple, set the timeout to 10.
like this : xhr.timeout = 10;

Resources