Why are my localhost HTTP response times so slow? - performance

Using localhost and Tomcat 7, I'm seeing between 600-800ms per request in Chrome Developer tools for a specific webapp. Requests are JS files, CSS files, images or the initial server response. Some responses are less than 1KB, others are over 100KB.
As a result, it's taking around 10 seconds to load one page of the webapp. When I load the same webapp on our production server, it's taking less than 1 second to load an entire page.
I'm not sure where to continue debugging the issue...
I've ruled out it being a browser issue by testing in Safari too.
I've turned it off and on again
Reduced response to 500-600ms overall
I've cleared out my log files
I've ruled out the webapp's frontend entirely by hitting a resource directly, ex: http://ts.xyz.com:9091/1.0/toolsList/javascript/toolsList.js or http://ts.xyz.com:9091/awake
I've tested another webapp and that performs lightning-quick
So, it has to be this particular app and it has to be locally.

I've seen such behaviour long time ago when the webserver (Apache httpd back then) was configured to make DNS lookups for logs - these took awfully long time especially when an IP could not be resolved. As it doesn't make sense for a localhost app to be orders of magnitude slower (especially when you're talking about serving static resources) I'd check for any network related issues: Database connections, logging configurations, DNS lookups, TLS server trust issues (with backends, database, LDAP or others).
I can't decide if I add this as "if everything else fails" or rather add this as "but first try this:"... you decide:
Compare the setup of your production server with your development server (localhost) and make extra extra extra sure that there's no meaningful difference.

Related

Is this a correct scenario to use WebSocket?

I have a browser plugin which will be installed on 40,000 dekstops.
This plugin will connect to a backend configuration file available via https, e.g. http://somesite/config_file.js.
The plugin is configured to poll this backend resource once/day.
But there is only one backend server. So if 40,000 endpoints start polling together the server might crash.
I could think of randomize the polling frenquency from the desktop plugins. But randomization still does not gurantee that there will not be a overload at the server.
Is using websocket in this scenario solves the scalability issue?
Polling once a day is very little.
I don't see any upside for Websockets unless you switch to Push and have more notifications.
However, staggering the polling does make a lot of sense, since syncing requests for the same time is like writing a DoS attack against your own server.
Staggering doesn't necessarily have to be random and IMHO, it probably shouldn't.
You could start with a fixed time and add a second per client ID, allowing for ~86K connections in 24 hours which should be easy for any server to handle.
As a side note, 40K concurrent connections might not as hard to achieve as you imagine.
EDIT (relating to the comments)
Websockets vs. Server Sent Events:
IMHO, when pushing data (vs. polling), I would prefer Websockets over Server Sent Events (SSE).
Websockets have a few advantages, such as client side communication which allows clients to ping the server and confirm that the connection is still alive.
The Specific Use-Case:
From the description in the question and the comments it seems that you're using browser clients with a custom plugin and that the updates you wish to install daily might require the browser to be active.
This raises different questions that effect the implementation (are the client browsers open all day? do you have any control over the client browsers and their environment? can you guarantee installation while the browser is closed?).
...
IMHO, you might consider having the client plugins test for an update each morning as they load for the first time during that day (first access).
People arrive at work in different times and they open their browsers for the first time at different schedules. So the 40K requests you're expecting will be naturally scattered across that timeline (probably a 20-30 minute timespan).
This approach makes sure that the browsers and computers are actually open (making the update possible) and that the update requests are staggered over a period of time (about 33.3 requests per second, if my assumption is correct).
If you're serving a pre-written static configuration file (perhaps updated by the server daily), avoiding dynamic content and minimizing any database calls, than 33 req/sec should be very easy to manage.

Any reason NOT to enable WSDL Caching for SOAP in Magento?

Can anyone see a reason not to enable the WSDL Cache in Magento?
I have an EPOS system that is periodically talking to Magento from outside the network. When it does this, the site suffers from a huge dip in speed, as it appears to struggle with the SOAP API. Even hitting the site with an https request like this:
https://[site-url]/api/v2_soap?wsdl=1
The response can be up to 10 seconds. Sometimes, when lots of these requests are made, the server grinds to a halt, and there are many sleeping connections left in the MySQL database.
On checking whether Magento is configured for WSDL caching, I notice that it isn't. I didn't develop the site, however, and I'm wondering if there are any legitimate reasons not to enable this feature?
Maybe this is obvious: for debugging.
I've experienced an issue (running Magento and PHP-FPM) where the WSDL cache became corrupted during a huge spike in traffic, which resulted in 503 errors whenever a SoapClient was constructed. That cache didn't clear through restarts of PHP-FPM, Apache, and the machine. Clearing the SOAP cache solved the problem, but it took some time to debug, and cache issues tend to be extra maddening.
I should say that I have no idea if this is a common issue, but the WSDL cache is a component that, like any component, can break.

Troubleshooting MVC4 Web API Performance Issues

I have an asp.net mvc4 web api interface that gets about 54k requests a day.
http://myserv.x.com/api/123/getstuff?whatstuff=thisstuff
I have 3 web servers behind a load balancer that are setup to handle the http requests.
On average response times are ~300ms. However, lately something has gone awry (or maybe it has always been there) as there is sporadic behavior of response times coming back in 10-20sec. This would be for the same request hitting the same server directly instead of through the load balancer.
GIVEN:
- System has been passed down to me so there may be gaps with IIS confiuration, etc,.
- Database: SQL Server 2008R2
- Web Servers: Windows Server 2008R2 Enterprise SP1
- IIS 7.5
- Using MemoryCache aggressively with Model and Business Objects with eviction set to 2hrs
- Looked at the logs but really don't see anything significantly relevant
- One application pool...no other LOB applications running on this server
Assumptions & Ask:
Somehow I'm thinking that something is recycling the application pool or IIS worker threads are shutting down and restarting thus causing each new request to warmup and recache itself. It's so sporadic that it's tough to trouble shoot right now. The same request to the same server comes back fast as expected (back to back N requests) since it was cached in about 300ms....but wait about 5-10-20min and that same request to the same server takes 16seconds.
I have limited tracing to go by as these are prod systems so I can only expose so much logging details. Any help and information attacking this or similar behavior somebody else has run into is appreciated. Thx
UPDATE:
The w3wpe.exe process grows to ~3G. Somehow it gets wiped out and the PID changes so itself or something is killing it every 3-4min I see tons of warnings in my webserver (IIS) log:
A process serving application pool 'MyApplication' suffered a fatal
communication error with the Windows Process Activation Service. The
process id was '1732'. The data field contains the error number.
After 4-5 days of assessing IIS and configuration vs internal code issues I finally found the issue with little to no help with windbg or debugdiag IIS tools. Those tools contain so much information even with mini dumps or log trace stacks that they can be red herrings. Best bet was to reproduce it by setting up a "copy intelligently" instance of a production system, which we did not have at the time and took a bit for ops to set something up.
Needless to say the problem had to do with over cacheing business objects. There was one race condition where updates on a certain table were updating an attribute to that corresponding business object (updates were coming from multiple servers) which was causing an OOC stackoverflow that pretty much caused the cacheing to recursively cache itself to death thus causing the w3wp.exe process to die and psuedo-recycle itself. It was one of those edge cases that was incredibly hard to test and repro in a non-production environment.

How can I increase SSL performance with Elastic Beanstalk

I really like Elastic Beanstalk and managed to get my webapp (Spring MVC, Hibernate, ...) up and running using SSL on a Tomcat7 64-bit container.
A major concern to me is performance (I thought using the Amazon cloud would help here).
To benchmark my server performance I am using blitz.io (which uses the amazon cloud to have multiple clients access my webservice simultaneously).
My very first simple performance test already got me wondering:
I benchmarked a health check url (which basically just prints "I'm ok").
Without SSL: Looks fine.
13 Hits/s with a response time of 9ms
230 Hits/s with a response time of 8ms
With SSL: Not so fine.
13 Hits/s with a response time of 44ms (Ok, this should be a bit larger due to encryption overhead)
30 Hits/s with a response time of 3.6s!
Going higher left me with connection timeouts (timeout = 10s).
I tried using a larger EC2 instance in the background with essentially the same result.
If I am not mistaken, the Load Balancer before the EC2 Instances serves as an endpoint for SSL encryption. How do I increase this performance?
Can this be done with elastic beanstalk? Or do I need to setup my own load balancer etc.?
I also did some tests using Heroku (albeith with a slightly different technology stack, play! vs. SpringMVC). Here I also saw the increased response time, but it stayed mostly constant. I am assuming they are using quite performant SSL endpoints. How do I get that for Elastic Beanstalk?
It seems my testing method was flawed.
Amazon's Elastic Load Balancers seem to go up to 10k SSL requests per second.
See this great writeup:
http://blog.mattheworiordan.com/post/24620577877/part-2-how-elastic-are-amazon-elastic-load-balancers
SSL requires a handshaking before a secure transmission channel is opened. Once the handshaking is done, which involves several roundtrips, the data is transmitted.
When you are just hitting a page using a load tester, it is doing the handshake for each and every hit. It is not reusing an already established session.
That's not how browsers are going to do. Browse will do handshake once and then reuse the open encrypted session for all the subsequent requests for a certain duration.
So, I would not be very worried about the results. I suggest you try a tool like www.browsermob.com to see how long a full page with many image, js, css etc takes to load over SSL vs non-SSL. That will be a fair comparison.
Helps?

Long delay getting data from a server for one site but not 2 others

We're running 3 different Drupal (Pressflow to be specific) sites on the same server. The 2nd and 3rd sites were cloned from the first one and load just fine. The first one, though, is taking a few seconds to connect and start sending data back from the server. Same box, same config (as far as we know), same modules, and generally the same theme. Here's what Pingdom shows...
Fast site:
http://i.stack.imgur.com/YZilC.png
Slow site:
http://i.stack.imgur.com/6Um1M.png
Edit: Those are from Pingdom, the yellow indicating "The web browser is waiting for data from the server"
The configs are the same, performance options same, server configs, as far as we can tell are the same. The delay occurs before any page elements are visible so it's not an on-page object problem or a page speed problem.
Could this be a config issue with the server? Where should we be investigating?
Thanks in advance!
I would try increasing the size of your MySQL cache. It's possible the fast sites have their particular queries cached and the slow site has a query that doesn't quite fit the cache, so the MySQL query results need to be regenerated each time.
Just a guess!

Resources