While debugging performance issue using DynaTrace,I have observed that my application is taking much time in _JSP Service.What does that mean?How to fix it?
Note: If we restart application the actual response time is 8secs,but if 15 users are using in parallel ,gradually application performance in degrading its taking 115secs on average.Using weblogic server 11g,Attached the dynatrace image
Related
We are working on an ASP.NET 5 Web API project that is in production now but we are experiencing an issue where it becomes unresponsive intermittently throughout the day.
A few notes about the application architecture. It is an ASP.NET Web API project using a MariaDB database on a separate EC2 instance within the same private network. The connection string uses the private IP of the database server to avoid any name resolution issues. The site is hosted via IIS 10.
The application itself has been developed carefully following the best practices provided by Microsoft. Heavy focus on async operations, minimizing query response times and offloading more expensive operations into background services.
The app is extremely responsive. It performs with sub 100ms responses on almost all requests, even the more complicated requests, and all the way up until it becomes unresponsive this high level of performance remains the same. We tend to see between 10-30 requests per second and 300-500 select queries per second at peak usage so not too extreme. However, randomly (2-3 times over a 24 hour period) it will begin hanging on requests and simply not respond to the request. During this time, the database is still extremely responsive and we are never over 300 connections out of our 512 connection limit.
The resources on the application server itself are never really taxed much at all. The CPU never gets above ~20% and the memory usage sits around 20-30%.
If I were to stop the site in IIS and start it again while this is happening, it will quickly come back online. If I don't it will be down for a few minutes until IIS finally kills it due to a failed health check. There are no real errors generated as a response to the issue other than typical errors caused by the hanging of the process such as connection terminated errors. The only thing I have seen before that gave me pause was the fact that there a few connection timeouts when getting the connection from the pool, but like I said, the connections to the server are never close to the limit.
Also, this app and version has been in production for months and it wasn't until the traffic volume started to grow that we started seeing these issues. At this point, I am at a loss for next steps of troubleshooting and I'm seeking suggestions.
In IIS App Pool advanced settings set Start Mode to AlwaysRunning
I never found a root cause for this issue, however, after updating to newer versions of .NET MVC this issue went away. My best guess is that changes with the Kestrel possibly resolved this issue, although, I have no idea what specific change that might have been. I have gone through the change logs a few times and didn't see anything that specifically jumped out at me.
The web application is running on Springboot and deployed on WebLogic.
We have assigned 400 as max threads and JDBC to be 100 connections.
When we perform load testing on the web application, the performance is optimal when the load is low (the response time is less than 200ms for most of the http request that we called).
When we increase the load, we can see that the thread count increases and jdbc count also increases gradually but no where near to max. However, the response time is getting much longer and it could take more than 5 seconds to response.
CPU usage, thread count, memory, JDBC connection seems to be normal during these period.
Another observation is that during testing and we saw that the performance is degrading, we used another machine to make a http call to the server that is only retrieving text without any DB or logic, and even this simple http call will take 10s to respond. (And the server resources is still not MAX!)
So, we are wondering what keep them waiting ?
Any other possible bottleneck?
If the server doesn't lack resources like CPU/RAM/etc. only a profiler can tell you where your application spends the most time which might be in:
Waiting in a queue for next thread/db connection from the pool to be available
Slow database query
Inefficient functions/algorithms which a subject to optimization
WebLogic configuration not suitable for high loads
JVM configuration not suitable for high loads (i.e. system is doing garbage collection to often/too long)
So I would recommend re-running your test with profiler tool telemetry enabled and at the same time monitoring essential JVM metrics using i.e. JMXMon Sample Collector which can be used for monitoring your application-specific metrics as well. It's a plugin which can be installed using JMeter Plugins Manager
For a detailed approach on how ago about identifying poor thread performance I suggest you take look at the TSA Method by Brendan Gregg.
Getting 503 Error while Running the JMeter for the Thread User 400,Is it Because of Server issues.? When I run the thread group for 100 user with ramp up period 25 seconds then it will be working fine but for the user 400 users its giving 503 error.
Given you don't experience any issues with 100 users and have issues with 400 users most probably it's a server issue connected with the overload so congratulations on finding the bottleneck.
You can either report it as is or perform a little bit deeper investigation in order to find the cause, suggested steps:
Instead of kicking off 400 users at once try increasing the load gradually at the same time looking at Response Times vs Threads and Transaction Throughput vs Threads charts. Ideally response time should remain the same and throughput should be growing as the number of threads increase. When response time starts increasing and throughput starts decreasing it indicates the saturation point and at this stage you can state that this is the maximum number of users your application can support
Check your application logs and configuration as it might be not properly tuned for the high loads, you can use 15 Simple ASP.NET Performance Tuning Tips as a reference or look for a similar guide for your application technology stack
Ensure that your application has enough headroom to operate in terms of CPU, RAM, Network, etc. as it might be the case that it's basically a lack of resources, it can be done using i.e. JMeter PerfMon Plugin
Repeat your test with profiler tool telemetry in place, this way you will be able to localize the problem and state where is the problematic piece of code or inefficient algo lives.
If server isn't down/restarted, then yes, 503 indicate overload
Common causes are a server that is down for maintenance or that is overloaded
You need to find what stop server from serving 400 concurrent requests/users
Notice that if you are testing on a test environment which isn't equal/similar to production environment, it may not reflect the load that production server can endure
I'm working on web application developed by the following technologies (JSF, Spring, Hibernate, MySQL, MongoDB, Elasticsearch, Jetty server and Tomcat). I created a stress test that simulates the application use by one user. In test scenario, I'm running a stress test for 4 paralel users, and in the same time login as 5th user and measure system response time for certain activities. If we present a response time when stress time is not running as T, after 5 minutes of running stress test response time is ~2xT, after 30 minutes it's ~5xT, after 45 minutes it's ~10-15xT.
After 1 hour I stopped the tests, waited 25 minutes untill memory heap was less then 200mb and CPU use near 0%, and performance got better slightly, but not too much (~7T). 1 hour after the tests stopped nothing changed.
My PC has 8GB of RAM, and I run the application with -Xms512m -Xmx4092m -XX:PermSize=512m -XX:MaxPermSize=512m, and according to what I found from JVisualVM and JProfiler, I don't think this is a memory leak issue. However, I'm not sure what else could be the problem.
I hope someone could indicate what should be a possible reason for such a huge performance degradation or where should I look for it.
Thanks in advance.
After having analysed all aspects of the application I identified that Mojarra was the main reason for performance degradation. Ajax post response time has been longer while backend initialization remained the same after a stress tests. After moving from Mojarra 2.1.9 to 2.2.0, problems disappeared, and I believe that is related to the complexity of JSF pages and how Mojarra deals with big number of components on one page.
We are experiencing slow processing of requests under heavy load. When looking at the currently running requests during these bursts I can see many requests to our web-service code.
The number of requests is not that large but they appear to be stuck in a preprocessing state. Below is an example:
We are running an IIS7 app pool in classic mode due to the need to support some legacy code.
Other requests continue to be processed but these stuck requests gradually seem to fill up the available threads leading to slow processing of other pages.
Does anyone have any idea on where these requests are getting stuck.
There appears to be no resource issue with the DB and the requests state show suggest this is all preprocessing.
We have run load tests on the code involved on local machines and can not replicate the issue.
Another possible factor is we are making use of MVC and UrlRouting.
Many thanks for any help.
Some issues only happen at production servers unfortunately, as load test can never simulate real world users.
You can try to capture hang dumps when performance is bad, and then analyze them (on your own or open a support case via http://support.microsoft.com to work with Microsoft support).
Usually you might have hit the famous thread pool bottleneck, http://support.microsoft.com/kb/821268. Dump analysis can easily tell the culprit and help locate a solution.
Why not move them into their own AppPool to separate them from the Classic ASP app - you'll then have more options to tune.