Response times within applications WebSphere Process Server goes down after a while working - ejb-3.0

I see a degradation in response times within myapplications. After a server restart, response times are acceptable. However, after some time, which depends on the workload on the system, the response times degrade and the server has to be restarted to return to good performance.

Are you monitoring the Java heap usage with verbose garbage collection (GC) logs?
The behavior you describe can happen if the heap has enough free space after a restart, then gradually fills with long-lived objects as the workload runs. This may be caused by the heap simply being too small, or the application may have a memory leak, using heap and not releasing it for collection when the associated work is completed. When there is not enough free heap space, the application work slows down because the JVM spends excessive time running GC.
You can learn more about Java GC troubleshooting in our documentation
https://www.eclipse.org/openj9/docs/vgclog/
You can also open a support case to get assistance from WebSphere/Java troubleshooting experts, if you have a support arrangement with IBM.

Related

GC Overhead limit exceeded in WSO2 EI 6.6

In WSO2 EI 6.6, proxy stopped working abruptly. upon analyzing observed an error in the wso2 carbon log "GC Overhead limit exceeded", after this error nothing happening in the EI.
Proxy logic is to get the data from Sql server table and form an xml and send it to an external API. Proxy runs every 5 mins interval and in every interval maximum of 5 records will be pushed to an API.
After restarting the wso2 carbon services, proxy are started working. currently we are restarting the services every 3 days to avoid this issue.
Need to know how to identify the potential issue and resolve this.
This mean the JVM has run out of allocated memory. There can be many reasons for this. For example, if you haven't allocated enough memory to the JVM you can easily run out of memory. If that's not the case you need to analyze a memory dump and see what's occupying the memory causing it to fill up.
Generally, when you see the mentioned error the JVM automatically creates a heap dump(heap-dump.hprof) in the <EI_HOME>/repository/logs directory. You can try analyzing the dump to find the root cause. If the server doesn't generate a memory dump, manually take a memory dump when it's occupied than the expected level and analyze it.

Why is the Heap Memory Usage of a SpringBoot application keeps increasing?

I created and ran a simple SpringBoot application to accept some requests. I used jconsole to check the Heap Memory Usage and I saw this periodic increase followed by GC, I don't know the reason for the increases. Are there any Objects keep being created (because I think the instances are imported to container when the application starts)?
Spring boot has background processes which may consume your resources despite on absent requests to your app, for example jobs, etc.
Those spikes are expected and regular for any Java app based on any more less complex framework. GC algorithm depends on your jvm version, but could be overridden. The graph shows normal situation, from time to time memory consumed for some activities and after some time GC wake up and do the cleaning.
In case if you want to check what exactly objects caused memory spike you may try to use Java Flight Recorder or regular heap dump analysis using Eclipse memory analyser.
For current case Java Flight Recorder would be more convenient and suitable.

Web Server Performance Degradation

The web application is running on Springboot and deployed on WebLogic.
We have assigned 400 as max threads and JDBC to be 100 connections.
When we perform load testing on the web application, the performance is optimal when the load is low (the response time is less than 200ms for most of the http request that we called).
When we increase the load, we can see that the thread count increases and jdbc count also increases gradually but no where near to max. However, the response time is getting much longer and it could take more than 5 seconds to response.
CPU usage, thread count, memory, JDBC connection seems to be normal during these period.
Another observation is that during testing and we saw that the performance is degrading, we used another machine to make a http call to the server that is only retrieving text without any DB or logic, and even this simple http call will take 10s to respond. (And the server resources is still not MAX!)
So, we are wondering what keep them waiting ?
Any other possible bottleneck?
If the server doesn't lack resources like CPU/RAM/etc. only a profiler can tell you where your application spends the most time which might be in:
Waiting in a queue for next thread/db connection from the pool to be available
Slow database query
Inefficient functions/algorithms which a subject to optimization
WebLogic configuration not suitable for high loads
JVM configuration not suitable for high loads (i.e. system is doing garbage collection to often/too long)
So I would recommend re-running your test with profiler tool telemetry enabled and at the same time monitoring essential JVM metrics using i.e. JMXMon Sample Collector which can be used for monitoring your application-specific metrics as well. It's a plugin which can be installed using JMeter Plugins Manager
For a detailed approach on how ago about identifying poor thread performance I suggest you take look at the TSA Method by Brendan Gregg.

is it ok to use golang pprof on production without effecting performance?

I'm kind of new to the pprof tool, and am wondering if its ok to keep running this in production. From the articles I have seen, it seems to be ok and standard, however I'm confused as to how this does not affect performance since it does a sampling N times every second and how come this does not lead to a degradation in performance.
Jaana Dogan does say in her article "Continuous Profiling of Go programs"
Profiling in production
pprof is safe to use in production.
We target an additional 5% overhead for CPU and heap allocation profiling.
The collection is happening for 10 seconds for every minute from a single instance. If you have multiple replicas of a Kubernetes pod, we make sure we do amortized collection.
For example, if you have 10 replicas of a pod, the overhead will be 0.5%. This makes it possible for users to keep the profiling always on.
We currently support CPU, heap, mutex and thread profiles for Go programs.
Why?
Before explaining how you can use the profiler in production, it would be helpful to explain why you would ever want to profile in production. Some very common cases are:
Debug performance problems only visible in production.
Understand the CPU usage to reduce billing.
Understand where the contention cumulates and optimize.
Understand the impact of new releases, e.g. seeing the difference between canary and production.
Enrich your distributed traces by correlating them with profiling samples to understand the root cause of latency.
So if you are using pprof for the right reason, yes, you can leave it in production.
But for basic monitoring, as commented, the system is enough.
As noted in "Continuous Profiling and Go" by Vladimir Varankin
Depending on the state of the infrastructure in the company, an “unexpected” HTTP server inside the application’s process can raise questions from your systems operations department ;)
At the same time, depending on the peculiar nature of a company, the very ability to access something inside a production application, that doesn’t directly relate to application’s business logic, can raise questions from the security department ;)) I
So the overhead is not the only criteria to consider when leaving active such a feature.

Tomcat - PermGen space Exception

Iam using Tomcat 6, springs, hibernate and struts in my application. Often i get this exception in logs regarding memory and the site hangs and stops responding. Iam not sure whats wrong. Can anybody suggest what changes has to be done to avoid this.
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable run
SEVERE: Caught exception (java.lang.OutOfMemoryError: PermGen space) executing org.apache.jk.common.ChannelSocket$SocketConnection#2c52d3, terminating thread
The PermGen space is what Tomcat uses to store class definitions (definitions only, no instantiations) and string pools that have been interned. From experience, the PermGen space issues tend to happen frequently in dev environments really since Tomcat has to load new classes every time it deploys a WAR or does a jspc (when you edit a jsp file). Personally, I tend to deploy and redeploy wars a lot when I’m in dev testing so I know I’m bound to run out sooner or later (primarily because Java’s GC cycles are still kinda crap so if you redeploy your wars quickly and frequently enough, the space fills up faster than they can manage).
This should theoretically be less of an issue in production environments since you (hopefully) don’t change the codebase on a 10 minute basis. If it still occurs, that just means your codebase (and corresponding library dependencies) are too large for the default memory allocation and you’ll just need to mess around with stack and heap allocation. I think the standards are stuff like:
-XX:MaxPermSize=SIZE
I’ve found however the best way to take care of that for good is to allow classes to be unloaded so your PermGen never runs out:
-XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled
Stuff like that worked magic for me in the past. One thing tho, there’s a significant performance tradeoff in using those, since permgen sweeps will make like an extra 2 requests for every request you make or something along those lines. You’ll need to balance your use with the tradeoffs.
Your application may either have a memory leak or your Tomcat may not have enough memory to run the application. Use tools like Yourkit to check your memory usage.
Edit: Try increasing Tomocat memory. By default it starts with 256 MB. If you are on Windows you can do this by configuring Tomcat process settings. Refer to below post.
How do I increase memory on Tomcat 7 when running as a Windows Service?

Resources