High CPU utilisation in websphere liberty - websphere

We are migrating from websphere application server to websphere liberty.
When our application is deployed in WAS, the CPU utilisation is 8%. The same application when deployed in WLP, the CPU utilisation is more than 50% and was fluctuating.
Can anyone advise how to debug this issue and which parameters to check to minimise the CPU utilisation.

My advice would be to use your favorite monitoring / profiling tool:
Check that your application isn't spending a lot of time garbage collecting. That could be a sign of the heap being too small, or another GC tuning problem.
Check which non-GC threads are using a lot of time. Does that tell you something unexpected?
Profile the code to look for performance hotspots.
Without knowing the cause, we can't suggest JVM parameter changes.

I hope you have verified its the liberty process hogging on the CPU.
Can you turn on the verbose GC in liberty profile and see the logs for GC.

Related

Response times within applications WebSphere Process Server goes down after a while working

I see a degradation in response times within myapplications. After a server restart, response times are acceptable. However, after some time, which depends on the workload on the system, the response times degrade and the server has to be restarted to return to good performance.
Are you monitoring the Java heap usage with verbose garbage collection (GC) logs?
The behavior you describe can happen if the heap has enough free space after a restart, then gradually fills with long-lived objects as the workload runs. This may be caused by the heap simply being too small, or the application may have a memory leak, using heap and not releasing it for collection when the associated work is completed. When there is not enough free heap space, the application work slows down because the JVM spends excessive time running GC.
You can learn more about Java GC troubleshooting in our documentation
https://www.eclipse.org/openj9/docs/vgclog/
You can also open a support case to get assistance from WebSphere/Java troubleshooting experts, if you have a support arrangement with IBM.

Kubernetes Pod CPU Throttling after few (3-4) runs

I am having a spring boot application with spring-kafka version 2.5.
Our application is containerized in Docker and deployed in Kubernetes cluster.
This consumer fetches a record and performs database aggregation where I am using EclipseLink as a persistence service and Oracle database.
I am publishing batches of 100k message to this consumer topic. What we are observing is for initial few batches CPU usage of application is below 500milli core. But after few runs, CPU spikes up to its limit 2000milli core and even it throttles for more 2000milli core. There is no change done in between runs which will increase cpu usage. And the only solution we figured out is to redeploy our application in order to reduce cpu usage and then again it behaves same way.
We have performed JVM profiling and couldn't found any issue at Memory or Garbage collection.
My CPU request is 1000milli core and CPU limit is set to 2000milli core.
Please help me in finding what could be the root cause of this weird cpu throttling issue after few runs.
Appreciate your time.
Please find attached screenshot of CPU usage dashboard which shows sudden spike after some time for CPU usage.
My suspect is some configuration required at Kubernetes level to resolve this issue.

RAM Usage in TeamCity

We have a large TeamCity Server (10.0.3), with around 2.000 builds configurations and around 50 build agents.
Frequently, we encounter some performances issues, with a garbage collection.
Inside the teamcity-server.log, we found this:
[2017-11-28 12:30:54,339] WARN - jetbrains.buildServer.SERVER - GC usage exceeded 50% threshold and is now 60%. GC was fired 82987 times since server start and consumed total 18454595ms. Current memory usage: 1.09 GB.
We are unable to figure out the source of the issue.
According to the Documentation, a 64 bit version of Java should be used, with only 4g RAM. We encountered some issues, and decided to use -Xmx6g parameter instead.
Do you know where we can enable/find more traces in order to figure out the source of our over-consumption of memory ?
First, you can try disabling third-party plugins and see if it helps.
Then you try benchmarking the server according to this blog post and see if increased memory limits will improve the situations.
But the best way to investigate the memory over-consumption would be capturing the memory dump and investigating the content using profiling tools. You can create memory dump from Administration | Server Administration | Diagnostics page of your TeamCity web UI using Dump Memory Snapshot button.
You can investigate the dump on your own or send it to Jetbrains for investigation.

Web application very slow in Tomcat 7

I implemented a web application to start the Tomcat service works very quickly, but spending hours and when more users are entering is getting slow (up to 15 users approx.).
Checking RAM usage statistics (20%), CPU (25%)
Server Features:
RAM 8GB
Processor i7
Windows Server 2008 64bit
Tomcat 7
MySql 5.0
Struts2
-Xms1024m
-Xmx1024m
PermGen = 1024
MaxPernGen = 1024
I do not use Web server, we publish directly on Tomcat.
Entering midnight slowness is still maintained (only 1 user online)
The solution I have is to restart the Tomcat service and response time is again excellent.
Is there anyone who has experienced this issue? Any clue would be appreciated.
Not enough details provided. Need more information :(
Use htop or top to find memory and CPU usage per process & per thread.
CPU
A constant 25% CPU usage in a 4 cores system can indicate that a single-core application/thread is running 100% CPU on the only core it is able to use.
Which application is eating the CPU ?
Memory
20% memory is ~1.6GB. It is a bit more than I expect for an idle server running only tomcat + mysql. The -Xms1024 tells tomcat to preallocate 1GB memory so that explains it.
Change tomcat settings to -Xms512 and -Xmx2048. Watch tomcat memory usage while you throw some users at it. If it keeps growing until it reaches 2GB... then freezes, that can indicate a memory leak.
Disk
Use df -h to check disk usage. A full partition can make the issues you are experiencing.
Filesystem Size Used Avail Usage% Mounted on
/cygdrive/c 149G 149G 414M 100% /
(If you just discovered in this example that my laptop is running out of space. You're doing it right :D)
Logs
Logs are awesome. Yet they have a bad habit to fill up the disk. Check logs disk usage. Are logs being written/erased/rotated properly when new users connect ? Does erasing logs fix the issue ? (copy them somewhere for future analysis before you erase them)
If not. Logs are STILL awesome. They have the good habit to help you track bugs. Check tomcat logs. You may want to set logging level to debug. What happens last when the website die ? Any useful error message ? Do user connections are still received and accepted by tomcat ?
Application
I suppose that the 25% CPU goes to tomcat (and not mysql). Tomcat doesn't fail by itself. The application running on it must be failing. Try removing the application from tomcat (you can eventually put an hello world instead). Can tomcat keep working overnight without your application ? It probably can, in which case the fault is on the application.
Enable full debug logging in your application and try to track the issue. Run it straight from eclipse in debug mode and throw users at it. Does it fail consistently in the same way ?
If yes, hit "pause" in the eclipse debugger and check what the application is doing. Look at the piece of code each thread is currently running + its call stack. Repeat that a few times. If there is a deadlock, an infinite loop, or similar, you can find it this way.
You will have found the issue by now if you are lucky. If not, you're unfortunate and it's a tricky bug that might be deep inside the application. That can get tricky to trace. Determination will lead to success. Good luck =)
For performance related issue, we need to follow the given rules:
You can equalize and emphasize the size of xms and xmx for effectiveness.
-Xms2048m
-Xmx2048m
You can also enable the PermGen to be garbage collected.
-XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
If the page changes too frequently to make this option logical, try temporarily caching the dynamic content, so that it doesn't need to be regenerated over and over again. Any techniques you can use to cache work that's already been done instead of doing it again should be used - this is the key to achieving the best Tomcat performance.
If there any database related issue, then can follow sql query perfomance tuning
rotating the Catalina.out log file, without restarting Tomcat.
In details,There are two ways.
The first, which is more direct, is that you can rotate Catalina.out by adding a simple pipe to the log rotation tool of your choice in Catalina's startup shell script. This will look something like:
"$CATALINA_BASE"/logs/catalina.out WeaponOfChoice 2>&1 &
Simply replace "WeaponOfChoice" with your favorite log rotation tool.
The second way is less direct, but ultimately better. The best way to handle the rotation of Catalina.out is to make sure it never needs to rotate. Simply set the "swallowOutput" property to true for all Contexts in "server.xml".
This will route System.err and System.out to whatever Logging implementation you have configured, or JULI, if you haven't configured.
See more at: Tomcat Catalina Out
I experienced a very slow stock Tomcat dashboard on a clean Centos7 install and found the following cause and solution:
Slow start up times for Tomcat are often related to Java's
SecureRandom implementation. By default, it uses /dev/random as an
entropy source. This can be slow as it uses system events to gather
entropy (e.g. disk reads, key presses, etc). As the urandom manpage
states:
When the entropy pool is empty, reads from /dev/random will block until additional environmental noise is gathered.
Source: https://www.digitalocean.com/community/questions/tomcat-8-5-9-restart-is-really-slow-on-my-centos-7-2-droplet
Fix it by adding the following configuration option to your tomcat.conf or (preferred) a custom file into /tomcat/conf/conf.d/:
JAVA_OPTS="-Djava.security.egd=file:/dev/./urandom"
We encountered a similar problem, the cause was "catalina.out". It is the standard destination log file for "System.out" and "System.err". It's size kept on increasing thus slowing things down and ultimately tomcat crashed. This problem was solved by rotating "catalina.out". We were using redhat so we made a shell script to rotate "catalina.out".
Here are some links:-
Mulesoft article on catalina (also contains two methods of rotating):
Tomcat Catalina Introduction
If "catalina.out" is not the problem then try this instead:-
Mulesoft article on optimizing tomcat:
Tuning Tomcat Performance For Optimum Speed
We had a problem, which looks similar to yours. Tomcat was slow to respond, but access log showed just milliseconds for answer. The problem was streaming responses. One of our services returned real-time data that user could subscribe to. EPOLL were becoming bloated. Network requests couldn't get to the Tomcat. And whats more interesting, CPU was mostly idle (since no one could ask server to do anything) and acceptor/poller threads were sitting in WAIT, not RUNNING or IN_NATIVE.
At the time we just limited amount of such requests and everything became normal.

Lift application performance degrade

I am using lift framework with embedded jetty. My application is running pretty fast if I am it in development. As soon as I make a assembly of it using SBT, the performance degrades upto 20-30 times. The request which was taking 400ms, starts taking 10sec. Does lift has something to do with assembly?
Please give me some pointers to solve this problem.
Could you ensure that .jar files does not contain any test resources or configuration which can slowdown application in runtime (as example configs which turning on debug/trace priority in logger or minimize size of DB connection pool).
Also please check that start scripts for application set enough limits for heap and permgen memory size.
Following JVM options are suitable for most small and mid- sized Lift web apps:
-server -Xms256m -Xmx2048m -XX:MaxPermSize=512m -XX:+TieredCompilation
P.S. Try to find hotspots with some profiler, and then find cause of them...

Resources