io.netty.util.ThreadDeathWatcher$Entry occupying heap resulting in OOM

io.netty.util.ThreadDeathWatcher$Entry occupying heap resulting in OOM - jms

We are connecting to JBoss EAP 7.1.0 GA using JMS in our application. Over a period of 5-10 days we see that our Old Gen eventually runs out of heap due to failed clean up of io.netty.util.ThreadDeathWatcherEntry* which keeps on retaining heap
I noticed that JBoss client jar is using the bundled Netty library v4.1.9. Below are the Netty version properties:
netty-buffer.version=4.1.9.Final-redhat-1
netty-buffer.buildDate=2017-05-19 04\:27\:45 -0400
netty-buffer.commitDate=2017-05-19 10\:08\:15 +0200
netty-buffer.shortCommitHash=e75b1f8
netty-buffer.longCommitHash=e75b1f856d38a057ab0886166a1b2f9578c64c25
netty-buffer.repoStatus=dirty
#Generated by netty-parent/pom.xml
#Fri, 19 May 2017 04:28:16 -0400
I found a reference on Red Hat, but I could not access it. I wanted to see if anyone else has seen this problem and has a better way to control this behavior to avoid OOM Error?

Related

Unknown Thread pools being created

We have spring boot micro service with several libraries as dependencies. e.g Jest (elastic search), Hikari, Spring-Rabbit, FasterXml and many more.
After analyzing thread dump we found that 2 unknown pools are being created. On the normal development machine, these pools contain 8 to 10 threads. But on prod environment, we observed each of the pool has 66 threads. Thread pool name is auto-generated like pool-7, pool-2 etc.
We want to find out which java class/library is creating this thread pool and spawning the threads. Tried with oracle flight recorder, but even there we could no see the origin for these threads.
Can someone pls suggest the way to find out who is creating these threads?
Thanks,
Smita

It's unfortunate that the Threat Start event in Flight Recorder doesn't record the stack trace from the Thread#start method. I will see if it can be added to a future JDK release. You should however be able to see the thread that starts new threads.
If you can't find other tools to help you, the only way I can think of is to instrument the java.lang.Thread#start method yourself. Either using bytecode instrumentation, or just clone OpenJDK, modify the source file for java.lang.Thread and build your own custom JDK. The last step may sound daunting, but it's not that hard if you are on JDK 8 or later.
hg clone http://hg.openjdk.java.net/jdk8/jdk8
cd jdk8
bash get_source.sh
bash configure
make images
When you clone, there is a README file in the root that will point you to further instructions, if you should run into problems.

Periodic tns-12531: TNS: Cannot allocate memory

I have a problem that's been plaguing me about a year now. I have Oracle 12.1.x.x installed on my machine. After a day or two the listener stops responding and the listener.log contains a bunch of TNS-12531 messages. If I reboot, the problem goes away and I'm fine for another day or two. I'm lazy and I hate rebooting, so I decided to finally track this down, but I'm having no luck. Since the alternative is to do work that I really don't want to do, I'm going to spend all my time researching this.
Some notes:
Windows 10 Pro
64-Bit
32 GB RAM
Generally, about 20GB free when the error occurs
I have several databases and it doesn't matter which DB is running
Restarting the DB doesn't help
Restarting the listener doesn't help
Only rebooting clears the problem
When I set TRACE_LEVEL_LISTENER = 16, I don't get much more info. Trace files are not written to
I can connect to the DB if I bypass the listener (ie, set ORACLE_SID=xxx and connect without a DB identifier)
All other network interactions seem to work fine after the listener stops
lsnrctl status hangs and adds another TNS-12531 to the listener.log
I have roughly the same config at home and this does not happen
Below is an example of a listener.log file:
Fri Jul 28 14:21:47 2017
System parameter file is D:\app\user\product\12.1.0\dbhome_1\network\admin\listener.ora
Log messages written to D:\app\user\diag\tnslsnr\LJ-Quad\listener\alert\log.xml
Trace information written to D:\app\user\diag\tnslsnr\LJ-Quad\listener\trace\ora_24288_14976.trc
Trace level is currently 16
Started with pid=24288
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=LJ-Quad)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\EXTPROC1521ipc)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
28-JUL-2017 14:22:06 * 12531
TNS-12531: TNS:cannot allocate memory
28-JUL-2017 14:22:47 * 12531
TNS-12531: TNS:cannot allocate memory
28-JUL-2017 14:26:24 * 12531
TNS-12531: TNS:cannot allocate memory
Thanks a bunch for any help you can provide!

Issue 1
This error can occur approximately after 2048 connections have been made via the listener when running on a non-English Windows installation.
Fix for Issue 1
Create a Windows User Group named Administrators on the computer where the listener.exe resides. This can fix the issue of the listener dying.
Reference: I'll post the link for the first issue as soon as I find it again
Issue 2
This error can also occur on Windows 64-Bit systems where the Desktop Application Heap is too small.
Fix for Issue 2
Try to Increase the Desktop Application Heap Registry in windows its located in
HKLM\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows
Just as note don't add this Value by yourself, you have to depend on document.
Basically search for the registry entry and alter the third value for the key SharedSection=1024,20480,1024. This is a trial and error approach, but seems to improve listener's stability and memory issues.
Reference: TNS:cannot allocate memory - is there limit to the num databases on one box (Oracle Developer Community)

Rails web app ridiculously long loading time

I'm new to Ruby on rails and my mentor just handed me a ruby on rails web application. It's fairly large but even then it's taking a ridiculously long time to load: 45 minutes! By the time it hits 20 minutes the loading page of the app already displays an error saying 'Loading seems to be taking longer than usual, please refresh.'
I'm running rails 4.2.4 on a linux server 14.04 (in Virtualbox). I access the website from my host machine (Windows 8). The rails uses jbuilder 1.2 for building JSON.
From the development.log I gathered a ton of GET requests to load all the things. Here's a small selection of those:
Started GET "/assets/loader/loader.css?body=1" for 192.168.39.XXX at 2015-11-13 13:32:43 +0100
Started GET "/assets/reset.css?body=1" for 192.168.39.XXX at 2015-11-13 13:32:47 +0100
Started GET "/assets/bootstrap/bootstrap.css?body=1" for 192.168.39.XXX at 2015-11-13 13:32:50 +0100
Started GET "/assets/site/form.css?body=1" for 192.168.39.XXX at 2015-11-13 13:32:53 +0100
Started GET "/assets/temporary.css?body=1" for 192.168.39.XXX at 2015-11-13 13:32:57 +0100
Started GET "/assets/vendor/spectrum.css?body=1" for 192.168.39.XXX at 2015-11-13 13:33:01 +0100
Started GET "/assets/general.css?body=1" for 192.168.39.XXX at 2015-11-13 13:33:04 +0100
As you can see, it takes each GET about 3-5 seconds, the log file is 2225 lines long from ONE load.
Is there any way to speed up the process?
EDIT: I copied the entire application to a different folder and tried running it from there. Loading time was down to only a couple of minutes. I still get the error 'Loading seems to be taking longer than usual, please refresh.', so it isn't fixed at all.

I haven't used Virtualbox on Windows yet, but I had a similar problem with Virtualbox on MacOS. There was a bug related to VirtualBox shared folder. I switched to NFS sharing and the performance was greatly improved!
Sorry that I don't know how to configure NFS on Windows, but I hope this may help you.
Update:
Alternatively, I found a workaround for this issue. If you use Nginx(or Apache), just add this configuration option into your nginx.conf (or apache.conf) file.
In Nginx
sendfile off;
In Apache
EnableSendfile Off

Eventually I removed the shared folder and installed samba on my linux machine and set that up for an anonymously shared folder. Now the folder is still shared with windows, so I can use it as I always have, but without the loading time issues or file system problems.
I've also followed this tutorial to set up apache with passenger:
https://www.digitalocean.com/community/tutorials/how-to-setup-a-rails-4-app-with-apache-and-passenger-on-centos-6
Which decreased my loading time to 5 minutes total.

API Manager (1.8.0) busy 'doing nothing'

I have a clustered WSO2 deployment. The CPU often at 30% (on a c2.large) and despite the CPU usage, the server isn't processing request it just seems to be busy doing nothing.
It seems that the SVN deepsync autocommit feature is the cause of the CPU consumption since if I switch off deepsync or simple set autocommit to false then I don't see the same CPU spiking.
The logs seem to back up this theory as I see:
TID: [0] [AM] [2015-02-20 16:30:14,100] DEBUG {org.wso2.carbon.deployment.synchronizer.subversion.SVNBasedArtifactRepository} - SVN adding files in /zzish/wso2am/repository/deployment/server {org.wso2.carbon.deployment.synchronizer.subversion.SVNBasedArtifactRepository}
TID: [0] [AM] [2015-02-20 16:30:52,932] DEBUG {org.wso2.carbon.deployment.synchronizer.subversion.SVNBasedArtifactRepository} - No changes in the local working copy {org.wso2.carbon.deployment.synchronizer.subversion.SVNBasedArtifactRepository}
TID: [0] [AM] [2015-02-20 16:30:52,932] DEBUG {org.wso2.carbon.deployment.synchronizer.internal.DeploymentSynchronizer} - Commit completed at Fri Feb 20 16:30:52 UTC 2015. Status: false {org.wso2.carbon.deployment.synchronizer.internal.DeploymentSynchronizer}
and during this time the CPU spike occurs.
As per https://docs.wso2.com/display/CLUSTER420/SVN-based+Deployment+Synchronizer I am using svnkit-1.3.9.wso2v1.jar.
I am using an external SVN service (silksvn) in order to avoid having to run my own HA subversion service.
So I have three questions:
Is it possible to reduce the frequency of the deepsync service ?
How to further debug this performance issue? running this hot smells like a bug.
Has anyone managed to get the git deployment sync working (Link to project on github) with AM 1.8.0 ?

MQ java process taking 100% of CPU

Following process in our linux server is taking 100% of CPU
java -DMQJMS_LOG_DIR=/opt/hd/ca/mars/tmp/logs/log -DMQJMS_TRACE_DIR=/opt/hd/ca/mars/tmp/logs/trace -DMQJMS_INSTALL_PATH=/opt/isv/mqm/java com.ibm.mq.jms.admin.JMSAdmin -v -cfg /opt/hd/ca/mars/mqm/data/JMSAdmin.config
I forcibly killed the process and bounced MQ then i don't see this. What might be the reason for this to happen?

The java process com.ibm.mq.jms.admin.JMSAdmin is normally executed via the IBM MQ script /opt/mqm/java/bin/JMSAdmin.
The purpose of JMSAdmin is to create JNDI resources for connecting to IBM MQ, these are normally file based and stored in a file called .binding, the location of the .binding file would be found in configuration file that is passed to the command. In your output above the configuration file is /opt/hd/ca/mars/mqm/data/JMSAdmin.config.
JMSAdmin is an interactive process where you run commands such as:
DEFINE QCF(QueueConnectionFactory1) +
QMANAGER(XYZ) +
...
I would be unable to tell you why it was taking 100% CPU, but the process itself does not directly interact with or connect to the queue manager and it would be safe to kill off the process with out needing to restart the queue manager. The .binding file that JMSAdmin generates is used by JMS applications in some configurations to find details of how to connect to MQ and the names of queues and topics to access.
In July 2011 you would have been using IBM MQ v7.0 or lower all of which are out of support, if anyone should come across a similar issue with a recent supported version of MQ I would suggest you take a java thread dump and open a case with IBM to investigate why it is taking up 100% of the CPU.
*PS I know this is a 9 year old question, but I thought an answer may be helpful to someone who finds this when searching for a similar problem.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio