Integration server could not be started (bip2057) - ibm-integration-bus

We have a problem that reoccurred several times on our ibm integration bus (version 10.0.0.6)
An integration server crashed, and the following exception occured:
"BIP2057
Integration server could not be started: integration node name ; UUID ; label ."
Any ideas?
update:
I've noticed that before this problem starts, there are errors on the log says: "Failed to allocate memory", sometimes with abends.
The server has lots of GB for RAM, and it seems that the IIB does not require even half of it.
Ideas?

From my experience, there is multiple possible solution, but as already proposed, opening a PMR to IBM might be the best solution.
1) You can try to increase the execution group HeapSize (exemple below for 1Go, I would recommend you to try it up to 4 Go) :
gigs=$((1 * 1024 * 1024 * 1024))
mqsichangeproperties $BROKERNAME -e $EGNAME -n jvmMaxHeapSize -o ComIbmJVMManager -v $gigs
2) Try to install the latest fixpack, as already suggested. Rollbacking the upgrade is really easy.
mqsisop $BROKERNAME
. /app/iib/iib-10.0.0.11/server/bin/mqsiprofile
mqsistart $BROKERNAME ##Now running with 10.0.0.11
mqsisop $BROKERNAME
. /app/iib/iib-10.0.0.6/server/bin/mqsiprofile
mqsistart $BROKERNAME ##Now running with 10.0.0.6
Note1: You have to unlog because profile repetition is not allowed, so you can't load profile for two different versions within the same session
Note2: on my side I am working on a issue starting from IIB 10.0.0.8, the PMR is ongoing for 8 month, and the issue is still present in IIB 10.0.0.11, which means I am still running with 10.0.0.7
3) Reproduce the issue with a new execution group to isolate the issue :
If you have multiple applications on your execution group, create one execution group per application, and deploy each application in the corresponding execution group. If one of your execution group refuse to start, I would review the application in it
I hope one of these will fix your problem

Related

The JVM should have exited but did not

During Distributed testing with Jmeter 3.3 in non gui mode i'm getting error as, how can I fix this :
I'm using same version of JMeter and JDK on Master as well as Slave machines.
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[main,5,main],
stackTrace:java.net.SocketInputStream#socketRead0
java.net.SocketInputStream#socketRead
java.net.SocketInputStream#read
java.net.SocketInputStream#read
java.io.BufferedInputStream#fill
java.io.BufferedInputStream#read
java.io.DataInputStream#readByte
sun.rmi.transport.StreamRemoteCall#executeCall
sun.rmi.server.UnicastRef#invoke
java.rmi.server.RemoteObjectInvocationHandler#invokeRemoteMethod
java.rmi.server.RemoteObjectInvocationHandler#invoke
com.sun.proxy.$Proxy19#rrunTest
org.apache.jmeter.engine.ClientJMeterEngine#runTest at line:149
org.apache.jmeter.engine.DistributedRunner#start at line:132
org.apache.jmeter.engine.DistributedRunner#start at line:149
org.apache.jmeter.JMeter#runNonGui at line:1005
org.apache.jmeter.JMeter#startNonGui at line:910
org.apache.jmeter.JMeter#start at line:538
sun.reflect.NativeMethodAccessorImpl#invoke0
sun.reflect.NativeMethodAccessorImpl#invoke
sun.reflect.DelegatingMethodAccessorImpl#invoke
java.lang.reflect.Method#invoke
org.apache.jmeter.NewDriver#main at line:248
I strongly recommend using this jmeter property:
jmeterengine.force.system.exit=true
documented here. These Chinese language web pages link link tipped me off.
You can add -Jjmeterengine.force.system.exit=true on the command line when launching JMeter, or add jmeterengine.force.system.exit=true to JMETER_HOME/bin/jmeter.properties.
How I Confirmed This Fix
With JMeter 5.1 and java version "1.8.0_231" on MS-Win10, we're using a customized version of this JMeter InfluxDB backend Listener.
After my 60 second test run from the command line (jmeter.bat -n -t plan.jtl), the command line hung after displaying this output (very similar to op):
Tidying up ... # Wed Jan 29 14:41:04 CST 2020 (1580330464874)
... end of run
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[DestroyJavaVM,5,main], stackTrace:
Thread[pool-2-thread-3,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-4,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-1,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
After modifying my command line as follows, jmeter.bat cleanly exited instead of hanging and all the ugly stack trace went away too:
jmeter.bat -n -Jjmeterengine.force.system.exit=true -t plan.jtl
To confirm that the problem was caused by our customized JMeter InfluxDB backend Listener, I removed it from the .jmx and I also removed the jmeterengine.force.system.exit=true. No hang, no ugly stacktrace (I actually love stacktraces).
I have not taken the next step to discover whether the problem is with the official JMeter InfluxDB backend Listener or with our customized variant, which is not (and will never be) available publicly.
Should mention one gap in this story. I feel this test conclusively points to our customized backend listener (or jmeter's). However, its odd that none of the threads in the above thread dump seem to belong to the backend listener. So I applaud that JMeter did the right thing by dumping the stack trace -- few other apps go to the extent of auto-dumping when appropriate for troubleshooting. But in this case, perhaps that jmeter auto-dump code needs to be enhanced, because it did not point to the culprit backend listener code. Anyone over there at Apache listening in on this?
Good luck.
Most probably your JMeter engine(s) is(are) overloaded therefore cannot gracefully shut down running threads when you request them to do so.
Make sure you follow JMeter Best Practices
The very first "best practice" states Always use latest version of JMeter so consider migrating to JMeter 5.0 or whatever latest version is available at JMeter Downloads page
Make sure your JMeter instances have enough headroom to operate in terms of CPU, RAM and so on. You can use JMeter PerfMon Plugin for this if you don't have other monitoring software in place/in mind.
Take a thread dump and examine it - this way you will know where exactly your test is stuck
Introduce reasonable timeout values in HTTP Request Defaults so in case when server fails to respond JMeter wouldn't wait infinitely but rather fail with an error
And finally (however I wouldn't recommend this) you can suppress this check by adding the next line to user.properties file:
jmeter.exit.check.pause=-1
if you go for this keep in mind that you may run into a situation when JMeter slaves will still be trying to execute something even after your test ends so you will need to kill and restart the processes manually or using a script.

Unknown Thread pools being created

We have spring boot micro service with several libraries as dependencies. e.g Jest (elastic search), Hikari, Spring-Rabbit, FasterXml and many more.
After analyzing thread dump we found that 2 unknown pools are being created. On the normal development machine, these pools contain 8 to 10 threads. But on prod environment, we observed each of the pool has 66 threads. Thread pool name is auto-generated like pool-7, pool-2 etc.
We want to find out which java class/library is creating this thread pool and spawning the threads. Tried with oracle flight recorder, but even there we could no see the origin for these threads.
Can someone pls suggest the way to find out who is creating these threads?
Thanks,
Smita
It's unfortunate that the Threat Start event in Flight Recorder doesn't record the stack trace from the Thread#start method. I will see if it can be added to a future JDK release. You should however be able to see the thread that starts new threads.
If you can't find other tools to help you, the only way I can think of is to instrument the java.lang.Thread#start method yourself. Either using bytecode instrumentation, or just clone OpenJDK, modify the source file for java.lang.Thread and build your own custom JDK. The last step may sound daunting, but it's not that hard if you are on JDK 8 or later.
hg clone http://hg.openjdk.java.net/jdk8/jdk8
cd jdk8
bash get_source.sh
bash configure
make images
When you clone, there is a README file in the root that will point you to further instructions, if you should run into problems.

Periodic tns-12531: TNS: Cannot allocate memory

I have a problem that's been plaguing me about a year now. I have Oracle 12.1.x.x installed on my machine. After a day or two the listener stops responding and the listener.log contains a bunch of TNS-12531 messages. If I reboot, the problem goes away and I'm fine for another day or two. I'm lazy and I hate rebooting, so I decided to finally track this down, but I'm having no luck. Since the alternative is to do work that I really don't want to do, I'm going to spend all my time researching this.
Some notes:
Windows 10 Pro
64-Bit
32 GB RAM
Generally, about 20GB free when the error occurs
I have several databases and it doesn't matter which DB is running
Restarting the DB doesn't help
Restarting the listener doesn't help
Only rebooting clears the problem
When I set TRACE_LEVEL_LISTENER = 16, I don't get much more info. Trace files are not written to
I can connect to the DB if I bypass the listener (ie, set ORACLE_SID=xxx and connect without a DB identifier)
All other network interactions seem to work fine after the listener stops
lsnrctl status hangs and adds another TNS-12531 to the listener.log
I have roughly the same config at home and this does not happen
Below is an example of a listener.log file:
Fri Jul 28 14:21:47 2017
System parameter file is D:\app\user\product\12.1.0\dbhome_1\network\admin\listener.ora
Log messages written to D:\app\user\diag\tnslsnr\LJ-Quad\listener\alert\log.xml
Trace information written to D:\app\user\diag\tnslsnr\LJ-Quad\listener\trace\ora_24288_14976.trc
Trace level is currently 16
Started with pid=24288
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=LJ-Quad)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(PIPENAME=\\.\pipe\EXTPROC1521ipc)))
Listener completed notification to CRS on start
TIMESTAMP * CONNECT DATA [* PROTOCOL INFO] * EVENT [* SID] * RETURN CODE
28-JUL-2017 14:22:06 * 12531
TNS-12531: TNS:cannot allocate memory
28-JUL-2017 14:22:47 * 12531
TNS-12531: TNS:cannot allocate memory
28-JUL-2017 14:26:24 * 12531
TNS-12531: TNS:cannot allocate memory
Thanks a bunch for any help you can provide!
Issue 1
This error can occur approximately after 2048 connections have been made via the listener when running on a non-English Windows installation.
Fix for Issue 1
Create a Windows User Group named Administrators on the computer where the listener.exe resides. This can fix the issue of the listener dying.
Reference: I'll post the link for the first issue as soon as I find it again
Issue 2
This error can also occur on Windows 64-Bit systems where the Desktop Application Heap is too small.
Fix for Issue 2
Try to Increase the Desktop Application Heap Registry in windows its located in
HKLM\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows
Just as note don't add this Value by yourself, you have to depend on document.
Basically search for the registry entry and alter the third value for the key SharedSection=1024,20480,1024. This is a trial and error approach, but seems to improve listener's stability and memory issues.
Reference: TNS:cannot allocate memory - is there limit to the num databases on one box (Oracle Developer Community)

nca_connect_server: cannot communicate with host in LoadRunner 12.53

I'm currently testing HP LoadRunner 12.53 with Oracle EBS R12.2.5.
I created a simple script using both Oracle Apps, and Oracle NCA + Http protocol (Log in, bring up a form and close/log out) and replayed but run into below error. (same error for both scripts)
nca_connect_server: cannot communicate with host
icx_ticket is correlated and works OK as it is picked and replaced in the parameter.
No need to correlate JSessionIDForms as EBS is running on socket mode.
It is s just simple script with single correlation but can't find any clue for the error.
What could be the root cause of the error?
Where should I look at for a clue? How to make the error / log more verbose and detailed
Thanks in advance
Record it twice. If the value shifts, then correlate it.
Please ensure that you have properly set up the environment before recording, Below steps need to be taken for setting up the environment
1.Set "record = names" flag for specific user profile in Oracle EBS Application** via administrator login (search google how to achieve this or simply ask your application team to do it for you)
2.Run Time Settings and Default.cfg file changes
Run Time Settings
Keep the below values to high limit to avoid replay timeout errors
Run-Time Settings > Internet Protocol >Preferences > Options>
Step Download Limit
HTTP-request connect timeout:
HTTP-receive receive timeout:
Keep-Alive timeout:
Run-Time Settings >Browser>Browser Emulation>
1) Simulate a new user on each iteration – checked
3. default.cfg file inside script directory
"RelativeURL={NCAJServSessionId}" statement in the default.cfg file rolls back each time we run the script so we need to check that it is
/forms/lservlet;JsessionIDForms={NCAJServSessionId} -- R12 Version or
/forms/formservlet?JServSessionIdforms={NCAJServSessionId} -- EBS 11
i Version
4. Correlation - Last but not the least
Ensure correct correlation of each and every parameter, The best way to achieve this is by recording the script 2 times and comparing them using a suitable tool, Correlate each value which might be changing each time you replay the script
Note :- The Oracle EBS is not fully supported by Loadrunner please download the loadrunner compatability matrix and check if your version is supported by Loadrunner.

Ruby Stack failed to deploy on Google Developers Console

I tried to deploy Ruby stack using Google Developers Console, but no success. I tried several times at other project, error was always the same (below).
Do you have any idea why it keeps failing?
2014/10/23 15:59:44
rubyStackBox: PENDING
2014/10/23 15:59:55~2014/10/23 16:06:01
rubyStackBox: DEPLOYING
2014/10/23 16:06:11
rubyStackBox: DEPLOYMENT_FAILED
Replica rubystackbox-eaeo failed with status PERMANENTLY_FAILING: Replica State changed to PERMANENTLY_FAILING. Replica was unhealthy 2 consecutive times.
I replicated the issue you experienced several times and it also failed. What finally worked was playing with the zones/regions when deploying the ruby stack :
Developers console > Click-to-deploy > Set MySQL password > Advanced Options, choose a different zone and click Deploy.
Another useful tool when investigating this is Console Output. Even if the deployment fails, you can go to the VM instance and check View Output towards the bottom of the page. It will list all the packages and any errors encountered. The following command will achieve the same thing:
$ gcloud compute instances get-serial-port-output <INSTANCE_NAME> --project <PROJECT_ID> --zone <ZONE_NAME>
Please advise if still seeing issues.

Resources