Disk full made MQ dead - ibm-mq

We have an application that uses WebSphere MQ 7.0.1.3. During extensive testing in our stage environment, the disks became full.
After this, the MQ is hanging. We removed the application logs (not related to MQ) and added more disk but it didn't solve the problem.
We tried to restart the queue manager:
$ endmqlsr
$ endmqm XYZ
$ strmqm XYZ
WebSphere MQ queue manager 'XYZ' starting.
WebSphere MQ was unable to display an error message 893.
The logs from the time when the disk became full and the error occurred:
----- amqxfdcx.c : 828 --------------------------------------------------------
06/08/2018 03:36:44 AM - Process(8832.5) User(mqm) Program(amqzlaa0)
AMQ6119: An internal WebSphere MQ error has occurred (Rc=28 from write)
----- amqxfdcx.c : 783 --------------------------------------------------------
06/08/2018 03:36:44 AM - Process(8832.5) User(mqm) Program(amqzlaa0)
AMQ6184: An internal WebSphere MQ error has occurred on queue manager XYZ.
----- amqxfdcx.c : 822 --------------------------------------------------------
06/08/2018 03:36:46 AM - Process(8832.5) User(mqm) Program(amqzlaa0)
AMQ6119: An internal WebSphere MQ error has occurred (Rc=28 from write)
----- amqxfdcx.c : 783 --------------------------------------------------------
06/08/2018 03:36:46 AM - Process(8832.5) User(mqm) Program(amqzlaa0)
AMQ6184: An internal WebSphere MQ error has occurred on queue manager XYZ.
AMQ6119: An internal WebSphere MQ error has occurred ('28 - No space left on device' from semget.)
----- amqxfdcx.c : 783 --------------------------------------------------------
06/14/2018 02:35:46 PM - Process(6794.1) User(mqm) Program(amqzxma0)
AMQ6184: An internal WebSphere MQ error has occurred on queue manager XYZ.
----- amqxfdcx.c : 822 --------------------------------------------------------
06/14/2018 02:35:46 PM - Process(6794.1) User(mqm) Program(amqzxma0)
AMQ6118: An internal WebSphere MQ error has occurred (20006037)
When trying to connect with the IBM WebSphere MQ Explorer
Queue manager not available for connection - reason 2059. (AMQ4043)
Severity: 20 (Error)
Explanation: The attempt to connect to the queue manager failed. This could be because the queue manager is incorrectly configured to allow a connection from this system, or the connection has been broken.
Response: Ensure that the queue manager is running. If the queue manager is running on another computer, ensure it is configured to accept remote connections.
Is there a way of clearing all messages from the queues and resetting all flags so the queue manager will start and the queues will work again?
There are only old test data in the queues, nothing of value.
Or do you have any other suggestions on how to fix this?

You can use the mqrc command to provide more information on errors. Most of the time MQ reports return codes as a four digit decimal number. In this case since the return code is three digits it usually (always?) means it is a HEX return code.
$ mqrc 2195
2195 0x00000893 MQRC_UNEXPECTED_ERROR
This error is thrown when MQ hits an error condition that was not expected. Usually you will find a FDC file was created in the /var/mqm/errors directory that could provide some more detail.
The best course of action when you receive this type of error is to open a PMR with IBM and have them provide direction on recovery to ensure you have the best chance of preserving messages that may be present on your queues, however you are using a version of MQ (7.0) that has been out of support since September 30th 2015. The specific Fix Pack you are on (7.0.1.3) was released in August 2010. The last release of v7.0 from IBM was 7.0.1.14 in August 2016.
If you pay IBM for extended support you may be able to open a PMR with them for futher support.
The best path forward once you have resolved your issue would be to migrate to a supported version of IBM MQ. Currently v8.0 and v9.0 are the only supported versions of IBM MQ at this time.
Assuming you do not have extended support and are unable to get assistance from IBM, the following are some suggested steps:
Updating even to the latest Fix Pack (7.0.1.14) may help, and if it does not solve the problem it is still better by be at the latest Fix Pack of a unsupported version of IBM MQ.
You could try to cold start your queue manager and see if that helps. This is documented starting on Page 4 of the presentation "WebSphere MQ Disaster Recovery" given by Mark Taylor at Capitalware's MQ Technical Conference v2.0.1.3.
Create a queue manager EXACTLY like the one that failed
Use qm.ini to work out parameters to crtmqm command
Log:
LogPrimaryFiles=10
LogSecondaryFiles=10
LogFilePages=65535
LogType=CIRCULAR
Issue the crtmqm command
crtmqm -lc -lf 65535 -lp 10 -ls 10 –ld /tmp/mqlogs TEMP.QMGR
Make sure there is enough space for the new log files in that directory
Name of the dummy queue manager is irrelevant
Only care about getting the log files
Don’t start this dummy queue manager, just create it
Replace old logs and amqhlctl.lfh with the new ones
cd /var/mqm/log
mv QM1 QM1.SAVE
mv /tmp/mqlogs/TEMP!QMGR QM1
Note the “mangled” directory name … this is normal
Data in the queues is preserved if messages are persistent
Object definitions are also preserved
Objects contain their own definitions in their files
Mapping between files and object names held in QMQMOBJCAT
Once all the above is complete then try and start your queue manager.

Related

IBM MQ failed error 2058

I'm new with MQ Series and then tried to start with the "Hello World"
https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_7.5.0/com.ibm.mq.dev.doc/q030200_.htm
I execute it with linux as follow :
helloworld pQueueName QueueName SYSTEM.DEF.SVRCONN/TCP/hostname\(1414\)
I get this error message ImqQueuemanager::connect failed with reset reason code 2058.
The API say this error code is due to a wrong queue manager name.
http://www-01.ibm.com/support/docview.wss?uid=swg21166938
Then : Why do I have a such message and what do they mean by "wrong queue manager name"?
No, queue manager and queues must be created explicitly before you can use them. The setName method points to queue manager to connect to and does not create a queue manager.
Watch this video from T.Rob on how to install MQ and use it - https://www.youtube.com/watch?v=wSCHLBftjDw&pbjreload=10. In the video Linux OS is used. That's OK. You can skip the setup part (up to 2 minutes and 20 seconds or so) and start following from crtmqm command.

Websphere mq listener available but showing not found error

we have facing error, application unable to connect to queue manager,with reason
code mqrc 2538,
webspher MQ version v7.0.1.2.
operating system "Solaris".
I have started the listener manually through
runmqlsr -m qmname -t tcp -p port
after i have checked status of listener through command,
display lsstatus(listener name)
"listener is available but when I try to display the status of this listener it is showing MQ object not found."
we have checked error logs but there is no information for related client fails we have started listener manually, listener information only available in error logs.
Also we have checked "/var/mqm/error" we found the FDC files "probe ID: XY132002" we have contact with sysadmin they mount the disk space.
After mounting /var/mqm/ disc space still we are facing the same issue.
i have already given "start lstr(lstr name)" in script mode, but i its accepting the request, while I try to display the status of this listener it is showing MQ object not found."
i have checked qmgr error logs and fdc error logs"
can you please find the below errors written in /var/mqm/errors/AMQERR01.LOG
Explanation: 1. An attempt hasbeen made to run the brker(SFMSICREQMGR) but the brker has ended for reason '6119:xecF_E_UNEXPECTED_SYSTEM_RC'.
error: AMQ6119:An internal WebSphere MQ error has occured(failed to get memory segment:shmget(0x00000000, 16384) [rc =1 errno=28] no space left on device.
++below error written in queue manger level error:++
AMQ5008: An essential websphere MQ process 10063 (amqfgpub) cannot be found is assumed to be terminated.
these are errors written in queue manager level error logs and system level error logs:
we have added below values
process.max-file-descriptor=(basic,10000,deny)
project.max-sem-ids=(priv,1024,deny)
project.max-shm-ids=(priv,1024,deny)
project.max-shm-memory=(priv,4294967296,deny)
after adding this parameters we restarted the queue manager's,
we have four queue managers in server, three queue managers and listeners are in running state, fourth queue manager facing same error.
we have stopped one queue manager and we have run the fourth queue manager,the fourth queue manager is running and listener also in running state.
one queue manager is not allowing to start. we are facing same error for this queue manager.
All queue managers and listeners running fine.
we have created local queue,
queue name(error_local_queue).but while application tried get msg from this queue his getting error
Mqrc 2033.
Kindly help for this issue
thank you so much to all issue got resolved.
If you start a listener using the following command (as per your question):-
runmqlsr -m qmname -t tcp -p port
Then you have not specified a name for the listener anywhere (because this command does not have that capability).
It will however still show up in a DISPLAY LSSTATUS command with a system generated name. If you use the following command:-
DISPLAY LSSTATUS(*)
that will show all running listeners, and you will see that there is one with a name something like SYSTEM.LISTENER.TCP.1 which is your runmqlsr one.
Alternatively, if you want to give your listener a specific name, then you must define a listener as follows (replacing nnnn with your port number):-
DEFINE LISTENER(TCP.LSTR) TRPTYPE(TCP) CONTROL(QMGR) PORT(nnnn)
Then you are able to start it as follows:-
START LISTENER(TCP.LSTR)
and show it's status as follows:-
DISPLAY LSSTATUS(TCP.LSTR) ALL
N.B. I used the name TCP.LSTR but you may choose any name you wish.
The errors you mention at the end of your question are unrelated to listeners. Please open a separate question for those.
MQ v7.0 has been out of support since September 30th 2015.
The errors you found indicate the queue manager is short on shared memory, this could cause the entire queue manager to have issues including your listener. The current values along with IBM's recommendations can by found using the mqconfig script.
MQ v7.0 did not come with the mqconfig script. Download the script and verify which kernel settings are not correct, the download site is "How to configure UNIX and Linux systems for IBM MQ".
You can find more information on setting these in the IBM MQ v7 Knowledge Center page "Resource limit configuration".
The values in the Knowledge center are recommended values for a average server with a couple of queue managers and should be treated as a minimum value. If you can't run 4 queue managers then I would suggest going to higher values. I would start with setting max-sem-ids and max-shm-ids to 10240 and see if that solves it, if not then attempt to add 50% to the max-shm-memory value.

MQSeries error with MQCONN on TESTQMGR - compcode = 2, reason = 2058

I Am using Java application to connect to WMQ so as to create test suite, which pass message from a file to queue and wait for response from other queue, I am using WMQ V7.0 and ih03_RFHutil package provided by IBM, but after configuring every thing correctly I am getting below error message. It looks like some Authentication issue. Can some one please help me in this.
Below is Logs I have taken using log4j:
2017-03-06 17:26:01 DEBUG Runner_TMH_Tester:108 - initial sleep time 20 tune = 0
2017-03-06 17:26:01 DEBUG Runner_TMH_Tester:108 - connecting to TESTQMGR
2017-03-06 17:26:01 DEBUG Runner_TMH_Tester:108 - MQSeries error with MQCONN on TESTQMGR - compcode = 2, reason = 2058
MQ v7.0 was released June 27th 2008 and has been out of support since September 30th 2015 (almost 1.5 years). The version probably does not have anything to do with your issue but I would strongly suggest that you move to a supported version of the MQ client. Newer MQ client versions can connect to older MQ queue managers. You can download a java only install of MQ 8.0 or MQ 9.0 jar files at the links below:
IBM MQ v8.0 Client
IBM MQ v9.0 Client
The MQ client and queue manager install come with a program called mqrc. You can run this against the MQ return code, in this case 2058 to come up with a more meaning full description:
$ mqrc 2058
2058 0x0000080a MQRC_Q_MGR_NAME_ERROR
This is telling you that TESTQMGR is not the name of the queue manager that exists on the host and port you are connecting too. Verify the queue manager name, hostname, and port are all correct.

Latest WAS Liberty MQ connection with SSL certificate

Guys I am trying to connect to MQ Hub from WAS Liberty application. Our MQ Hub supports only SSL certificate authentication. I have created QCF, Keystore with JKS file and with certificate inside it. Then I created defaultSSLConfig and pointed to that keystore.
But I could not find anyway to specify the SSLConfig in the QCF and read on some page that it was not possible. The only way was to use defaultSSLConfig and specify keystore from there which I did. So now I am here and MQ connection does not work. On the MQ Hub logs I see the error saying that "The channel is lacking a certificate to use for the SSL handshake."
This is how my QCF looks like, no parameter to specify an SSL config
<jmsConnectionFactory connectionManagerRef="ConMgr" jndiName="jms/wmqCF">
<properties.wmqJms channel="TEST_CHANNEL" hostName="REMOVED" port="1415" queueManager="ALQ.TEST" transportType="CLIENT" sslCipherSuite="SSL_RSA_WITH_AES_128_CBC_SHA"/>
</jmsConnectionFactory>
Full error on MQ side
EXPLANATION:
The channel is lacking a certificate to use for the SSL handshake. The
channel
name is 'XXX.ADM.SVRCONN' (if '????' it is unknown at this stage in the
SSL
processing).
The remote host is 'XXX (10.xx.xx.x)'.
The channel did not start.
ACTION:
Make sure the appropriate certificates are correctly configured in the
key
repositories for both ends of the channel.
----- amqccisa.c : 7355
02/14/17 15:07:44 - Process(7510.304808) User(mqm) Program(amqrmppa)
Host(xxx) Installation(Installation1)
VRMF(7.5.0.6) QMgr(XXXXX)
AMQ9999: Channel 'XXX.ADM.SVRCONN' to host 'xxx (10.xx.xx.xx)' ended
abnormally.
EXPLANATION:
The channel program running under process ID 7510 for channel 'XX.ADM.
SVRCONN'
ended abnormally. The host name is 'xx (10.xx.xx.xx)'; in some
cases the
host name cannot be determined and so is shown as '????'.
ACTION:
Look at previous error messages for the channel program in the error
logs to
determine the cause of the failure. Note that this message can be
excluded
completely or suppressed by tuning the "ExcludeMessage" or
"SuppressMessage"
attributes under the "QMErrorLog" stanza in qm.ini. Further information
can be
found in the System Administration Guide.
It is working now :) What we think was the cause of the problem is this bug http://www-01.ibm.com/support/docview.wss?uid=swg1IT16056
Although the error in APAR above is not the same I was getting. I was seeing this error on the Liberty (client side)
Caused by: com.ibm.mq.MQException: JMSCMQ0001: IBM MQ call failed with
compcode '2' ('MQCC_FAILED') reason '2059' ('MQRC_Q_MGR_NOT_AVAILABLE').
at com.ibm.msg.client.wmq.common.internal.Reason.createException(Reason.
java:203)
I was using this resource adapter when the problem was manifesting itself: 9.0.1.0-IBM-MQ-Java-InstallRA.jar
Then we decided to try lower version of the adapter which had that APAR/fix in it and thus used this one: 8.0.0.6-WS-MQ-Java-InstallRA.jar
So that solved the problem.
I was pretty sure that the above bugfix was included in Ver 9.X of the resource adapter but as it turns out it is not the case.
I checked with IBM and they confirmed that APAR IT16056 is not included in the 9.0.1.0 CD release. They are working to correct the APAR to show the right target release for the fix.
Quote from IBM support is below.
I can confirm that the APAR in question, "IT16056" is NOT included in
the 9.0.1.0 CD release, and is currently targeted to be included in
the 9.0.2.0 CD release.
Based on this if you want you use a version of the RA higher than 8.0 you would need to do one of the following:
Wait until the 9.0 LTS (Long Term Support) 9.0.0.1 fixpack is released (IBM has a site where they list they are targeting 1Q 2017).
Wait until the 9.0 CD (Continuous Delivery) 9.0.2.0 release is out (IBM does not publish a target for CD)
Open a PMR to IBM and ask them for a IFIX to apply the 9.0.0.1 LTS or 9.0.1.0 CD release.

MQ Explorer stops displaying most information

After using MQ Explorer 7.5 on Ubuntu 12 to add JMS Connection factories and a JMS Destination it decided to stop displaying my two queues and subsidiary info as well as the new JMS information. I tried a few things to get it to work again: stopping the queue manager/restarting, rebooting etc. even reinstalling MQ Explorer without any luck.
I can do a status on the "empty" queues folder and it then shows me my two queues; each has "queue monitoring" set as off. Is this relevant? Can I set it on ?
Am I stuck with MQ Explorer to display and manage the JMS objects (there doesn't seem to be any documentation about how to use the command line for JMS objects) ?
more detail:
so I created objects using the following:
DEFINE QLOCAL (QUEUE_FROM)
DEFINE QLOCAL (QUEUE_TO)
SET AUTHREC PROFILE(QUEUE_FROM) OBJTYPE(QUEUE) PRINCIPAL('bsmith') AUTHADD(PUT,GET)
SET AUTHREC PROFILE(QUEUE_TO) OBJTYPE(QUEUE) PRINCIPAL('bsmith') AUTHADD(PUT,GET)
SET AUTHREC OBJTYPE(QMGR) PRINCIPAL('bsmith') AUTHADD(CONNECT)
DEFINE CHANNEL (CHANNEL1) CHLTYPE (SVRCONN) TRPTYPE (TCP)
SET CHLAUTH(CHANNEL1) TYPE(ADDRESSMAP) ADDRESS('127.0.0.1') MCAUSER('bsmith')
DEFINE LISTENER (LISTENER1) TRPTYPE (TCP) CONTROL (QMGR) PORT (1415)
START LISTENER (LISTENER1)
So these were all visible then in MQ Explorer using a user that was part of group mqm.
I then added, using MQ Explorer, a file based JMS context, two JMS Connection Factories, and a JMS Destination. After adding the JMS Destination the MQ Explorer stopped displaying everything except the Queue Manager and the JMS context in the MQ Explorer UI.
if I try to start the LISTENER again using the command START LISTENER (LISTENER1) it will tell me that it is already started. When I add a new queue to the queue manager using a command it also is not visible on the UI. A refresh doesn't change this.
/etc/environment is set to:
export JAVA_HOME=/usr/lib/jvm/jdk1.6.0_45
export MQSERVER="SWI_CHANNEL/TCP/COM22189(1415)"
export MQ_JAVA_LIB_PATH=/opt/mqm/java/lib64
export MQ_JAVA_INSTALL_PATH=/opt/mqm/java
export MQ_JAVA_DATA_PATH=/var/mqm
export LD_LIBRARY_PATH=/opt/mqm/java/lib64
CLASSPATH=.:/opt/mqm/java/lib/com.ibm.mq.jar:/opt/mqm/java/lib/com.ibm.mqjms.jar:/opt/mqm/samp/wmqjava/samples:/opt/mqm/samp/jms/samples:${JAVA_HOME}:${MQ_JAVA_LIB_PATH}:${CLASSPATH}
PATH=".:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:${JAVA_HOME}/bin:${JAVA_HOME}:/usr/lib/jvm/jdk1.6.0_45/jre/bin:${MQ_JAVA_LIB_PATH}"
trying the JMS Admin tool suggested gives :
/opt/mqm/java/bin$ ./JMSAdmin -v
Licensed Materials - Property of IBM 5724-H72, 5655-R36, 5724-L26,
5655-L82 (c) Copyright IBM Corp. 2008, 2011 All Rights Reserved. US
Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corp. Starting
WebSphere MQ classes for Java(tm) Message Service Administration
Initializing JNDI Context... INITIAL_CONTEXT_FACTORY:
com.sun.jndi.fscontext.RefFSContextFactory PROVIDER_URL:
file:/C:/JNDI-Directory JNDI initialization failed, please check your
JNDI settings and service. The name '"/C:/JNDI-Directory"' cannot be
resolved
Error: javax.naming.NameNotFoundException; remaining name
'"/C:/JNDI-Directory"
The error Error: javax.naming.NameNotFoundException; remaining name '"/C:/JNDI-Directory" can be resolved by creating a folder named JNDI-DIRECTORY in C drive. This is the place where .bindings file will get generated.

Resources