Parse IBM MQ v9.1 Error Logs using Splunk - ibm-mq

I'm forwarding my IBM MQ v9.1 error logs using splunk forwarder to a centralized cluster to see trends on common error occurring across my distributed messaging systems.
However I'm unable to parse the required fields, since the format of MQ error logs are varying i.e. the severity of the messages could be error, warning, informational, severe and termination and each have different set of fields in itself and are not consistent.
Please let me know if anyone have used regex in splunk for parsing the fields of IBM MQ error logs for v9.1.
I have tried few regex patterns but it wasn't parsing as expected.
I have already referred below link, but that is for v8 and there is a different in format of error logs for v9,
https://t-rob.net/2017/12/18/parsing-mq-error-logs-in-splunk/
Also the splunk user is unable to access the error logs. I have updated below stanza in qm.ini
Filesystem:
ValidateAuth=No
also set chmod -R 755 to /var/mqm/qmgrs/qmName/errors folder.
Though the permissions for the ERROR logs doesn't change whenever it gets updated, when the logs rotate the permissions are revoked and splunk user is not able to read the logs.
Please let me know how to overcome this without adding splunk user to mqm group

I would suggest enabling JSON logging and forward those logs to Splunk which should be able to parse this format.
In IBM MQ v9.0.4 CDS release IBM added the ability to log out to a JSON formatted log, MQ will always log to the original AMQERR0x.LOG files even if you enable the JSON logging. This is included in all MQ 9.1 LTS and CSD releases.
The IBM MQ v9.1 Knowledge Center Page IBM MQ>Configuring>Changing IBM MQ and queue manager configuration information>Attributes for changing queue manager configuration information>Diagnostic message logging>Diagnostic message service stanzas>Diagnostic message services has information on the topic. You can add the following to your qm.ini to have it output the log information to a JSON formatted file called AMQERR0x.json in the standard queue manager errors directory:
DiagnosticMessages:
Service = File
Name = JSONLogs
Format = json
FilePrefix = AMQERR
As noted by the OP the JSON formatted logs do not contain the EXPLANATION or ACTION portion that you see in the normal logs.
In IBM MQ v9.1 you can use the mqrc command to convert the JSON format to the familiar format you see in AMQERR01.LOG.
One simple example is below:
cat <<EOL |mqrc -i json -o text -
{"ibm_messageId":"AMQ9209E","ibm_arithInsert1":0,"ibm_arithInsert2":0,"ibm_commentInsert1":"localhost (127.0.0.1)","ibm_commentInsert2":"TCP/IP","ibm_commentInsert3":"SYSTEM.DEF.SVRCONN","ibm_datetime":"2018-02-22T06:54:53.942Z","ibm_serverName":"QM1","type":"mq_log","host":"0df0ce19c711","loglevel":"ERROR","module":"amqccita.c:4214","ibm_sequence":"1519282493_947814358","ibm_remoteHost":"127.0.0.1","ibm_qmgrId":"QM1_2018-02-13_10.49.57","ibm_processId":4927,"ibm_threadId":4,"ibm_version":"9.1.0.5","ibm_processName":"amqrmppa","ibm_userName":"johndoe","ibm_installationName":"Installation1","ibm_installationDir":"/opt/mqm","message":"AMQ9209E: Connection to host 'localhost (127.0.0.1)' for channel 'SYSTEM.DEF.SVRCONN' closed."}
EOL
The output will be:
02/22/2018 06:54:53 AM - User(johndoe) Program(amqrmppa)
Host(0df0ce19c711) Installation(Installation1)
VRMF(9.1.0.5) QMgr(QM1)
Time(2018-02-22T11:54:53.942Z)
RemoteHost(127.0.0.1)
CommentInsert1(localhost (127.0.0.1))
CommentInsert2(TCP/IP)
CommentInsert3(SYSTEM.DEF.SVRCONN)
AMQ9209E: Connection to host 'localhost (127.0.0.1)' for channel
'SYSTEM.DEF.SVRCONN' closed.
EXPLANATION:
An error occurred receiving data from 'localhost (127.0.0.1)' over TCP/IP. The
connection to the remote host has unexpectedly terminated.
The channel name is 'SYSTEM.DEF.SVRCONN'; in some cases it cannot be determined
and so is shown as '????'.
ACTION:
Tell the systems administrator.
----- amqccita.c : 4214 -------------------------------------------------------
You can also use mqrc with just the error message from the JSON, for example AMQ9209E, you can run the command like this:
mqrc AMQ9209E
The output will be:
536908297 0x20009209 rrcE_CONNECTION_CLOSED
536908297 0x20009209 urcMS_CONN_CLOSED
MESSAGE:
Connection to host '<insert one>' for channel '<insert three>' closed.
EXPLANATION:
An error occurred receiving data from '<insert one>' over <insert two>. The
connection to the remote host has unexpectedly terminated.
The channel name is '<insert three>'; in some cases it cannot be determined and
so is shown as '????'.
ACTION:
Tell the systems administrator.
You could take it further and specify the inserts from the JSON:
Exmple portion of the JSON log:
"ibm_messageId":"AMQ9209E","ibm_arithInsert1":0,"ibm_arithInsert2":0,"ibm_commentInsert1":"localhost (127.0.0.1)","ibm_commentInsert2":"TCP/IP","ibm_commentInsert3":"SYSTEM.DEF.SVRCONN"
In the command below each ibm_arthInsert is specified with a proceeding -n flag in order following by each ibm_commentInsert with a proceeding -c flag:
mqrc AMQ9209E -n 0 -n 0 -c "localhost (127.0.0.1)" -c "TCP/IP" -c "SYSTEM.DEF.SVRCONN"
The output is below:
536908297 0x20009209 rrcE_CONNECTION_CLOSED
536908297 0x20009209 urcMS_CONN_CLOSED
MESSAGE:
Connection to host 'localhost (127.0.0.1)' for channel 'SYSTEM.DEF.SVRCONN'
closed.
EXPLANATION:
An error occurred receiving data from 'localhost (127.0.0.1)' over TCP/IP. The
connection to the remote host has unexpectedly terminated.
The channel name is 'SYSTEM.DEF.SVRCONN'; in some cases it cannot be determined
and so is shown as '????'.
ACTION:
Tell the systems administrator.

Related

Adding user to ActiveMQ Artemis fails on Windows

I'm trying to add a user to ActiveMQ Artemis on Windows. I have created an instance and started it. Then I run command:
artemis user add --user admin --password admin --user-command-user another_admin --user-command-password another_admin --role admin --url tcp://localhost:61616
The command fails with message:
The system cannot find the path specified.
The syntax of the command is incorrect.
Connection brokerURL = tcp://localhost:61616
Failed to add user another_admin. Reason: AMQ229220: Failed to load user file: /C:/Program%20Files/apache-artemis-2.26.0-instance1/etc/artemis-users.properties
How to fix?
There is a bug related to how the broker deals with spaces in the path to the user/role files. I've sent a PR to resolve the problem. The fix should be in 2.27.0.
In the mean-time you can work-around the issue by putting the broker instance on a path which has no spaces.

Setup WSO2 Enterprise Integrator VFS connection towards Windows SFTP server

Running WSO2 Enterprise Integrator 6.5.0. on RHEL 7. We are in the proces of building flows to read files from an sftp server. But setting up the sftp connection towards a Windows SFTP server fails. We can access this Windows SFTP server correctly with Windows clients like FileZilla/WinSCP.
With netstat we see a connection is build towards the Windows SFTP server but the flow isn't moving - no files are being read. On the point of stopping the server the error as shown below is printed in the wso2carbon.log.
When setting up the connection towards a Linux sftp server ( Plain RHEL 7 box with SSHD ) we don't face any issues. We have the matching private key place under .ssh/id_rsa in the home dir of the user running WSO2 EI.
Searching for the error message ( see snippet below ) we should get it resolved by adding the transport.vfs.AvoidPermissionCheck=true parameter to the VFS URL but unfortunately this doesn't solve our issue.
This is the VFS URL we are using.
sftp://SFTPUSER#SERVER.ACMECORP.ORG/inputdir?transport.vfs.AvoidPermissionCheck=true;vfs.passive=true
Is this a configuration that should work and are we missing a configuration option? Or is this a bug in the WSO2 software?
These URL's mention the issue we are facing.
VFS2 Error cannot delete file and could not get the groups id of the current user (error code: -1)
https://issues.apache.org/jira/browse/VFS-617
https://github.com/wso2/product-ei/issues/3725
[2019-12-06 13:48:59,724] [-1] [] [vfs-Worker-2] ERROR {org.apache.synapse.transport.vfs.VFSTransportListener} - Error checking for existence and readability : sftp://SFTPUSER#SERVER.ACMECORP.ORG/inputdir?transport.vfs.AvoidPermissionCheck=true;vfs.passive=true
org.apache.commons.vfs2.FileSystemException: Could not determine if file "sftp://SFTPUSER#SERVER.ACMECORP.ORG/inputdir?transport.vfs.AvoidPermissionCheck=true;vfs.passive=true" is readable.
at org.apache.commons.vfs2.provider.AbstractFileObject.isReadable(AbstractFileObject.java:1494)
at org.apache.synapse.transport.vfs.VFSTransportListener.scanFileOrDirectory(VFSTransportListener.java:295)
at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTransportListener.java:188)
at org.apache.synapse.transport.vfs.VFSTransportListener.poll(VFSTransportListener.java:134)
at org.apache.axis2.transport.base.AbstractPollingTransportListener$1$1.run(AbstractPollingTransportListener.java:67)
at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(NativeWorkerPool.java:172)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.jcraft.jsch.JSchException: Could not get the groups id of the current user (error code: -1)
at org.apache.commons.vfs2.provider.sftp.SftpFileSystem.getGroupsIds(SftpFileSystem.java:219)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.getPermissions(SftpFileObject.java:250)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.doIsReadable(SftpFileObject.java:264)
at org.apache.commons.vfs2.provider.AbstractFileObject.isReadable(AbstractFileObject.java:1492)
... 8 more
UPDATE
Using the same URL but then setting up the WSO2 flow to write a file towards the SFTP server works.
Got this resolved with support from WSO2.
The correct VFS url to use is.
sftp://SFTPUSER#SERVER.ACMECORP.ORG/inputdir?transport.vfs.AvoidPermissionCheck=true&vfs.passive=true So a '&' seperator instead of a ';'.
The documentation of WSO2 just is very fuzzy about the correct syntax to use.
They give different examples across their documentation.
https://docs.wso2.com/display/EI650/VFS+Transport
https://docs.wso2.com/display/EI650/File+Inbound+Protocol
https://docs.wso2.com/display/EI650/Configuring+File+Inbound+Protocol+for+FTP%2C+SFTP+and+FILE+Connections

Why I get AMQ7077 even after turning security in WebSphere MQ off?

In Windows7 when I set MQSNOAUT=yes everything is ok and I can do whatever I want in WebSphere MQ. But in RedHat even after setting MQSNOAUT to yes I'm getting this error:
[root#RHEL6-135 bin]$ ll crtmqm
-rwxrwxrwx. 1 mqm mqm 41822 Oct 22 2015 crtmqm
[root#RHEL6-135 bin]$ crtmqm testqm
AMQ7077: You are not authorized to perform the requested operation.
[root#RHEL6-135 bin]$
Using mqm user I can create queue manager but cannot start it:
[mqm#RHEL6-135 bin]$ crtmqm testqm
WebSphere MQ queue manager created.
Directory '/var/mqm/qmgrs/testqm' created.
The queue manager is associated with installation 'Installation1'.
Creating or replacing default objects for queue manager 'testqm'.
Default objects statistics : 79 created. 0 replaced. 0 failed.
Completing setup.
Setup completed.
[mqm#RHEL6-135 bin]$ strmqm testqm
WebSphere MQ queue manager 'testqm' starting.
The queue manager is associated with installation 'Installation1'.
5 log records accessed on queue manager 'testqm' during the log replay phase.
Log replay for queue manager 'testqm' complete.
Transaction manager state recovered for queue manager 'testqm'.
The queue manager ended for reason 545284129, ''.
[mqm#RHEL6-135 bin]$
Unfortunately, there is no useful information in these log files:
/var/mqm/errors/AMQERR01.LOG:
----- amqxfdcx.c : 888 --------------------------------------------------------
03/14/2017 10:00:16 AM - Process(15859.1) User(mqm) Program(amqzmur0)
Host(RHEL6-135) Installation(Installation1)
VRMF(8.0.0.4)
AMQ6125: An internal WebSphere MQ error has occurred.
EXPLANATION:
An internal error has occurred with identifier 2080520F. This message is
issued in association with other messages.
ACTION:
Use the standard facilities supplied with your system to record the problem
identifier and to save any generated output files. Use either the MQ Support
site: http://www.ibm.com/software/integration/wmq/support/, or IBM Support
Assistant (ISA): http://www.ibm.com/software/support/isa/, to see whether a
solution is already available. If you are unable to find a match, contact your
IBM support center. Do not discard these files until the problem has been
resolved.
...
repeated 27 times!
/var/mqm/qmgrs/testqm/errors/AMQERR01.LOG:
03/14/2017 10:00:16 AM - Process(15840.4) User(mqm) Program(amqzmuc0)
Host(RHEL6-135) Installation(Installation1)
VRMF(8.0.0.4) QMgr(testqm)
AMQ5051: The queue manager task 'LOGGER-IO' has started.
EXPLANATION:
The critical utility task manager has started the LOGGER-IO task. This task has
now started 1 times.
ACTION:
None.
-------------------------------------------------------------------------------
....
-------------------------------------------------------------------------------
03/14/2017 10:00:16 AM - Process(15859.6) User(mqm) Program(amqzmur0)
Host(RHEL6-135) Installation(Installation1)
VRMF(8.0.0.4) QMgr(testqm)
AMQ5037: The queue manager task 'DEFERRED_DELIVERY' has started.
EXPLANATION:
The restartable utility task manager has started the DEFERRED_DELIVERY task.
This task has now started 1 times.
ACTION:
None.
-------------------------------------------------------------------------------
The mqm user is sudoer and the following is a part of my /etc/group file:
root:x:0:root, mqm, bin
adm:x:4:root,adm,daemon, mqm, mquser
mqm:x:500:root, mqm
mquser:x:502:mqm
... regardless all these, I think having MQSNOAUT variable that is set to yes should be enough to work with WebShpere MQ using any user. Maybe something related to RedHat caused the problem.
BTW, searching for The queue manager ended for reason 545284129, ''., I couldn't find any solution.
Any thoughts?
UPDATE
Having done chmod -R 6550 on /opt/mqm/bin, now I can start queue managers and create queue, channel, ... using IBM MQ's command line binaries. For more convenient, however, still I can't use MQ Explorer, because when I run MQExplorer I get the following error:
[mqm#RHEL6-135 bin]$ MQExplorer
No protocol specified
MQExplorer: Cannot open display:
No protocol specified
No protocol specified
MQExplorer: Cannot open display:
MQExplorer:
An error has occurred. See the log file
/var/mqm/IBM/WebSphereMQ/workspace-Installation1/.metadata/.log.
[mqm#RHEL6-135 bin]$
Running it with sudo I get this error:
[mqm#RHEL6-135 bin]$ sudo MQExplorer
[sudo] password for mqm:
/opt/mqm/java/jre64/jre/bin/java: error while loading shared libraries: libjli.so: cannot open shared object file: No such file or directory
(process:4451): Gtk-WARNING **: This process is currently running setuid or setgid.
This is not a supported use of GTK+. You must create a helper
program instead. For further details, see:
http://www.gtk.org/setuid.html
Refusing to initialize GTK+.
[mqm#RHEL6-135 bin]$
and the /var/mqm/IBM/WebSphereMQ/workspace-Installation1/.metadata/.log is as follows:
!SESSION 2017-03-15 16:41:52.369 -----------------------------------------------
eclipse.buildId=unknown
java.fullversion=JRE 1.7.0 IBM J9 2.7 Linux amd64-64 Compressed References 20150630_255653 (JIT enabled, AOT enabled)
J9VM - R27_Java727_SR3_20150630_2236_B255653
JIT - tr.r13.java_20150623_94888.01
GC - R27_Java727_SR3_20150630_2236_B255653_CMPRSS
J9CL - 20150630_255653
BootLoader constants: OS=linux, ARCH=x86_64, WS=gtk, NL=en_US
Command-line arguments: -os linux -ws gtk -arch x86_64
!ENTRY org.eclipse.osgi 4 0 2017-03-15 16:41:54.516
!MESSAGE Application error
!STACK 1
org.eclipse.swt.SWTError: No more handles [gtk_init_check() failed]
at org.eclipse.swt.SWT.error(SWT.java:4423)
at org.eclipse.swt.widgets.Display.createDisplay(Display.java:925)
at org.eclipse.swt.widgets.Display.create(Display.java:909)
at org.eclipse.swt.graphics.Device.<init>(Device.java:156)
at org.eclipse.swt.widgets.Display.<init>(Display.java:507)
at org.eclipse.swt.widgets.Display.<init>(Display.java:498)
at org.eclipse.ui.internal.Workbench.createDisplay(Workbench.java:691)
at org.eclipse.ui.PlatformUI.createDisplay(PlatformUI.java:162)
at com.ibm.mq.explorer.ui.rcp.internal.base.RcpApplication.start(RcpApplication.java:88)
at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:354)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:181)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
at java.lang.reflect.Method.invoke(Method.java:620)
at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:636)
at org.eclipse.equinox.launcher.Main.basicRun(Main.java:591)
at org.eclipse.equinox.launcher.Main.run(Main.java:1450)
at org.eclipse.equinox.launcher.Main.main(Main.java:1426)
The log/stacktrace looks like a catch-all exception handling. I didn't yet completely digged into this error but maybe it is also caused by some permission problems. For example, maybe some authorization errors are raised when MQExplorer tries to load it's components from mqm's sub-directories! However, running chmod -R 6550 on some related paths didn't solve the problem.
Setting MQSNOAUT=ANYVALUE only turns off MQ OAM if it is set at the time the queue manager is created. It results a few lines being omitted from the qm.ini file for the queue manager you create while it is set.
If OAM is turned off it means only that any user connecting to the queue manager will have full authority.
The queue manager itself on Unix still needs to run under the mqm userid.
I noted that you showed the following permissions on the crtmqm binary:
-rwxrwxrwx. 1 mqm mqm
This is not correct, the MQ install on Unix has many of the files with setuid permission because the permissions on the files that are created under /var/mqm/qmgrs, /var/mqm/log, /var/mqm/sockets are very important. From the research I did the 545284129 and 2080520F errors are related to file permissions. I would suggest you reset the permissions back to what they previously were, if you do not know then I would suggest you remove the IBM MQ software and reinstall. For reference below is the normal permissions on the crtmqm binary:
-r-sr-s--- 1 mqm mqm
Once the IBM MQ binary permissions are corrected I would suggest that you use dltmqm to delete your queue manager and ensure that nothing remains related to that queue manager name under /var/mqm/qmgrs, /var/mqm/log, /var/mqm/sockets, and in the /var/mqm/mqs.ini file.
Once it is cleaned up, create it again as the mqm user and try starting it.
I would suggest that you try not to disable security and instead set things up with proper permissions. Even if this is a development environment, it is much better to get things working with security enabled. When you develop with security disabled you end up needing to troubleshoot why things are not working later when security is turned on in a real environment.
Take a look at my answer to question "Provide anonymous access to IBM WebSphere MQ" for some more information on how to keep security enabled as well as disabling things if you want to continue down that path.

Websphere + MQ client

I am getting following error message while connecting to websphere server using MQ client :
/opt/mqm/samp/bin/amqssslc -x 'X.X.X.10(9110)' -c QMEIGS1.VSER.SVRCONN
QMEIGS1 -k /var/mqm/qmgrs/QMEIGS1/ssl/qmeigs1.arm -s TRIPLE_DES_SHA_US
Error Message :
LE_DES_SHA_US
Sample AMQSSSLC start
Connecting to the default queue manager
Using the server connection channel QMEIGS1.VSER.SVRCONN
on connection name 10.87.205.70(7118).
No SSL configuration specified.
MQCONNX ended with reason code 2393
We have placed .arm file in ssl dir in the path /var/mqm/qmgrs/QMEIGS1/ssl/qmeigs1.arm
Please tell me what need to be done to resolve this ?
we are using following Packages on client side :
Client version : 8.0.0.4
Client OS : Redhat Linux 6.x 64bit (Non GUI)
Packages Installed on client :
MQSeriesJRE_vserv-8.0.0-4.x86_64
MQSeriesRuntime_vserv-8.0.0-4.x86_64
MQSeriesGSKit_vserv-8.0.0-4.x86_64
MQSeriesClient_vserv-8.0.0-4.x86_64
MQSeriesSamples_vserv-8.0.0-4.x86_64
Regards
Atul
The -k parameter on the client side (the amqssslc application) and the queue manager's ssl folder should contain a .kdb file. You appear to be using a .arm file. You should create a Key Database File (KDB) and add the certificate contained in the .arm file to that KDB, then rerun using the KDB as the target used by both client and queue manager instead of the .arm file.
You can find step-by-step instructions at the following page:
Running the SSL/TLS sample program

Websphere MQ server configuration

somebody can help me in configuring Websphere MQ Server in WAS 8.5?I got the below error while creating the WAS MQ Server.
Error: WebSphere MQ server MQSERVER connection test failed for WebSphere MQ queue manager MQSERVER. CWSJP0050E: An attempt to connect to WebSphere MQ queue manager or queue sharing group MQSERVER failed. The WebSphere MQ reason code is Unknown (2538)..
MQRC 2538 means "host not available". Check the host name and port name that you have specified and is pointing to the machine where MQ queue manager "MQSERVER" is running.
Check on which port your queue manager is listening. You can do that by using MQExplorer or runmqsc command shell on the machine where you queue manager is running. In a command prompt, run the following command
runmqsc MQSERVER
Once the runmqsc shell opens run the following command to list TCP listener.
dis listener(SYSTEM.DEFAULT.LISTENER.TCP)
Check the PORT number displayed. By default it will be 0. You need to change this to some port number. To change the port number run the following command.
alter listner(SYSTEM.DEFAULT.LISTENER.TCP) port(1414)
Once this is done you need to start the listener by running the following command
start listener(SYSTEM.DEFAULT.LISTENER.TCP)
After this you can attempt your tests.

Resources