Handling invalid junk characters at the beginning of XML file in IIB - ibm-mq

I am receiving an XML file with invalid junk characters at the beginning of the file as below:
"MDE H"¸MQSTR AMQ COREDC.QM4 Ègcù; ÿÿÿÿ
xml tag starts here-
The message is failing at IIB due to parsing errors.
How do I handle this message successfully in IIB? without failing the message by extracting only the XML body and discarding the invalid junk characters at the beginning of the file?

First of all, just reading your question I am assuming you already tried to ask the sending application to send a proper XML message and they cannot for some reason.
It looks like they are putting some header information in the XML payload (this should not happen).
Second: If this is something unsolvable because you have no control over what you are receiving in the input node, you can always treat the message as a blob, then, inside a compute node you can cast it to a character and remove any part of it as u wish (you can also use Java for this). Then you can parse it as an XML from the Compute node or using a reset content descriptor node.
I strongly recommend asking the sending application to send a proper XML message.

Related

JMS API encoding issue when IBM MQ message is TEXT+BINARY

I am consuming message from IBM MQ queue. Messages are of format MQSTR. But message data is TEXT+Binary. This Binary data is nothing but a JPG image
When these messages are consumed by my JMS consumer, I can see some encoding issue for Binary data, due to which transformed image is distorted from original.
I have tried every possible aspect (different IBM specification). But was not able to make through.
If someone already faced such issue please suggest possible solution for it.
If the message has an MQSTR MQMD.Format, then it must be a text string. MQ will convert text strings between codepages when required (eg ASCII to Unicode). If the message is not meant to be a string - which it sounds like from your limited description - then whoever creates the message needs to set the format suitably, and the receiving app must be prepared to parse and convert the message body components.

Is there a way to check for new fields in the protobuf message?

What I want to do is to validate the data inside a protobuf message before I send it to an external network. This is providing a security check.
The problem is that protobufs allow sending additional fields using an updated proto file, which allows backwards compatibility.
What this means is when I go to check a message, my autogenerated code parses the object, but drops the unknown fields. So this means the transmitted bytes could have information I don't know about.
A work around would be to transmit the version of data I have parsed and checked, which would mean dropping the new fields. That's the right security thing to do, but I still won't know that someone is sending me new version of messages. It would be nice to log that and be told I might need to update. I also want to communicate back to the sender that some of their data is being dropped.
Is there a way to know if the format of the message I received mismatches from the format I expect to receive?

JMeter encoding issue on "application/soap+msbin1"

Working on JMeter and trying to send the soap request to server and shows the below error msg.
Error Msg:- Cannot process the message because the content type 'application/soap+msbin1' was not the expected type 'application/xml; charset=utf-8'.
We need help to encode XML to 'application/soap+msbin1' format.
Bit late to the party, but I encountered a similar issue - I had a template for SOAP request which uses embedded-binary XML (xop:Include cid="...") and had to scratch my head to figure out how to do that with the stock HTTP Request.
The answer: you can't - not in a simple way. To solve the issue, I ended up customizing JMeter (I also looked at HTTPRawRequest as well but it doesn't seem to support https and I would have to rewrite a lot of the test script to use that). Since HTTP request does 99% of the job, the quickest way to support binary data is to change the source code to handle binary data.
The main issues are two: the Function interface in JMeter is designed around returning String, not byte[]. So already __FileToString() (which I used to read an external binary file to use) encodes the content of the file . Secondly, the HTTP Request Sampler and HTTPHC4Impl itself (excluding the "upload file" bit) encodes the parts of the HTTP request before sending it over to the wire.
Changing that implied changes in Function, AbstractFunction, CompoundVariable and create a new function class FileToStringBinary which encode the binary data in a way that it can be decoded after (by changes made to HTTPHC4Impl).
If I have the time I'll find someplace where to post the idea and the source (can't submit to JMeter because my update to HTTPHC4Impl is limited to handle the specific requests I need to test, where the embedded binary is in a multipart/related part, and I have no time or inclination to handle the general cases), but if you still need help to make it work, drop a line.

To identify the xsd of xml message which is received from MQ

In IBM MQ, I have a requirement where I can get many types of xml from the queue. The xml messages will be conformed to already specified xsd (there are say, 5 xsd - which means I can get 5 different xml). When I get the message from queue, I would like to know the type of xml (if its xsd1 or xsd2 or so on)
The reason why I would want to know is, I am using a JaxB interface with SAX implementation, for which I need to give the java object corresponding to the xml as parameter. So I have to know which xsd the input and is and assign the parameter correspondingly.
The options I have is to set a property in the header to the message, but the party who is dropping the message into MQ is not ready.
What other options do I have? Can I get the file name (of xml) from the mq and find the xsd based on the name of the file? Or do I have to do I sax parsing and identify the root tag and derive the xsd type? Any other better option anybody has in mind?
Think of MQ like the Post Office. When you get a letter, the post office doesn't mess with anything on the inside (the payload) and if it changes the outside, it only changes routing information. If you want to sort incoming mail to different recipients, whoever is sending it has to put the data against which the sort criteria operate on the outside of the envelope. If that doesn't work, you must open the envelope and look for the recipient name, department, or whatever on the papers inside.
Your MQ message is that envelope. The sort criteria can be different queue names, a property of the message, a property of the message header, or something in the payload. But unless the sender explicitly sets the destination queue name based on the selection criteria, or sets the message or header property, your only option is to inspect the payload and figure it out.
If you have to inspect the payload, this is a perfect scenario for IBM Integration Broker. But you can also write an application to perform this function. Very often this is performed by a Dispatch app which gets the message, figures out where it goes, then puts it onto another queue and COMMITs the GET and PUT operations. But if the dispatch app must parse the XML to determine the correct queue, the message has to be parsed twice - once by the dispatcher, once by the receiving app.
I think you can do:
Does the incoming message has the file name at the beginning of the message body? In that case, after receiving the message your application can read first few bytes to get the file name. Based on the file name, application can use appropriate Xsd and pass the entire message body.

HTTP POST - nameless data VS named data

Our server A notifies 3rd party server B with an XML-formatted message, sent as HTTP POST request. It's us who specify the message format and other aspects of interaction.
We can specify that the XML is sent as
a) raw data (just the XML)
b) single POST parameter having some specific name (say, xml=XML)
The question is which way is better for the 3rd party in general, if we don't know the platform and language they are using.
I thought I had seen some problems in certain languages to easily parse the nameless raw data, though I don't remember any specific case. While my colleague insists that the parameter name is redundant, and it's really better to send the raw data without any name.
If you don't need send extra information in other post parameters the xml parameter name is redundant and innecesary as your teammate said, if the 3rd party waits only for a XML data only send the raw data in the POST body with the correct mime type and encoding and and do not complicate.
The process for Getting raw data is easy in most application server containers, so you dont care about that, most of them uses a Reader to get received data and manipulate it.

Resources