MQ messaging architecture for managing rejected messages - ibm-mq

I am having some difficulty deciding between 2x approaches to managing the rejection of messages on an MQ client. Admittedly, it's more an ideological argument than a technical one.
Consider this: a message (XML) on a queue is read by a client. The client checks the digital signature (and, by extension, whether the message adheres to a certain schema), before further processing. Let's say the verification of the digital signature fails though. I don't want the message to be further processed. It needs to go back to source and be sorted out 'by hand'.
As far as I can see, there are 2x approaches I could take:
Option 1
Client reads message
Client acknowledges receipt
Client discovers message is somehow invalid
Client writes invalid message onto 'reject' queue
CLIENT MQ CLIENT
READ +-------+ +----+
OUT Q | --- | --------> |PROCESS| -----> |NEXT|
| --- | |MESSAGE| |STEP|
+-----+ +-------+ +----+
|
|
REJECT Q | --- | <-------------+
| --- | FAILURE
+-----+
Option 2
Client reads message
Client discovers message is somehow invalid
Client does not acknowledge receipt of message
MRRTY = 0 (?) so QM writes message onto reject Q
CLIENT MQ CLIENT
READ +-------+ +----+
OUT Q | --- | --------> |PROCESS| -----> |NEXT|
| --- | <-------- |MESSAGE| |STEP|
+-----+ FAILURE +-------+ +----+
|
|
V
REJECT Q | --- |
| --- |
+-----+
I'm biased towards Option 2, where the QM is responsible for writing failed messaged onto a reject queue, as it seems to me to be a neater solution. This would also mean that the comms to the client is in one direction only. I understand the CLIENT_ACKNOWLEDGE is for the receipt of all messages up to point of acknowledgement: Am I misguided in thinking that ACKing per-message would be the mechanism that would allow me to have the QM write failed messaged onto the rejected Q per MRRTY parameter?
Any opinion / discussion re standard patterns / architecture much appreciated.

Thanks to both Morag and Attila for their help and input.
What it came down to was essentially this:
The application should handle the application errors, and a malformed message is an application error. The queue manager should only handle transport errors. (Attila)
and this...
There is no mechanism for having the queue manager route a failed message to a side queue. It is the responsibility of the application. (Morag)
So in the case of application errors the client itself will be expected to write failed / malformed messages back onto a separate queue out-of-band.

Related

What is the difference between latency and response time?

I started to read famous Martin Fowler book (Patterns of Enterprise Application Architecture)
I have to mention that I am reading the book translated into my native language so it might be a reason of my misunderstanding.
I found their definitions (back translation into English):
Response time - amount of time to process some external request
Latency - minimal amount of time before getting any response.
For me it is the same. Could you please highlight the difference?
One way of looking at this is to say that transport latency + processing time = response time.
Transport latency is the time it takes for a request/response to be transmitted to/from the processing component. Then you need to add the time it takes to process the request.
As an example, say that 5 people try to print a single sheet of paper at the same time, and the printer takes 10 seconds to process (print) each sheet.
The person whose print request is processed first sees a latency of 0 seconds and a processing time of 10 seconds - so a response time of 10 seconds.
Whereas the person whose print request is processed last sees a latency of 40 seconds (the 4 people before him) and a processing time of 10 seconds - so a response time of 50 seconds.
As Martin Kleppman says in his book Designing Data Intensive Applications:
Latency is the duration that a request is waiting to be handled - during which it is latent, awaiting service. Used for diagnostic purposes ex: Latency spikes
Response time is the time between a client sending a request and receiving a response. It is the sum of round trip latency and service time. It is used to describe the performance of application.
This article is a good read on the difference, and is best summarized with this simple equation,
Latency + Processing Time = Response Time
where
Latency = the time the message is in transit between two points (e.g. on the network, passing through gateways, etc.)
Processing time = the time it takes for the message to be processed (e.g. translation between formats, enriched, or whatever)
Response time = the sum of these.
If processing time is reasonably short, which in well designed systems is the case, then for practical purposes response time and latency could be the same in terms of perceived passage of time. That said, to be precise, use the defined terms and don't confuse or conflate the two.
I differentiate this using below example,
A package has been sent from A-B-C where
A-B took 10 sec, B (processing) took 5 sec, B-C took 10 sec
Latency = (10 + 10) sec = 20 sec
Response time = (10 + 5 + 10) sec = 25 sec
Latency
The time from the source sending a packet to the destination receiving it
Latency is the time it takes for a message, or a packet, to travel from its point of origin to the point of destination. That is a simple and useful definition, but it often hides a lot of useful information — every system contains multiple sources, or components, contributing to the overall time it takes for a message to be delivered, and it is important to understand what these components are and what dictates their performance.
Let’s take a closer look at some common contributing components for a typical router on the Internet, which is responsible for relaying a message between the client and the server:
Propagation delay
Amount of time required for a message to travel from the sender to receiver, which is a function of distance over speed with which the signal propagates.
Transmission delay
Amount of time required to push all the packet’s bits into the link, which is a function of the packet’s length and data rate of the link.
Processing delay
Amount of time required to process the packet header, check for bit-level errors, and determine the packet’s destination.
Queuing delay
Amount of time the packet is waiting in the queue until it can be processed.
The total latency between the client and the server is the sum of all the delays just listed
Response Time
Total time taken between the packet to send and receive the packet from the receiver
Q : Could you please highlite the difference?
Let me start using a know-how from professionals in ITU-T ( former CCITT ), who have for decades spent many thousands of man*years of efforts on the highest levels of professional experience, and have developed an accurate and responsible methodology for measuring both:
What have the industry-standards adopted for coping with this ?
Since early years of international industry standards ( well, as far as somewhere deep in 60-ies ), these industry-professionals have created concept of testing complex systems in a repeatable and re-inspectable manner.
System-under-Test (SuT), inter-connected and inter-acting across a mediation service mezzo-system
SuT-component-[A]
|
+------------------------------------------------------------------------------------------------------------[A]-interface-A.0
| +------------------------------------------------------------------------------------------------[A]-interface-A.1
| |
| | SuT-component-[B]
| | |
| | +-------------------------[B]-interface-B.1
| | | +---------[B]-interface-B.0
| | ???????????????? | |
| | ? mezzo-system ? | |
+-----------+ ???????????????? +---------------+
| | ~~~~~~~~~~~~~~~~~~~~~~~~ ??? ... ... ??? ~~~~~~~~~~~~~~~~~~~~~~~~ | |
| | ~~<channel [A] to ???>~~ ??? ... ... ??? ~~<channel ??? to [B]>~~ | |
| | ~~~~~~~~~~~~~~~~~~~~~~~~ ??? ... ... ??? ~~~~~~~~~~~~~~~~~~~~~~~~ | |
+-----------+ ???????????????? +---------------+
|
No matter how formal this methodology may seem, it brings both clarity and exactness, when formulating ( the same when designing, testing and validating ) the requirements closely and explicitly related to SuT-components, SuT-interfaces, SuT-channels and also constraints for interactions across exo-system(s), including limitations of responses to an appearance of any external ( typically adverse ) noise/disturbing events.
At the end, and to the benefit of clarity, all parts of the intended SuT-behaviour could be declared against a set of unambiguously defined and documented REFERENCE_POINT(s), for which the standard defines and documents all properties.
A Rule of Thumb :
The LATENCY, most often expressed as a TRANSPORT-LATENCY ( between a pair of REFERENCE_POINTs ) is related to a duration of a trivial / primitive event-propagation across some sort of channel(s), where event-processing does not transform the content of the propagated-event. ( Ref. memory-access latency - does not re-process the data, but just delivers it, taking some time to "make it" )
The PROCESSING is meant as that kind of transforming an event, in some remarkable manner inside a SuT-component.
The RESPONSE TIME ( observed on REFERENCE_POINT(s) of the same SuT-component ) is meant as a resulting duration of some kind of rather complex, End-to-End transaction processing, that is neither a trivial TRANSPORT LATENCY across a channel, nor a simplistic in-SuT-component PROCESSING, but a some sort of composition of several ( potentially many ) such mutually interacting steps, working along the chain of causality ( adding random stimuli, where needed, for representing noise/errors disturbances ). ( Ref. database-engine response-times creep with growing workloads, right due to increased concurrent use of some processing resources, that are needed for such requested information internal retrieval, internal re-processing and for final delivery re-processing, before delivering resulting "answer" to the requesting counter-party )
|
| SuT_[A.0]: REFERENCE_POINT: receives an { external | internal } <sourceEvent>
| /
| / _SuT-[A]-<sourceEvent>-processing ( in SuT-[A] ) DURATION
| / /
| / / _an known-<channel>-transport-LATENCY from REFERENCE_POINT SuT-[A.1] to <mezzo-system> exosystem ( not a part of SuT, yet a part of the ecosystem, in which the SuT has to operate )
| / / /
| / / / _a mezzo-system-(sum of all)-unknown-{ transport-LATENCY | processing-DURATION } duration(s)
| / / / /
| / / / / _SuT_[B.1]: REFERENCE_POINT: receives a propagated <sourceEvent>
| / / / / /
| / / / / / _SuT_[B.0]: REFERENCE_POINT: delivers a result == a re-processed <sourceEvent>
| / / / / | / | /
|/ /| /................ / |/ |/
o<_________>o ~~< chnl from [A.1] >~~? ??? ... ... ??? ?~~<chnl to [B.1]>~~~~~~? o<______________>o
| |\ \ \
| | \ \ \_SuT-[B]-<propagated<sourceEvent>>-processing ( in SuT-[B] ) DURATION
| | \ \
| | \_SuT_[A.1]: REFERENCE_POINT: receives \_an known-<channel>-transport-LATENCY from <mezzo-system to REFERENCE_POINT SuT_[B.1]
| |
| | | |
o<--------->o-----SuT-test( A.0:A.1 ) | |
| | | |
| | o<-------------->o---SuT-test( B.1:B.0 )
| | | |
| o<----may-test( A.1:B.1 )-------------------------------------------->o |
| | exo-system that is outside of your domain of control, | |
| indirectly, using REFERENCE_POINT(s) that you control |
| |
| |
o<-----SuT-End-to-End-test( A.0:B.0 )------------------------------------------------------------->o
| |
Using this ITU-T / CCITT methodology, an example of a well defined RESPONSE TIME test would be a test of completing a transaction, that will measure a net duration between delivering a source-event onto REFERENCE_POINT [A.0] ( entering SuT-component-[A] ) and waiting here until the whole SuT delivers an answer from any remote part(s) ( like a delivery from [A]-to-[B], plus a processing inside a SuT-component-[B] and an answer delivery from [B]-back-to-[A] ) until an intended response is received back on a given REFERENCE_POINT ( be it the same one [A.0] or another, purpose-specific one [A.37] ).
Being as explicit as possible saves potential future mis-understanding ( which the international industry standards fought to avoid since ever ).
So a requirement expressed like:
1) a RESPONSE_TIME( A.0:A.37 ) must be under 125 [ms]
2) a net TRANSPORT LATENCY( A.1:B.1 ) ought exceed 30 [ms] in less than 0.1% cases per BAU
are clear and sound ( and easy to measure ) and everybody interested can interpret both the SuT-setup and the test-results.
Meeting these unambiguous requirements qualify a such defined SuT-behaviour to safely become compliant with an intended set of behaviours, or let professionals to cheaply detect, document and disqualify those, who do not.

WebClient not emitting a response

I just recently came across an issue that leaves me puzzled. I'm happy about every advice you can give, even if it's about how to get more insights (i.e. logging).
I am using spring boot 2.0.0M1 (as generated from start.spring.io) with the reactive (netty backed) org.springframework.web.reactive.function.client.WebClient.
When calling an old, non-reactive service that just returns a JSON object, the WebClient does not emit any event, even though the called service is fully responsive (compare log).
2017-05-29 17:33:30,016 | reactor-http-nio-2 | INFO | | onSubscribe([Fuseable] FluxOnAssembly.OnAssemblySubscriber) | myLogClass:145 |
2017-05-29 17:33:30,016 | reactor-http-nio-2 | INFO | | request(unbounded) | myLogClass:145 |
2017-05-29 17:33:30,016 | reactor-http-nio-2 | DEBUG | | onSubscribe([Fuseable] FluxOnAssembly.OnAssemblySubscriber) | client:125 |
2017-05-29 17:33:30,016 | reactor-http-nio-2 | DEBUG | | request(unbounded) | client:125 |
When debugging this issue, I figured something strange that leaves me thinking that I'm simply using this implementation in a wrong way: I saw an URL from another service (that I called earlier) in nettys pool/channel handling (will add the class I found it in later).
Other observations:
when I leave the other WebClient call out of the picture, the problematic call works
the called services are behind a gateway, so they (might) have the same IP but different URI
So my seemingly unlikely ideas so far:
netty channel/pool handling somehow screws my urls up
WebClient shouldn't be used with different URIs
there is a f'up with URI/IP mapping on the way back from netty to the subscribers
As I said in the beginning: I am thankful for every help and of course I will add any information you request to reproduce this bugger.
Any pointer on how to maybe write a test for this is welcome as well!
Starting with information: this is how I use the WebClient as a minimum sample (I know - I am not planning on using it in a blocking way..)
return WebClient.create(serviceUri)
.put()
.uri("/mypath/" + id)
.body(fromObject(request))
.accept(APPLICATION_JSON)
.header(HttpHeaders.CONTENT_TYPE, APPLICATION_JSON_VALUE)
.header(HttpHeaders.COOKIE, createCookie())
.retrieve()
.bodyToMono(Response.class)
.log("com.mypackage.myLogClass", Level.ALL)
.block();

How to keep track of messages exchanged between a server and clients?

My app sends notification to the pc when a new text message is received on the phone. I am doing that over bluetooth if it matters.
(This is relevant to PC side)
What I am struggling with is keeping track of messages for each contact. I am thinking of having a linked list that grows as new contacts come in. Each node will represent a new contact.
There will be another list that grows vertically and this will be the messages for that contact.
Here is a diagram to make it clear:
=======================
| contact 1 | contact 2 ...
=======================
|| ||
========= =========
| msg 0 | | msg 0 |
========= =========
|| ||
========= =========
| msg 1 | | msg 1 |
========= =========
. .
. .
. .
This will handle the messages received but how do I keep track of the responses sent? Do I tag the messages as TAG_MSG_SENT, TAG_MSG_RECEIVED etc?
I have not written code for this part as I want to do the design first.
Why does it matter?
well when the user clicks on a contact from a list I want to be able display the session like this in a new window:
==============================
| contact 1 |
==============================
|Received 0 |
| Sent 0|
| Sent 1|
|Received 1 |
==============================
I am using C/C++ on windows.
Simple approach would be to use of existing file systems to store message as follows :-
Maintain a received file and sent file for each contact in specific folder.
Name them contact-rec-file and contact-sent-file.
Every time you receive or send message.
Append the message to corresponding sent or receive file
first write the size of message in bytes to the end of file
then write the content of the message.
Whenever you need to display messages open the file
read the size of file then read the contents of message using the size.
Note: Using main memory to store message is pretty inefficient as a lot of memory is used if there are more messages sent.
Optimization :- Use another file to store the number of messages and their seek position in send or receive files so that you can read that file at loading time and then directly seek the file to correct position if you to read only particular message.
It depends on what you want to keep track of, If you just want the statistics of the sent and received messages, then two counters for each contact will do. If you just want the messages sent and received by the client, not caring about how they are interleaved, then 2 lists for each client will do. If you also need to know the order of how they are interleaved, then as you suggested, a single list with an additional flag indicating if it was a sent or received message will work. There are other possibilities definitely, these are just to get you started.
Ok, if order matters, then here are 2 more ways that I can think of off the top of my head:
1) in the linked list, instead of having a flag indicating the status, have 3 next pointers, one for next message, one for next sent message, one for next received message. The next message pointer will have the same value as one of the others, but that's just so you can know how they are interleaved. So now you can easily get a list of sent messages, received messages, both, or some other weird walk.
2) Have only 1 linked list/array/table, each entry will include the contact info and the SENT/RECEIVED flag. This is not good if there's lots of other info about the contact that you wish to keep since now they need to be replicated. But for simplicity, only 1 list instead of list of lists. To remedy this problem, you could create a separate list with just the contact info, and put a reference in the messages linked list to this contact info list. You could also create a contacts_next_message pointer in the list of messages, this way you can walk using that and get all of that contacts messages.
And so on, there's lots of ways you can do this.

Does validation in CQRS have to occur separately once in the UI, and once in the business domain?

I've recently read the article CQRS à la Greg Young and am still trying to get my head around CQRS.
I'm not sure about where input validation should happen, and if it possibly has to happen in two separate locations (thereby violating the Don't Repeat Yourself rule and possibly also Separation of Concerns).
Given the following application architecture:
# +--------------------+ ||
# | event store | ||
# +--------------------+ ||
# ^ | ||
# | events | ||
# | v
# +--------------------+ events +--------------------+
# | domain/ | ---------------------> | (denormalized) |
# | business objects | | query repository |
# +--------------------+ || +--------------------+
# ^ ^ ^ ^ ^ || |
# | | | | | || |
# +--------------------+ || |
# | command bus | || |
# +--------------------+ || |
# ^ |
# | +------------------+ |
# +------------ | user interface | <-----------+
# commands +------------------+ UI form data
The domain is hidden from the UI behind a command bus. That is, the UI can only send commands to the domain, but never gets to the domain objects directly.
Validation must not happen when an aggregate root is reacting to an event, but earlier.
Commands are turned into events in the domain (by the aggregate roots). This is one place where validation could happen: If a command cannot be executed, it isn't turned into a corresponding event; instead, (for example) an exception is thrown that bubbles up through the command bus, back to the UI, where it gets caught.
Problem:
If a command won't be able to execute, I would like to disable the corresponding button or menu item in the UI. But how do I know whether a command can execute before sending it off on its way? The query side won't help me here, as it doesn't contain any business logic whatsoever; and all I can do on the command side is send commands.
Possible solutions:
For any command DoX, introduce a corresponding dummy command CanDoX that won't actually do anything, but lets the domain give feedback whether command X could execute without error.
Duplicate some validation logic (that really belongs in the domain) in the UI.
Obviously the second solution isn't favorable (due to lacking separation of concerns). But is the first one really better?
I think my question has just been solved by another article, Clarified CQRS by Udi Dahan. The section "Commands and Validation" starts as follows:
Commands and Validation
In thinking through what could make a command fail, one topic that comes up is validation. Validation is
different from business rules in that it states a context-independent fact about a command. Either a
command is valid, or it isn't. Business rules on the other hand are context dependent.
[…] Even though a command may be valid, there still may be reasons to reject it.
As such, validation can be performed on the client, checking that all fields required for that command
are there, number and date ranges are OK, that kind of thing. The server would still validate all
commands that arrive, not trusting clients to do the validation.
I take this to mean that — given that I have a task-based UI, as is often suggested for CQRS to work well (commands as domain verbs) — I would only ever gray out (disable) buttons or menu items if a command cannot yet be sent off because some data required by the command is still missing, or invalid; ie. the UI reacts to the command's validness itself, and not to the command's future effect on the domain objects.
Therefore, no CanDoX commands are required, and no domain validation logic needs to be leaked into the UI. What the UI will have, however, is some logic for command validation.
Client-side validation is basically limited to format validation, because the client side cannot know the state of the data model on the server. What is valid now, may be invalid 1/2 second from now.
So, the client side should check only whether all required fields are filled in and whether they are of the correct form (an email address field must contain a valid email address, for instance of the form (.+)#(.+).(.+) or the like).
All those validations, along with business rule validations, are then performed on the domain model in the Command service. Therefore, data that was validated on the client may still result in invalidated Commands on the server. In that case, some feedback should be able to make it back to the client application... but that's another story.

NTP working modes

I am new to NTP protocol. I read the RFC1305 and have some questions about NTP.
My questions are related to NTP working modes.
According to RFC1305 there are 8 modes
| 0 | reserved
| 1 | symmetric active
| 2 | symmetric passive
| 3 | client
| 4 | server
| 5 | broadcast
| 6 | NTP control message
| 7 | reserved for private use
My questions:
1- What are the differences between the symmetric passive device and symmetric active one?
2- Two symmetric active device can sync each other and Two passive active device can sync each other too ,but Can a symmetric passive device been synced by a symmetric active one and vice versa?
3- When a Symmetric passive device is connected to symmetric active one which one sends the NTP packet first?
4- What happens in broadcasting mode? Does the client send any NTP packet or only the broadcaster does that?
5- ”in order to sync some clients who have CLASS D IP ‘s , the server fills the 3 time stamp fields(receive time stamp is null) and set the mode to 5 and send the packet to 224.0.1.1 and clients get that packet and they send nothing in this procedure” Is this true?
6- Who sends the NTP control message? Client or broadcaster? What’s it for? What’s the appropriate answer for it?is it always 12 bytes long?
7- “A stratum 1 NTP server (GPS connected) acts like this: answer mode 1 requests with mode 2, mode 3 with mode 4 and mode 6 with 7” Is this true?
can only reply to a few questions:
-4. only the server (broadcaster) is allowed to send any ntp-packet in this mode
clients only listen to the interface, parse the received packet and set their clock accordingly - there is no reply being send.
but clients may send a ntp-request too, the server should then not reply to this one.
-5. right. there is no answer supposed to be send by this clients.
Mode 6 is used by the ntpq program. It can for example query "a list of the peers known to the server as well as a summary of their state" (from the man page).
This has recently be exploited to do DDOS reflection attacks, because it can be triggered with spoofed IP address, and the reply is larger than the query. 1
For this reason mode 6 and 7 queries should be blocked from outside sources.

Resources