Confirm JMS message into subprocess - tibco

Is there any way to confirm a JMS message in a subprocess?
I have process A that starts with a JMS Queue Receiver (or JMS Topic Subscriber). It calls process B which has to confirm the message received - I'm using Tibco EMS Explicit acknowledge mode.
This will allow me to reuse some parts. Is it possible to do it?

I'm afraid this is not possible. The confirm always has to be in the same process as the receiver.

In a well-designed architecture you do not want to split the messaging (and confirm) layer but rather push all functional processing into a sub-process having a result parameter indicating if the initial message shall be kept (defer processing to a later time by not confirming) or mark it as "processed" (and confirm it).
By default, all (JMS) messages are auto-confirmed so an explicit confirm is a design choice made by you (based on a particular consumption model) in the config tab of the process starter/step. You should only use this if you know what happens with that message and if a processing deferral is possible. Most loosely coupled messaging is not "transactional" (except you decide to take the extra mile) in a DB sense - so rather stick to the auto-confirm if you have no special handling demand! BW/EMS are quite good in handling (reasonably small) messages so NOT auto-confirming can create re-deliveries within milliseconds and flood your whole system (heap space) if not handled properly.

Related

How does a microservice return data to the caller when using a message broker? or a message queue?

I am prettty new to microservices, and I am trying to figure out how to set a micro-service architecture in which my publisher that emits an event, can receive a response with data from the consumer within the publisher?
From what i have read about message-brokers and message-queues, it seems like it's one-way communication. The producer emits an event (or rather, sends a message) which is handled by the message broker, and then the consumer consumes that event and performs some action.
This allows for decoupled code, which is part of what im looking for, but i dont understand if the consumer is able to return any data to the caller.
Say for example I have a microservice that communicates with an external API to fetch data. I want to be able to send a message or emit an event from my front-facing server, which then calls the service that fetches data, parses the data, and then returns that data back to my servver1 (front-facing server)
Is there a way to make message brokers or queues bidirectional? Or is it only useable in one direction. I keep reading message brokers allow services to communicate with each other, but I only find examples in which data flow goes one way.
Even reading rabbitMQ documentation hasn't really made it very clear to me how i could do this
In general, when talking about messaging, it's one-way.
When you send a letter to someone you're not opening up a mind-meld so that they telepathically communicate their response to you.
Instead, you include a return address (or some other means of contacting you).
So to map a request-response interaction when communicating with explicit messaging (e.g. via a message queue), the solution is the same: you include some directions which the recipient can/will interpret as "send a response here". That could, for instance be, "publish a message on this queue with this correlation ID".
Your publisher then, after sending this message, subscribes to the queue it's designated and waits for a message with the expected correlation ID.
Needless to say, this is fairly elaborate: you are, in some sense, reimplementing a decent portion of a session protocol like TCP on top of a datagram protocol like IP (albeit in this case, we may have some stronger reliability guarantees than we'd get from IP). It's worth noting that this sort of request-response interaction intrinsically couples the two parties (we can't really say "sender and receiver": each is the other's audience), so we're basically putting in some effort to decouple the two sides and then some more effort to recouple them.
With that in mind, if the actual business use case calls for a request-response interaction like this, consider implementing it with an actual request-response protocol (e.g. REST over HTTP or gRPC...) and accept that you have this coupling.
Alternatively, if you really want to pursue loose coupling, go for broke and embrace the asynchronicity at the heart of the universe (maybe that way lies true enlightenment?). Have your publisher return success with that correlation ID as soon as its sent its message. Meanwhile, have a different service be tracking the state of those correlation IDs and exposing a query interface (CQRS, hooray!). Your client can then check at any time whether the thing it wanted succeeded, even if its connection to your publisher gets interrupted.
Queues are the wrong level of abstraction for request-reply. You can build an application out of them, but it would be nontrivial to support and operate.
The solution is to use an orchestration system like temporal.io or AWS Step Functions. These services out of the box provide state management, asynchronous communication, and automatic recovery in case of various types of failures.

How does Mass Transit handle retries deduplication and message id generation when using in-memory outbox

Mass Transit has an in-memory "outbox" implementation that I think will handle the majority of the concerns / challenges I am looking to over come however I can not find a lot of documentation that describes its capabilities in the detail I am looking for. A lot of these questions came about after watching a video where Udi Dahan explains how to handle reliable messaging without distributed transactions (https://vimeo.com/111998645).
Does the in-memory outbox handle failures that may happen when trying to send a message to the queue? So for example: A consumer generates 3 messages that are collected in the outbox. The consumer completes without issue.The collected messages in the outbox start being processed
If from some reason while processing the collected message there is a network issue (or other issue) and message 2 fails to be sent what will happen to message 2 and 3? Is there any sort of retry policy?
What happens if a message being processed in the outbox is successfully added to the queue but is unsuccessfully marked as sent in the outbox? Will there be another attempt to send the message to the queue?
Assuming the outbox will retry sending a message to a queue if there is some sort of failure is the message ID guaranteed to be consistent between attempts? Having a consistent Message ID is important for de-duplication to ensure we do not process the same message multiple times.
When a message is consumed is there any de-duplication that takes place? (This ties back to 1.C)
How does Mass Transit track processed records for each consumer? Do the storage engines take care of this responsibility?
Is there any sort of "transaction" exposed to the consumer that allows you to clear the collected message in the outbox without throwing an exception or is throwing an exception the only way to rollback the outbox?
What about messages that are generated outside of a consumer, Is there a way to rollback messages collected in the outbox (example: A WebAPI controller action)?
Is there a recommendation to use the DTC features of Mass Transit instead of outbox or vice versa or use them both?
Currently Mass Transit does not have an outbox implementation that can survive a process crash. Is there a plan to include such a feature? Is there a road map this is tracked on?
The in-memory outbox defers any message send/publish/respond calls until the consumer has completed all processing. This includes regular consumers and sagas. The very last thing the consumer does is send/publish any deferred messages, after which the incoming message is acknowledged (and removed from the queue). With that said, most of the remaining items in your question aren't relevant, because it isn't writing messages to a database, and then processing them afterwards.
No
No
Don't use the DTC, it isn't even supported in .NET Core
No plans, nothing on the roadmap
As you said at the start, the in-memory outbox handles 99.9% of the cases. A well-designed saga and supporting services can push that even higher, ensuring idempotency and eventually successful command (or event) processing. Anything beyond what's there today is typically to support poorly designed systems and just creates way too much complexity with extra dependencies.

How to get data a ZMQ_PUB service?

Can I publisher service receive data from an external source and send them to the subscribers?
In the wuserver.cpp example, the data are generated from the same script.
Can I write a ZMQ_PUBLISHER entity, which receives data from external data source / application ... ?
In this affirmation:
There is one more important thing to know about PUB-SUB sockets: you do not know precisely when a subscriber starts to get messages. Even if you start a subscriber, wait a while, and then start the publisher, the subscriber will always miss the first messages that the publisher sends. This is because as the subscriber connects to the publisher (something that takes a small but non-zero time), the publisher may already be sending messages out.
Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
Q1: Can I write a ZMQ_PUBLISHER entity, which receives data from external data source/application?
A1: Oh sure, this is why ZeroMQ is so helping us in designing smart distributed-systems. Just imagine the PUB-side process to also have other { .bind() | .connect() }-calls, so as to establish such other links to data-feeder(s), and you are done to operate the wished to have scheme. In distributed-systems this gives you a new freedom to smart integrate heterogeneous systems to talk to each other in a very efficient way.
Q2:Does this mean, that a PUB-SUB ZeroMQ pattern is performed to a best effort - UDP style?
A2: No, it has another meaning. The newly declared subscriber entities at some uncertain moment start to negotiate their respective subscription-topic filtering and such a ( distributed ) process takes some a-priori unknown time. Unless until the new / changed topic-filter policy was established, there is nothing to go into the SUB-side exgress interface to meet a .recv()-call, so no one can indeed tell, when that will get happened, can he?
On a higher level, there is another well known dichotomy of ZeroMQ -- Zero-Warranty Principle -- expect to either get delivered a complete message or none at all, which prevents the framework users from a need to handle any kind of damaged / inconsistent message-payloads. Either OK, or None. That's a great warranty. The more for distributed-systems.

Is there an enterprise message queue which can drop duplicate messages (first value stays)?

I am looking looking for a message queue with these requirements. Couldn't find it; maybe the closest was the rabbitmq-lvc plugin (but I need the first value in the line to stick and stay in front).
Would anyone know a technology to support these?
message queue is FIFO
if a duplicate message is being enqueued, the message queue itself either rejects or drops it.
For example, producers put these three messages (each with a discriminator value) into the queue in this sequence: M1(discriminator=7654), M2(discriminator=2435), M3(discriminator=7654).
Now I want the message queue to see that M3 has the same discriminator value as M1 and thus drop/reject M3. Consumers receive only: M1, M2.
Thanks
Tom
I don't know the other transports but I know that WebSphere MQ doesn't do this and I believe that the explanation why would apply broadly across the category. I'd be very surprised to find that any messaging transport actually provides this. Here are a few reasons why:
Async messages are supposed to be atomic. Different vendors make their own accommodations for message affinity (a relationship between two or more messages) but as a rule, message affinity is to be avoided. Your use case not only requires the transport to deal with message affinity, but to do so over an indeterminate interval between related messages.
Message payload is a blob. For performance reasons, WMQ doesn't touch message payloads except for things like compression or code page conversion. Anything that requires parsing the message payload is a job for WebSphere Message Broker, DataPower or WebSphere ESB. I would expect any messaging transport which claims to be performant would face similar issues because parsing payloads results in longer code paths and non-linear performance degradation. The exception is message properties but WMQ uses these for selection only and I expect that is generally the case.
Stateless operation. As a transport, the state of the application may be stored in a persistent message but the state of the transport layer should not depend on the state of the application across different units of work. Again, an ESB type of product is best suited when you want to delegate management of some of the application state to the messaging layer and especially when such management spans many units of work.
Assured delivery. WMQ was designed to never lose your persistent message. If the app explicitly sets expiry the message might go away because the sender said it was OK to do so. If the message is non-persistent it might go away, but only in an exceptional condition and, again, because the sender said it was OK to do so. The use case you describe might result in a message going away not because the sender said it was OK, or even because the recipient said it was OK but because of an interaction with some unrelated 3rd party who happened to beat you to the queue with a duplicate value. What if that first message has an invalid header or code page problem and gets rolled back? What if I as an attacker spew out garbage messages with all possible 4-digit values for discriminator?
As I said, I don't know the other messaging products so there may be something out there which meets your requirement and if so I'll be interested to read about it. However in the event hat nobody replies, this post may shed some light on the reasons why.

ActiveMQ: Slow processing consumers

Concerning ActiveMQ: I have a scenario where I have one producer which sends small (around 10KB) files to the consumers. Although the files are small, the consumers need around 10 seconds to analyze them and return the result to the producer. I've researched a lot, but I still cannot find answers to the following questions:
How do I make the broker store the files (completely) in a queue?
Should I use ObjectMessage (because the files are small) or blob messages?
Because the consumers are slow processing, should I lower their prefetchLimit or use a round-robin dispatch policy? Which one is better?
And finally, in the ActiveMQ FAQ, I read this - "If a consumer receives a message and does not acknowledge it before closing then the message will be redelivered to another consumer.". So my question here is, does ActiveMQ guarantee that only 1 consumer will process the message (and therefore there will be only 1 answer to the producer), or not? When does the consumer acknowledge a message (in the default, automatic acknowledge settings) - when receiving the message and storing it in a session, or when the onMessage handler finishes? And also, because the consumers are so slow in processing, should I change some "timeout limit" so the broker knows how much to wait before giving the work to another consumer (this is kind of related to my previous questions)?
Not sure about others, but here are some thoughts.
First: I am not sure what your exact concern is. ActiveMQ does store messages in a data store; all data need NOT reside in memory in any single place (either broker or client). So you should actually be good in that regard; earlier versions did require that all ids needed to fit in memory (not sure if that was resolved), but even that memory usage would be low enough unless you had tens of millions of in-queue messages.
As to ObjectMessage vs blob; raw byte array (blob) should be most compact representation, but since all of these get serialized for storage, it only affects memory usage on client. Pre-fetch mostly helps with access latency; but given that they are slow to process, you probably don't need any prefetching; so yes, either set it to 1 or 2 or disable altogether.
As to guarantees: best that distributed message queues can guarantee is either at-least-once (with possible duplicates), or at-most-once (no duplicates, can lose messages). It is usually better to take at-least-once, and make clients to de-duping using client-provided ids. How acknowledgement is sent is defiend by JMS specification so you can read more about JMS; this is not ActiveMQ specific.
And yes, you should set timeout high enough that worker typically can finish up work, including all network latencies. This can slow down re-transmit of dropped messages (if worked dies), but it is probably not a problem for you.

Resources