Distributed System - How to guarantee at least once published a message? - microservices

How to maintain consistency between disparate system (aka bounded contexts) when failures happen before publishing the message in a service bus (it could be a queue ).?
I figured out 3 options:
1. Using Udi Dahan approach ( Reliable Messaging - https://vimeo.com/111998645), i.e., maintain a published messages information in the same store as the entity that generated the event.
2. Process database transaction logging, and published it to a message bus.
3. Use event sourcing approach.
Is there any options besides that? What are the pros and cons of each approach?

Message brokers like RabbitMQ provide at-least-once delivery guarantee.

Related

Microservice Event driven Design with multiple Instances

At the Moment we design and plan to transform our system to a microservice architecture pattern.
To loose coupling we think about an event driven design with an JMS Topic. This looks great. But i don't now how we can solve the problem with multiple instances of a microservice.
For failover and load balancing we have n instances of each service. If an event is published to the topic each instance will receive and process that event.
It's possible to handle this with locks and processed states in the data storage. But this solution looks very expensive and every instance has the same work. This is not a load balaning for me.
Is there some good Solution or best practice for this pattern?
Why not use a Queue instead of a Topic? Then your instances will compete for messages rather than all get a copy.
EDIT
rabbitmq might be a better fit for you - publish to a fanout exchange and have any number of queues bound to it, with each queue having any number of competing consumers.
I have also seen JMS topics used where competing clients connect with the same client id. Some (all?) brokers will only allow one such client to consume. The others keep trying to reconnect until the current consumer dies.

How to send incremental updates and snapshot sync using ActiveMQ topics

Here is my use case: I am developing a trading application and i want to send incremental stock updates (bidQty etc) to active consumers instead of the whole quote and a snapshot update to a new consumer (to start with).
Now, is it possible to override any ActiveMQ's class (implementors of Topic) to achieve this behavior? Any clues on this would be helpful .
If the same is possible in any other openSource provider, please let me know.
This is NOT a case where you simply can change the implementation of topic. You should actually avoid changing the implementation of core ActiveMQ features to solve specific business requirements. Fixing bugs and adding core messaging features is another thing.
There are multiple ways to solve your use case with regular ActiveMQ features.
Separate Sync and Update channel
I would probably divide the "sync/snapshot" channel from the "incremental update" channel.
One way is to implement the "snapshot-sync" as JMS request/reply where the consumer asks the provider for a sync, then continues to rely on incremental updates pushed via the topic.
Advisory messages and Selectors
You can also implement it all using a single topic using a mix of AdvisoryMessages and JMS Selectors.
An idea (you can do this in many ways):
Introduce two message properties: MsgType and Receiver
Mark each incremental update with MsgType=inc
Mark each snapshot with some client id of the consumer, Receiver=.
Have the producer listen to advisory messages from ActiveMQ and and fire a snapshot/sync message marked with Receiver= and MsgType=snapshot when there is a new client subscribing the stock topic.
The client subscribes with a selector of something like
MsgType='inc' OR (MsgType='snapshot' AND Receiver=<me>)
This way you can trigger snapshot syncs with specific clients as well as incremental updates for all clients.
If you start think about the dynamics you already have, you can probably think of another ten or so solutions.
Retroactive Consumers
You might have some use of a Retroactive Consumer - the example actually shows a scenario similar to yours.

JMS 2.0: Shared-Durable-Consumer on Topic vs Asynchronous-Consumer on Queue; Ref. Official GlassFish 4.0 docs/javaee-tutorial Java EE 7

Ref: Official GlassFish 4.0 docs/javaee-tutorial Java EE 7
Firstly, let us start with the destination-type of: topic.
As per GlassFish 4.0 tutorial, section “46.4 Writing High Performance and Scalable JMS Applications”:
This section describes how to use the JMS API to write applications
that can handle high volumes of messages robustly.
In the subsection “46.4.2 Using Shared Durable Subscriptions”:
The SharedDurableSubscriberExample.java client shows how to use shared
durable subscriptions. It shows how shared durable subscriptions
combine the advantages of durable subscriptions (the subscription
remains active when the client is not) with those of shared consumers
(the message load can be divided among multiple clients).
When we run this example as per “46.4.2.1 To Run the ShareDurableSubscriberExample and Producer Clients”, it gives us the same effect/functionality as previous example on destination-type of queue: if we follow “46.2.6.2 To Run the AsynchConsumer and Producer Clients”, points 5 onwards – and modify it slightly using 2 consumer terminal-windows and 1 producer terminal-window.
Yes, section “45.2.2.2 Publish/Subscribe Messaging Style” does mention:
The JMS API relaxes this requirement to some extent by allowing
applications to create durable subscriptions, which receive messages
sent while the consumers are not active. Durable subscriptions provide
the flexibility and reliability of queues but still allow clients to
send messages to many recipients.
.. and anyway section “46.4 Writing High Performance and Scalable ..” examples are queue style – one message per consumer:
Each message added to the topic subscription is received by only one
consumer, similarly to the way in which each message added to a queue
is received by only one consumer.
What is the precise technical answer for: why, in this example, the use of Shared-Durable-Consumer on Topic is supposed to be, and mentioned under, “High Performance and Scalable JMS Application” vs. use of Asynchronous-Consumer on Queue?
I was wonderign about the same issue, so I found out the following link. I understand that John Ament gave you the right reponse, maybe it was just too short to get a full understand.
Basically, when you create a topic you are assuming that only the subscribed consumers will receive its messages. However processing such a message may requires a heavy processing; in such a cases you can create a shared topic using as much threads as you want.
Why not use a queue? The answer is quite simple, if you use a queue only one consumer will be able to handle such a message.
In order to clarify I will give you an example. Let's say a federal court publishes thousand of sentences every day and you have three distinct applications that depends on it.
Application A just copy the sentences to a database.
Application B parse the sentence and try to find out all relation between people around all previously saved sentences.
Application C parse the sentence and try to find out all relation between companies around all previously saved sentences.
You could use a Topic for the sentences, where Application A, B and C would be subscribed. However it easy to see that Application A can process the message very quicly while Application B and C may take some time. An available solution would consist of create a shared subscription for application B and another one to application C, so multiple threads could act on each of them simultaneouly...
...Of course there are other solutions, you could for example use a unshared topic (i.e. a regular one) and post all received messages on a ArrayBlockingQueue that would be handled by a pool of threads some time later; howecer in such a decision the developer would be the one to worry about queue handling.
Hope this can help.
The idea is that you can have multiple readers on a subscription. This allows you to read more messages faster, assuming you have threads available.
JMS Queue :
queued messages are persisted
each message is guaranteed to be delivered once-and-only-once, even no consumer running when the messages are sent.
JMS Shared Subscription :
subscription could have zero to many consumers
if messages sent when there is no subscriber (durable or not), message will never be received.

When to use persistence with Java Messaging and Queuing Systems

I'm performing a trade study on (Java) Messaging & Queuing systems for an upcoming re-design of a back-end framework for a major web application (on Amazon's EC2 Cloud, x-large instances). I'm currently evaluating ActiveMQ and RabbitMQ.
The plan is to have 5 different queues, with one being a dead-letter queue. The number of messages sent per day will be anywhere between 40K and 400K. As I plan for the message content to be a pointer to an XML file location on a data store, I expect the messages to be about 64 bytes. However, for evaluation purposes, I would also like to consider sending raw XML in the messages, with an average file size of 3KB.
My main questions: When/how many messages should be persisted on a daily basis? Is it reasonable to persist all messages, considering the amounts I specified above? I know that persisting will decrease performance, perhaps by a lot. But, by not persisting, a lot of RAM is being used. What would some of you recommend?
Also, I know that there is a lot of information online regarding ActiveMQ (JMS) vs RabbitMQ (AMQP). I have done a ton of research and testing. It seems like either implementation would fit my needs. Considering the information that I provided above (file sizes and # of messages), can anyone point out a reason(s) to use a particular vendor that I may have missed?
Thanks!
When/how many messages should be persisted on a daily basis? Is it
reasonable to persist all messages, considering the amounts I
specified above?
JMS persistence doesn't replace a database, it should be considered a short-lived buffer between producers and consumers of data. that said, the volume/size of messages you mention won't tax the persistence adapters on any modern JMS system (configured properly anyways) and can be used to buffer messages for extended durations as necessary (just use a reliable message store architecture)
I know that persisting will decrease performance, perhaps by a lot.
But, by not persisting, a lot of RAM is being used. What would some of
you recommend?
in my experience, enabling message persistence isn't a significant performance hit and is almost always done to guarantee messages. for most applications, the processes upstream (producers) or downstream (consumers) end up being the bottlenecks (especially database I/O)...not JMS persistence stores
Also, I know that there is a lot of information online regarding
ActiveMQ (JMS) vs RabbitMQ (AMQP). I have done a ton of research and
testing. It seems like either implementation would fit my needs.
Considering the information that I provided above (file sizes and # of
messages), can anyone point out a reason(s) to use a particular vendor
that I may have missed?
I have successfully used ActiveMQ on many projects for both low and high volume messaging. I'd recommend using it along with a routing engine like Apache Camel to streamline integration and complex routing patterns
A messaging system must be used as a temporary storage. Applications should be designed to pull the messages as soon as possible. The more number of messages lesser the performance. If you are pulling of messages then there will be a better performance as well as lesser memory usage. Whether persistent or not memory will still be used as the messages are kept in memory for better performance and will backed up on disk if a message type is persistent only.
The decision on message persistence depends on how critical a message is and does it require to survive a messaging provider restart.
You may want to have a look at IBM WebSphere MQ. It can meet your requirements. It has JMS as well as proprietary APIs for developing applications.
ActiveMQ is a good choice for open source JMS, more expensive ones I can recommend are TIBCO EMS or maybe Solace.
But JMS is actually built for once-only delivery and longer persistence is left out of the specification. You could of course go database, but that's heavy weight and possibly expensive.
What I would recommend (Note: I work for CodeStreet) is our 'ReplayService for JMS'. It let's you store any type of JMS messages (or native WebSphere MQ ones) in a high-performance file-based disk storage. Each message is automatically assigned a nanosecond timestamp and a globalMsgID that you can overwrite on publication. So the XML messages could be recorded by the ReplayServer and your actual message could just contain the globalMsgID as reference. And maybe some properties ?
Once a receiver receives the globalMsgID, it could then replay that message from the ReplayServer, if needed.
But on the other hand, 400K*3KB XML message should be easily doable for ActiveMQ or others. Also, you should compress your XML messages before sending.

JMS Producer-Consumer-Observer (PCO)

In JMS there are Queues and Topics. As I understand it so far queues are best used for producer/consumer scenarios, where as topics can be used for publish/subscribe. However in my scenario I need a way to combine both approaches and create a producer-consumer-observer architecture.
Particularly I have producers which write to some queues and workers, which read from these queues and process the messages in those queues, then write it to a different queue (or topic). Whenever a worker has done a job my GUI should be notified and update its representation of the current system state. Since workers and GUI are different processes I cannot apply a simple observer pattern or notify the GUI directly.
What is the best way to realize this using a combination of queues and/or topics? The GUI should always be notified, but it should never consume anything from a queue?
I would like to solve this with JMS directly and not use any additional technology such as RMI to implement the observer part.
To give a more concrete example:
I have a queue with packages (PACKAGEQUEUE), produced by machine (PackageProducer)
I have a worker which takes a package from the PACKAGEQUEUE adds an address and then writes it to a MAILQUEUE (AddressWorker)
Another worker processes the MAILQUEUE and sends the packages out by mail (MailWorker).
After step 2. when a message is written to the MAILQUEUE, I want to notify the GUI and update the status of the package. Of course the GUI should not consume the messages in the MAILQUEUE, only the MailWorker must consume them.
You can use a combination of queue and topic for your solution.
Your GUI application can subscribe to a topic, say MAILQUEUE_NOTIFICATION. Every time (i.e at step 2) PackageProducer writes message to MAILQUEUE, a copy of that message should be published to MAILQUEUE_NOTIFICATION topic. Since the GUI application has subscribed to the topic, it will get that publication containing information on status of the package. GUI can be updated with the contents of that publication.
HTH

Resources