I prefer confluent-kafka-go driver. It's fast but it needs some workaround. The most tricky part is commiting messages. For the better performance we shouldn't commit every message, it's better to do commit periodically. But to make this approach robust we should handle rebalance events.
I have my implementation of such approach. but, how you guys do it?
Or nobody deals with it?
Related
I have watch several of article about XA/Distributed Transaction Coordinator, many of them just mentioned that a DBMS must explicitly support XA in order to run. They also talk about how a Distributed Transaction Coordinator works. However, after reading a lot of these information, yes, I know what a DTC does, but I don't even know how to start.
Have looking so long but I don't find a out of box DTC. Do we need to implement DTC by ourselves? Isn't there any existing usable DTC framework?
Some related StackOverflow post:
How do two-phase commits prevent last-second failure?
2PC distributed transactions across many microservices?
How to process distributed transaction within postgresql?
I have a REST service - all its requests are persisted to its own relational database. So far, good. But, there is also a small business functionality (email notification, sms alert) that should be run on the newly received/updated data. For this process to work on data in background, it requires some way to know about the persisted data - a message queue would fix the problem. Three common ways I see designing this,
The REST service inserts into the database, also, publish to the queue, too.
The problem here is, distributed transaction - combining different types within one transaction - relational database & the queue. Some tools may support, some may not.
As usual REST service persists only to its database. Additionally it also inserts the data into another table to which a scheduled job queries, publishes them to queue (from which the background job should start its work).
The problem I see is the scheduler - not reactive, batchprocessing, limited by the time slot, not realtime, slow and others.
The REST endpoint publishes the data directly to a topic. A consumer persists it to the database, whereas another process it in the background.
Something like eventsourcing. TMU, it is bit complex to implement as the number of services grow. Also, if the db is down, the persistent service would fail to save the data, however the background service (say, the emailer) would send email which is functionaly wrong. This may lead to inconsistency among the services, also functional.
I have also thought of reading database transaction-logs, but it seems more complex, requires tools to configurations to make it work, also, it seems right for data processing systems than for our use case.
What's your thought on this - did I miss anything? How do you manage such scenarios? What should be looked for? Thinking reactive, say Vertx?
Apologies if this looks very naive, but I have to ask.
I think best approach is 2 with a CDC(change data capture) system like debezium.
See [https://microservices.io/patterns/data/transactional-outbox.html][1]
I usually recommend option 3 if you don't need immediate read after write consistency. Background job should retry if the database record is still not updated by the message it processes.
Your post exemplifies why queues shouldn't be used for these types of scenarios. They are good for delivering analytical data or logs, but for task orchestration developers have to reinvent the wheel every time.
The much better approach is to use a task orchestration system like Cadence Workflow that eliminates issues you described and makes multi-service orchestration much simpler.
See this presentation that explains the Cadence programming model.
Application 1(A1) sends messages to A2 over MQ. A2 uses XA transactions so that a message dropped on the queue is picked by A2, processed and written to the DB and the whole transaction is committed at once.
I would like to test whether A2 correctly maintains system consistency if the transaction fails mid-way and whether XA has been implemented correctly.
I would like to stop the DB as soon as A2 picks up the message. But I am not sure whether I will have enough time to stop the DB and whether I will know for sure that the message has been picked.
Any other suggestions for testing this?
Thanks,
Yash
I am assuming you are using Java here, otherwise, some of this won't be applicable.
The quick, pragmatic solution is to inject a delay into your process which will give you time to take your transactionally destructive action. The easiest way to do this would be to run the app in a debugger. Place a breakpoint at some suitable location (perhaps after the message has been received and the DB write is complete but not committed) and kill the DB when the debugger pauses the thread. Alternatively, add a test hook to your code whereby the thread will sleep if the MQ message has a header titled something unlikely like 'sleeponmessagereceived'.
A more complex but sophisticated technique is to use error injection via some AOP tool. I would definitely look at Byteman. It allows you to inject bytecode at runtime and was originally written to test XA scenarios like yours for the Arjuna transaction manager. You can inject code procedurally, or you can annotate unit testing procedures. Some of the advantages of this approach is that you can direct Byteman to trigger an error condition based on a variety of other conditions, such as nth invocation, or if a method arg is X. Also, depending on how detailed your knowledge of your transaction manager is, you can recreate a wider set of scenarios to produce some more tricky XA outcomes like Heuristic exceptions. There's some examples here that demonstrate how to use Byteman scripts to validate MQ XA Recovery. This project is intended to help reproducing XA failure and recovery. It is JBoss specific, but I would imagine you would be able to adapt to your environment.
I've found references in a few places to some internal logging capabilities of ZMQ. The functionality that I think might exist is the ability to connect to either or both of a inproc or ipc SUB socket and listen to messages that give information about the internal state of ZMQ. This would be quite useful when debugging a distributed application. For instance, if messages are missing/being dropped, it might shed some light on why they're being dropped.
The most obvious mention of this is here: http://lists.zeromq.org/pipermail/zeromq-dev/2010-September/005724.html, but it's also referred to here: http://lists.zeromq.org/pipermail/zeromq-dev/2011-April/010830.html. However, I haven't found any documentation of this feature.
Is some sort of logging functionality truly available? If so, how is it used?
Some grepping through the git history eventually answered my question. The short answer is that a way for ZMQ to transmit logging messages to the outside world was implemented, but it was never used to actually send logging messages by the rest of the code base. After a while it was removed since nothing used it.
The commit that originally added it making use of an inproc socket:
https://github.com/zeromq/libzmq/commit/ce0972dca3982538fd123b61fbae3928fad6d1e7
The commit that added a new "sys" socket type specifically to support the logging:
https://github.com/zeromq/libzmq/commit/651c1adc80ddc724877f2ebedf07d18e21e363f6
JIRA issue, pull request, and commit to remove the functionality:
https://zeromq.jira.com/browse/LIBZMQ-336
https://github.com/zeromq/libzmq/pull/277
https://github.com/zeromq/libzmq/commit/5973da486696aca389dab0f558c5ef514470bcd2
I am wondering how we can ensure message durability when using websphere MQ and WCF. I want to be able to have my WCF process pick messages off of the queue and if there is an issue that the applciation encounters (power outage, etc) I don't lose the messages. I also would like to not have to use a transaction if at all possible because I want to eliminate distributed transactions.
Thanks,
S
Well, there's transactions and there's distributed transactions. The "right" answer is to use the WMQ 1-phase commit here. That doesn't have the complexity of XA transactions but it does give you the ability to roll back a message without losing it. In fact, when using clients you really should be using at least 1-phase commit just to prevent loss of messages.
Short of that there is always the "browse-with-lock, delete-message-under-cursor" method. I'm pretty sure everything you need to do the browseing, locking and deleting is exposed under .NET but perhaps Shashi will comment and confirm.
WebSphere MQ WCF custom channel has a feature "Assured Delivery" that guarantees that a service request or reply is actioned and not lost. This is the 1-phase commit (also known as SYNC_POINT in) WMQ.
"Assuered Delivery" is a service contract attribute. Here are more details about the feature.