I'm kind of confused about the syncpoint from WebSphere MQ. According to the reference from the WMQ Infocenter, the syncpoint is defined as below:
The decision to commit or back out the changes is taken, in the simplest case, at the end of a task. However, it can be more useful for an application to synchronize data changes at other logical points within a task. These logical points are called sync points (or synchronization points) and the period of processing a set of updates between two sync points is called a unit of work.
Therefore, a unit of work is named as processing between two sync points. From my understanding, that unit of work is a transaction actually. So I draw the picture here reflecting the relationship between them:
Am I understanding it correctly?
Another question here is as for the syncpoint, it's said to synchronize data for application, where is the data synced from and where to?
Thanks
Answer 1: Yes, you are understanding it correctly. Transactions are atomic units of work that succeed or fail in their entirety. Syncpoints are the boundaries between the transactions. The subtle difference between a transaction and a syncpoint is that the work done under a transaction can be in-doubt for a period of time whereas the sync point is the state during which the transaction is not in doubt. In the event of failure, the queues are restored to the state which existed at the last syncpoint and then any pending transactions rolled back or, in the case of XA, possibly committed by the resource manager.
The Infocenter page Transaction management and support may explain it better than the page linked in the post.
Answer 2: The data is synchronized by the resource manager(s). For local units of work where messages are the only participants in the transaction, the synchronization occurs only in the queue, and the queue manager acts as both the resource manager and the transaction manager. For global units of work involving messages and database updates in the same transaction, the synchronization occurs in the QMgr and the database which act as resource managers. The transaction manager will either be MQ or an application server and it orchestrates the synchronization between the resource managers.
Regardless of whether the transaction is local or global (sometimes also referred to as single-phase commit or 2-phase commit) the relationship between transactions and syncpoints is the same. The syncpoint is the most recent point in time at which a known state is preserved and to which an in-doubt transaction might be rolled back.
Related
Currently working on initial phase of Microservice Architecture for product. It is evident that for many operation it required to have distributed transaction or say there are couple of operation required across different microservice in-order to complete one business process.
For this purpose I found that Saga is useful. Now for ideal case or where everything goes correct then it works fine but when something is not correct or some activity failed at that moment we may have to rollback those operation. For this there is something called "Compensating" transaction or operation required. Now It is completely possible when operation performed specially for successful operation, it may possible that other transaction also performed on that service so db might be in different state then when actually operation performed.
What will be the solution for this ? One solution I think is that somehow state needs to preserve so it can revisit but like for stock may be change to some other transaction so I feel that compensating transaction would be a problem.
As per what I read on internet, method annotated with Spring #KafkaListener will commit the offset in 5 sec by default.
Suppose after 5 seconds, the offset is committed but the processing is still going on and in between consumer crashes because of some issue, in that case after rebalancing, the partition will be assigned to other consumer and it will start processing from next message because previous message offset was committed.
This will result in loss of the message.
So, do I need to commit the offset manually after processing completes? What would be the recommended approach?
Again, if processing is done, and just before commit, the consumer crashed, then how to avoid the message
duplication in this case.
Please suggest the way which will avoid message loss and duplication. I am using Spring KafkaListener
with default configuration.
As usual this depends on your use case and how you would like to deal with issues during your processing. The usage of auto-commit will change the delivery semantics of your application.
Enabling the auto commits is more an "at-most-once" semantics as you would read the data and commit it before you have actually processed the data. In case your processing fails the message was already committed and you will not read it again, it is therefore "lost" for your application (for your particular consumerGroup to be more precise).
Disabling the auto commit is more a "at-least-once" semantics as you are committing the data only after the processing of the data. Imagine you fetch 100 messages from the topic. 50 of them were processed sucessfullay and your application fails during the processing of the 51st message. Now, as you disabled auto commit and only commit all or none messages at the end of the processing, you have not committed any of the 100 messages, the next time your application reads the same 100 messages again. However, you have now created 50 duplicate messages as they were already processed successfully previously.
To conclude, you need to figure out if your use case can rather handle data loss or deal with duplicates. Dealing with duplicates can be ensured if your application is idempotent.
You are asking about "how to prevent data loss and duplicates" which means you are referring to "exactly-once-scemantics". This is a big topic in distributed streaming systems and you could check the spring-kafka docs if this is supported under which configuration and dependent on the output operation of your application.
Please also check the comment of GaryRussell on this post:
"the Spring team does not recommend using auto commit; the listener container Ackmode (BATCH or RECORD) will commit the offsets in a deterministic manner; recent versions of the framework disable auto commit (unless specifically enabled)"
If the consumer takes 5+ seconds to process the message then you have a problem in the code that needs to be fixed.
Auto-commit is risky in Production as can lead to problem scenarios (message loss etc.)
Better to go with manual commit to have better control.
Make the consumer idempotent so that duplicate message and WIP state of consumer is not a problem. May be, maintain processing status in consumer's DB so that if processing is half done then on consumer restart it can clear the WIP state and process afresh. Similarly, if processing status is Complete state then on restart it will see the Complete status and simply commit the duplicate message to Kafka.
I am beginner to spring , now in my project we have method to make booking ,now if the available booking is 1 , when two user are trying booking at same time ideally only one booking is allowed , but my application is allowing two bookings, now i made the method as synchronized now it is working fine, but synchronization concept belongs to JVM now if I am configuring my application in cluster mode there are different servers in different machines(so different JVMS), now synchronization wont work.
Can any one please tell me the possible solution for this to restrict the booking,tell me solution from JAVA side and and also from DB side
If the application may be deployed in cluster, the synchronized method will indeed not be enough.
You should rely on a node shared by all server instances : the DB.
You could use DB locking features, especially optimistic locking and pessimistic locking.
With them, you could avoid collisions resulting from concurrent updates to the same line by concurrent clients.
Choose which one that matches better to your use case.
Concurrency control in databases extract:
Optimistic - Delay the checking of whether a transaction meets the
isolation and other integrity rules (e.g., serializability and
recoverability) until its end, without blocking any of its (read,
write) operations ("...and be optimistic about the rules being
met..."), and then abort a transaction to prevent the violation, if
the desired rules are to be violated upon its commit. An aborted
transaction is immediately restarted and re-executed, which incurs an
obvious overhead (versus executing it to the end only once). If not
too many transactions are aborted, then being optimistic is usually a
good strategy.
Pessimistic - Block an operation of a transaction, if
it may cause violation of the rules, until the possibility of
violation disappears. Blocking operations is typically involved with
performance reduction.
JDBC and JPA support both.
You should try with ETags - optimistic locking mechanism.
It can be easily implemented with Spring. Please check official documentation.
Hope this will help.
Since a couple of days I've been trying to figure it out how to inform to the rest of the microservices that a new entity was created in a microservice A that store that entity in a MongoDB.
I want to:
Have low coupling between the microservices
Avoid distributed transactions between microservices like Two Phase Commit (2PC)
At first a message broker like RabbitMQ seems to be a good tool for the job but then I see the problem of commit the new document in MongoDB and publish the message in the broker not being atomic.
Why event sourcing? by eventuate.io:
One way of solving this issue implies make the schema of the documents a bit dirtier by adding a mark that says if the document have been published in the broker and having a scheduled background process that search unpublished documents in MongoDB and publishes those to the broker using confirmations, when the confirmation arrives the document will be marked as published (using at-least-once and idempotency semantics). This solutions is proposed in this and this answers.
Reading an Introduction to Microservices by Chris Richardson I ended up in this great presentation of Developing functional domain models with event sourcing where one of the slides asked:
How to atomically update the database and publish events and publish events without 2PC? (dual write problem).
The answer is simple (on the next slide)
Update the database and publish events
This is a different approach to this one that is based on CQRS a la Greg Young.
The domain repository is responsible for publishing the events, this
would normally be inside a single transaction together with storing
the events in the event store.
I think that delegate the responsabilities of storing and publishing the events to the event store is a good thing because avoids the need of 2PC or a background process.
However, in a certain way it's true that:
If you rely on the event store to publish the events you'd have a
tight coupling to the storage mechanism.
But we could say the same if we adopt a message broker for intecommunicate the microservices.
The thing that worries me more is that the Event Store seems to become a Single Point of Failure.
If we look this example from eventuate.io
we can see that if the event store is down, we can't create accounts or money transfers, losing one of the advantages of microservices. (although the system will continue responding querys).
So, it's correct to affirmate that the Event Store as used in the eventuate example is a Single Point of Failure?
What you are facing is an instance of the Two General's Problem. Basically, you want to have two entities on a network agreeing on something but the network is not fail safe. Leslie Lamport proved that this is impossible.
So no matter how much you add new entities to your network, the message queue being one, you will never have 100% certainty that agreement will be reached. In fact, the opposite takes place: the more entities you add to your distributed system, the less you can be certain that an agreement will eventually be reached.
A practical answer to your case is that 2PC is not that bad if you consider adding even more complexity and single points of failures. If you absolutely do not want a single point of failure and wants to assume that the network is reliable (in other words, that the network itself cannot be a single point of failure), you can try a P2P algorithm such as DHT, but for two peers I bet it reduces to simple 2PC.
We handle this with the Outbox approach in NServiceBus:
http://docs.particular.net/nservicebus/outbox/
This approach requires that the initial trigger for the whole operation came in as a message on the queue but works very well.
You could also create a flag for each entry inside of the event store which tells if this event was already published. Another process could poll the event store for those unpublished events and put them into a message queue or topic. The disadvantage of this approach is that consumers of this queue or topic must be designed to de-duplicate incoming messages because this pattern does only guarantee at-least-once delivery. Another disadvantage could be latency because of the polling frequency. But since we have already entered the eventually consistent area here this might not be such a big concern.
How about if we have two event stores, and whenever a Domain Event is created, it is queued onto both of them. And the event handler on the query side, handles events popped from both the event stores.
Ofcourse every event should be idempotent.
But wouldn’t this solve our problem of the event store being a single point of entry?
Not particularly a mongodb solution but have you considered leveraging the Streams feature introduced in Redis 5 to implement a reliable event store. Take a look this intro here
I find that it has rich set of features like message tailing, message acknowledgement as well as the ability to extract unacknowledged messages easily. This surely helps to implement at least once messaging guarantees. It also support load balancing of messages using "consumer group" concept which can help with scaling the processing part.
Regarding your concern about being the single point of failure, as per the documentation, streams and consumer information can be replicated across nodes and persisted to disk (using regular Redis mechanisms I believe). This helps address the single point of failure issue. I'm currently considering using this for one of my microservices projects.
I have a WCF service that uses ODP.NET to read data from an Oracle database. The service also writes to the database, but indirectly, as all updates and inserts are achieved through an older layer of business logic that I access via COM+, which I wrap in a TransactionScope. The older layer connects to Oracle via ODBC, not ODP.NET.
The problem I have is that because Oracle uses a two-phase-commit, and because the older business layer is using ODBC and not ODP.NET, the transaction sometimes returns on the TransactionScope.Commit() before the data is actually available for reads from the service layer.
I see a similar post about a Java user having trouble like this as well on Stack Overflow.
A representative from Oracle posted that there isn't much I can do about this problem:
This maybe due to the way OLETx
ITransaction::Commit() method behaves.
After phase 1 of the 2PC (i.e. the
prepare phase) if all is successful,
commit can return even if the resource
managers haven't actually committed.
After all the successful "prepare" is
a guarantee that the resource managers
cannot arbitrarily abort after this
point. Thus even though a resource
manager couldn't commit because it
didn't receive a "commit" notification
from the MSDTC (due to say a
communication failure), the
component's commit request returns
successfully. If you select rows from
the table(s) immediately you may
sometimes see the actual commit occur
in the database after you have already
executed your select. Your select will
not therefore see the new rows due to
consistent read semantics. There is
nothing we can do about this in Oracle
as the "commit success after
successful phase 1" optimization is
part of the MSDTC's implementation.
So, my question is this:
How should I go about dealing with the possible delay ("asyc" via the title) problem of figuring out when the second part of the 2PC actually occurs, so I can be sure that data I inserted (indirectly) is actually available to be selected after the Commit() call returns?
How do big systems deal with the fact that the data might not be ready for reading immediately?
I assume that the whole transaction has prepared and a commit outcome decided by the TransactionManager, therefore eventually (barring heuristic damage) the Resource Managers will receive their commit message and complete. However, there are no guarantees as to how long that might take - could be days, no timeouts apply, having voted "commit" in the Prepare the Resource Manager must wait to hear the collective outcome.
Under these conditions, the simplest approach is to take "an understood, we're thinking" approach. Your request has been understood, but you actually don't know the outcome, and that's what you tell the user. Yes, in all sane circumstances the request will complete, but under some conditions operators could actually choose to intervene in the transaction manually (and maybe cause heuristic damage in doing so.)
To go one step further, you could start a new transaction and perform some queries to see if the data is there. Now, if you are populating a result screen you will naturally be doing such as query. The question would be what to do if the expected results are not there. So again, tell the user "your recent request is being processed, hit refresh to see if it's complete". Or retry automatically (I don't much like auto retry - prefer to educate the user that it's effectively an asynch operation.)