Multiple Publish inside a transactionScope fails - oracle

I modeled an integration architecture between different subsystems. All notification from a subsystem are sent to the subscribed subsystems using the primitive Publish. These notifications are sent in a for loop inside an Handler method, so they are all in the same TransactionScope. I did a simple example to explain this: A client send a message to a server that send a variable number of messages using the primitive Publish. This is the server handler:
public void Handle(MyMessage message)
{
for (int i = 0; i < message.numberOfNotifications; i++)
{
Bus.Publish<NotificationMessage>(m =>
{
m.myPersonalCount= i;
}
);
}
}
What I'm looking and I can't figure out is that when I set i to 30 or less everything is OK. From 31 or more I get this error message:
could not execute query
[ SELECT this_.SubscriberEndpoint as y0_ FROM "Subscription" this_ WHERE this_.MessageType in (?) ]
And looking in the inner exception I get Unable to enlist in a distributed transaction.
I tried the same using the primitive Send but everything was(tried with 10k messages), so this is a problem pertinent only to the Publish directive.
I use Oracle 10g for dbms and Oracle 11g for client.
If the endpoint is not transactional I don't have any problem and so the problem seems to concern only with the TransactionScope.
Any help is appreciated, Thanks

I am not an Oracle expert by any stretch of the imagination. I probably wouldn't even qualify as novice.
However, I know that NServiceBus queries the subscription storage for every single publish, in case there are changes to subscriptions between publishes.
Is it possible that the Oracle client has some sort of limitation to how many queries can enlist in a distributed transaction? Perhaps as a way to prevent N+1 types of performance problems?
That said, it seems very odd that you would want to publish more than 30 of the same type of event. I wonder what your business use case is. Events are generally supposed to announce that something irrevocable has happened. Why would there be 30 things that have happened?
If the business case is solid, it might be a good idea to implement your own subscription storage engine that uses some limited caching (5 seconds even) so that you aren't returning to querying the database on every single publish.

Related

How to handle Transaction in CosmosDB - "All or nothing" concept

I am trying to save multiple Document to 'multiple Collection' at once in one Transaction. So if one of the save fail, then all the saved document should RollBack
I am using SpringBoot & SpringData and using MongoAPi to connect to CosmosDB in Azure. I have read in their portal that this can be done by writing some Stored procedure. But is there a way we can do it from code like how spring have #Transaction annotation.?
Any help is really appreciated.
The only way you can write transactionally is with a stored procedure, or via Transactional Batch operations (SDK-based, in a subset of the language SDKs, currently .NET and Java) . But that won't help in your case:
Stored procedures are specific to the core (SQL) API; they're not for the MongoDB API.
Transactions are scoped to a single partition within a single collection. You cannot transactionally write across collections, regardless whether you use the MongoDB API or the core (SQL) API.
You'll need really ask whether transactions are absolutely necessary. Maybe you can use some type of durable-messaging approach, to manage your content-updates. But there's simply nothing built-in that will allow you to do this natively in Cosmos DB.
Transactions across partitions and collections is indeed not supported. If you really need a Rollback mechanism, it might be worthwhile to check the event sourcing pattern, as you might then be able to capture events instead of updating master entities. These events you could then easily delete, but still other processes might have executed using incorrect events.
We created a sort of unitofwork. We register all changes to the data model, including events and messages that are being sent. Only when we call a committ, the changes are persisted to the database, in the following order:
Commit updates
Commit deletes
Commit inserts
Send messages
Send events
Still, it's not watertight, but it avoids sending out messages/events/modifications to the data model as long as the calling process is not ready to do so(i.e. due to an error). This UnitOfWork is passed through our domain services to allow all operations of our command to be handled in one batch. It's then up to thé developer to realize if a certain operation can be committed as part of a bigger operarion(same UoW), or independent(new UoW).
We then wrapped our command handlers in a Polly policy to retry in case of update conflicts. Theoretically though we could get an update conflict on the 2nd update, which could cause an inconsistent data model, but we keep this in mind, when using the UoW.
It's not watertight, but hopefully it helps!
Yes, transactions are supported in Cosmos DB with the Mongo API. It believe it's a fairly new addition, but it's simple enough and described here.
I don't know how well it's supported in Spring Boot, but at least it's doable.
// start transaction
var session = db.getMongo().startSession();
var friendsCollection = session.getDatabase("users").friends;
session.startTransaction();
// operations in transaction
try {
friendsCollection.updateOne({ name: "Tom" }, { $set: { friendOf: "Mike" } } );
friendsCollection.updateOne({ name: "Mike" }, { $set: { friendOf: "Tom" } } );
} catch (error) {
// abort transaction on error
session.abortTransaction();
throw error;
}
// commit transaction
session.commitTransaction();

How to rollback distributed transactions?

I have three different Spring boot Projects with separated databases e.g account-rest, payment-rest, gateway-rest.
account-rest : create a new account
payment-rest : create a new payment
gateway-rest : calls other endpoints
at gateway-rest there is an endpoint which calls the other two endpoints.
#GetMapping("/gateway-api")
#org.springframework.transaction.annotation.Transactional(rollbackFor = RuntimeException.class)
public String getApi()
{
String accountId = restTemplate.getForObject("http://localhost:8686/account", String.class);
restTemplate.getForObject("http://localhost:8585/payment?accid="+accountId, String.class);
throw new RuntimeException("rollback everything");
}
I want to rollback transactions and revert everything when I throw exception at gateway or anyother endpoints.
How can I do that ?
It is impossible rollback external dependencies accessible via rest or something like that.
The only think that you can do is compensate errors, you can use pattern like SAGA
I hope that is can help you
You are basically doing dual persistence. That's not ideally a good thing because of 2 reasons
It increases the latency and thus have a direct impact on user experience
What if one of them fails?
As the other answer pointed out SAGA pattern is an option to post compensation transaction.
The other option and it's better to go with this by all means is to avoid dual persistence by writing to only one service synchronously and then use Change Data Capture (CDC) to asynchronously upate the other service. If we can design in this way, we can ensure atomicity (all or nothing) and thus probably the rollback scenario itself will not surface.
Refer to these two answers also, if they help:
https://stackoverflow.com/a/54676222/1235935
https://stackoverflow.com/a/54527066/1235935
By all means avoid distributed transactions or 2-phase commit. It's not a good solution and creates lot of operational overhead, locking etc. when the transaction co-ordinator fails after prepare phase and before commit phase. Worse things happen when transaction co-ordinator gets its data corrupted.
For that purpose you need external transaction management system. It will handle distributed transations and commit/rollback when its finished on all services.
Possible flow example:
Request coming
gateway-rest starts a distributed transaction and local transaction and sends a request(with transaction id) to payment-rest. Thread with transaction lives until all local transactions is finished.
payment-rest knows about global transaction and starts own local transaction.
When all local transactions marked as commited, TM(transaction manager) sends a request to each service to close local transactions and close global transaction.
In your case you can use sagas as mentioned by many others, but they require events and async in nature.
if you want a sync kind of API. you can do something similar to this:
first lets take an example in case of amazon, for creating a order and getting balance out of your wallet and completing the order:
create Order in PendingState
reserveBalance in Account service for order id
if balance reserved change Order state to Confirmed (also having the transaction id for the reserve) and update reserveBalanceConsumed to Account Service
else change Order state to Cancelled with reason , "not enough Balance"
Now there are cases where lets says account service balance is reserved but for some reason order is either not confirmed.
Then somebody could periodically check that if there are reserve Balance for some order and time>30 min let say then check whether that order is marked as confirmed with that trnasaction id , call reserveBalanceConsumed , else cancel that order with reason "some error please try again" , mark balance as free
NOW THESE TYPE OF SYSTEMS ARE COMPLEX TO BUILD. Use the Saga pattern in general for simpler structure.

Big difference in throughput when using MSMQ as subscription storage for Nservicebus as opposed to RavenDB

Recently we noticed that our Nservicebus subscribers were not able to handle the increasing load. We have a fairly constant input stream of events (measurement data from embedded devices), so it is very important that the throughput follows the input.
After some profiling, we concluded that it was not the handling of the events that was taking a lot of time, but rather the NServiceBus process of retrieving and publishing events. To try to get a better idea of what goes on, I recreated the Pub/Sub sample (http://particular.net/articles/nservicebus-step-by-step-publish-subscribe-communication-code-first).
On my laptop, using all the NServiceBus defaults, the maximum throughput of the Ordering.Server is about 10 events/second. The only thing it does is
class PlaceOrderHandler : IHandleMessages<PlaceOrder>
{
public IBus Bus { get; set; }
public void Handle(PlaceOrder message)
{
Bus.Publish<OrderPlaced>
(e => { e.Id = message.Id; e.Product = message.Product; });
}
}
I then started to play around with configuration settings. None seem to have any impact on this (very low) performance:
Configure.With()
.DefaultBuilder()
.UseTransport<Msmq>()
.MsmqSubscriptionStorage();
With this configuration, the throughput instantly went up to 60 messages/sec.
I have two questions:
When using MSMQ as subscription storage, the performance is much better than RavenDB. Why does something as trivial as the storage for subscription data have such an impact?
I would have expected a much higher performance. Are there any other configuration settings that I should use to get at least one order of magnitude better than this? On our servers, the maximum throughput when running this sample is about 200 msg/s. This is far from spectacular for a system that doesn't even do anything useful yet.
MSMQ doesn't have native pub/sub capabilities so NServiceBus adds support this by storing the list of subscribers and then looping over that list sending a copy of the event to each of the subscribers. This translates to X message queuing operations where X is the number of subscribers. This explains why RabbitMQ is faster since it has native pub/sub so you would only need one operation against the broker.
The reason the storage based on a msmq queue is faster is that it's a local storage (can't be used if you need to scaleout the endpoint) and that means that we can cache the data since that can't be any other endpoint instances updating the storage. In short this means that we get away with a in memory lookup which as you can see is the fastest option.
There are plans to add native caching across all storages:
https://github.com/Particular/NServiceBus/issues/1320
200 msg/s sounds quite low, what number do you get if you skip the bus.Publish? (just to get a base line)
Possibility 1: distributed transactions
Distributed transactions are created when processing messages because of the combination Queue-Database.
Try measuring without transactional handling of the messages. How does that compare?
Possibility 2: msmq might not be the best queueing system for your needs
Ever considered switching to rabbitmq for transport? I have very good experiences with RabbitMq in combination with MassTransit. Way exceed the numbers you are mentioning in your question.

NHibernate IsUpdateNecessary takes enormous amount of time

My C# 3.5 application uses SQL Server 2008 R2, NHibernate and CastleProject ActiveRecord. The application imports emails to database along with their attachments. Saving of emails and attachments is performed by 50 emails in new session and transaction scope to make sure they are not stored in memory (there can be 100K of emails in some mailbox).
Initially emails are saved very quickly. However, closer to 20K emails performance degrades dramatically. Using dotTrace I got the following picture:
Obviously, when I save an attachment, NHibernate tries to see if it really should save it and probably compares with another attachments in the session. To do so, it compares them byte by byte what takes almost 500 seconds (for the snapshot on the picture) and 600M enumerator operations.
All this looks crazy, especially when I know for sure that SaveAndFlush indeed should save the attachment without any checks: I know for sure that it is new and should be saved.
However, I cannot figure out, how to instruct NHibernate to avoid this check (IsUpdateNecessary). Please advise.
P.S. I am not sure but it might appear that degradation of performance closer to 20K is not connected with having some older mails in memory: I noticed that in mailbox I am working with, larger emails are stored later than smaller so the problem may be only in attachments comparison.
Update:
Looks like I need something like StatelessSessionScope, but there is no documentation on it even at CastleProject site! If I do something like
using (TransactionScope txScope = new TransactionScope())
using (StatelessSessionScope scope = new StatelessSessionScope())
{
mail.Save();
}
it fails with exception that Save is not supported by stateless session. I am supposed to insert objects into session, but I do not have any Session (only SessionScope, which adds up to SessionScope only single OpenSession method which accepts strange paramenters).
May be I missed it in that long text, but are you using stateless session for importing data? Using that prevents a lot of checks and also bypasses first level cache, thus using minimal resources.
Looks like I've found an easy solution: for my class Attachment, causing biggest performance penalty, I overridden the following method:
protected override int[] FindDirty(
object id,
System.Collections.IDictionary previousState,
System.Collections.IDictionary currentState, NHibernate.Type.IType[] types)
{
return new int[0];
}
Thus, dirty check always consider it dirty and does not do that crazy per-byte comparison.

What's a good way to handle "async" commits?

I have a WCF service that uses ODP.NET to read data from an Oracle database. The service also writes to the database, but indirectly, as all updates and inserts are achieved through an older layer of business logic that I access via COM+, which I wrap in a TransactionScope. The older layer connects to Oracle via ODBC, not ODP.NET.
The problem I have is that because Oracle uses a two-phase-commit, and because the older business layer is using ODBC and not ODP.NET, the transaction sometimes returns on the TransactionScope.Commit() before the data is actually available for reads from the service layer.
I see a similar post about a Java user having trouble like this as well on Stack Overflow.
A representative from Oracle posted that there isn't much I can do about this problem:
This maybe due to the way OLETx
ITransaction::Commit() method behaves.
After phase 1 of the 2PC (i.e. the
prepare phase) if all is successful,
commit can return even if the resource
managers haven't actually committed.
After all the successful "prepare" is
a guarantee that the resource managers
cannot arbitrarily abort after this
point. Thus even though a resource
manager couldn't commit because it
didn't receive a "commit" notification
from the MSDTC (due to say a
communication failure), the
component's commit request returns
successfully. If you select rows from
the table(s) immediately you may
sometimes see the actual commit occur
in the database after you have already
executed your select. Your select will
not therefore see the new rows due to
consistent read semantics. There is
nothing we can do about this in Oracle
as the "commit success after
successful phase 1" optimization is
part of the MSDTC's implementation.
So, my question is this:
How should I go about dealing with the possible delay ("asyc" via the title) problem of figuring out when the second part of the 2PC actually occurs, so I can be sure that data I inserted (indirectly) is actually available to be selected after the Commit() call returns?
How do big systems deal with the fact that the data might not be ready for reading immediately?
I assume that the whole transaction has prepared and a commit outcome decided by the TransactionManager, therefore eventually (barring heuristic damage) the Resource Managers will receive their commit message and complete. However, there are no guarantees as to how long that might take - could be days, no timeouts apply, having voted "commit" in the Prepare the Resource Manager must wait to hear the collective outcome.
Under these conditions, the simplest approach is to take "an understood, we're thinking" approach. Your request has been understood, but you actually don't know the outcome, and that's what you tell the user. Yes, in all sane circumstances the request will complete, but under some conditions operators could actually choose to intervene in the transaction manually (and maybe cause heuristic damage in doing so.)
To go one step further, you could start a new transaction and perform some queries to see if the data is there. Now, if you are populating a result screen you will naturally be doing such as query. The question would be what to do if the expected results are not there. So again, tell the user "your recent request is being processed, hit refresh to see if it's complete". Or retry automatically (I don't much like auto retry - prefer to educate the user that it's effectively an asynch operation.)

Resources