I've created two services.
One of them (scheduler) only requests to the other (backoffice) for performing some "large" operations.
When backoffice receives a request:
first creates a mark (key on redis) in order to set that the process has started.
Each time a request is reached:
backoffice checks if the mark exist.
When it exists means that the previous process has not yet finished, and escape it.
Perform the large process.
When process is finished, the previous key in redis is removed.
It would be something like this:
if (key exists)
return;
make long process... (1);
remove key;
The problem arises when service is destroyed when the process has not already finished and then it doesn't removes the mark on redis. It means the process will never run again.
Is there any way to solve this kind of problems?
The way to solve this problem is use an existing engine as building custom scalable and robust solution for reliable service orchestration is really hard.
I recommend looking at Uber Cadence Workflow which would allow to convert your pseudocode into a real production application with minor changes.
You can fire a background job that updates timestamp under the key, e.g. every minute.
When service attempts to start the process it must verify key existence (as it does now) + timestamp under the key. If it is more than 1 minute ago then the previous attempt is stale and you can start over.
Sounds like you should be using a messaging queue to schedule tasks for the back office service. Queuing solutions like RabbitMQ allow you to manually acknowledge (or “ack”) that the process is complete. Whenever a subscriber crashes, the queue detects that the connection dropped without acknowledgement and will re-enqueue the same task which will be picked up by the next available subscriber. Here’s another thread talking about this problem specifically focused on messaging queues:
What happens to fetched messages when RabbitMQ consumer crashes?
Related
I have a simple integration flow that poll data based on a cron job from database, publish on a DirectChannel, then do split and transformations, and publish on another executor service channel, do some operations and finally publish to an output channel, its written using dsl style.
Also, I have an endpoint where I might receive an http request to trigger this flow, at this point I send the messages one of the mentioned channels to trigger the flow.
I want to make sure that the manual trigger doesn’t happen if the flow is already running due to either the cron job or another request.
I have used the isRunning method of the StandardIntegrationFlow, but it seems that it’s not thread safe.
I also tried using .wireTap(myService) and .handle(myService) where this service has an atomicBoolean flag but it got set per every message, which is not a solution.
I want to know if the flow is running without much intervention from my side, and if this is not supported how can I apply the atomic boolean logic on the overall flow and not on every message.
How can I simulate the racing condition in a test in order to make sure my implementation prevent this?
The IntegrationFlow is just a logical container for configuration phase. It does have those lifecycle methods, but only for an internal framework logic. Even if they are there, they don't help because endpoints are always running if you want to do them something by some event or input message.
It is hard to control all of that since it is in an async state as you explain. Even if we can stop a SourcePollingChannelAdapter in the beginning of that flow to let your manual call do do something, it doesn't mean that messages in other threads are not in process any more. The AtomicBoolean cannot help here for the same reason: even if you set it to true in the MessageSourceMutator.beforeReceive() and reset back to false in its afterReceive() when message is null, it still doesn't mean that messages you pushed down in other thread are already processed.
You might consider to use an aggregator for AtomicBoolean resetting in the end of batch since you mention that you pull data from DB, so perhaps there is a number of records per poll you can track downstream. This way your manual call could be skipped until aggregator collects results for that batch.
You also need to think about stopping a SourcePollingChannelAdapter at the moment when manual action is permitted, so there won't be any further race conditions with the cron.
I want to inform the seller, that the buyer is coming soon (about 2 hours before pickup time) via mail.
I would normally do it the hard way with CRON and a database table. Checking hourly if I find an order with pickup time minus 2 hours, only then sending the mail out.
Now, I would like to know if you would recommend using Queueing Jobs for sending Mails out.
With
$when = now()->addDays(10); //I would dynamically set the date
Mail::to($order->seller())
->later($when, new BuyerIsComing($order));
I can delay the delivery of a queued email message.
But how safe would this be? Especially, if someone is ordering something but is picking it up in let us exaggerate two months?
Is the Laravel queueing system rigid enough to behave correctly after long delays (i.e. 2 months)?
Edit
I'm using Redis for Queueing
You actually have nothing to worry about. Sending mail usually increases the response time of your application, so it's a good thing you want to delay the sending.
Queues are the way to go and it's pretty easy to setup in Laravel. Laravel supports a couple of them out of the box. I would advise you start with database and then try beanstalk etc.
Lastly and somehow more importantly, use a process manager like Supervisor to monitor and maintain your queue workers...
Take a look at https://laravel.com/docs/5.7/queues for more insight.
Cheers.
If by safe, you mean reliable, then it would be little different than sending an email immediately. If there's ever a possibility that your server "hiccups" and doesn't send an email, that possibility would be the same now as 10 minutes from now. Once the job is in the queue, it is persisted until completion (unless you use a memory-based driver, like Redis, which could get reset if the server reboots).
If you are using a database queue driver or remote, the log of queued jobs will remain even if the server is unavailable for a short period of time. Your queue will be honored even if the exact time stamp for when you want to send the job has expired. For instance if you schedule to send an email at 1:00pm but your server is down at that exact moment, when it comes back online it will still see the job because it is stored as incomplete and the time for the job is in the past, which will trigger the execution of the job at the next time your queue worker checks the job list.
Of course, this assumes that you have your queue worker set up to always check jobs and automatically restart, even after a server failure, but that's a different discussion with lots of solutions...such as those shown here.
If you're using database driver with Laravel queues to process your email then you don't need to worry about anything.
Jobs are only removed from Jobs table if they are successfully completed otherwise their next attempt time is set which is few minutes in future and they are executed again (if your queue worker is online).
So its completely safe to use Laravel queues
I has some strange behaviour on production deployment for azure queue messages:
Some of the messages in queues appears with big delay - minutes, and sometimes 10 minutes.
Befere you ask about setting delayTimeout when we put message to queue - we do not set delayTimeout for that message, so message should appear almost immedeatly after it was placed in queue.
At that moments we do not have a big load. So my instances has no work load, and able to process message fast, but they just don't appear.
Our service process millions of messages per month, we able to identify that 10-50 messages processed with very big delay, by that we fail SLA in front of our customers.
Does anyone have any idea what can be reason?
How to overcome?
Did anyone faced similar issues?
Some general ideas for troubleshooting:
Are you certain that the message was queued up for processing - ie the queue.addmessage operation returned successfully and then you are waiting 10 minutes - meaning you can rule out any client side retry policies etc as being the cause of the problem.
Is there any chance that the time calculation could be subject to some kind of clock skew problems. eg - if one of the worker roles pulling messages has its close out of sync with the other worker roles you could see this.
Is it possible that in the situations where the message is appearing to be delayed that a worker role responsible for pulling the messages is actually failing or crashing. If the client calls GetMessage but does not respond with an appropriate acknowledgement within the time specified by the invisibilityTimeout setting then the message will become visible again as the Queue Service assumes the client did not process the message. You could tell if this was a contributing factor by looking at the dequeue count on these messages that are taking longer. More information can be found here: http://msdn.microsoft.com/en-us/library/dd179474.aspx.
Is it possible that the number of workers you have pulling items from the queue is insufficient at certain times of the day and the delays are simply caused by the queue being populated faster than you can pull messages from the queue.
Have you enabled logging for queues and then looked to see if you can find the specific operations (look at e2elatency and serverlatency).
http://blogs.msdn.com/b/windowsazurestorage/archive/tags/analytics+2d00+logging+_2600_amp_3b00_+metrics/. You should also enable client logging and try to determine if the client is having connectivity problems and the retry logic is possibly kicking in.
And finally if none of these appear to help can you please send me the server logs (and ideally the client side logs as well) along with your account information (no passwords) to JAHOGG at Microsoft dot com.
Jason
Azure Service bus has a property in the BrokeredMessage class called ScheduledEnqueueTimeUtc, it allows you to set a time for when the message is added to the queue (effectively creating a delay).
Are you sure that in your code your not setting this property, and this might be the cause for the delay?
You can find more info on this at this url: https://www.amido.com/azure-service-bus-how-to-delay-a-message-being-sent-to-the-queue/
If you are using WebJobs to process messages from the queue, it can be due to WebJobs configuration.
From an MSDN forum post by pranav rastogi:
Starting with 0.4.0-beta, the (WebJobs) SDK implements a random exponential back-off algorithm. As a result of this if there are no messages on the queue, the SDK will back off and start polling less frequently.
The following setting allows you to configure this behavior.
MaxPollingInterval for when a queue remains empty, the longest period of time to wait before checking for a message to. Default is 10min.
static void Main()
{
JobHostConfiguration config = new JobHostConfiguration();
config.Queues.MaxPollingInterval = TimeSpan.FromMinutes(1);
JobHost host = new JobHost(config);
host.RunAndBlock();
}
I have a Mass Transit Service Bus that is listening to several queues and processing the messages. I would like to somehow pause the processing of new requests and wait for the current requests to complete so that I can run some housekeeping tasks.
A few of my own thoughts:
I have investigated the service bus BeforeConsumingMessage handler and although this would allow me to check for a 'Pause processing' flag in my database, I unsure how I would then actually pause the processing!
We are using RabbitMQ - could I use this to put the queues in a suspend state?
I have found so little on this subject that I wonder if it is an 'anti-pattern' and I should just stop my Mass Transit services if I want to run some housekeeping jobs and trust in any partially complete sagas to be picked up when the service bus starts back up. (Rather not go for this option, though).
So my question is: Is there a way to instruct the service bus to finish processing the current sagas but do not take any more messages from the queue?
Cleanly shutting down a MT service will wait for any messages in process to completely finished. Why sometimes it takes a little while for a service to shutdown. Shutting down the service is the best way to handle this, you are sure MT is not pulling any new messages.
If your sagas are serialized to a backing store, e.g. NHibernate, then the state will be saved until the service is restarted and the sagas will pick up in the state they were left after the last message was processed. You should be in good shape. We do this all the time for any maintenance periods.
If you REALLY must leave the service running, call Dispose on the IServiceBus instance. This will do the same thing, letting the current consumers finish then releasing all your resources. Once you have done maintenance you can create a new IServiceBus instance as needed.
I have the problem that I have to run very long running processes on my Webservice and now I'm looking for a good way to handle the result. The scenario : A user executes such a long running process via UI. Now he gets the message that his request was accepted and that he should return some time later. So there's no need to display him the status of his request or something like this. I'm just looking for a way to handle the result of the long running process properly. Since the processes are external programms, my application server is not aware of them. Therefore I have to wait for these programms to terminate. Of course I don't want to use EJBs for this because then they would block for the time no result is available. Instead I thought of using JMS or Spring Batch. Does anyone ever had the same problem or an advice which solution would be better?
It really depends on what forms of communication your external programs have available. JMS is a very good approach and immediately available in your app server but might not be the best option if your external program is a long running DB query which dumps the result in a text file...
The main advantage of Spring Batch over "just" using JMS as an aynchronous communcations channel is the transactional properties, allowing the infrastructure to retry failed jobs, group jobs together and such. Without knowing more about your specific setup, it is hard to give detailed advise.
Cheers,
I had a similar design requirement, users were sending XML files and I had to generate documents from them. Using JMS in this case is advantageous since you can always add new instances of these processes which can consume and execute the jobs in parallel.
You can use a timer task to check status or monitor these processes. Also, you can publish a message to a JMS queue once the processes are completed.