What happens to celery's events when my receiver is down?
According to documentation(https://docs.celeryproject.org/en/latest/userguide/monitoring.html#real-time-processing) I need to run a separate process that is listening for celery events and process them.
But, if i have to shutdown the receiver process for maintance or other purpose, all events are lost for ever?
Can i persist this events?
Long answer: it depends on your broker choice.
Short answer: the three most popular brokers with celery are RabbitMQ, redis, and SQS. Each one offers some degree of persistence and. Rabbit MQ and SQS are message queueing services that offer "guaranteed delivery" of messages once and only once. The default redis configuration will persist messages in RAM and save them to disk after a maximum of fifteen minutes, so if redis shuts down within that fifteen minute time span, you will lose messages / tasks.
Related
I have multiple instances of a worker connected to a queue and all requests will be distributed to worker instances in a load balanced way. When a new worker instance is connected to the queue, I should dump a small data from mainstream app to this new worker instance (one time job).
Currently I'm using REST endpoint from mainstream app for doing this at application start-up but can we leverage the messaging queue for this? Once a new worker instance connected to queue, it will ask the initial data dump to mainstream app through queue and then app will reply with initial data.
Is it possible using messaging queue/topic? Kindly share your views/suggestions to achieve this using activemq
If you're using ActiveMQ Artemis this kind of requirement is typically fulfilled with a queue that supports both non-destructive and last-value semantics. The last-value semantics allows the queue to stay up-to-date with the latest messages and the non-destructive semantics means that even when consumers acknowledge the messages they will remain on the queue for the next client which connects. When using this combination clients can first consume all the messages from this special "initialization" queue and then continue on with whatever other messaging work they need to do.
Unfortunately ActiveMQ "Classic" doesn't support either of these semantics and there is no straight-forward way to get equivalent behavior.
I worked a little with the ActiveMQ scheduler plugin. This simplifies scheduling messages for delivery with a delay at low volume, but as I get into the 100ks of messages the system breaks down in two key ways.
It's very slow (compared to queues) to enqueue messages in the scheduler.
Attempting to view the schedules in the dashboard crashes the ActiveMQ instance.
The existing scheduler feels a little bolted on and does not perform as expected. So, rethinking the problem I would like to have a jobs and jobs-scheduled queue. Messages sent to the jobs-scheduled queue will have a ttl header with the unix timestamp for when it should be delivered. A process will run on a cron job which will take messages from the jobs-scheduled queue and send it to the jobs queue using a selector to just pick out the messages with an elapsed ttl convert_string_expressions:ttl < %(now)s.
My two questions are:
Will this strategy work for delaying messages at scale or will I find scaling pains around the selector? These messages will be persisted if that makes a difference.
Is there an existing feature in ActiveMQ that will allow me to send messages from one queue to another with a selector query?
ActiveMQ is a message broker not a job scheduler so what you are trying to do is really outside the scope of the what the broker is intended to do. Yes ActiveMQ does have a scheduled message feature but this is not intended for large scale job queue type work, it is a simple feature to provide some minimal delayed delivery.
What you are looking for sounds more like Quartz or some other batch job scheduling library. You could develop your own Job scheduler implementation for ActiveMQ or do something in a plugin but you are really trying to run against the grain of what a broker is meant to do which is deliver messages as quickly as possible in a decoupled manner.
Side note-- potentially off-topic.
I've had to solve a similar situation in the past where it made a lot of sense to load up the queues with messages ahead of time to cut down on the total transfer time.
I solved it by using Camel routes and a side-channel activation. Camel allows you to programmatically start and stop routes, so you can load up a queue with no consumers for the data for a given time period. Then using a dedicated queue for control you send the 'start' message. The control route receives the 'start' message, and then activates the main data processing route. You then need to configure some sort of 'stop' message semantic to be ready for the next time periods run.
Effectively, you get the delayed behavior pattern with much more control over scheduling and cut down on the data-to-queue loading time problem. You can also solve the scaling problem by loading the data across more than one queue.
How can I get all messages from a JMS topic in Tibco?
I know that I can use a topic subscriber, but it wouldn't fit exactly my needs. I want to start a process only once a day that will read all messages from a topic and process them. I cannot have both a timer and a topic subscriber in the same process.
I tried with "Wait for JMS Topic Message", but it seems that it gets only one message, no matter how many I have in the topic.
I would try going a different direction. You could implement this using 2 separate processes.
One process, a topic subscriber (with a durable) which receives all messages. This process starter should be disabled by default (so the listener is not active).
The second process is a timer, which will activate the first process through Hawk (Engine Command). So every time the subscriber gets activated, it will start processing events.
The problematic part here is the deactivation of the topic subscriber after it is done. For that you need a separate logic, when to deactivate the subscriber. This could also be done by a separate timer or some Hawk Rule which fires, when the subscriber has no more messages.
I think the best solution will be to bridge the JMS topic to a queue and use the "JMS Queue Receiver" activity at the start of your process.
Once you start the instance once a day, it will connect and process all the messages in the queue.
A much more natural solution (if it can be implemented) is to just implement a Topic Subscriber (or a Queue subscriber if the Topic is bridged to a Queue) and let the BusinessWorks Engine spawn Job instances whenever a message gets published.
This allows to spread the workload much more evenly than to get all the messages from either a Topic or a Queue.
I'm looking for a way to schedule a MDB. My requirement is that the MDB is set to feed a system from the company. This system goes out for maintenance every night, but the other systems don't know about it and may keep trying to feed it. A persistent queue is great in the way that my messages could be pilled until system goes back online.
How could I manage that? I've run into that already: schedule a message driven bean to access a queue during certain times? but it uses java 7, and worst, message is lost if the server restarts (messages is taken out of the JMS Queue and kept in memory until timer process it).
Another use of this would be to implement a "retry" queue. In case of error I want to retry processing my message, but not immediately, after a certain amount time only.
Any ideas to keep my MDB offline for a certain amount of time?
Most versions of JBoss publish a management MBean that allows you to stop delivery on a MDB.
If you're using EJB3, however, they auto-start, so you will need to register a startup class to stop starting MDBs at boot time if boots occur in your MDB's blackout period. Once past that snafu, you can schedule a simple quartz job to start and stop the MDBs according to your delivery windows.
Well, it looks like there is no way to pause a MDB in a generic way. The best solution is, like most people will answer, to use the DLQ (or DMQ).
Now, if I want to introduce a timer on a message, I set the time to live of the producer to the amount of time I want the message to wait. Then I send it to a normal queue, lets say waitingQueue which has no consumer. After expiration, the message is sent to default destination (mq.sys.dmq for Glassfish MQ, make sure to create a jms resource with mq.sys.dmq as imqDestinationName). I have a MDB listening to the error queue and responsible of sending the message again. Now, if I want to "close" a queue for some time, when a message arrives in the queue, I check if current time is allowed or not. Just set the time to live to the amount of time before next opening hours and send it to waitingQueue.
The reason I didn't use it since the beginning is that I fell into a few pitfalls. Here are a few useful properties to set when using DMQ with Glassfish 3.1.1 and its embedded MQ.
imq.message.expiration.interval=1 that's for the poll interval on each queue before sending timed out messages to the DMQ. Default is 60 seconds. If like me you want to test your application with little latency, this is what you need.
I have an online service that receives incoming events (few every second). Service needs to process a job when there were no events for 30 seconds or more. Service is distributed across several PCs and uses Amazon webservices (SQS and SimpleDB) as a backbone.
I understand how can I schedule a job when there IS an incoming event (just put a message into message queue and you are done), but how can I schedule a job when the condition is "NO EVENTS FOR X SECONDS" ?
Ideally I would want a message queue that does not allow duplicate messages, allows scheduling for the future and allows adjusting "delivery date" on each message.
Is there such a message queue implementation?
Is this problem can be solved at all without persisting some data in database?
Thank you
Both BizTalk or SQL Server Service Broker fit your requirements. If they are too heavyweight, you could write a simple service that peeks the queue every couple of seconds and times out if it does not see anything in 30 seconds. That would be more difficult to scale horizontally across machines, however.