Resque and a multi-server architecture - resque

I haven't yet actually used Resque. I have the following questions and assumptions that I'd like verified:
1) I understand that you can have a multiserver architecture by configuring each of your resque instances to point to a central redis server. Correct?
2) Does this mean that any resque instance can add items to a queue and any workers can work on any of those queues?
3) Can multiple workers respond to the same item in a queue? I.e. one server puts "item 2 has been updated" in a queue, can workers 1, 2, and 3, on different servers, all act on that? Or would I need to create separate queues? I kind of want a pub/sub types tasks.
4) Does the Sinatra monitor app live on each instance of Resque? Or is there just one app that knows about all the queues and workers?
5) Does Resque know when a task is completed? I.e. does the monitor app show that a task is in process? Or just that a worker took it?
6) Does the monitor app show completed tasks? I.e. if a task complete quickly will I be able to see that at some point in the recent past that task was completed?
7) Can I programmatically query whether a task has been started, is in progress, or is completed?

As I am using resque extensively in our project, here are few answers for your query:
I understand that you can have a multi-server architecture by configuring each of your resque instances to point to a central redis server. Correct?
Ans: Yes you have multiple servers pointing to single resque server. I am running on similar architecture.
Does this mean that any resque instance can add items to a queue and any workers can work on any of those queues?
Ans: This depends on how you are configuring your servers, you have to create queues and them assign workers to them.
You can have multiple queues and each queue can have multiple workers working on them.
Can multiple workers respond to the same item in a queue? I.e. one server puts "item 2 has been updated" in a queue, can workers 1, 2, and 3, on different servers, all act on that? Or would I need to create separate queues? I kind of want a pub/sub types tasks.
Ans: This again based on your requirement, if you want to have a single queue and all workers working on it, this is also valid.
or if you want separate queue for each server you can do that also.
Any server can put jobs in any queue but only assigned workers will pickup and work on that job
Does the Sinatra monitor app live on each instance of Resque? Or is there just one app that knows about all the queues and workers?
Ans: Sinatra monitor app gives you an interface where you can see all workers/queues related info such as running jobs, waiting jobs, queues and failed jobs etc.
Does Resque know when a task is completed? I.e. does the monitor app show that a task is in process? Or just that a worker took it?
Ans: It does, basically resque also maintains internal queues to manage all the jobs.
Ans: Yes it shows every stats about the Job.
Does the monitor app show completed tasks? I.e. if a task complete quickly will I be able to see that at some point in the recent past that task was completed?
Ans: Yes you can do
for example to know which workers are working use Resque.working, similarly you can check their code base and utilise anything.
Resque is a very powerful library we are using for more than one year now.
Cheers happy coding!!

Related

How to Use Heroku Background Workers with NestJS and Bull?

What is the recommended way of providing Heroku workers for heavy processes that I want running on my queue using NestJS?
I have an HTTP server running on Heroku that executes certain time-consuming tasks (e.g. communicating with certain third-party APIs) that I want to be put in a Queue and have delegated to background workers.
Reading this example, it seems that I would create a processor file and instantiate the Queue object there and then define it's process function. That seems to allow scaling up, because each process would have the Queue object and define it's process therein. Spinning up more dynos would provide more workers.
Looking over here, I see that I can declare the process file when I register the queue. There I do not need to instantiate the Queue object and define it's process, I can simply export a default function. Can I declare a worker process in my Procfile that points to one of these process files and scale them up? Will that work? Or am I missing something here?
Right now, I don't have separate processes set up. I defined the Processors and the Processes using the given decorators within Nest's IoC container. I thought things would queue up nicely. What I've seen is that jobs come in fast and my server can't keep up with all the requests and jobs.

Understanding the MajorDomo Pattern from NetMQ ZeroMQ

I am trying to understand how to best implement the MDP example in c# to be used in a windows service in a multiple client - single server environment.
I have read the docs but I am still unclear on the following:
Should all Worker instances be created on startup and left to run?
Should the Workers all be different types of services or just different instances of the same service?
Can I have one windows service when contains the Broker and Workers or is it best to split them out into their own services?
The example code I am using is the MajorDomo Pattern taken from here https://github.com/NetMQ/Samples
Yes, all workers in a MDP environment should be created independently of the requests, since the broker should not know how to create them
Each worker handles a given "service" (contract). Obviously each contract should have at least one worker.
If you need parallelized handling of requests, and a given worker can only do one at a time, having extra workers for that service could make sense. Generally you would do this if multiple machines were involved however (horizontal scaling)
You can have the broker and workers in the same process. HOWEVER, if you want to update only a worker, taking down the broker at the same time can be annoying for the clients. I would recommend letting the broker be its own process, with the workers in one or more other processes.

How does Celery work?

I have recently started working on distributed computing for increasing the computation speed. I opted for Celery. However, I am not very familiar with some terms. So, I have several related questions.
From the Celery docs:
What's a Task Queue?
...
Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task the client adds a message to the queue, the broker then delivers that message to a worker.
What are clients (here)? What is a broker? Why are messages delivered through a broker? Why would Celery use a backend and queues for interprocess communication?
When I execute the Celery console by issuing the command
celery worker -A tasks --loglevel=info --concurrency 5
Does this mean that the Celery console is a worker process which is in charge of 5 different processes and keeps track of the task queue? When a new task is pushed into the task queue, does this worker assign the task/job to any of the 5 processes?
Last question first:
celery worker -A tasks --loglevel=info --concurrency 5
You are correct - the worker controls 5 processes. The worker distributes tasks among the 5 processes.
A "client" is any code that runs celery tasks asynchronously.
There are 2 different types of communication - when you run apply_async you send a task request to a broker (most commonly rabbitmq) - this is basically a set of message queues.
When the workers finish they put their results into the result backend.
The broker and results backend are quite separate and require different kinds of software to function optimally.
You can use RabbitMQ for both, but once you reach a certain rate of messages it will not work properly. The most common combination is RabbitMQ for broker and Redis for results.
We can take analogy of assembly line packaging in a factory to understand the working of celery.
Each product is placed on a conveyor belt.
The products are processed by machines.
At the end all the processed product is stored in one place one by one.
Celery working:
Note: Instead of taking each product for processing as they are placed on convey belt, In celery the queue is maintained whose output will be fed to a worker for execution one per task (sometimes more than one queue is maintained).
Each request (which is a task) is send to a queue (Redis/Rabbit MQ) and an acknowledgment is send back.
Each task is assigned to a specific worker which executes the task.
Once the worker has finished the task its output is stored in the result backend (Redis).

Golang background processing

How can one do background processing/queueing in Go?
For instance, a user signs up, and you send them a confirmation email - you want to send the confirmation email in the background as it may be slow, and the mail server may be down etc etc.
In Ruby a very nice solution is DelayedJob, which queues your job to a relational database (i.e. simple and reliable), and then uses background workers to run the tasks, and retries if the job fails.
I am looking for a simple and reliable solution, not something low level if possible.
While you could just open a goroutine and do every async task you want, this is not a great solution if you want reliability, i.e. the promise that if you trigger a task it will get done.
If you really need this to be production grade, opt for a distributed work queue. I don't know of any such queues that are specific to golang, but you can work with rabbitmq, beanstalk, redis or similar queuing engines to offload such tasks from your process and add fault tolerance and queue persistence.
A simple Goroutine can make the job:
http://golang.org/doc/effective_go.html#goroutines
Open a gorutine with the email delivery and then answer to the HTTP request or whatever
If you wish use a workqueue you can use Rabbitmq or Beanstalk client like:
https://github.com/streadway/amqp
https://github.com/kr/beanstalk
Or maybe you can create a queue in you process with a FIFO queue running in a goroutine
https://github.com/iNamik/go_container
But maybe the best solution is this job queue library, with this library you can set the concurrency limit, etc:
https://github.com/otium/queue
import "github.com/otium/queue"
q := queue.NewQueue(func(email string) {
//Your mail delivery code
}, 20)
q.Push("foo#bar.com")
I have created a library for running asynchronous tasks using a message queue (currently RabbitMQ and Memcache are supported brokers but other brokers like Redis or Cassandra could easily be added).
You can take a look. It might be good enough for your use case (and it also supports chaining and workflows).
https://github.com/RichardKnop/machinery
It is an early stage project though.
You can also use goworker library to schedule jobs.
http://www.goworker.org/
If you are coming from Ruby background and looking for something like Sidekiq, Resque, or DelayedJob, please check out the library asynq.
Queue semantics are very similar to sidekiq.
https://github.com/hibiken/asynq
If you want a library with a very simple interface, yet robust that feels Go-like, uses Redis as Backend and RabbitMQ as message broker, you can try
https://github.com/Joker666/cogman

How to manage several worker processes with one master process in ruby?

I need to develop multi-process application in Ruby: one master process manage several worker processes. Master creates (forks! I dont want multithreading, but multiprocessing) pull of workers and then asks database for new records to be processed periodically, and if any - send records to workers via pipes. When worker done its job it must notify master, so master can send to this worker another record to process. I cant find solution for notifications which workers can send to master, seems like it should be done in async, but IO pipes work in sync mode. Please, give me the direction to go? Any fork best practices in comments are also welcomed! Thank you.
PS. I dont want to use external solutions like EventMachine or Parallel, only forks.
well i do realy think a MQ system is more suitable.
master publish the job, works query and process the job.
rails also have a publish/subscription function
see
http://api.rubyonrails.org/classes/ActiveSupport/Notifications.html
Rails pub/sub with faye

Resources