Golang background processing

Golang background processing - go

How can one do background processing/queueing in Go?
For instance, a user signs up, and you send them a confirmation email - you want to send the confirmation email in the background as it may be slow, and the mail server may be down etc etc.
In Ruby a very nice solution is DelayedJob, which queues your job to a relational database (i.e. simple and reliable), and then uses background workers to run the tasks, and retries if the job fails.
I am looking for a simple and reliable solution, not something low level if possible.

While you could just open a goroutine and do every async task you want, this is not a great solution if you want reliability, i.e. the promise that if you trigger a task it will get done.
If you really need this to be production grade, opt for a distributed work queue. I don't know of any such queues that are specific to golang, but you can work with rabbitmq, beanstalk, redis or similar queuing engines to offload such tasks from your process and add fault tolerance and queue persistence.

A simple Goroutine can make the job:
http://golang.org/doc/effective_go.html#goroutines
Open a gorutine with the email delivery and then answer to the HTTP request or whatever
If you wish use a workqueue you can use Rabbitmq or Beanstalk client like:
https://github.com/streadway/amqp
https://github.com/kr/beanstalk
Or maybe you can create a queue in you process with a FIFO queue running in a goroutine
https://github.com/iNamik/go_container
But maybe the best solution is this job queue library, with this library you can set the concurrency limit, etc:
https://github.com/otium/queue
import "github.com/otium/queue"
q := queue.NewQueue(func(email string) {
//Your mail delivery code
}, 20)
q.Push("foo#bar.com")

I have created a library for running asynchronous tasks using a message queue (currently RabbitMQ and Memcache are supported brokers but other brokers like Redis or Cassandra could easily be added).
You can take a look. It might be good enough for your use case (and it also supports chaining and workflows).
https://github.com/RichardKnop/machinery
It is an early stage project though.

You can also use goworker library to schedule jobs.
http://www.goworker.org/

If you are coming from Ruby background and looking for something like Sidekiq, Resque, or DelayedJob, please check out the library asynq.
Queue semantics are very similar to sidekiq.
https://github.com/hibiken/asynq

If you want a library with a very simple interface, yet robust that feels Go-like, uses Redis as Backend and RabbitMQ as message broker, you can try
https://github.com/Joker666/cogman

Related

ActiveMQ- Can I Replace The Scheduler Plugin With A Delayed Message Queue?

I worked a little with the ActiveMQ scheduler plugin. This simplifies scheduling messages for delivery with a delay at low volume, but as I get into the 100ks of messages the system breaks down in two key ways.
It's very slow (compared to queues) to enqueue messages in the scheduler.
Attempting to view the schedules in the dashboard crashes the ActiveMQ instance.
The existing scheduler feels a little bolted on and does not perform as expected. So, rethinking the problem I would like to have a jobs and jobs-scheduled queue. Messages sent to the jobs-scheduled queue will have a ttl header with the unix timestamp for when it should be delivered. A process will run on a cron job which will take messages from the jobs-scheduled queue and send it to the jobs queue using a selector to just pick out the messages with an elapsed ttl convert_string_expressions:ttl < %(now)s.
My two questions are:
Will this strategy work for delaying messages at scale or will I find scaling pains around the selector? These messages will be persisted if that makes a difference.
Is there an existing feature in ActiveMQ that will allow me to send messages from one queue to another with a selector query?

ActiveMQ is a message broker not a job scheduler so what you are trying to do is really outside the scope of the what the broker is intended to do. Yes ActiveMQ does have a scheduled message feature but this is not intended for large scale job queue type work, it is a simple feature to provide some minimal delayed delivery.
What you are looking for sounds more like Quartz or some other batch job scheduling library. You could develop your own Job scheduler implementation for ActiveMQ or do something in a plugin but you are really trying to run against the grain of what a broker is meant to do which is deliver messages as quickly as possible in a decoupled manner.

Side note-- potentially off-topic.
I've had to solve a similar situation in the past where it made a lot of sense to load up the queues with messages ahead of time to cut down on the total transfer time.
I solved it by using Camel routes and a side-channel activation. Camel allows you to programmatically start and stop routes, so you can load up a queue with no consumers for the data for a given time period. Then using a dedicated queue for control you send the 'start' message. The control route receives the 'start' message, and then activates the main data processing route. You then need to configure some sort of 'stop' message semantic to be ready for the next time periods run.
Effectively, you get the delayed behavior pattern with much more control over scheduling and cut down on the data-to-queue loading time problem. You can also solve the scaling problem by loading the data across more than one queue.

Notifying golongpoll.SubscriptionManager of an event from kafka-go

I was writing a POC on long-polling using go.
I see the general package to be used is https://github.com/jcuga/golongpoll .
But assuming that I would want to publish an event to the golongpoll.SubscriptionManager from a general context, especially when there is a possibility that the long poll API request is being served by one machine, while the Kafka event for that particular consumer group is consumed by another instance in the cluster.
The examples given in the documentation did not talk of such a scenario at all, even though this seems like a common scenario. One way I can think of is have a distributed cache like Redis in between and have all the services poll this for a change? But that sounds a bit dumb to me.

How to manage several worker processes with one master process in ruby?

I need to develop multi-process application in Ruby: one master process manage several worker processes. Master creates (forks! I dont want multithreading, but multiprocessing) pull of workers and then asks database for new records to be processed periodically, and if any - send records to workers via pipes. When worker done its job it must notify master, so master can send to this worker another record to process. I cant find solution for notifications which workers can send to master, seems like it should be done in async, but IO pipes work in sync mode. Please, give me the direction to go? Any fork best practices in comments are also welcomed! Thank you.
PS. I dont want to use external solutions like EventMachine or Parallel, only forks.

well i do realy think a MQ system is more suitable.
master publish the job, works query and process the job.
rails also have a publish/subscription function
see
http://api.rubyonrails.org/classes/ActiveSupport/Notifications.html
Rails pub/sub with faye

JMS Producer-Consumer-Observer (PCO)

In JMS there are Queues and Topics. As I understand it so far queues are best used for producer/consumer scenarios, where as topics can be used for publish/subscribe. However in my scenario I need a way to combine both approaches and create a producer-consumer-observer architecture.
Particularly I have producers which write to some queues and workers, which read from these queues and process the messages in those queues, then write it to a different queue (or topic). Whenever a worker has done a job my GUI should be notified and update its representation of the current system state. Since workers and GUI are different processes I cannot apply a simple observer pattern or notify the GUI directly.
What is the best way to realize this using a combination of queues and/or topics? The GUI should always be notified, but it should never consume anything from a queue?
I would like to solve this with JMS directly and not use any additional technology such as RMI to implement the observer part.
To give a more concrete example:
I have a queue with packages (PACKAGEQUEUE), produced by machine (PackageProducer)
I have a worker which takes a package from the PACKAGEQUEUE adds an address and then writes it to a MAILQUEUE (AddressWorker)
Another worker processes the MAILQUEUE and sends the packages out by mail (MailWorker).
After step 2. when a message is written to the MAILQUEUE, I want to notify the GUI and update the status of the package. Of course the GUI should not consume the messages in the MAILQUEUE, only the MailWorker must consume them.

You can use a combination of queue and topic for your solution.
Your GUI application can subscribe to a topic, say MAILQUEUE_NOTIFICATION. Every time (i.e at step 2) PackageProducer writes message to MAILQUEUE, a copy of that message should be published to MAILQUEUE_NOTIFICATION topic. Since the GUI application has subscribed to the topic, it will get that publication containing information on status of the package. GUI can be updated with the contents of that publication.
HTH

Solution/Architecture: queues or something else?

I have a multiple frontends to my service written in Node.js and workers written in Ruby. Now the question is how to make those communicate? I need to maintain dynamic pool of workers to handle load (spawn more workers when load rises) and messages are quite big ~2-3M because I'm sending images to workers uploaded by users through Node.js frontends. Because I want nice scaling I thought about some queuing solution, but I didn't find any existing solutions (or misunderstood guides) that will provide:
Fallback mechanisms. Solutions I've found so far have single failure point - message broker and there are no ways to provide fallbacks.
Serialization. So when broker fails tasks are not lost.
Ability to pass big messages.
Easy API for Ruby and Node.js
Some API to track queue size so I could rearrange workers pool.
Preferrably lightweight.
Maybe my approach is wrong? Maybe I shouldn't use queues but some other way? Or there's some queueing solution that fits requirements above?

No doubt you require a Queue to scale and you can monitor this queue to spawn "workers".
Apache ActiveMQ is very robust and supports REST protocol. Ruby client is also available to access the queue.
Interesting article on RESTful queue using Apache ActiveMQ

in the end of the day i took ZeroMQ queue solution. Very fast, robust and lightweight implementation. Had to write own broker, but thats the only cons of this solution.

redis publish/subscribe should do the trick
http://redis.io/topics/pubsub

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio