multiple sidekiq queue for an sinatra application - ruby

We have a Ruby on Sinatra application. We use sidekiq and redis for queue process.
We already implemented and using sidekiq that queues up jobs that does insertion into database. it works pretty fine till now.
Now I wanted to add another jobs which will read bulk data from database and export to csv file.
I donot want both this job to be in same queue instead is there possible to create different queue for these jobs in same application?
Please give some solution.

You probably need advanced queue options. Read about them here: https://github.com/mperham/sidekiq/wiki/Advanced-Options
Create csv queue from command line (it can be done in config file as well):
sidekiq -q csv -q default
Then in your worker:
class CSVWorker
include Sidekiq::Worker
sidekiq_options :queue => :csv
# perform method
end

take a look at sidekiq wiki: https://github.com/mperham/sidekiq/wiki/Advanced-Options
by default everything goes inside 'default' queue but you can specify a queue in your worker:
sidekiq_options :queue => :file_queue
and to tell sidekiq to process your queue, you have to either declare it in configuration file:
:queues:
- file_queue
- default
or pass it as argument to the sidekiq process: sidekiq -q file_queue

Related

How to see celery tasks in redis queue when there is no worker?

I have a container creating celery tasks, and a container running a worker.
I have removed the worker container, so I expected that tasks would accumulate in the redis list of tasks.
But I can't see any tasks in redis.
This is with django. I need to isolate the worker and queue, hence the settings
A typical queue name is 'test-dear', that is, SHORT_HOSTNAME='test-dear'
CELERY_DATABASE_NUMBER = 0
CELERY_BROKER_URL = f"redis://{REDIS_HOST}:6379/{CELERY_DATABASE_NUMBER}"
CELERY_RESULT_BACKEND = f"redis://{REDIS_HOST}:6379/{CELERY_DATABASE_NUMBER}"
CELERY_BROKER_TRANSPORT_OPTIONS = {'global_keyprefix': SHORT_HOSTNAME }
CELERY_TASK_DEFAULT_QUEUE = SHORT_HOSTNAME
CELERY_TASK_ACKS_LATE = True
After starting everything, and stopping the worker, I add tasks.
For example, on the producer container after python manage.py shell
>>> from cached_dear import tasks
>>> t1 = tasks.purge_deleted_masterdata_fast.delay()
<AsyncResult: 9c9a564a-d270-444c-bc71-ff710a42049e>
t1.get() does not return.
then in redis:
127.0.0.1:6379> llen test-dear
(integer) 0
I was not expecting 0 entries.
What I am doing wrong or not understanding?
I did this from the redis container
redis-cli monitor | grep test-dear
and sent a task.
The list is test-deartest-dear and
llen test-deartest-dear
works to show the number of tasks which have not yet been sent to a worker.
The queue name is f"{global_keyprefix}{queue_name}

How do You monitor sidekiq processes?

I'm working on a production app that has multiple rails servers behind nginx loadbalancer. We are monitoring sidekiq processes with monit, and it works just fine - when sidekiq proces dies monit starts it right back.
However recently encountered a situation where one of these processes was running and visible to monit, but for some reason not visible to sidekiq. That resulted in many failed jobs and took us some time to notice that we're missing one process in sidekiq Web UI, since monit was telling us everything was fine and all processes were running. Simple restart fixed the problem.
And that bring me to my question: how do you monitor your sidekiq processes? I know i can use something like rollbar to notify me when jobs fail, but i'd like to know if there is a way to monitor process count and preferably send mail when one dies. Any suggestions?
Something that would ping sidekiq/stats and verify response.
My super simple solution to a similar problem looks like this:
# sidekiq_check.rb
namespace :sidekiq_check do
task rerun: :environment do
if Sidekiq::ProcessSet.new.size == 0
exec 'bundle exec sidekiq -d -L log/sidekiq.log -C config/sidekiq.yml -e production'
end
end
end
and then using cron/whenever
# schedule.rb
every 5.minutes do
rake 'sidekiq_check:rerun'
end
We ran into this problem where our sidekiq processes had stopped working off jobs overnight and we had no idea. It took us about 30 minutes to integrate http://deadmanssnitch.com by following these instructions.
It's not the prettiest or cheapest option but it gets the job done (integrates nicely with Pagerduty) and has saved our butt twice in the last few months.
On of our complaints with the service is the shortest grace interval we can set is 15 minutes which is too long for us. So we're evaluating similar services like Healthchecks, etc.
My approach is the following:
create a background job that does something
call the job regularly
check that the thing is being done!
so; using a cron script (or something like whenever) every 5 mins, I run :
CheckinJob.perform_later
It's now up to sidekiq (or delayed_job, or whatever active job you're using) to actually run the job.
The job just has to do something which you can check.
I used to get the job to update a record in my Status table (essentially a list of key/value records). Then I'd have a /status page which returns a :500 status code if the record hasn't been updated in the last 6 minutes.
(obviously your timing may vary)
Then I use a monitoring service to monitor the status page! (something like StatusCake)
Nowdays I have a simpler approach; I just get the background job to check in with a cron monitoring service like
IsItWorking
Dead Mans Snitch
Health Checks
The monitoring service which expects your task to check in every X mins. If your task doesn't check in - then the monitoring service will let you know.
Integration is dead simple for all the services. For Is It Working it would be:
IsItWorkingInfo::Checkin.ping(key:"CHECKIN_IDENTIFIER")
full disclosure: I wrote IsItWorking !
I use god gem to monitor my sidekiq processes. God gem makes sure that your process is always running and also can notify the process status on various channels.
ROOT = File.dirname(File.dirname(__FILE__))
God.pid_file_directory = File.join(ROOT, "tmp/pids")
God.watch do |w|
w.env = {'RAILS_ENV' => ENV['RAILS_ENV'] || 'development'}
w.name = 'sidekiq'
w.start = "bundle exec sidekiq -d -L log/sidekiq.log -C config/sidekiq.yml -e #{ENV['RAILS_ENV']}"
w.log = "#{ROOT}/log/sidekiq_god.log"
w.behavior(:clean_pid_file)
w.dir = ROOT
w.keepalive
w.restart_if do |restart|
restart.condition(:memory_usage) do |c|
c.interval = 120.seconds
c.above = 100.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
restart.condition(:cpu_usage) do |c|
c.interval = 120.seconds
c.above = 80.percent
c.times = 5
end
end
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 1.hours
end
end
end

Writing to a file in production with sinatra

I cannot write to a file for the life of me using Sinatra in production.
In my development environment, I can use Logger without a problem and log STDOUT to a file.
It seems like in production, the Logger class is overwritten by the RACK middleware's Logger and it makes things more complicated.
I simply want to write to a file like this:
post '/' do
begin
$log_file = File.open("/home/ec2-user/www/logs/app.log", "w")
...do..stuff...
$log_file.write "INFO -- #{Time.now} --\n #{notification['Message']}"
...do..stuff...
rescue
$log_file.write "ERROR -- #{Time.now} --" + "\njob failed"
ensure
$log_file.close
end
end
The file doesn't get created when I receive a POST request to '/'.
However the file DOES get created when I load the app running pry:
pry -r ./app.rb
I am certain the code inside the POST block is effectively running because new jobs are getting added to the database upon receiving requests..
Any help would be greatly appreciated.
I was finally able to get to the bottom of this.
I changed the nginx user in /etc/nginx/nginx.conf from nginx to ec2-user. (Ideally I would just fix the write permissions for the nginx user but this solution suits me for now.)
Then I ps aux | grep unicorn and saw the timestamp next to the process name: unicorn master -c unicorn.rb -D was 3 days old!!
All this time I was pushing my code the the production server, restarting nginx and never killed and restart the unicorn process.
I removed all the code in my POST block and left only the file creation part
post '/' do
$log_file = File.open("/home/ec2-user/www/logs/app.log", "a")
$log_file.write("test log string")
$log_file.close
end
And the the file was successfully written to upon receiving a POST request.

How to print capistrano current thread hash?

An example output from capistrano:
INFO [94db8027] Running /usr/bin/env uptime on leehambley#example.com:22
DEBUG [94db8027] Command: /usr/bin/env uptime
DEBUG [94db8027] 17:11:17 up 50 days, 22:31, 1 user, load average: 0.02, 0.02, 0.05
INFO [94db8027] Finished in 0.435 seconds command successful.
As you can see, each line starts with "{type} {hash}". I assume the hash is some unique identifier for either the server or the running thread, as I've noticed if I run capistrano over several servers, each one has it's own distinct hash.
My question is, how do I get this value? I want to manually output some message during execution, and I want to be able to match my output, with the server that triggered it.
Something like: puts "DEBUG ["+????+"] Something happened!"
What do I put in the ???? there? Or is there another, built in way to output messages like this?
For reference, I am using Capistrano Version: 3.2.1 (Rake Version: 10.3.2)
This hash is a command uuid. It is tied not to the server but to a specific command that is currently run.
If all you want is to distinguish between servers you may try the following
task :some_task do
on roles(:app) do |host|
debug "[#{host.hostname}:#{host.port}] something happened"
end
end

Sidekiq multiple workers?

I have a question regarding Sidekiq. I come from the Resque paradigm, and in the current application I launch one worker per queue, so in the terminal I would do:
rake resque:work QUEUE='first'
rake resque:work QUEUE='second'
rake resque:work QUEUE='third'
Then, If I want more workers, for example for the third queue, I just create more workers as:
rake resque:work QUEUE='third'
My question is...
With Sidekiq, how would you start with multiple workers? I know you can do this:
sidekiq -q first, -q second, -q third
But that would just start one worker that fetches all those queues. So, how would I go to start three workers, and tell each worker to just focus on a particular queue? Also, how would I do that in Heroku?
You could use a config file in config/sidekiq.yml :
# Sample configuration file for Sidekiq.
# Options here can still be overridden by cmd line args.
# sidekiq -C config.yml
---
:verbose: true
:pidfile: ./tmp/pids/sidekiq.pid
:concurrency: 15
:timeout: 5
:queues:
- [first, 20]
- [second, 20]
- [third, 1]
staging:
:verbose: false
:concurrency: 25
production:
:verbose: false
:concurrency: 50
:timeout: 60
That way you can configure exactly what you want, and to answer precisely your question the concurrency value is what you are loking for, it defines the number of concurrent workers executed.
More info here : https://github.com/mperham/sidekiq/wiki/Advanced-Options
So, how would I go to start three workers, and tell each worker to
just focus on a particular queue?
You can define on the worker-level in which queue it should be placed via sidekiq_options.
To place for example your worker in a queue called "first" just define it with
class MyWorker
include Sidekiq::Worker
sidekiq_options :queue => :first
...
end

Resources