identify which sidekiq process belongs to which application - ruby

I have 5+ Applications running on my system which are using Sidekiq for background process. How to identify which sidekiq process belongs to which application.

I can't give you a "call this Sidekiq method" sort of answer, but I can give you an approach. Using the Sidekiq server middleware, you can create a Redis key (e.g. "Process_") and assign it the name of the app, then it's just a simple matter of looking up the value of the key to determine which app created it. If you want to go the opposite direction, create a key based on the app name (e.g. "application_") as a set and add the process id as a member. There are examples of server middleware use in the Sidekiq Wiki, and you can dig through the Sidekiq code and refer to the Redis documentation to determine how to set keys in Redis.
Hope this helps.

Related

Manually launch a Sidekiq job through Redis

Is there a way to manually launch a Sidekiq process without using Ruby, but by posting the appropriate message into Redis? There must be some sort of convention for the message that it expects.
This is already covered in the FAQ: https://github.com/mperham/sidekiq/wiki/FAQ#how-do-i-push-a-job-to-sidekiq-without-ruby
Not sure why you would do this, but from its documentation: "Sidekiq is compatible with Resque. It uses the exact same message format as Resque so it can integrate into an existing Resque processing farm." I know that Resque enqueues a hash of data as a string:
"{\"class\":\"NoOpWorker\",\"args\":[]}"
You can manually verify this by enqueuing a job at a console with:
Resque.enqueue_to "foo", NoOpWorker
And then see what the data is with a redis-cli command
redis-cli lrange resque:queue:foo 0 100
But before proceeding, why would you want to do this? Why not just run a script or a rake task that would use enqueue the job through Sidekiq's normal API instead of hacking around it?
EDIT: Are you trying to interop between technologies?
Redis doesn't know anything about Ruby or Sidekiq.
So yeah, it's possible. It might require some work, and you might have to take versioning of the non-public (well, it is open source after all, so anything is public) API into account.
You could write a separate client process in any programming language, and analyze the redis keyspace. Read up on the implementation of the Sidekiq serialization. A quick look (I don't use Sidekiq) reveals that it uses simple JSON serialization: sidekiq/api.rb .
Hope this helps, TW

Reliable persisted sidekiq task

I am working on a ruby application that creates todos and meetings.
There will be reminders that are sent out with respect to each meeting or todo as you would imagine.
We are already using sidekiq and it would be nice to use sidekiq to create the scheduled jobs in x number of days/hours etc.
My concern is that we will lose the jobs if redis restarts.
Am I write in assuming that if redis restarts, we lose the jobs and if so, is there anything that can be done about it?
If not sidekiq, what else could I use?
There are several ways of doing that, just go through the link http://redis.io/topics/persistence. Snapshotting is a technique to snapshots of the dataset on disk.

How to identify a heroku dyno number from within the app?

Is there a way to identify the heroku dyno name (e.g. web.1, web.2) from within the application? I'd like to be able to generate a unique request id (e.g. to track requests between web and worker dynos for consolidated logging of the entire request stack) and it seems to me that the dyno identifier would make a decent starting point.
If this can't be done, does anyone have a fallback recommendation?
Recently that issue has been addressed by Heroku team.
The Dyno Manager adds DYNO environment variables that holds identifier of your dyno e.g. web.1, web.2, foo.1 etc. However, the variable is still experimental and subject to change or removal.
I needed that value (actually instance index like 1, 2 etc) to initialize flake id generator at instance startup and this variable was working perfectly fine for me.
You can read more about the variables on Local environment variables.
I asked this question of Heroku support, and since there are others here who have asked similar questions to mine I figured I should share it. Heroku staff member JD replied with the following:
No, it's not possible to see this information from inside the dyno.
We've reviewed this feature request before and have chosen not to
implement it, as this would introduce a Heroku-specific variable which
we aim to avoid in our stack. As such, we don't have plans to
implement this feature.
You can generate / add to your environment a unique identifier (e.g. a
UUID) on dyno boot to accomplish a similar result, and you can
correlate this to your app's dynos by printing it to your logs at that
time. If you ever need to find it later, you can check your logs for
that line (of course, you'll need to drain your logs using Papertrail,
Loggly, etc, or to your own server).
Unfortunately for my scenario, a UUID is too long (if I wanted such a large piece of data, I would just use a UUID to track things in the first place). But it's still good to have an official answer.
Heroku has a $DYNO environment variable, however there are some big caveats attached to it:
"The $DYNO variable is experimental and subject to change or removal." So they may take it away at any point.
"$DYNO is not guaranteed to be unique within an app." This is the more problematic one, especially if you're looking to implement something like Snowflake IDs.
For the problem you're attempting to solve, the router request ID may be more appropriate. Heroku passes a unique ID to every web request via the X-Request-ID header. You can pass that to the worker and have both the web and worker instance log the request ID anytime they log information for a particular request/bit of work. That will allow you to correlate incidents in the logs.
This may not exactly answer the question, but you could have a different line in your Procfile for each worker process (using a ps:scale of 1 for each). You could then pass in the worker number as an environment variable from the Procfile.
Two lines from an example procfile might look like:
worker_1: env WORKER_NUMBER=1 node worker
worker_2: env WORKER_NUMBER=2 node worker
The foreman package which heroku local uses seems to have changed the ENV variable name again (heroku/7.54.0). You can now get the worker name via $FOREMAN_WORKER_NAME when running locally. It has the same value $DYNO will have when running on Heroku (web.1, web.2, etc)
The foreman gem still uses $PS, so to access the dyno name and have it work both on heroku and in development (when using foreman) you can check $PS first and then $DYNO. To handle the case of a local console, check for Rails.console
dyno_name = ENV['PS'] || ENV['DYNO'] || (defined?(Rails::Console) ? "console" : "")
It's dangerous to use the DYNO environment variable because its value is not guaranteed to be unique. That means you can have two dynos running at the same time that briefly have the same DYNO variable value. The safe way to do this is to enable dyno metadata and then use the HEROKU_DYNO_ID environment variable. That will better let you generate unique request ids. See: https://devcenter.heroku.com/articles/dyno-metadata

ROR + Detect how many ports are running for ruby application

In my ruby on rails project, I have to take pull from sql-server to my mysql database.
When I run my project on port 3000, it makes system busy when I want to take pull.
I want such method or way which system can detect, how many ports are running for ruby application and how to close if it is not in use ?
Thanks in advance.
Hard to understand exactly what you're asking for, but I'm assuming that when you are synchronizing databases, the system becomes busy and you can't serve any pages. This is a perfect example for the use of a background job that allows you to do tasks like this without affecting the rails application. The two gems that come to mind that will allow you to do this is Delayed_job and Resque. An outstanding screencast for doing this type of stuff is listed below as well.
http://github.com/collectiveidea/delayed_job
https://github.com/defunkt/resque/
http://railscasts.com/episodes/171-delayed-job
http://railscasts.com/episodes/271-resque

What's the best way to fetch a POP3 server for new mails every 15 minutes?

I'm developing an app that needs to fetch a POP3 account every 5-15 minutes to check for new email and process it. I have written all the code except for the part where it automatically runs every 5-15 minutes.
I'm using Sinatra, DataMapper and hosting on Heroku which means cron jobs are out of the question, because Heroku only provides hourly cron jobs at best.
I have looked into Delayed::Job which doesn't natively support Sinatra nor DataMapper but there are workarounds for both. Since my Ruby knowledge is limited I couldn't find a way to merge these two forks into one working Delayed::Job for Sinatra/DataMapper solution.
Initially I used Mailman to check for emails which has built-in polling and runs continuously, but since it's not Rack-based it doesn't run on Heroku.
Any pointers on where to go next? Before you say: a different webhost, I should add I really prefer to stick with Heroku because of its ease of use (except of course, for the above issue).
Heroku supports CloudMailin
A simple trick is to write your code contained in a loop, then sleep at the bottom of it for however long you want:
Untested sample code...
loop do
do_something_way_cool()
sleep 5 * 60 # it's in minutes
end
If it has to be contained in the main body of the app then use a Thread to wrap it so the thread does the work. You'll need to figure out your shared data structures to transfer the data out of the loop. Queue is your friend there.

Resources