Ruby Background Scheduler - ruby

i'm trying to use some background scheduling in ruby, and seems like havent found yet the right solution.
so many gem are talked about on the net, rufus, sidekiq, resque, whenever , clockwwork , but i was wondering what is the best option for our need.
basically i have few jobs that i want to schedule on a cron basis upfront.
also, would like to have the ability to update those jobs, or add new jobs during runtime.
tried the following:
rufus-scheduler : was vary simple and easy to integrate. had the ability to schedule jobs by ons startup. the problem is there is no ability to dynamically integreate with it during runtime.
resque - should have the ability to schedule at runtime (schedule.yml) and also to integrate with dynamically, but uses process for each worker. bad performance.
sidekiq - should have support all of these. but by adding also some 3rd party gem "sidekiq-scheduler". had few issues on the scheduler and looks like its not really supported, which might be an issue.
so should we keep invest on the sidekiq-scheduler or have we missed some more mature gem that could help us making this works?
any help will be appriciated.
Levi

Related

The best way to schedule||automate MarkLogic data hub flows/custom steps

I use DMSDK to ingest data; I have multiple custom flows to run following data ingestion. Instead of manually running the flows one by one, What is the best way to orchestrate MarkLogic data hub flows?
gradle, trigger or other scheduling tools?
I concur with Dave Cassel that NiFi, or perhaps something like MuleSoft, or maybe even Camel is a great way to manage running your flows. Particularly if you are talking about operational management.
To answer on other mechanisms:
Crontab doesn't connect to MarkLogic itself. You'd have to write scripts or code to make something actually happen. You won't have much control either, nor logging, unless you add that as well.
We have great plugins for Gradle that make running flows real easy. Great during development and such, but perhaps less suited for scheduling or operational tasking.
Triggers inside MarkLogic only respond to insertion of data, so you'd still have to initiate an update from outside anyhow.
Scheduled Tasks inside MarkLogic has similar limitations to Crontab and Gradle. It doesn't do much by itself, so you have to write code anyhow. It provides no logging by itself, nor ways to operationally manage the tasks, other than through Admin ui.
JAR package might depend on what JAR package you actually mean. You can create a JAR of your ml-gradle project, but that doesn't give you a lot of gain over calling Gradle itself.
Personally, I'd have a close look at the operational requirements. Think of for instance: need to get status overview, interrupt schedules, loops to retry at failure, built-in logging, and facilities to send notifications when attention is needed.
HTH!
There are a variety of answers that will work, of course; my preference is NiFi. This keeps any scheduling overhead outside of MarkLogic, with the trade-off that you'll need to have NiFi running.

Issues with EventMachine (and looking into Sinatra Async)

I've been trying to find a good way of dealing with asynchronous requests and organizing jobs that need to be repeated, and eventmachine seemed a good way to go, but I found some posts trying to discourage users from eventmachine (for example https://github.com/kyledrake/sinatra-synchrony). I was wondering what the issues they are referring to are? (and if someone would be nice enough, what the alternatives are?)
Considering you're basically searching for a job queue, take a look at Background Jobs at Ruby Toolbox and you'll find a plethora of good options. Manageability vs Speed goes something like this,
Delayed Job
Sidekiq/Resque
Beanstalkd
with DJ being slowest and most manageable and beanstalkd being fastest and least manageable. Your best bet is probably sidekiq or resque, they both depend on redis for managing their queue.
I'd discourage you to use EventMachine because:
It's hard to reason about the reactor pattern.
Fibers detangle reactor pattern's callback pyramid of doom into synchronous looking code but fiber support in third party apps tend to bite you.
You're limited to a very limited eco system when it comes to net-related code.
It's hard not to block the reactor and it's often not easy to catch it when you do.
There are finished solutions for background processing, you don't need to code your own.
It's not really maintained any more, just take a look at last commits and issue list on github.
There's celluloid and celluloid-io and dcell.
Actually, the Sinatra Synchrony people sum it up good:
This gem should not be considered for a new application. It is better
to use threads with Ruby, rather than EventMachine. It also tends to
break when new releases of ruby come out, and EM itself is not
maintained very well and has some pretty fundamental problems.
I will not be maintaining this gem anymore. If anyone is interested in
maintaining it, feel free to inquire, but I recommend not using
EventMachine or sinatra-synchrony anymore.
Use EM if it fits your workflow. Callbacks can be fine to work with as long as you don't get too crazy. We built a lot of software on top of EM at my last job.
There is pretty good support for third party protocols, just take a look at the protocol implementations page.
As to blocking the reactor, you just need to make sure you don't do work on the main thread, and if you do, make sure it's work you do fast. There are some things you can do to determine if this is working. The simplest is just to add a latency check into your code. It's as simple as adding a periodic timer for every x seconds and logging a message (in development). Printing out the time between the calls will tell you how lagged the reactor has become. The greater this time is then your x value the more work you're doing on the main thread.
So, I'd say, try it for yourself. Try celluloid, try straight up threads, try EM with EM-Synchrony and fibers.
It really comes down to personal preference.

How to Monitor ruby process ?

What is the best way to monitore Ruby process ( Not Rails ), I tried to use Newrelic but it seems it is designed for Rails .
I'm using distributed Ruby processes with foreman, the processes communicate with RabbitMQ queuing system, and I'm looking for a tool that can trace my queries and other parameters specially that I'm using active record with most of my models .
NewRelic is still viable for Ruby apps that are non-Rails. The downside is you have to manually instrument your code with trace calls. NewRelic's documentation (scroll down to Adding Method Tracers (Ruby)) shows how to setup custom tracing for your application. Then you can use the NewRelic website as usual to inspect the runtime of your app.
I have done this for Java apps, but not a Ruby app. One thing to look out for is that custom traces will show up in the Background Issues area, not under Web Transactions (for obvious reasons). Database calls should be in the correct area.
The NewRelic support is very good, I have work with them several times to debug an issue. If you have an issue with the custom tracing, they will help sort it out.
try monit

Which is better, Nagios or Sensu? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am unsure about which monitoring framework to use. Currently I am looking at either Nagios or Sensu.
Can anybody give me a good reference which shows a comparison of these two (or any other monitoring tool which may be a good solution)? My main intention is to scale-out on EC2. I am using Opscode Chef for system integration.
One important difference between Nagios and Sensu -
Nagios requires all the configuration for 1)checks 2)handlers but most importantly 3)hosts to be written in configuration files on the Nagios server. This means that each time one of the 3 above is changed (for example new hosts added, old hosts removed) you need to re-write the configuration files and restart Nagios.
Sensu is almost the same as the above, with one important difference -- when hosts are added or removed from your architecture (as is the case in most auto-scaling cloud deployments) -- the hosts themselves run a sensu-client that "subscribes" to different available checks. So when a new server comes into existence and says "I'm a webserver", the sensu-client running on it will ask the sensu-server "what checks should a webserver run on itself?" and run those.
Other than this, operations wise both Nagios (also Icinga) and Sensu are great and have a lot of facilities for checks, handlers, and visibility through a dashboard (YMMV).
From a little recent experience with Sensu and quite a bit of experience with Nagios I'd say both are excellent choices.
Sensu is definitely the new kid. It has a nice UI and nice API. It does however require Redis and RabbitMQ in your setup to work. So consider if you'll therefore want something to monitor those dependencies outside the sensu monitoring stack. Sonian provide Chef recipes for trying it out too.
https://github.com/sensu/sensu-chef
Nagios has been around for an awfully long time. It's generally packaged for most distros which makes installation simple and it has few dependencies. It's track record also means that finding people who know it or that have used it and can offer advice is easy. On the other hand the UI is ugly and programatic access is often hacky or via third party add-ons. Chef recipes also exist for Nagios:
https://github.com/bryanwb/chef-nagios
If you have time I'd try both, there is little harm in having two monitoring systems running as a trial. The main think to focus on, especially in a dynamic EC2 setup, is how easily the monitoring configuration files can be generated by your configuration management tool.
In terms of other tools I'd personally include something to record time series data, for instance requests per second or load over time. Graphs are a great help with monitoring, and can be used to drive alerting via Nagios or similar. Personally I'm a fan of both Ganglia and Graphite while Librato Metrics (https://metrics.librato.com/) is a very nice non-free option.
I tried using Nagios for a while: I got the feeling that the only reason that it's common is that 'everyone else uses it', because it's absolutely hideous to work with. Massively overcomplicated, difficult and long-winded to make it do anything new: if you find something it doesn't do, you know you're in for a week of swearing at crummy documentation of an archaic design. At the end of all your efforts and it's all working, it looks hideous. Scrapping it made me sleep better.
Cacti looks nice, but again it's unnecessarily complex when creating new plugins.
For graphing I'd recommend Munin: it's completely trivial to write new plugins in any language, there are hundreds available, and it looks reasonable. It's incredibly easy to install - one command to install and set one access rule, so works well for automated deployments, easy to wrap into a chef recipe. 2.0 is out soon and addresses most of its shortcomings (in particular adding variable update intervals, zoomable graphs, ssh transport). Munin can talk to Nagios for notifications, or it can do that itself, and it provides a basic dashboard.
For local process/file/service monitoring, monit is simpler and works better than god. I've not tried it with m/monit.
When compared with Sensu and Nagios... The pick would be Sensu monitoring systems.
Below is the are the main reasons,
1.Easy Setup.. There is lot of reduction of restarting of Clients.. which is major trouble in the large enterprise
2. Nagios Plugins can be used with the Sensu Ecosystem.
3. Scalable and easily for the Cloud environment.
Has anyone heard about Zabbix.It has lot many features and comes as a single package. I doubt the scalability
As long as enterprise it consists of databases, sap, network devices, webservers, filers, backup libraries.... there is barely an alternative to nagios (or it's cousins icinga, shinken)
Maybe one day everything will come out of clouds automagically but still a few years there will be static servers (physical or virtual, it doesn't matter) with a defined purpose resting at least for a few months. We will still have to monitor interface bandwidth, tablespaces, business processes, database sessions, logfiles, jmx metrics. All things where the plugin concept of the nagios world has an advantage.

Free Ruby scheduled tasks (off the rails) hosting?

I have some simple Ruby scripts to run as a background job. They are in an infinite while loop to monitor external database changes. Can someone recommend a hosting provider that can do it for free to start?
I looked at AWS and the EC2 micro instance is actually a good fit for the first year. Anything equivalent beyond the first year?
Then I looked at Heroku. It seems a hack and overkill to use Delayed Job in the Rails framework.
Google App Engine is also a good fit as long as I am willing to rewrite my Ruby scripts to Python.
More background on my project. I am using CouchDB + CouchApp. It requires some external scripts to monitor new users sign up and send out forget password emails.
I don't know about free, but there is SimpleWorker, which lets you offload processes (and apparently schedule them too!) to their cloud infrastructure.
Try using http://callmyapp.com
Are you able to set up an always on machine at your house? It's probably going to be just as reliable as hosting it with a free tier provider.

Resources