Ruby: Any gems for threadpooling? - ruby

Is there a gem for threadpooling anyone can recommend?

From my experience forking/process pooling is much more effective than thereadpooling in Ruby (assuming you do not need much in terms of thread communication). Some time ago I created a gem called process_pool, which is a very basic process pool with a file based job queue (you can check it out here: http://github.com/psyho/process_pool).

I would try https://github.com/ruby-concurrency/concurrent-ruby/ .
It's basically a port of the java.util.concurrent abstractions (including threadpools) to ruby -- except if you install it under Jruby, it'll use the java.util.concurrent stuff. So you can write code that'll work and do the same thing semantically (not neccesarily the same performance) under any ruby platform.
It also offers Futures, a higher level abstraction which may be more convenient to use than thread pools.

Related

Using spdy with mod_php

The documentation for SPDY says it is not compatible with mod_php as its not thread safe:
https://developers.google.com/speed/spdy/mod_spdy/php
Much like the Apache Worker MPM, mod_spdy is a multithreaded module,
and processes multiple SPDY requests from the same connection
simultaneously. This poses a problem for other Apache modules that may
not be thread-safe, such as mod_php. Fortunately, it is fairly easy to
adjust your Apache configuration to make your existing PHP code safe
to use with mod_spdy (and with the Worker MPM as well).
I have tried using SPDY with mod_php and I haven't had any issues. What is the danger of doing this?
The PHP core is thread-safe since PHP5. However many of the extensions and libraries that extension use are not.
If you're not using those extensions you'll probably not going to get any problems. If you do, you might get segfaults, other memory access violations or just strange behavior.
A partial list is available op the PHP site. Unfortunately, there doesn't seem to be a conclusive list of thread-safe and thread-unsafe extensions.

Issues with EventMachine (and looking into Sinatra Async)

I've been trying to find a good way of dealing with asynchronous requests and organizing jobs that need to be repeated, and eventmachine seemed a good way to go, but I found some posts trying to discourage users from eventmachine (for example https://github.com/kyledrake/sinatra-synchrony). I was wondering what the issues they are referring to are? (and if someone would be nice enough, what the alternatives are?)
Considering you're basically searching for a job queue, take a look at Background Jobs at Ruby Toolbox and you'll find a plethora of good options. Manageability vs Speed goes something like this,
Delayed Job
Sidekiq/Resque
Beanstalkd
with DJ being slowest and most manageable and beanstalkd being fastest and least manageable. Your best bet is probably sidekiq or resque, they both depend on redis for managing their queue.
I'd discourage you to use EventMachine because:
It's hard to reason about the reactor pattern.
Fibers detangle reactor pattern's callback pyramid of doom into synchronous looking code but fiber support in third party apps tend to bite you.
You're limited to a very limited eco system when it comes to net-related code.
It's hard not to block the reactor and it's often not easy to catch it when you do.
There are finished solutions for background processing, you don't need to code your own.
It's not really maintained any more, just take a look at last commits and issue list on github.
There's celluloid and celluloid-io and dcell.
Actually, the Sinatra Synchrony people sum it up good:
This gem should not be considered for a new application. It is better
to use threads with Ruby, rather than EventMachine. It also tends to
break when new releases of ruby come out, and EM itself is not
maintained very well and has some pretty fundamental problems.
I will not be maintaining this gem anymore. If anyone is interested in
maintaining it, feel free to inquire, but I recommend not using
EventMachine or sinatra-synchrony anymore.
Use EM if it fits your workflow. Callbacks can be fine to work with as long as you don't get too crazy. We built a lot of software on top of EM at my last job.
There is pretty good support for third party protocols, just take a look at the protocol implementations page.
As to blocking the reactor, you just need to make sure you don't do work on the main thread, and if you do, make sure it's work you do fast. There are some things you can do to determine if this is working. The simplest is just to add a latency check into your code. It's as simple as adding a periodic timer for every x seconds and logging a message (in development). Printing out the time between the calls will tell you how lagged the reactor has become. The greater this time is then your x value the more work you're doing on the main thread.
So, I'd say, try it for yourself. Try celluloid, try straight up threads, try EM with EM-Synchrony and fibers.
It really comes down to personal preference.

Ruby and graph DB w/o Jruby

I'm hoping to introduce a graph DB into my project w/o having to move to jRuby. As I see it, given this restriction I've got two options:
Use a graph DB that provides a RESTful interface. I don't know what impact this will have on performance. I'm planning for a crapload of data.
Find a graph DB that has a ruby interface not requiring jRuby. In my search thus far I've not found anything but most of the posts and blog entries I've found have been fairly dated. I'd prefer the DB and interface to be somewhat mature and reliable, of course.
Does anyone know of anything that would meet #2 above?
If you're concerned about performance, I'd recommend trying JRuby and neo4j.rb
because it interacts directly with the embedded, high performance neo4j-Java-API. Ultimately I think that would be the highest-performance solution.
If you're not willing to entertain JRuby at all, there are options. Neo4j has a REST API and neography is a thin wrapper for it.
Or you use the Neo4j Server - (J)Ruby extension. This is a JRuby Rack application that exposes a REST API. It contains the Neo4J server, so it can be installed and used as a JRuby app, and your stack is Ruby all the way down, even if it is mostly MRI Ruby and the JRuby part is isolated to persistence.

Is communication between two ruby processes possible/easy?

If I have a ruby script Daemon that, as it's name implies, runs as a daemon, monitoring parts of the system and able to perform commands which require authentication, for example changing permissions, is there an easy way to have a second ruby script, say client, communicate to that script and send it commands / ask for information? I'm looking for a built in ruby way of doing this, I'd prefer to avoid building my own server protocol here.
Ruby provides many mechanisms for this including your standards such as: sockets, pipes, shared memory. But ruby also has a higher level library specifically for IPC which you can checkout Here, Drb. I haven't had a chance to play around with it too much but it looks really cool.
You may want to look into http://rubyeventmachine.com/

Working with multiple processes in Ruby

Is there a module for Ruby that makes it easy to share objects between multiple processes? I'm looking for something similar to Python's multiprocessing, which supports process-safe queues and pipes that can be shared between processes.
I think you can do a lot of what you want using the facilities of Ruby IO; you're sharing between processes, not threads, correct?
If that's the case, IO.pipe will do what you need. Ruby doesn't have any built-in way of handling cross-process queues (to my knowledge), but you can also use FIFOs (if you're on Unix).
If you want something more fine-grained, and with good threading support, I'm fairly certain that you can piggyback on java.util.concurrent if you use JRuby. MRI has pretty lousy threading/concurrency support, so if that's what your aiming for, JRuby is probably a better place to go.
I've run into this library but I haven't tried it yet.
Parallel::ForkManager — A simple parallel processing fork manager.
http://parallelforkmgr.rubyforge.org/
Combining DRb, which provides simple inter-process communication, with Queue or SizedQueue, which are both threadsafe queues, should give you what you need.
You may also want to check out beanstalkd which is also hosted on github

Resources