Using File#flock as ruby global lock (mutex for processes) - ruby

I am having concurrency issues between two processes after short research I have seen that temporary file is suggested solution to this problem.
So solution would be to create /tmp/global.lock and use it as global lock. Example of this I have found in this thread Mutex for Rails Processes
Make sense to me so far, but I would like to see best practice for this solution. Above explained make sense but I wonder how to check if given file is locked?
fh = File.open("/some/file/path", File::CREAT)
begin
if locked = check_file_locked?
sleep(1)
else
fh.flock(File::LOCK_EX)
# do what you need to do
end
ensure
fh.flock(File::LOCK_UN)
end
This is my understanding of solution and not sure how to implement mentioned check_file_locked?()? Also if there is best way would love to hear it.

#bjhaid's answer can cause a problem with Timeout#timeout causing an interpreter error in Rubinius. It's also unnecessarily complicated.
Here's a simpler version, using a nonblocking lock instead of timeout:
def locked? lockfile_name
f = File.open(lockfile_name, File::CREAT)
# returns false if already locked, 0 if not
ret = f.flock(File::LOCK_EX|File::LOCK_NB)
# unlocks if possible, for cleanup; this is a noop if lock not acquired
f.flock(File::LOCK_UN)
f.close
!ret # ret == false means we *couldn't* get a lock, i.e. it was locked
end

When you have an exclusive lock to a file, attempting to lock it again in ruby would wait indefinitely till the file is unlocked, so you can rely on that and set a timeout on how long ruby should would wait, this might not be the most adequate way but I would do as below:
fh = File.open("/some/file/path", File::CREAT)
fh.flock(File::LOCK_EX)
require 'timeout'
def check_file_locked?(file)
f = File.open(file, File::CREAT)
Timeout::timeout(0.001) { f.flock(File::LOCK_EX) }
f.flock(File::LOCK_UN)
false
rescue
true
ensure
f.close
end
f = File.open("/tmp/a.txt", "w+")
f.flock(File::LOCK_EX)
check_file_locked?("/tmp/a.txt") # => true
f.flock(File::LOCK_UN)
check_file_locked?("/tmp/a.txt") # => false

Related

How to stop a udp_server_loop from the outside

I've written little UDP server in Ruby:
def listen
puts "Started UDP server on #{#port}..."
Socket.udp_server_loop(#port) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
I start it in a separate thread:
thread = Thread.new { listen }
Is there a way to gracefully stop the udp_server_loop from outside the thread without just killing it (thread.kill)? I also dont't want to stop it from the inside by receiving any UDP message. Is udp_server_loop maybe not the right tool for me?
I don’t think you can do this with udp_server_loop (although you might be able to use some of the methods it uses). You are going to have to call IO::select in a loop of your own with some way of signalling it to exit, and some way of waking the thread so you don’t have to send a packet to stop it.
A simple way would be to use the timeout option to select with a variable to set to indicate you want the thread to end, something like:
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
loop do
readable, _, _ = IO.select(sockets, nil, nil, 1) # timeout 1 sec
break if #halt_loop
next unless readable # select returns nil on timeout
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
You then set #halt_loop to true when you want to stop the thread.
The downside to this is that it is effectively polling. If you decrease the timeout then you potentially do more work on an empty loop, and if you increase it you have to wait longer when stopping the thread.
Another, slightly more complex solution would be to use a pipe and have the select listen on it along with the sockets. You could then signal directly to finish the select and exit the thread.
#read, #write = IO.pipe
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
sockets << #read
loop do
readable, _, _ = IO.select(sockets)
break if #halt_loop
readable.delete #read
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
def end_loop
#halt_loop = true
#write.puts "STOP!"
end
To exit the thread you just call end_loop which sets the #halt_loop flag then writes to the pipe, making the other end readable and causing the other thread to return from select.
You could have this code check the readable IOs and exit if one of them is the read end of the pipe instead of using the variable, but at least on Linux there is a potential bug where a call to select might return a file descriptor as readable when it actuallt isn’t. I don’t know if Ruby deals with this, so better safe than sorry.
Also be sure to remove the pipe from the readable array before passing it to udp_server_recv. It’s not a socket so will cause an exception if you don’t.
A downside to this technique is that pipes are “[n]ot available on all platforms".
Although I doubt I understand what would be wrong with Thread::kill and/or Thread#exit, you might use the thread local variable for that.
def listen
Socket.udp_server_loop(#port) do |message, message_source|
break :interrupted if Thread.current[:break]
handle_incoming_message(message)
end
end
and do
thread[:break] = true
from the outside.

Is $SAFE = 4 and a timed execution limit enough to prevent eval's security vulnerabilities in Ruby?

Here is my current implementation of a safe eval in Ruby:
$mthread = Thread.new {}
class SafeEval
def self.safeEval code
$killed = false
$mthread = Thread.new {
$SAFE = 4
result = begin
eval code
rescue Exception => e
"Error in eval: #{e}"
end
Thread.current[:evalResult] = result
}
Thread.new {
sleep 3
if $mthread.alive?
$killed = true
Thread.kill $mthread
end
}.join
$mthread.join
$killed ? 'Error in eval: Maximum execution time reached' : String($mthread[:evalResult])
end
end
It uses $SAFE = 4. From my understanding, and from this post I've read, that's not enough to stop security vulnerabilities. However, if I set a maximum execution time, and kill the thread running the code after the time expires, is that enough for a safe eval?
If not, why isn't it safe? Are there still any vulnerabilites? Is there any way to prevent these vulnerabilities as well?
Of course setting an execution time is not secure. All you're doing then is making the execution path of whatever is executed less predictable.
Security is not about saying 'Oh, no untrusted code can cause trouble if it runs for less than 4s'. Security starts with not letting untrusted code execute anywhere outside of a strict sandboxed environment.
Why are you using eval here? What are you trying to accomplish?
edit- I'm an idiot, ignore, I read that as a timeout, not as a level. :P That said, this works perfectly well on my local machine:
$mthread = Thread.new {}
class SafeEval
def self.safeEval code
$killed = false
$mthread = Thread.new {
$SAFE = 4
result = begin
eval code
rescue Exception => e
"Error in eval: #{e}"
end
Thread.current[:evalResult] = result
}
Thread.new {
sleep 3
if $mthread.alive?
$killed = true
Thread.kill $mthread
end
}.join
$mthread.join
$killed ? 'Error in eval: Maximum execution time reached' : String($mthread[:evalResult])
end
end
SafeEval.safeEval("`cat /etc/passwd > /Users/usr/development/source/tests/test.txt`")
run that code on a web server that has a mail client or other method of connecting to remote servers, and an attacker can establish the user accounts on your machine and from there engage in social engineering to recover passwords.
Sandboxing is important because it prevents stuff like the above. $SAFE is not enough in and of itself, and this is one of the reasons you never put something like eval() or anything else whose core job is to execute untrusted code in an environment that could be reached by an attacker.
If you consider 'being able to kill the bot' as security vulnerability, then $SAFE = 4 is not safe enough, as we found out while testing it.
People can execute this, without getting the 'unsafe eval' error:
loop { Thread.start { loop{} } }
This starts many threads within 3 seconds, and after enough executions this will have created lots and lots of threads, which has killed the bot while testing.
Or this:
Thread.start { loop { Thread.start { loop {} } } }
It starts a thread which keeps generating other threads. The timeout does not stop this.

Ruby 1.8.7: Forks & Pipes - Troubleshooting

I'm aware that there are great gems like Parallel, but I came up with the class below as an exercise.
It's working fine, but when doing a lot of iterations it happens sometimes that Ruby will get "stuck". When pressing CTRL+C I can see from the backtrace that it's always in lines 38 or 45 (the both Marshal lines).
Can you see anything that is wrong here? It seems to be that the Pipes are "hanging", so I thought I might be using them in a wrong way.
My goal was to iterate through an array (which I pass as 'objects') with a limited number of forks (max_forks) and to return some values. Additionally I wanted to guarantee that all childs get killed when the parent gets killed (even in case of kill -9), this is why I introduced the "life_line" Pipe (I've read here on Stackoverflow that this might do the trick).
class Parallel
def self.do_fork(max_forks, objects)
waiter_threads = []
fork_counter = []
life_line = {}
comm_line = {}
objects.each do |object|
key = rand(24 ** 24).to_s(36)
sleep(0.01) while fork_counter.size >= max_forks
if fork_counter.size < max_forks
fork_counter << true
life_line[key] = {}
life_line[key][:r], life_line[key][:w] = IO.pipe
comm_line[key] = {}
comm_line[key][:r], comm_line[key][:w] = IO.pipe
pid = fork {
life_line[key][:w].close
comm_line[key][:r].close
Thread.new {
begin
life_line[key][:r].read
rescue SignalException, SystemExit => e
raise e
rescue Exception => e
Kernel.exit
end
}
Marshal.dump(yield(object), comm_line[key][:w]) # return yield
}
waiter_threads << Thread.new {
Process.wait(pid)
comm_line[key][:w].close
reply = Marshal.load(comm_line[key][:r])
# process reply here
comm_line[key][:r].close
life_line[key][:r].close
life_line[key][:w].close
life_line[key] = nil
fork_counter.pop
}
end
end
waiter_threads.each { |k| k.join } # wait for all threads to finish
end
end
The bug was this:
A pipe can handle only a certain amount of data (e.g. 64 KB).
Once you write more than that, the Pipe will get "stuck" forever.
An easy solution is to read the pipe in a thread before you start writing to it.
comm_line = IO.pipe
# Buffered Pipe Reading (in case bigger than 64 KB)
reply = ""
read_buffer = Thread.new {
while !comm_line[0].eof?
reply = Marshal.load(comm_line[0])
end
}
child_pid = fork {
comm_line[0].close
comm_line[0].write "HUGE DATA LARGER THAN 64 KB"
}
Process.wait(child_pid)
comm_line[1].close
read_buffer.join
comm_line[0].close
puts reply # outputs the "HUGE DATA"
I don't think the problem is with Marshal. The more obvious one seems to be that your fork may finish execution before the waiter thread gets to it (leading to the latter to wait forever).
Try changing Process.wait(pid) to Process.wait(pid, Process::WNOHANG). The Process::WNOHANG flag instructs Ruby to not hang if there are no children (matching the given PID, if any) available. Note that this may not be available on all platforms but at the very least should work on Linux.
There's a number of other potential problems with your code but if you just came up with it "as an exercise", they probably don't matter. For example, Marshal.load does not like to encounter EOFs, so I'd probably guard against those by saying something like Marshal.load(comm_line[key][:r]) unless comm_line[key][:r].eof? or loop until comm_line[key][:r].eof? if you expect there to be several objects to be read.

Odd bug with DataMapper, Mutexes, and Threads?

I have a database full of URLs that I need to test HTTP response time for on a regular basis. I want to have many worker threads combing the database at all times for a URL that hasn't been tested recently, and if it finds one, test it.
Of course, this could cause multiple threads to snag the same URL from the database. I don't want this. So, I'm trying to use Mutexes to prevent this from happening. I realize there are other options at the database level (optimistic locking, pessimistic locking), but I'd at least prefer to figure out why this isn't working.
Take a look at this test code I wrote:
threads = []
mutex = Mutex.new
50.times do |i|
threads << Thread.new do
while true do
url = nil
mutex.synchronize do
url = URL.first(:locked_for_testing => false, :times_tested.lt => 150)
if url
url.locked_for_testing = true
url.save
end
end
if url
# simulate testing the url
sleep 1
url.times_tested += 1
url.save
mutex.synchronize do
url.locked_for_testing = false
url.save
end
end
end
sleep 1
end
end
threads.each { |t| t.join }
Of course there is no real URL testing here. But what should happen is at the end of the day, each URL should end up with "times_tested" equal to 150, right?
(I'm basically just trying to make sure the mutexes and worker-thread mentality are working)
But each time I run it, a few odd URLs here and there end up with times_tested equal to a much lower number, say, 37, and locked_for_testing frozen on "true"
Now as far as I can tell from my code, if any URL gets locked, it will have to unlock. So I don't understand how some URLs are ending up "frozen" like that.
There are no exceptions and I've tried adding begin/ensure but it didn't do anything.
Any ideas?
I'd use a Queue, and a master to pull what you want. if you have a single master you control what's getting accessed. This isn't perfect but it's not going to blow up because of concurrency, remember if you aren't locking the database a mutex doesn't really help you is something else accesses the db.
code completely untested
require 'thread'
queue = Queue.new
keep_running = true
# trap cntrl_c or something to reset keep_running
master = Thread.new do
while keep_running
# check if we need some work to do
if queue.size == 0
urls = URL.all(:times_tested.lt => 150)
urls.each do |u|
queue << u.id
end
# keep from spinning the queue
sleep(0.1)
end
end
end
workers = []
50.times do
workers << Thread.new do
while keep_running
# get an id
id = queue.shift
url = URL.get(id)
#do something with the url
url.save
sleep(0.1)
end
end
end
workers.each do |w|
w.join
end

Limiting concurrent threads

I'm using threads in a program that uploads files over sftp. The number of files that could be upload can potentially be very large or very small. I'd like to be able to have 5 or less simultaneous uploads, and if there's more have them wait. My understanding is usually a conditional variable would be used for this, but it looks to me like that would only allow for 1 thread at a time.
cv = ConditionVariable.new
t2 = Thread.new {
mutex.synchronize {
cv.wait(mutex)
upload(file)
cv.signal
}
}
I think that should tell it to wait for the cv to be available the release it when done. My question is how can I do this allowing more than 1 at a time while still limiting the number?
edit: I'm using Ruby 1.8.7 on Windows from the 1 click installer
Use a ThreadPool instead. See Deadlock in ThreadPool (the accepted answer, specifically).
A word of caution -- there is no real concurrency in Ruby unless you are using JRuby. Also, exception in thread will freeze main loop unless you are in debug mode.
require "thread"
POOL_SIZE = 5
items_to_process = (0..100).to_a
message_queue = Queue.new
start_thread =
lambda do
Thread.new(items_to_process.shift) do |i|
puts "Processing #{i}"
message_queue.push(:done)
end
end
items_left = items_to_process.length
[items_left, POOL_SIZE].min.times do
start_thread[]
end
while items_left > 0
message_queue.pop
items_left -= 1
start_thread[] unless items_left < POOL_SIZE
end

Resources