I was wondering whether it's easy to make a race condition using MRI ruby(2.0.0) and some global variables, but as it turns out it's not that easy. It looks like it should fail at some point, but it doesn't and I've been running it for 10 minutes. This is the code I've been trying to achieve it:
def inc(*)
a = $x
a += 1
a *= 3000
a /= 3000
$x = a
end
THREADS = 10
COUNT = 5000
loop do
$x = 1
THREADS.times.map do Thread.new { COUNT.times(&method(:inc)) } end.each(&:join)
break puts "woo hoo!" if $x != THREADS * COUNT + 1
end
puts $x
Why am I not able to generate (or detect) the expected race condition, and get the output woo hoo! in Ruby MRI 2.0.0?
Your example does (almost instantly) work in 1.8.7.
The following variation does the trick for 1.9.3+:
def inc
a = $x + 1
# Just one microsecond
sleep 0.000001
$x = a
end
THREADS = 10
COUNT = 50
loop do
$x = 1
THREADS.times.map { Thread.new { COUNT.times { inc } } }.each(&:join)
break puts "woo hoo!" if $x != THREADS * COUNT + 1
puts "No problem this time."
end
puts $x
The sleep command is a strong hint to the interpreter that it can schedule another thread, so this is not a huge surprise.
Note if you replace the sleep with something that takes just as long or longer, e.g. b = a; 500.times { b *= 100 }, then there is no race condition detected in the above code. But take it further with b = a; 2500.times { b *= 100 }, or increase COUNT from 50 to 500, and the race condition is more reliably triggered.
The thread scheduling in Ruby 1.9.3 onwards (of course including 2.0.0) appears to be assigning CPU time in larger chunks than in 1.8.7. Opportunities to switch threads can be low in simple code, unless some kind of I/O waiting is involved.
It is even possible that the threads in the OP, each of which is performing just a few thousand calculations, are in essence occurring in series - although increasing the COUNT global to avoid this still does not trigger additional race conditions.
Generally MRI Ruby does not switch context between threads during atomic processes (e.g. during a Fixnum multiply or division) that occur within its C implementation. This means that the only opportunities for a thread context switch where all methods are calls to Ruby internals without I/O waiting, are "in-between" each line of code. In the original example, there are only 4 such fleeting opportunities, and it seems that in the scheme of things that this is not very much at all for MRI 1.9.3+ (in fact, see update below, these opportunities probably have been removed by Ruby)
When I/O waits or sleep are involved, it actually gets more complex, as Ruby MRI (1.9+) will allow a little bit of true parallel processing on multi-core CPUs. Although this is not the direct cause of race conditions with threads, it is more likely to result in them, as Ruby will usually make a thread context switch at the same time to take advantage of the parallelism.
Whilst I was researching this rough answer, I found an interesting link: Nobody understands the GIL (part 2 linked, as more relevant to this question)
Update: I suspect that the interpretter is optimising away some potential thread-switching points
in the Ruby source. Starting with my sleep version of the code, and setting:
COUNT = 500000
the following variation of inc does not seem to have a race condition affecting $x:
def inc
a = $x + 1
b = 0
b += 1
$x = a
end
However, these minor changes both trigger a race condition:
def inc
a = $x + 1
b = 0
b = b.send( :+, 1 )
$x = a
end
def inc
a = $x + 1
b = 0
b += '1'.to_i
$x = a
end
My interpretation is that the Ruby parser has optimised b += 1 to remove some of the
overhead of method despatch. One of the optimised-away steps is likely to include
the check for a possible switch to a waiting thread.
If that is the case, then the code in the question may never have the opportunity to switch threads within the inc method, because all the operations inside it can be optimised
in the same way.
Related
I have a Ruby Discord (discordrb) bot written to manage D&D characters. I notice when multiple players submit the same command, at the same time, the results they each receive are not independent. The request of one player (assigning a weapon to their character) ends up being assigned to other characters who submitted the same request at the same time. I expected each request to be executed separately, in sequence. How do I prevent crossing requests?
bot.message(contains:"$Wset") do |event|
inputStr = event.content; # this should contain "$Wset#" where # is a single digit
check_user_or_nick(event); pIndex = nil; #fetch the value of #user & set pIndex
(0..(#player.length-1)).each do |y| #find the #player pIndex within the array using 5 char of #user
if (#player[y][0].index(#user.slice(0,5)) == 0) then pIndex = y; end; #finds player Index Value (integer or nil)
end;
weaponInt = Integer(inputStr.slice(5,1)) rescue false; #will detect integer or non integer input
if (pIndex != nil) && (weaponInt != false) then;
if weaponInt < 6 then;
#player[pIndex][1]=weaponInt;
say = #player[pIndex][0].to_s + " weapon damage has be set to " + #weapon[(#player[pIndex][1])].to_s;
else;
say = "Sorry, $Wset requires this format: $Wset? where ? is a single number ( 0 to 5 )";
end;
else
say = "Sorry, $Wset requires this format: $Wset? where ? is a single number ( 0 to 5 )";
end;
event.respond say;
end;
To avoid race conditions in multithreaded code like this, the main thing you want to look for are side effects.
Think about the bot.message(contains:"$Wset") do |event| block as a mini program or a thread. Everything in here should be self contained - there should be no way for it to effect any other threads.
Looking through your code initially, what I'm searching for are any shared variables. These produce a race condition if they are read/written by multiple threads at the same time.
In this case, there are 2 obvious offenders - #player and #user. These should be refactored to local variables rather than instance variables. Define them within the block so they don't affect any other scope, for example:
# Note, for this to work, you will have to change
# the method definition to return [player, user]
player, user = check_user_or_nick(event)
Sometimes, making side effects from threads is unavoidable (say you wanted to make a counter for how many times the thread was run). To prevent race conditions in these scenarios, a Mutex is generally the solution but also sometimes a distributed lock if the code is being run on multiple machines. However, from the code you've shown, it doesn't look like you need either of these things here.
Before anything, I have read all the answers of Why doesn't Ruby support i++ or i—? and understood why. Please note that this is not just another discussion topic about whether to have it or not.
What I'm really after is a more elegant solution for the situation that made me wonder and research about ++/-- in Ruby. I've looked up loops, each, each_with_index and things alike but I couldn't find a better solution for this specific situation.
Less talk, more code:
# Does the first request to Zendesk API, fetching *first page* of results
all_tickets = zd_client.tickets.incremental_export(1384974614)
# Initialises counter variable (please don't kill me for this, still learning! :D )
counter = 1
# Loops result pages
loop do
# Loops each ticket on the paged result
all_tickets.all do |ticket, page_number|
# For debug purposes only, I want to see an incremental by each ticket
p "#{counter} P#{page_number} #{ticket.id} - #{ticket.created_at} | #{ticket.subject}"
counter += 1
end
# Fetches next page, if any
all_tickets.next unless all_tickets.last_page?
# Breaks outer loop if last_page?
break if all_tickets.last_page?
end
For now, I need counter for debug purposes only - it's not a big deal at all - but my curiosity typed this question itself: is there a better (more beautiful, more elegant) solution for this? Having a whole line just for counter += 1 seems pretty dull. Just as an example, having "#{counter++}" when printing the string would be much simpler (for readability sake, at least).
I can't simply use .each's index because it's a nested loop, and it would reset at each page (outer loop).
Any thoughts?
BTW: This question has nothing to do with Zendesk API whatsoever. I've just used it to better illustrate my situation.
To me, counter += 1 is a fine way to express incrementing the counter.
You can start your counter at 0 and then get the effect you wanted by writing:
p "#{counter += 1} ..."
But I generally wouldn't recommend this because people do not expect side effects like changing a variable to happen inside string interpolation.
If you are looking for something more elegant, you should make an Enumerator that returns integers one at a time, each time you call next on the enumerator.
nums = Enumerator.new do |y|
c = 0
y << (c += 1) while true
end
nums.next # => 1
nums.next # => 2
nums.next # => 3
Instead of using Enumerator.new in the code above, you could just write:
nums = 1.upto(Float::INFINITY)
As mentioned by B Seven each_with_index will work, but you can keep the page_number, as long all_tickets is a container of tuples as it must be to be working right now.
all_tickets.each_with_index do |ticket, page_number, i|
#stuff
end
Where i is the index. If you have more than ticket and page_number inside each element of all_tickets you continue putting them, just remember that the index is the extra one and shall stay in the end.
Could be I oversimplified your example but you could calculate a counter from your inner and outer range like this.
all_tickets = *(1..10)
inner_limit = all_tickets.size
outer_limit = 5000
1.upto(outer_limit) do |outer_counter|
all_tickets.each_with_index do |ticket, inner_counter|
p [(outer_counter*inner_limit)+inner_counter, outer_counter, inner_counter, ticket]
end
# some conditional to break out, in your case the last_page? method
break if outer_counter > 3
end
all_tickets.each_with_index(1) do |ticket, i|
I'm not sure where page_number is coming from...
See Ruby Docs.
I have implemented the following statistical computation in perl http://en.wikipedia.org/wiki/Fisher_information.
The results are correct. I know this because I have 100's of test cases that match input and output. The problem is that I need to compute this many times every single time I run the script. The average number of calls to this function is around 530. I used Devel::NYTProf to find out this out as well as where the slow parts are. I have optimized the algorithm to only traverse the top half of the matrix and reflect it onto the bottom as they are the same. I'm not a perl expert, but I need to know if there is anything I can try to speed up the perl. This script is distributed to clients so compiling a C file is not an option. Is there another perl library I can try? This needs to be sub second in speed if possible.
More information is $MatrixRef is a matrix of floating point numbers that is $rows by $variables. Here is the NYTProf dump for the function.
#-----------------------------------------------
#
#-----------------------------------------------
sub ComputeXpX
# spent 4.27s within ComputeXpX which was called 526 times, avg 8.13ms/call:
# 526 times (4.27s+0s) by ComputeEfficiency at line 7121, avg 8.13ms/call
{
526 0s my ($MatrixRef, $rows, $variables) = #_;
526 0s my $r = 0;
526 0s my $c = 0;
526 0s my $k = 0;
526 0s my $sum = 0;
526 0s my #xpx = ();
526 11.0ms for ($r = 0; $r < $variables; $r++)
{
14202 19.0ms my #temp = (0) x $variables;
14202 6.01ms push(#xpx, \#temp);
526 0s }
526 7.01ms for ($r = 0; $r < $variables; $r++)
{
14202 144ms for ($c = $r; $c < $variables; $c++)
{
198828 43.0ms $sum = 0;
#for ($k = 0; $k < $rows; $k++)
198828 101ms foreach my $RowRef (#{$MatrixRef})
{
#$sum += $MatrixRef->[$k]->[$r]*$MatrixRef->[$k]->[$c];
6362496 3.77s $sum += $RowRef->[$r]*$RowRef->[$c];
}
198828 80.1ms $xpx[$r]->[$c] = $sum;
#reflect on other side of matrix
198828 82.1ms $xpx[$c]->[$r] = $sum if ($r != $c);
14202 1.00ms }
526 2.00ms }
526 2.00ms return \#xpx;
}
Since each element of the result matrix can be calculated independently, it should be possible to calculate some/all of them in parallel. In other words, none of the instances of the innermost loop depend on the results of any other, so they could run simultaneously on their own threads.
There really isn't much you can do here, without rewriting parts in C, or moving to a better framework for mathematic operations than bare-bone Perl (→ PDL!).
Some minor optimization ideas:
You initialize #xpx with arrayrefs containing zeros. This is unneccessary, as you assign a value to every position either way. If you want to pre-allocate array space, assign to the $#array value:
my #array;
$#array = 100; # preallocate space for 101 scalars
This isn't generally useful, but you can benchmark with and without.
Iterate over ranges; don't use C-style for loops:
for my $c ($r .. $variables - 1) { ... }
Perl scalars aren't very fast for math operations, so offloading the range iteration to lower levels will gain a speedup.
Experiment with changing the order of the loops, and toy around with caching a level of array accesses. Keeping $my $xpx_r = $xpx[$r] around in a scalar will reduce the number of array accesses. If your input is large enough, this translates into a speed gain. Note that this only works when the cached value is a reference.
Remember that perl does very few “big” optimizations, and that the opcode tree produced by compilation closely resembles your source code.
Edit: On threading
Perl threads are heavyweight beasts that literally clone the current interpreter. It is very much like forking.
Sharing data structures across thread boundaries is possible (use threads::shared; my $variable :shared = "foo") but there are various pitfalls. It is cleaner to pass data around in a Thread::Queue.
Splitting the calculation of one product over multiple threads could end up with your threads doing more communication than calculation. You could benchmark a solution that divides responsibility for certain rows between the threads. But I think recombining the solutions efficiently would be difficult here.
More likely to be useful is to have a bunch of worker threads running from the beginning. All threads listen to a queue which contains a pair of a matrix and a return queue. The worker would then dequeue a problem, and send back the solution. Multiple calculations could be run in parallel, but a single matrix multiplication will be slower. Your other code would have to be refactored significantly to take advantage of the parallelism.
Untested code:
use strict; use warnings; use threads; use Thread::Queue;
# spawn worker threads:
my $problem_queue = Thread::Queue->new;
my #threads = map threads->new(\&worker, $problem_queue), 1..3; # make 3 workers
# automatically close threads when program exits
END {
$problem_queue->enqueue((undef) x #threads);
$_->join for #threads;
}
# This is the wrapper around the threading,
# and can be called exactly as ComputeXpX
sub async_XpX {
my $return_queue = Thread::Queue->new();
$problem_queue->enqueue([$return_queue, #_]);
return sub { $return_queue->dequeue };
}
# The main loop of worker threads
sub worker {
my ($queue) = #_;
while(defined(my $problem = $queue->dequeue)) {
my ($return, #args) = #$problem;
$return->enqueue(ComputeXpX(#args));
}
}
sub ComputeXpX { ... } # as before
The async_XpX returns a coderef that will eventually collect the result of the computation. This allows us to carry on with other stuff until we need the result.
# start two calculations
my $future1 = async_XpX(...);
my $future2 = async_XpX(...);
...; # do something else
# collect the results
my $result1 = $future1->();
my $result2 = $future2->();
I benchmarked the bare-bones threading code without doing actual calculations, and the communication is about as expensive as the calculations. I.e. with a bit of luck, you may start to get a benefit on a machine with at least four processors/kernel threads.
A note on profiling threaded code: I know of no way to do that elegantly. Benchmarking threaded code, but profiling with single-threaded test cases may be preferable.
I have the following code (from a Ruby tutorial):
require 'thread'
count1 = count2 = 0
difference = 0
counter = Thread.new do
loop do
count1 += 1
count2 += 1
end
end
spy = Thread.new do
loop do
difference += (count1 - count2).abs
end
end
sleep 1
puts "count1 : #{count1}"
puts "count2 : #{count2}"
puts "difference : #{difference}"
counter.join(2)
spy.join(2)
puts "count1 : #{count1}"
puts "count2 : #{count2}"
puts "difference : #{difference}"
It's an example for using Mutex.synchronize. On my computer, the results are quite different from the tutorial. After calling join, the counts are sometimes equal:
count1 : 5321211
count2 : 6812638
difference : 0
count1 : 27307724
count2 : 27307724
difference : 0
and sometimes not:
count1 : 4456390
count2 : 5981589
difference : 0
count1 : 25887977
count2 : 28204117
difference : 0
I don't understand how it is possible that the difference is still 0 even though the counts show very different numbers.
The add operation probably looks like this:
val = fetch_current(count1)
add 1 to val
store val back into count1
and something similar for count2. Ruby can switch execution between threads, so it might not finish writing to a variable, but when the CPU gets back to the thread, it should continue from the line where it was interrupted, right?
And there is still just one thread that is writing into the variable. How is it possible that, inside the loop do block, count2 += 1 is executed much more times?
Execution of
puts "count1 : #{count1}"
takes some time (although it may be short). It is not done in an instance. Therefore, it is not mysterious that the two consecutive lines:
puts "count1 : #{count1}"
puts "count2 : #{count2}"
are showing different counts. Simply, the counter thread went though some loop cycles and incremented the counts while the first puts was executed.
Similarly, when
difference += (count1 - count2).abs
is calculated, the counts may in principle increment while count1 is referenced before count2 is referenced. But there is no command executed within that time span, and I guess that the time it takes to refer to count1 is much shorter than the time it takes for the counter thread to go through another loop. Note that the operations done in the former is a proper subset of what is done in the latter. If the difference is significant enough, which means that counter thread had not gone through a loop cycle during the argument call for the - method, then count1 and count2 will appear as the same value.
A prediction will be that, if you put some expensive calculation after referencing count1 but before referencing count2, then difference will show up:
difference += (count1.tap{some_expensive_calculation} - count2).abs
# => larger `difference`
Here's the answer. I think you've made the assumption that the threads stop execution after join(2) returns.
This is not the case! The threads continue to run even though join(2) returns execution (temporarily) back to the main thread.
If you change your code to this you will see what happens:
...
counter.join(2)
spy.join(2)
counter.kill
spy.kill
puts "count1 : #{count1}"
puts "count2 : #{count2}"
puts "difference : #{difference}"
This seems to work a bit differently in ruby 1.8, where the threads do not seem to get a chance to run while the main thread is executing.
The tutorial is probably written for ruby 1.8, but the threading model has been changed since then in 1.9.
In fact it was pure "luck" that it worked in 1.8, as the threads do not finish execution when join(2) returns in neither 1.8 nor 1.9.
Ok here's a basic for loop
local a = {"first","second","third","fourth"}
for i=1,#a do
print(i.."th iteration")
a = {"first"}
end
As it is now, the loop executes all 4 iterations.
Shouldn't the for-loop-limit be calculated on the go? If it is calculated dynamically, #a would be 1 at the end of the first iteration and the for loop would break....
Surely that would make more sense?
Or is there any particular reason as to why that is not the case?
The main reason why numerical for loops limits are computed only once is most certainly for performance.
With the current behavior, you can place arbitrary complex expressions in for loops limits without a performance penalty, including function calls. For example:
local prod = 1
for i = computeStartLoop(), computeEndLoop(), computeStep() do
prod = prod * i
end
The above code would be really slow if computeEndLoop and computeStep required to be called at each iteration.
If the standard Lua interpreter and most notably LuaJIT are so fast compared to other scripting languages, it is because a number of Lua features have been designed with performance in mind.
In the rare cases where the single evaluation behavior is undesirable, it is easy to replace the for loop with a generic loop using while end or repeat until.
local prod = 1
local i = computeStartLoop()
while i <= computeEndLoop() do
prod = prod * i
i = i + computeStep()
end
The length is computed once, at the time the for loop is initialized. It is not re-computed each time through the loop - a for loop is for iterating from a starting value to an ending value. If you want the 'loop' to terminate early if the array is re-assigned to, you could write your own looping code:
local a = {"first", "second", "third", "fourth"}
function process_array (fn)
local inner_fn
inner_fn =
function (ii)
if ii <= #a then
fn(ii,a)
inner_fn(1 + ii)
end
end
inner_fn(1, a)
end
process_array(function (ii)
print(ii.."th iteration: "..a[ii])
a = {"first"}
end)
Performance is a good answer but I think it also makes the code easier to understand and less error-prone. Also, that way you can (almost) be sure that a for loop always terminates.
Think about what would happen if you wrote that instead:
local a = {"first","second","third","fourth"}
for i=1,#a do
print(i.."th iteration")
if i > 1 then a = {"first"} end
end
How do you understand for i=1,#a? Is it an equality comparison (stop when i==#a) or an inequality comparison (stop when i>=#a). What would be the result in each case?
You should see the Lua for loop as iteration over a sequence, like the Python idiom using (x)range:
a = ["first", "second", "third", "fourth"]
for i in range(1,len(a)+1):
print(str(i) + "th iteration")
a = ["first"]
If you want to evaluate the condition every time you just use while:
local a = {"first","second","third","fourth"}
local i = 1
while i <= #a do
print(i.."th iteration")
a = {"first"}
i = i + 1
end