I'm playing around for the first time with parallel computing with julia. I'm having a bit of a headache. So let's say I start julia as follows: julia -p 4. Then I declare the a function for all processors and then I use it with pmap and also with #parallel for.
#everywhere function count_heads(n)
c::Int = 0
for i=1:n
c += rand(Bool)
end
n, c # tuple (input, output)
end
###### first part ######
v=pmap(count_heads, 50000:1000:70000)
println("Result first part")
println(v)
###### second part ######
println("Result second part")
#parallel for i in 50000:1000:70000
println(count_heads(i))
end
The result is the following.
Result first part
Counting heads function
Any[(50000,24894),(51000,25559),(52000,26141),(53000,26546),(54000,27056),(55000,27426),(56000,28024),(57000,28380),(58000,29001),(59000,29398),(60000,30100),(61000,30608),(62000,31001),(63000,31520),(64000,32200),(65000,32357),(66000,33063),(67000,33674),(68000,34085),(69000,34627),(70000,34902)]
Result second part
From worker 4: (61000, From worker 5: (66000, From worker 2: (50000, From worker 3: (56000
Thus, the funcion pmap is apparently working fine but #parallel for is stopping or it doesn't give me the results. Am I doing something wrong?
Thanks!
Update
If at the end of the code I put sleep(10). It does the work correctly.
From worker 5: (66000,33182)
From worker 3: (56000,27955)
............
From worker 3: (56000,27955)
Both of your examples work properly on my laptop so I'm not sure but I think this answer might solve your problem!
It should work as expected if you add #sync before the #parallel for
From the julia Parallel Computing Docs http://docs.julialang.org/en/release-0.4/manual/parallel-computing/:
... the reduction operator can be omitted if it is not
needed. In that case, the loop executes asynchronously, i.e. it spawns
independent tasks on all available workers and returns an array of
RemoteRef immediately without waiting for completion. The caller can
wait for the RemoteRef completions at a later point by calling fetch()
on them, or wait for completion at the end of the loop by prefixing it
with #sync, like #sync #parallel for.
So you are maybe calling println on the RemoteRef before it has completed.
Related
My subroutine is called by parallel code. I want to write an if statement which is only true for one/first thread who reach it first. then make if section unaccessble for rest of threads.
I use 50 threads however, initialization/first access is countered 10 time, end is never counter sometimes.
!Following is the code. which is part of large parallel code.
if(counter.eq.0) then
write(*,*) 'initialised'
!x=0
endif
if(counter<totalElement)then
counter=counter+1
!equations etc
endif
if(counter.eq.totalElement) then
write(*,*) 'finished'
counter=0
endif
“I expect the output of
initialized
finished
”
“The output is
initialized
initialized
initialized
initialized
initialized
initialized
initialized
initialized
, but that is wrong because more then one thread found counter zero and initialized.”
!Extra details: parallelization is made on meshed geometry with total numbe rof elements "totalElement". In serial code i can simply says (if Element=1) initialize and if(Element=totalElement) finishes.
In parallel loops Element is not 1. Code can pick any element for first run. and last loop element is not totalElement. Thats why i made counter variable but still it is not going correct.
When you run your code on multi-thread, each thread will do part of the job. Thus, increasing your counter through counter=counter+1 will never allow the variable counter to reach the value of totalElement. If counter is a shared variable, thus you are encontering a race problem. if you are dealing with a shared variable you can protect the access to your variable with atomic clause:
!$omp atomic update
counter=counter+1
and
!$omp atomic write
counter=0
This should be enough to get rid of thread racing.
If counter is a private variable and your schedule is static, then the variable counter cannot exceed totalElement/number_of_used_threads
More details about your code are welcomed.
Im using the Matpower - Matlab toolbox in parallel computing and building the computer cluster to simulate the programme which is shown below:
matlabpool open job1 5 % matlabpool means computer cluster
spmd %the statement from the Parallel computing toolbox
% Run all the statements in parallel
% first part of code
if labindex==1
runopf('casea');
end
% second part of code
if labindex==2
runopf('caseb');
end
end
matlabpool close;
When the labindex is 1 the first part of code in this program is running in "computer1" in the cluster, and so forth when the labindex is 2, then the second part of code in the program is "running in computer2". My question is the main code shown above running in sequence or in parallel?
By which I mean, does the second part code has to wait to be executed until the first part of code is executed or two parts of codes can be executed parallel at the two different computers in the cluster?
The code between spmd and corresponding end is sent to all workers (5 in your case) and they execute these instructions in parallel. Then, in your code you instructed worker #1 to execute runopf('casea'); and worker #2 runopf('caseb');. Workers #3 to #5 will effectively do nothing.
Technically, worker #2 will execute runopf('caseb'); a little later. The delay appears because worker #2 will also check the first if statement (but will not execute the code in it).
I am working on a eventmachine based application that periodically polls for changes of MongoDB stored documents.
A simplified code snippet could look like:
require 'rubygems'
require 'eventmachine'
require 'em-mongo'
require 'bson'
EM.run {
#db = EM::Mongo::Connection.new('localhost').db('foo_development')
#posts = #db.collection('posts')
#comments = #db.collection('comments')
def handle_changed_posts
EM.next_tick do
cursor = #posts.find(state: 'changed')
resp = cursor.defer_as_a
resp.callback do |documents|
handle_comments documents.map{|h| h["comment_id"]}.map(&:to_s) unless documents.length == 0
end
resp.errback do |err|
raise *err
end
end
end
def handle_comments comment_ids
meta_product_ids.each do |id|
cursor = #comments.find({_id: BSON::ObjectId(id)})
resp = cursor.defer_as_a
resp.callback do |documents|
magic_value = documents.first['weight'].to_i * documents.first['importance'].to_i
end
resp.errback do |err|
raise *err
end
end
end
EM.add_periodic_timer(1) do
puts "alive: #{Time.now.to_i}"
end
EM.add_periodic_timer(5) do
handle_changed_posts
end
}
So every 5 seconds EM iterates over all posts, and selects the changed ones. For each changed post it stores the comment_id in an array. When done that array is passed to a handle_comments which loads every comment and does some calculation.
Now I have some difficulties in understanding:
I know, that this load_posts->load_comments->calculate cycle takes 3 seconds in a Rails console with 20000 posts, so it will not be much faster in EM. I schedule the handle_changed_posts method every 5 seconds which is fine unless the number of posts raises and the calculation takes longer than the 5 seconds after which the same run is scheduled again. In that case I'd have a problem soon. How to avoid that?
I trust em-mongo but I do not trust my EM knowledge. To monitor EM is still running I puts a timestamp every second. This seems to be working fine but gets a bit bumpy every 5 seconds when my calculation runs. Is that a sign, that I block the loop?
Is there any general way to find out if I block the loop?
Should I nice my eventmachine process with -19 to give it top OS prio always?
I have been reluctant to answer here since I've got no mongo experience so far, but considering no one is answering and some of the stuff here is general EM stuff I may be able to help:
schedule next scan on first scan's end (resp.callback and resp.errback in handle_changed_posts seem like good candidates to chain next scan), either with add_timer or with next_tick
probably, try handling your mongo trips more often so they handle smaller chunks of data, any cpu cycle hog inside your reactor would make your reactor loop too busy to accept events such as periodic timer ticks
no simple way, no. One idea would be to measure diff of Time.now to next_tick{Time.now}, do benchmark and then trace possible culprits when the diff crosses a threshold. Simulating slow queries (Simulate slow query in mongodb? ?) and many parallel connections is a good idea
I honestly don't know, I've never encountered people who do that, I expect it depends on other things running on that server
To expand upon bbozo's answer, specifically in relation to your second question, there is no time when you run code that you do not block the loop. In my experience, when we talk about 'non-blocking' code what we really mean is 'code that doesn't block very long'. Typically, these are very short periods of time (less than a millisecond), but they still block while executing.
Further, the only thing next_tick really does is to say 'do this, but not right now'. What you really want to do, as bbozo mentioned, is split up your processing over multiple ticks such that each iteration blocks for as little time as possible.
To use your own benchmarks, if 20,000 records takes about 3 seconds to process, 4,000 records should take about 0.6 seconds. This would be short enough to not usually affect your 1 second heartbeat. You could split it up even farther to reduce the amount of blockage and make the reactor run smoother, but it really depends on how much concurrency you need from the reactor.
I am getting into ruby and have been using threads for a little while now with out fully understanding them. I notice that when adding a thread to an array and if I add a sleep() command as the first command the thread does not run until I do a join which is mostly what I want. So I have 2 questions.
1.Is that suppose to happen?
2.Is there a better way to do that other then the way I'm doing it. Here is a sample code that I have to show what I'm talking about.
job = Array.new
10.times do |n|
job << Thread.new do
sleep 0.001
puts "done #{n}"
end
end
#job.each do |t|
#t.join
#end
puts "End of script"
Output is
End of script
If I remove the comments output is
done 1
done 0
done 7
done 6
done 5
done 4
done 3
done 2
done 9
done 8
End of script
So I use this now but I don't understand why it does that. Sometimes I notice even doing something like `echo hi` instead of sleep does the trick.
Thanks in advance.
Timing of threads isn't a defined behavior. Once you put them to sleep, they will be put in a queue to be run later. You can't ever expect it to run one way or another.
Your main program doesn't take very long to run, so it is likely to happen to finish before your other threads get picked back up to run again. Really, when you think about it, 0.001 seconds is quite a long time to computer, so spinning off 10 threads in that time is likely to happen -- but even if it takes longer, there is no guarantee the thread will resume immediately after .001 seconds. Often there's really no guarantee it won't start before .001 seconds, either, but sleep calls usually don't end early.
When you add the join calls, you are introducing additional time into your main thread which allows the other threads time to run, so this behavior is expected.
I need to perform long-running operation in ruby/rails asynchronously.
Googling around one of the options I find is Sidekiq.
class WeeklyReportWorker
include Sidekiq::Worker
def perform(user, product, year = Time.now.year, week = Date.today.cweek)
report = WeeklyReport.build(user, product, year, week)
report.save
end
end
# call WeeklyReportWorker.perform_async('user', 'product')
Everything works great! But there is a problem.
If I keep calling this async method every few seconds, but the actual time heavy operation performs is one minute things won't work.
Let me put it in example.
5.times { WeeklyReportWorker.perform_async('user', 'product') }
Now my heavy operation will be performed 5 times. Optimally it should have performed only once or twice depending on whether execution of first operaton started before 5th async call was made.
Do you have tips how to solve it?
Here's a naive approach. I'm a resque user, maybe sidekiq has something better to offer.
def perform(user, product, year = Time.now.year, week = Date.today.cweek)
# first, make a name for lock key. For example, include all arguments
# there, so that another perform with the same arguments won't do any work
# while the first one is still running
lock_key_name = make_lock_key_name(user, product, year, week)
Sidekiq.redis do |redis| # sidekiq uses redis, let us leverage that
begin
res = redis.incr lock_key_name
return if res != 1 # protection from race condition. Since incr is atomic,
# the very first one will set value to 1. All subsequent
# incrs will return greater values.
# if incr returned not 1, then another copy of this
# operation is already running, so we quit.
# finally, perform your business logic here
report = WeeklyReport.build(user, product, year, week)
report.save
ensure
redis.del lock_key_name # drop lock key, so that operation may run again.
end
end
end
I am not sure I understood your scenario well, but how about looking at this gem:
https://github.com/collectiveidea/delayed_job
So instead of doing:
5.times { WeeklyReportWorker.perform_async('user', 'product') }
You can do:
5.times { WeeklyReportWorker.delay.perform('user', 'product') }
Out of the box, this will make the worker process the second job after the first job, but only if you use the default settings (because by default the worker process is only one).
The gem offers possibilities to:
Put jobs on a queue;
Have different queues for different jobs if that is required;
Have more than one workers to process a queue (for example, you can start 4 workers on a 4-CPU machine for higher efficiency);
Schedule jobs to run at exact times, or after set amount of time after queueing the job. (Or, by default, schedule for immediate background execution).
I hope it can help you as you did to me.