In my experience Matlab performs publish subscribe operations with ROS slow for some reason. I work with components as defined in an object class as shown below, where I made a test-class. Normally objects of comparable structure are used to control mobile robots.
To quantify performance tested required time for an operation and got the following results:
1x publishing a message + 1x simple subscriber callback : 3.7ms
Simply counting in a callback (per count): 2.1318e-03 ms
Creating a new message with msg1 = rosmessage(obj.publisher) adds 3.6-4.3ms per iteration
Pinging myself indicated communication latency of 0.05 ms
The times required for a simple publish + start of a subscribe callback seems oddly slow.
I want to have multiple system components as objects in my workspace such that they respond to ROS topic updates or on timer events. The pc used for testing is not a monster but should not be garbage either.
Do you also think the shown time requirements are unneccesary large? this allows barely to publish a single topic at 200hz without doing anything else. Normally I have multiple lower frequency topics (e.g.20hz) but the total consumed time becomes significant.
Do you know any practices to make the system operate quicker?
What do you think of the OOP style of making control system components in general?
classdef subpubspeedMonitor < handle
% Use: call in matlab console, after initializing ros:
%
% SPM1 = subpubspeedMonitor()
%
% This will create an object which starts a set repetitive task upon creation
% and finally destructs itself after posting results in console.
properties
node
subscriber
publisher
timestart
messagetotal
end
methods
function obj = subpubspeedMonitor()
obj.node = ros.Node('subspeedmonitor1');
obj.subscriber = ros.Subscriber(obj.node,'topic1','sensor_msgs/NavSatFix',{#obj.rosSubCallback});
obj.publisher = ros.Publisher(obj.node,'topic1','sensor_msgs/NavSatFix');
obj.timestart = tic;
obj.messagetotal = 0;
msg1 = rosmessage(obj.publisher);
% Choose to evaluate subscriber + publisher loop or just counting
if 1
send(obj.publisher,msg1);
else
countAndDisplay(obj)
end
end
%% Test method one: repetitive publishing and subscribing
function rosSubCallback(obj,~,msg_) % ~3.7 ms per loop for a simple publish+subscribe action
% Latency to self is 0.05ms on average, according to "pinging" in terminal
obj.messagetotal = obj.messagetotal+1;
if obj.messagetotal <10000
%msg1 = rosmessage(obj.publisher); % this line adds 4.3000ms per loop
msg_.Longitude = 51; % this line adds 0.25000 ms per loop
send(obj.publisher,msg_)
else
% Display some results
timepassed = toc(obj.timestart);
time_per_pubsub = timepassed/obj.messagetotal
delete(obj);
end
end
%% Test method two: simply counting
function countAndDisplay(obj) % this costs 2.1318e-03 ms(!) per loop
obj.messagetotal = obj.messagetotal+1;
if obj.messagetotal <10000
%msg1 = rosmessage(obj.publisher); %adds 3.6ms per loop
%i = 1% adds 5.7532e-03 ms per loop
%msg1 = rosmessage("std_msgs/Bool"); %adds 1.5ms per loop
countAndDisplay(obj);
else
% Display some results
timepassed = toc(obj.timestart);
time_per_count_FCN = timepassed/obj.messagetotal
delete(obj);
end
end
%% Deconstructor
function delete(obj)
delete(obj.subscriber)
delete(obj.publisher)
delete(obj.node)
end
end
end
My project involves computing in parallel a map using Julia's Distributed's pmap function.
Mapping a given element could take a few seconds, or it could take essentially forever. I want a timeout or time limit for an individual map task/computation to complete.
If a map task finishes in time, great, return the result of the computation. If the task doesn't complete by the time limit, stop computation when the time limit has been reached, and return some value or message indicating a timeout occurred.
A minimal example follows. First are imported modules, and then worker processes are launched:
num_procs = 1
using Distributed
if num_procs > 1
# The main process (no calling addprocs) can be used for `pmap`:
addprocs(num_procs-1)
end
Next, the mapping task is defined for all the worker processes. The mapping task should timeout after 1 second:
#everywhere import Random
#everywhere begin
"""
Compute stuff for `wait_time` seconds, and return `wait_time`.
If `timeout` seconds elapses, stop computation and return something else.
"""
function waitForTimeUnlessTimeout(wait_time, timeout=1)
# < Insert some sort of timeout code? >
# This block of code simulates a long computation.
# (pretend the computation time is unknown)
x = 0
while time()-t0 < wait_time
x += Random.rand() - 0.5
end
# computation completed before time limit. Return wait_time.
round(wait_time, digits=2)
end
end
The function that executes the parallel map (pmap) is defined on the main process. Each map task randomly takes up to 2 seconds to complete, but should time out after 1 second.
function myParallelMapping(num_tasks = 20, max_runtime=2)
# random task runtimes between 0 and max_runtime
runtimes = Random.rand(num_tasks) * max_runtime
# return the parallel computation of the mapping tasks
pmap((runtime)->waitForTimeUnlessTimeout(runtime), runtimes)
end
print(myParallelMapping())
How should this time-limited parallel map be implemented?
You could put something like this inside your pmap body
pmap(runtimes) do runtime
t0 = time()
task = #async waitForTimeUnlessTimeout(runtime)
while !istaskdone(task) && time()-t0 < time_limit
sleep(1)
end
istaskdone(task) && (return fetch(task))
error("time over")
end
Also note that (runtime)->waitForTimeUnlessTimeout(runtime) is the same as just waitForTimeUnlessTimeout .
Following #Fredrik Bagge's very helpful answer, here is the full working example implementation with some extra explanation.
num_procs = 8
using Distributed
if num_procs > 1
addprocs(num_procs-1)
end
#everywhere import Random
#everywhere begin
function waitForTime(wait_time)
# This code block simulates a long computation.
# Pretend the computation time is unknown.
t0 = time()
x = 0
while time()-t0 < wait_time
x += Random.rand() - 0.5
yield() # CRITICAL to release computation to check if task is done.
# If you comment out #yield(), you will see timeout doesn't work!
end
return round(wait_time, digits=2)
end
end
function myParallelMapping(num_tasks = 16, max_runtime=2, time_limit=1)
# random task runtimes between 0 and max_runtime
runtimes = Random.rand(num_tasks) * max_runtime
# parallel compute the mapping tasks. See "do block" in
# the Julia documentation, it's just syntactic sugar.
return pmap(runtimes) do runtime
t0 = time()
task = #async waitForTime(runtime)
while !istaskdone(task) && time()-t0 < time_limit
# releases computation to waitForTime
sleep(0.1)
# nothing past here will run until waitForTime calls yield()
# *and* 0.1 seconds have passed.
end
# equal to if istaskdone(task); return fetch(task); end
istaskdone(task) && (return fetch(task))
return "TimeOut"
# `return error("TimeOut")` halts pmap unless pmap is
# given an error handler argument. See pmap documentation.
end
end
The output is
julia> print(myParallelMapping())
Any["TimeOut", "TimeOut", 0.33, 0.35, 0.56, 0.41, 0.08, 0.14, 0.72,
"TimeOut", "TimeOut", "TimeOut", 0.52, "TimeOut", 0.33, "TimeOut"]
Note that there are two tasks per process in this example. The original task (the "time checker") is checking every 0.1 seconds if the other task has completed computation. The other task (created with #async) is computing something, periodically calling yield() to release control to the time checker; if it doesn't call yield(), time checking cannot occur.
I have a for loop like
for tot in range(0,100000):
at first iterations it is quite fast but as it reaches toward the middle it gets extremely slow. I don't have such code lines that you say sth gets accumulated and gets large so it's computation takes longer.
It stays the same as the first iteration always, but gets processed slower.
Is it because each time tot goes from beginning of "range" to reach to the proper iteration, and as iterative method continues it takes longer to reach to middle of the "range"?
I have no idea why this happens!!!
Inside loop sample code:
for tot in range(0,nt):
for k in range(0,nx):
if k!=0 and k!=nx-1:
for i in range(1,nz-1):
phai0[i,k]=(1.0/dz**2)*(w0[i+1,k]-2.0*w0[i,k]+w0[i-1,k])+(1.0/dx**2)*(w0[i,k+1]-2.0*w0[i,k]+w0[i,k-1])
phai1[i,k]=(1.0/dz**2)*(w1[i+1,k]-2.0*w1[i,k]+w1[i-1,k])+(1.0/dx**2)*(w1[i,k+1]-2.0*w1[i,k]+w1[i,k-1])
phai2[i,k]=(2.0-Ncoef**2*dt**2)*phai1[i,k]-phai0[i,k]+1.0*(Ncoef**2)*(dt**2*1.0/dz**2)*(w1[i+1,k]-2*w1[i,k]+w1[i-1,k])
if k==0:
for i in range(1,nz-1):
phai0[i,k]=(1.0/dz**2)*(w0[i+1,k]-2.0*w0[i,k]+w0[i-1,k])+(1.0/dx**2)*(w0[i,k+1]-2.0*w0[i,k]+w0[i,k-2])
phai1[i,k]=(1.0/dz**2)*(w1[i+1,k]-2.0*w1[i,k]+w1[i-1,k])+(1.0/dx**2)*(w1[i,k+1]-2.0*w1[i,k]+w1[i,k-2])
phai2[i,k]=(2.0-Ncoef**2*dt**2)*phai1[i,k]-phai0[i,k]+1.0*(Ncoef**2)*(dt**2*1.0/dz**2)*(w1[i+1,k]-2*w1[i,k]+w1[i-1,k])
if k==nx-1:
for i in range(1,nz-1):
phai0[i,k]=(1.0/dz**2)*(w0[i+1,k]-2.0*w0[i,k]+w0[i-1,k])+(1.0/dx**2)*(w0[i,1]-2.0*w0[i,k]+w0[i,k-1])
phai1[i,k]=(1.0/dz**2)*(w1[i+1,k]-2.0*w1[i,k]+w1[i-1,k])+(1.0/dx**2)*(w1[i,1]-2.0*w1[i,k]+w1[i,k-1])
phai2[i,k]=(2.0-Ncoef**2*dt**2)*phai1[i,k]-phai0[i,k]+1.0*(Ncoef**2)*(dt**2*1.0/dz**2)*(w1[i+1,k]-2*w1[i,k]+w1[i-1,k])
for N in range(0,N_inner_iteration):
sum_max=0
for k in range(0,nx):
if k!=0 and k!=nx-1:
for i in range(1,nz):
if i!=nz-1:
w21[i,k]=0.5*(dz**2)*(w20[i,k+1]+w20[i,k-1])/(dx**2+dz**2)+0.5*(dx**2)*(w20[i+1,k]+w20[i-1,k])/(dx**2+dz**2)-(0.5/(dx**2+dz**2))*(dx**2*dz**2)*phai2[i,k]
if i==nz-1:
w21[i,k]=1/(1+(p(k*dx,1)/(dx*p(k*dx,2)))+((2/p(k*dx,2)*(p(k*dx,1))**2-p(k*dx,0))/dz)+(p(k*dx,1)/dx/dz))*(((p(k*dx,1)/(dx*p(k*dx,2)))+(p(k*dx,1)/dx/dz))*w0[i,k-1]+(((2/p(k*dx,2)*(p(k*dx,1))**2-p(k*dx,0))/dz)+(p(k*dx,1)/dx/dz))*w0[i-1,k]-((p(k*dx,1)/dx/dz))*w0[i-1,k-1])
sum_max=sum_max+abs(w21[i,k]-w20[i,k])
# w20[i,k]=w21[i,k]
if k==0:
for i in range(1,nz):
if i!=nz-1:
w21[i,k]=0.5*(dz**2)*(w20[i,1]+w20[i,k-2])/(dx**2+dz**2)+0.5*(dx**2)*(w20[i+1,k]+w20[i-1,k])/(dx**2+dz**2)-(0.5/(dx**2+dz**2))*(dx**2*dz**2)*phai2[i,k]
if i==nz-1:
k=0.00000001
w21[i,k]=1/(1+(p(k*dx,1)/(dx*p(k*dx,2)))+((2/p(k*dx,2)*(p(k*dx,1))**2-p(k*dx,0))/dz)+(p(k*dx,1)/dx/dz))*(((p(k*dx,1)/(dx*p(k*dx,2)))+(p(k*dx,1)/dx/dz))*w0[i,k-2]+(((2/p(k*dx,2)*(p(k*dx,1))**2-p(k*dx,0))/dz)+(p(k*dx,1)/dx/dz))*w0[i-1,k]-((p(k*dx,1)/dx/dz))*w0[i-1,k-2])
sum_max=sum_max+abs(w21[i,k]-w20[i,k])
if k==nx-1:
for i in range(1,nz):
# w21[i,k]=w21[i,0]
# w20[i,k]=w21[i,k]
if i!=nz-1:
w21[i,k]=0.5*(dz**2)*(w20[i,1]+w20[i,k-1])/(dx**2+dz**2)+0.5*(dx**2)*(w20[i+1,k]+w20[i-1,k])/(dx**2+dz**2)-(0.5/(dx**2+dz**2))*(dx**2*dz**2)*phai2[i,k]
if i==nz-1:
w21[i,k]=1/(1+(p(k*dx,1)/(dx*p(k*dx,2)))+((2/p(k*dx,2)*(p(k*dx,1))**2-p(k*dx,0))/dz)+(p(k*dx,1)/dx/dz))*(((p(k*dx,1)/(dx*p(k*dx,2)))+(p(k*dx,1)/dx/dz))*w0[i,k-1]+(((2/p(k*dx,2)*(p(k*dx,1))**2-p(k*dx,0))/dz)+(p(k*dx,1)/dx/dz))*w0[i-1,k]-((p(k*dx,1)/dx/dz))*w0[i-1,k-1])
sum_max=sum_max+abs(w21[i,k]-w20[i,k])
# w20[i,k]=w21[i,k]
w20[:]=w21[:]
if (1.0/(nx*nz))*sum_max<0.0000001:
break
print (1.0/(nx*nz))*sum_max, "N=",N
w0[:]=w1[:]
w1[:]=w20[:]
w_final[:,:,tot]=w1[:]
for ordstep in range(1,num_of_orders+1):
integ_sum2=0
for zstep in range (0,nz):
for xstep in range (0,nx):
integ_sum2=integ_sum2+w1[zstep,xstep]*np.sin(ordstep*(k_z*1.0/m)*zstep*dz)*np.sin(ordstep*(k_x*1.0/n)*xstep*dx-omega*tot*dt)*dx*dz
Amp[ordstep-1,tot]=4.0/(l*h)*integ_sum2/Ampref
if tot%1==0:
Ampsave=np.reshape(Amp[:,tot],(1,5))
with open('test.csv', 'a') as file:
np.savetxt(file,np.array(Ampsave))
# np.save(outfile,Amp)
oldcol = wframe
wframe = ax.plot_surface(X, Z, w1, rstride=2, cstride=2)
if oldcol is not None:
ax.collections.remove(oldcol)
plt.pause(.001)
ax.set_xlabel('(X)')
ax.set_ylabel('(Z)')
ax.set_zlabel('$ W_{Numerical} $')
plt.figure('Amplitude Evolution')
plt.axis([0, 40000, -2, 2])
plt.scatter(tot, Amp[0,tot])
plt.scatter(tot, Amp[1,tot])
print Amp[0,tot]
plt.draw()
plt.legend(bbox_to_anchor=(1, 1), loc=1, borderaxespad=0.)
plt.title('Amplitude Evolution')
plt.xlabel('Time[s]',fontsize=25)
plt.ylabel('Amplitude',fontsize=25)
plt.savefig("res.png", transparent = True, pad_inches=0)
plt.show()
I have also changed all range to xrange and avoided using np. inside the loops by predefining them.
Are you using Python2.7 ou Python3.x?
If you are using Python2.7 you should choose xrange generator:
for tot in xrange(0,100000): print tot
It will be faster for sequences large than this. :-)
I am trying to figure out how to be able to do the following.
I want to start a job at 12:00pm every day which computes a list of things and then processes these things in batches of size 'b'. and guarantee an interval of x minutes between the end of one batch and the beginning of another.
schedule.cron '00 12 * * *' do
# things = Compute the list of things
schedule.interval '10m', # Job.new(things[index, 10]) ???
end
I tried something like the following but I hate that I have to pass in the scheduler to the job or access the singleton scheduler.
class Job < Struct.new(:things, :batch_index, :scheduler)
def call
if (batch_index+1) >= things.length
ap "Done with all batches"
else
# Process batch
scheduler.in('10m', Dummy.new(things, batch_index+1, scheduler))
end
end
end
scheduler = Rufus::Scheduler.new
schedule.cron '00 12 * * *' , Dummy.new([ [1,2,3] , [4,5,6,7], [8,9,10]], 0, scheduler)
with a bit of help from https://github.com/jmettraux/rufus-scheduler#scheduling-handler-instances
I'd prepare a class like:
class Job
def initialize(batch_size)
#batch_size = batch_size
end
def call(job, time)
# > I want to start a job (...) which computes a list of things and
# > then processes these things in batches of size 'b'.
things = compute_list_of_things()
loop do
ts = things.shift(#batch_size)
do_something_with_the_batch(ts)
break if things.empty?
# > and guarantee an interval of x minutes between the end of one batch
# > and the beginning of another.
sleep 10 * 60
end
end
end
I'd pass an instance of that class to the scheduler instead of a block:
scheduler = Rufus::Scheduler.new
# > I want to start a job at 12:00pm every day which computes...
scheduler.cron '00 12 * * *', Job.new(10) # batch size of 10...
I don't bother using the scheduler for the 10 minutes wait.
I've written a feature for my library Rubikon that displays a throbber (a spinning — as you may have seen in other console apps) as long as some other code is running.
To test this feature I capture the output of the throbber in a StringIO and compare it with the expected value. As the throbber is only displayed as long as the other code is running the content of the IO gets longer when the code runs longer. In my tests I do a simple sleep 1 and should have a constant 1 second delay. This works most of the time, but sometimes (apparently due to external factors like heavy load on the CPU) it fails, because the code doesn't run for 1 second, but for a bit more, so that the throbber prints a few additional characters.
My question is: Is there any possibility to test such time critical features in Ruby?
From your github repository, I found this test for the Throbber class:
should 'work correctly' do
ostream = StringIO.new
thread = Thread.new { sleep 1 }
throbber = Throbber.new(ostream, thread)
thread.join
throbber.join
assert_equal " \b-\b\\\b|\b/\b", ostream.string
end
I'll assume that a throbber iterates over ['-', '\', '|', '/'], backspacing before each write, once per second. Consider the following test:
should 'work correctly' do
ostream = StringIO.new
started_at = Time.now
ended_at = nil
thread = Thread.new { sleep 1; ended_at = Time.now }
throbber = Throbber.new(ostream, thread)
thread.join
throbber.join
duration = ended_at - started_at
iterated_chars = " -\\|/"
expected = ""
if duration >= 1
# After n seconds we should have n copies of " -\\|/", excluding \b for now
expected << iterated_chars * duration.to_i
end
# Next append the characters we'd get from working for fractions of a second:
remainder = duration - duration.to_i
expected << iterated_chars[0..((iterated_chars.length*remainder).to_i)] if remainder > 0.0
expected = expected.split('').join("\b") + "\b"
assert_equal expected, ostream.string
end
The last assignment of expected is a bit unpleasant, but I made the assumption that the throbber would write character/backspace pairs atomically. If this is not true, you should be able to insert the \b escape sequence into the iterated_chars string and remove the last assignment entirely.
This question is similar (I think, altough I'm not completely sure) to this one:
Only real time operating system can
give you such precision. You can
assume Thread.Sleep has a precision of
about 20 ms so you could, in theory
sleep until the desired time - the
actual time is about 20 ms and THEN
spin for 20 ms but you'll have to
waste those 20 ms. And even that
doesn't guarantee that you'll get real
time results, the scheduler might just
take your thread out just when it was
about to execute the RELEVANT part
(just after spinning)
The problem is not rubby (possibly, I'm no expert in ruby), the problem is the real time capabilities of your operating system.