Failing to store agents with multiprocessing.Queue - parallel-processing

I try to parallelize a simulation by splitting it into several processes. In each process, I simulate N agents over T periods. For one agent and one period this is just
ag.step()
, where ag is an instance of the agent class. My worker looks like this:
class Worker(mp.Process):
def __init__(self, args):
mp.Process.__init__(self, args = args)
self.agents = []
self.N = args[0]
self.T = args[1]
self.q = args[2]
def run(self):
for i in range(N):
ag = agent()
for t in range(T):
ag.step()
self.agents.append(ag)
self.q.put(self.agents)
Then I just do
q = mp.Queue()
if __name__ == '__main__':
processes, p = [], []
for i in range(mp.cpu_count()):
p = Worker(args = (N,T,q))
p.start()
processes.append(p)
for p in processes:
p.join()
However, when I try to get my simulated agents with
q.get()
it does not work, this line keeps running for a very long time. But if I store not instances of the agent class but some particular results of simulation, like doing
self.agents.append(ag.state)
instead, then it works well and q.get() returns a list of values. Why it that?
My N and T are very small (say, 10 and 10) now, but the agent class is relatively complicated (each agent has a neural network, memory, etc.)

Apparently this happened because objects to put in the queue should be pickable. To some extent, I managed to fix it by using pathos.multiprocessing instead.

Related

Why do these two methods give me two different times?

Both methods calculate the time it takes ruby to call and run a code block. I don't see any reason why these two methods should return different results.
methodone gives me: 1.000135157
methodtwo gives me: 1.000108267
I'm a noob, am I even doing this right? please let me know.
def methodone(a)
start = Time.now
a.call
result = Time.now - start
end
def methodtwo(a)
start_time = Time.now
a.call
end_time = Time.now
result = end_time - start_time
end
a = Proc.new do {}
end
p methodone(a)
p methodtwo(a)
You don't get the same output always because the CPU of your machine can be less or more used by other processes running on your computer as well as some caching and interpreter optimizations can occur. For such simple methods you can't reliably time them by just a single pass. If you want to benchmark something like that it is better to tun it thousands or millions of times and then take an average. This will produce a more consistent result, because the "noise" of outside factors gets canceled out.
You shouldn't expect them to be exactly the same. There will always be something going on outside of the Ruby process that will impact performance. You should consider a margin of error of, say, 0.1%
def time(&block)
t = Time.now.to_f
yield
t2 = Time.now.to_f
puts t2 - t
end
50.times do
time do
Proc.new { }
end
end

How to IPC with the parent process when creating child processes in a loop (Ruby)

I have the following code snippet (a simplified representation of what I'm trying to do - training wheels). The sleep(2) would represent some network operation in my real code:
arr = []
5.times do |i|
rd, wr = IO.pipe
fork do
sleep(2) # I'm still waiting for the sleep to happen on each process ... not good, there is no parallelism here
rd.close
arr[i] = i
Marshal.dump(arr, wr)
end
wr.close
result = rd.read
arr = Marshal.load(result)
end
# Process.waitall
p arr
Q: is it possible to somehow create new processes in a loop, pass the results back but not waiting on each iteration. I'm pretty rusty and don't know / remember a great deal about IPC ... especially in Ruby.
Actual result is wait time of 2s*5 = 10s
Expected ~2s tootal (async processing of the sleep())
So a good comment clarifying things, explaining the theory would help a lot. Thanks.
In your loop you wait for each child process to write its results to the pipe before starting the next iteration.
The simplest fix would be to save the read ends of the pipes in an array and don’t read any of them until the loop is finished and you’ve started all the child processes:
arr = []
# array to store the IOs
pipes = []
5.times do |i|
rd, wr = IO.pipe
fork do
sleep(2)
rd.close
# Note only returning the value of i here, not the whole array
Marshal.dump(i, wr)
end
wr.close
#store the IO for later
pipes[i] = rd
end
# Now all child processes are started, we can read the results in turn
# Remember each child is returng i, not the whole array
pipes.each_with_index do |rd, i|
arr[i] = Marshal.load(rd.read)
end
A more complex solution if the wait/network times for different child processes variad might be to look at select, so you could read from whichever pipe was ready first.

Implementing a synchronization barrier in Ruby

I'm trying to "replicate" the behaviour of CUDA's __synchtreads() function in Ruby. Specifically, I have a set of N threads that need to execute some code, then all wait on each other at mid-point in execution before continuing with the rest of their business. For example:
x = 0
a = Thread.new do
x = 1
syncthreads()
end
b = Thread.new do
syncthreads()
# x should have been changed
raise if x == 0
end
[a,b].each { |t| t.join }
What tools do I need to use to accomplish this? I tried using a global hash, and then sleeping until all the threads have set a flag indicating they're done with the first part of the code. I couldn't get it to work properly; it resulted in hangs and deadlock. I think I need to use a combination of Mutex and ConditionVariable but I am unsure as to why/how.
Edit: 50 views and no answer! Looks like a candidate for a bounty...
Let's implement a synchronization barrier. It has to know the number of threads it will handle, n, up front. During first n - 1 calls to sync the barrier will cause a calling thread to wait. The call number n will wake all threads up.
class Barrier
def initialize(count)
#mutex = Mutex.new
#cond = ConditionVariable.new
#count = count
end
def sync
#mutex.synchronize do
#count -= 1
if #count > 0
#cond.wait #mutex
else
#cond.broadcast
end
end
end
end
Whole body of sync is a critical section, i.e. it cannot be executed by two threads concurrently. Hence the call to Mutex#synchronize.
When the decreased value of #count is positive the thread is frozen. Passing the mutex as an argument to the call to ConditionVariable#wait is critical to prevent deadlocks. It causes the mutex to be unlocked before freezing the thread.
A simple experiment starts 1k threads and makes them add elements to an array. Firstly they add zeros, then they synchronize and add ones. The expected result is a sorted array with 2k elements, of which 1k are zeros and 1k are ones.
mtx = Mutex.new
arr = []
num = 1000
barrier = Barrier.new num
num.times.map do
Thread.start do
mtx.synchronize { arr << 0 }
barrier.sync
mtx.synchronize { arr << 1 }
end
end .map &:join;
# Prints true. See it break by deleting `barrier.sync`.
puts [
arr.sort == arr,
arr.count == 2 * num,
arr.count(&:zero?) == num,
arr.uniq == [0, 1],
].all?
As a matter of fact, there's a gem named barrier which does exactly what I described above.
On a final note, don't use sleep for waiting in such circumstances. It's called busy waiting and is considered a bad practice.
There might be merits of having the threads wait for each other. But I think that it is cleaner to have the threads actually finish at "midpoint", because your question obviously impliest that the threads need each others' results at the "midpoint". Clean design solution would be to let them finish, deliver the result of their work, and start a brand new set of threads based on these.

optimize this ruby code, switch arrays to sets/hash?

I need to optimize this code. Any suggestions to make it go faster, please tell me. I don't have a specific amount that I want it to go faster, any suggestion would be helpful. In terms of complexity I want to keep it below O(n^2)
I'm wondering if trying to convert the array that I'm using into like a set or hash because that is quicker right? How much faster in terms of complexity might this allow me to run?
The main problem I think might be my use of the ruby combination function which runs pretty slow, does anyone know exactly the complexity for this ruby function? is there a faster alternative to this?
the point of this code is basically to find the single point that is the shortest combined distance from all the other points ie (the friends house that is most convenient for everyone to go to). there is a little extra code here which has some debugging/printing functions.
class Point
attr_accessor :x, :y, :distance, :done, :count
def initialize(x,y)
#x = x
#y = y
#distance = 0
#closestPoint = []
#done = false
#count = 0
end
end
class Edge
attr_accessor :edge1, :edge2, :weight
def initialize(edge1,edge2,weight)
#edge1 = edge1
#edge2 = edge2
#weight = weight
end
end
class AdjacencyList
attr_accessor :name, :minSumList, :current
def initialize(name)
#name = name
#minSumList = []
#current = nil
#vList = []
#edgeList = []
end
def addVertex(vertex)
#vList.push(vertex)
end
def generateEdges2
minSumNode = nil
current = nil
last = nil
#vList.combination(2) { |vertex1, vertex2|
distance = distance2points(vertex1,vertex2)
edge = Edge.new(vertex1,vertex2,distance)
if (current == nil)
current = vertex1
minSumNode = vertex1
end
vertex1.distance += distance
vertex2.distance += distance
vertex1.count += 1
vertex2.count += 1
if (vertex1.count == #vList.length-1)
vertex1.done = true
elsif (vertex2.count == #vList.length-1)
vertex2.done = true
end
if ((vertex1.distance < minSumNode.distance) && (vertex1.done == true))
minSumNode = vertex1
end
##edgeList.push(edge)
}
return minSumNode.distance
end
def generateEdges
#vList.combination(2) { |vertex1, vertex2|
distance = distance2points(vertex1,vertex2)
#edgeList.push(Edge.new(vertex1,vertex2,distance))
}
end
def printEdges
#edgeList.each {|edge| puts "(#{edge.edge1.x},#{edge.edge1.y}) <=> (#{edge.edge2.x},#{edge.edge2.y}) weight: #{edge.weight}"}
end
def printDistances
#vList.each {|v| puts "(#{v.x},#{v.y} distance = #{v.distance})"}
end
end
def distance2points(point1,point2)
xdistance = (point1.x - point2.x).abs
ydistance = (point1.y - point2.y).abs
total_raw = xdistance + ydistance
return totaldistance = total_raw - [xdistance,ydistance].min
end
#pointtest1 = Point.new(0,1)
#pointtest2 = Point.new(2,5)
#pointtest3 = Point.new(3,1)
#pointtest4 = Point.new(4,0)
graph = AdjacencyList.new("graph1")
gets
while (line = gets)
graph.addVertex(Point.new(line.split[0].to_i,line.split[1].to_i))
end
#graph.addVertex(pointtest1)
#graph.addVertex(pointtest2)
#graph.addVertex(pointtest3)
#graph.addVertex(pointtest4)
puts graph.generateEdges2
#graph.printEdges
#graph.printDistances
Try to do this, and then post some more code:
ruby -rprofile your_script your_args
This will run the script under the profiler, and generate a nice table with results. If you post that here, it's more likely to get better help. Plus, you will have a more exact idea of what's consuming your CPU cycles.
Sets are basically hashes, and the advantage of hashes over arrays is O(1) find operations. Since you are simply iterating over the entire array, hashes will not offer any speed improvements if you simply replace the arrays with hashes.
Your real problem is that the running time of your algorithm is O(n^2), as in given a set of n points it will have to perform n^2 operations since you're matching every point with every other possible point.
This can be somewhat improved using hashes to cache values. For example, lets say you want the distance between point "a" and point "b". You could have a hash #distances which stores #distances["a,b"] = 52 (of course you'll have to be smart about what to use as the key). Basically just try to remove redundant operations wherever you can.
That said, the largest speed boost would be from a smarter algorithm, but I can't think of something applicable off the top of my head right now.
There's something many people know, and it won't cost you anything.
While you're trying to guess how to make the code faster, or scouring the internet for some kind of profiler, just run the program under the debugger and interrupt it while it's being slow.
Do it several times, and each time take careful note of what it's doing and why.
Here's an example in python.
The slower it is, the more obvious the problem will be.

Best way to program an "Excel-like" network of influence in Ruby?

I have a network of nodes, each node influencing the state of some other nodes (imagine an Excel spreadsheet with cells values depending on other cells through formulas).
I'm wondering what is the cleanest way to implement this in Ruby ?
Of course I could have one process per node, but how will it perform if the number of nodes increases ? And, I'm sure there are libraries for that, but I can't find a up-to-date one.
Thanks for your help !
Update: Sounds like EventMachine might do the job... but it seems more adapted to a small number of "nodes"
This sounds like a good situation for the observer pattern. This is a sample of that in ruby:
require 'observer'
class Node
attr_accessor :id
##current_node_id = 0
def initialize
##current_node_id += 1
id = ##current_node_id
end
include Observable
attr_reader :value
protected
def value=(new_value)
return if #value == new_value
old_value = #value
#value = new_value
changed
notify_observers(id, old_value, #value)
end
end
class ValueNode < Node
def initialize(initial_value)
super()
#value = initial_value
end
def value=(new_value)
super(new_value)
end
end
class SumNode < Node
def initialize(*nodes)
super()
#value = nodes.map(&:value).inject(0, &:+)
nodes.each do |node|
node.add_observer(self)
end
end
def update(id, old_value, new_value)
self.value = self.value - old_value + new_value
end
end
def test
v1 = ValueNode.new 4
v2 = ValueNode.new 8
sum = SumNode.new(v1, v2)
sum2 = SumNode.new(v1, sum)
v2.value = 10
p sum.value
p sum2.value
end
test()
Notice how the value of SumNode isn't recalculated every time it is requested - instead it is updated when one of its value nodes is updated. This works recursively, so that inner SumNodes also trigger updates. As the notification includes the unique id of the node, it is possible to write more complex Node types, such as ones that contain formulas.
See http://www.ruby-doc.org/stdlib/libdoc/observer/rdoc/index.html for more details on Observable
This sounds similar to the oft-used Twitter paradigm, where updates by one user are pushed to all it's followers. To do this efficiently, you should store two lists for a given person: one with the people he follows and one with the people that follow him. You can do the same for a list of nodes. When a node changes you can quickly look up the nodes that are influenced by this node. When a relationship disappears you will need the 'forward' list to know from which lists to 'remove' the reverse relationship.
You can store these lists two-dimensional arrays, or in something like Redis. I don't really understand how EventMachine would fit in.
If you have a network graph of dependencies and you want them to scale, a graph database is the best solution. Neo4J is a popular, powerful database for tracking this type of dependencies.
There are several ways to interface with Neo4J from Ruby:
You can use JRuby and its java interface.
Use its REST API
Use neo4j.rb or one of the other Ruby interface libraries.

Resources