Efficient flow graph algorithm that routes its flow more evenly

Efficient flow graph algorithm that routes its flow more evenly - algorithm

This problem I recently came back to after putting it on a backburner for a while - and that's the one of trying to create a program to calculate the flow rates of some kind of resource through a network of pipes from resource sources to resource sinks, each of which can only pass so much resource per unit of time. This is, of course, a classic problem called the "network flow problem", and in particular the typical aim is to find a flow pattern that maximizes the flow going from the sources to the sinks. And I made a program that uses a common algorithm called the Ford-Fulkerson Max-Flow Method to do this, but I found that while this algorithm certainly does a nice job at finding a flow solution, it doesn't necessarily do a good job at making one that is particularly "natural" in terms of the flow pattern.
That is to say, consider a graph like the one below.
------------- SINK 1
0 / 8 | 0 / 5
SOURCE ---------X
| 0 / 5
------------- SINK 2
where the numbers represent the current flow rate on that particular edge or "pipe", here in "units" per second, versus the maximum flow the pipe can support, the "X" is a junction node, and the other labels should be self-explanatory.
When we solve this using F-F (which requires us to temporarily add an "aggregate sink" node that ties the two sinks on the right together), we find the max flow rate is indeed 8 U/s, which should be obvious just from simple inspection for such a simple graph. However, the flow pattern it gives may look something like either
------------- SINK 1
8 / 8 | 5 / 5
SOURCE ---------X
| 3 / 5
------------- SINK 2
or
------------- SINK 1
8 / 8 | 3 / 5
SOURCE ---------X
| 5 / 5
------------- SINK 2
depending on the order on which it encounters the edges during the depth-first walk used in the calculation. Trouble is, not only is that behavior itself not ideal, that flow doesn't "feel natural" in a certain sense. Intuitively, if we were imagining pushing a fluid, we'd expect 4 U/s of flow to go to sink 1 and another 4 to go to sink 2 by symmetry. Indeed, if we actually shrink the capacity of the edge leading out of the source to 5, the Ford-Fulkerson algorithm will starve one sink entirely, and that is also a behavior I'd like to avoid - if there's not enough flow to keep everybody as happy as they'd like to be, then at least try to distribute it as evenly as possible. In this case, that'd mean that if the max flow is, say, as here, 80% of the flow needed to fully satiate all the sinks, then 80% should go to each sink, unless there's a constriction somewhere in the graph that prevents sending even that much to that sink, in which case excess flow should back up and go to the other sinks while that one still gets the maximum it can get.
So my question is, what sort of algorithms would have either this behavior or a behavior similar to it? Or, to put it another way, if F-F is a good tool to just find a maximum flow, what is a good tool for tailoring the pattern of that maximum flow to some "desirable" form like this?
One simple solution I thought of is to just repeatedly apply F-F, only instead of routing from the source to the fictitious aggregate sink, apply it from the source to each individual sink, thus giving the max flow that is capable of making it through the constrictions, then work out from that how much each sink can actually get fed based on its demand and the whole-graph max flow. Trouble is, that means running the algorithm as many times as there are sinks, so the Big-O goes up, perhaps too much. Is there a more efficient way to achieve this?

Related

FlowField Pathfinding on Large RTS Maps

When building a large map RTS game, my team are experiencing some performance issues regarding pathfinding.
A* is obviously inefficient due to not only janky path finding, but processing costs for large groups of units moving at the same time.
After research, the obvious solution would be to use FlowField pathfinding, the industry standard for RTS games as it stands.
The issue we are now having after creating the base algorithm is that the map is quite large requiring a grid of around 766 x 485. This creates a noticeable processing freeze or lag when computing the flowfield for the units to follow.
Has anybody experienced this before or have any solutions on how to make the flowfields more efficient? I have tried the following:
Adding flowfields to a list when it is created and referencing later (Works once it has been created, but obviously lags on creation.)
Processing flowfields before the game is started and referencing the list (Due to the sheer amount of cells, this simply doesn't work.)
Creating a grid based upon the distance between the furthest selected unit and the destination point (Works for short distances, not if moving from one end of the map to the other).
I was thinking about maybe splitting up the map into multiple flowfields, but I'm trying to work out how I would make them move from field to field.
Any advice on this?
Thanks in advance!

Maybe this is a bit late answer. Since you have mentioned that this is a large RTS game, then the computation should not be limited to one CPU core. There are a few advice for you to use flowfield more efficiently.
Use multithreads to compute new flow fields for each unit moving command
Group units, so that all units in same command group share the same flowfield
Partition the flowfield grids, so you only have to update any partition that had modification in path (new building/ moving units)
Pre baked flowfields grid slot cost:you prebake basic costs of the grids(based on environments or other static values that won't change during the game).
Divide, e.g. you have 766 x 485 map, set it as 800 * 500, divide it into 100 * 80 * 50 partitions as stated in advice 3.
You have a grid of 10 * 10 = 100 slots, create a directed graph (https://en.wikipedia.org/wiki/Graph_theory) using the a initial flowfield map (without considering any game units), and use A* algorihtm to search the flowfield grid before the game begins, so that you know all the connections between partitions.
For each new flowfield, Build flowfield only with partitions marked by a simple A* search in the graph. And then use alternative route if one node of the route given by A* is totally blocked from reaching the next one (mark the node as blocked and do A* again in this graph)
6.Cache, save flowfield result from step.5 for further usage (same unit spawning from home and going to the enemy base. Same partition routes. Invalidate cache if there is any change in the path, and only invalidate the cache of the changed partition first, check if this partition still connects to other sides, then only minor change will be made within the partition only)
Runtime late updating the units' command. If the map is large enough. Move the units immediately to the next partition without using the flowfield, (use A* first to search the 10*10 graph to get the next partition). And during this time of movement, in the background, build the flowfield using previous step 1-6. (in fact you only need few milliseconds to do the calculation if optimized properly, then the units changes their route accordingly. Most of the time there is no difference and player won't notice a thing. In the worst case, where we finally have to search all patitions to get the only possible route, it is only the first time there will be any delay, and the cache will minimise the time since it is the only way and the cache will be used repeatitively)
Re-do the build process above every once per few seconds for each command group (in the background), just in case anything changes in the middle of the way.
I could get this working with much larger random map (2000*2000) with no fps drop at all.
Hope this helps anyone in the future.

Finding the flow of water in a connected network

I have a set of water meters for water consumers drawn up as geojson and visualized with ol3. For each consumer house i have their usage of water for the given year, and also the water pipe system is given as linestrings, with metadata for the diameter of each pipe section.
What is the minimum required information I need to be able to visualize/calculate the amount of water that passed each pipe in total of the year when the pipes have inner loops/circles.
is there a library that makes it easy to do the calculations in javascript.
Naive approach, start from each house and move to the first pipe junction and add the used mater measurement for the house as water out of the junction and continue until the water plant is reached. This works if there was no loops within the pipe system.

This sounds more like a physics or civil engineering problem than a programming one.
But as best I can tell, you would need time series data for sources and sinks.
Consider this simple network:
Say, A is a source and B and D are sinks/outlets.
If the flow out of B is given, the flow in |CB| would be dependent on the flow out of D.
So e.g. if B and D were always open at the same time, the total volume that has passed |CB| might be close to 0. Conversely, if B and D were never open at the same time the number might be equal to the volume that flowed through |AB|.
If you can obtain time series data, so you have concurrent values of flow through D and B, I would think there would exist a standard way of determining the flow through |CB|.
Wikipedia's Pipe Network Analysis article mentions one such method: The Hardy Cross method, which:
"assumes that the flow going in and out of the system is known and that the pipe length, diameter, roughness and other key characteristics are also known or can be assumed".
If time series data are not an option, I would pretend it was always average (which might not be so bad given a large network, like in your image) and then do the same thing.

You can use the Ford-Fulkerson algorithm to find the maximum flow in a network. To use this algorithm, you need to represent your network as a graph with nodes that represent your houses and edges to represent your pipes.

You can first simplify the network by consolidating demands on "dead-ends". Next you'll need pressure data at the 3 feeds into this network, which I see as the top feed from the 90 (mm?), centre feed at the 63 and bottom feed near to the 50. These 3 clusters are linked by a 63mm running down, which have the consolidated demand and the pressure readings at the feed would be sufficient to give the flowrate across the inner clusters.

load balancing algorithms - special example

Let´s pretend i have two buildings where i can build different units in.
A building can only build one unit at the same time but has a fifo-queue of max 5 units, which will be built in sequence.
Every unit has a build-time.
I need to know, what´s the fastest solution to get my units as fast as possible, considering the units already in the build-queues of my buildings.
"Famous" algorithms like RoundRobin doesn´t work here, i think.
Are there any algorithms, which can solve this problem?

This reminds me a bit of starcraft :D
I would just add an integer to the building queue which represents the time it is busy.
Of course you have to update this variable once per timeunit. (Timeunits are "s" here, for seconds)
So let's say we have a building and we are submitting 3 units, each take 5s to complete. Which will sum up to 15s total. We are in time = 0.
Then we have another building where we are submitting 2 units that need 6 timeunits to complete each.
So we can have a table like this:
Time 0
Building 1, 3 units, 15s to complete.
Building 2, 2 units, 12s to complete.
Time 1
Building 1, 3 units, 14s to complete.
Building 2, 2 units, 12s to complete.
And we want to add another unit that takes 2s, we can simply loop through the selected buildings and pick the one with the lowest time to complete.
In this case this would be building 2. This would lead to Time2...
Time 2
Building 1, 3 units, 13s to complete
Building 2, 3 units, 11s+2s=13s to complete
...
Time 5
Building 1, 2 units, 10s to complete (5s are over, the first unit pops out)
Building 2, 3 units, 10s to complete
And so on.
Of course you have to take care of the upper boundaries in your production facilities. Like if a building has 5 elements, don't assign something and pick the next building that has the lowest time to complete.
I don't know if you can implement this easily with your engine, or if it even support some kind of timeunits.
This will just result in updating all production facilities once per timeunit, O(n) where n is the number of buildings that can produce something. If you are submitting a unit this will take O(1) assuming that you keep the selected buildings in a sorted order, lowest first - so just a first element lookup. In this case you have to resort the list after manipulating the units like cancelling or adding.
Otherwise amit's answer seem to be possible, too.

This is NPC problem (proof at the end of the answer) so your best hope to find ideal solution is trying all possibilities (this will be 2^n possibilities, where n is the number of tasks).
possible heuristic was suggested in comment (and improved in comments by AShelly): sort the tasks from biggest to smallest, and put them in one queue, every task can now take element from the queue when done.
this is of course not always optimal, but I think will get good results for most cases.
proof that the problem is NPC:
let S={u|u is a unit need to be produced}. (S is the set containing all 'tasks')
claim: if there is a possible prefect split (both queues finish at the same time) it is optimal. let this time be HalfTime
this is true because if there was different optimal, at least one of the queues had to finish at t>HalfTime, and thus it is not optimal.
proof:
assume we had an algorithm A to produce the best solution at polynomial time, then we could solve the partition problem at polynomial time by the following algorithm:
1. run A on input
2. if the 2 queues finish exactly at HalfTIme - return True.
3. else: return False
this solution solves the partition problem because of the claim: if the partition exist, it will be returned by A, since it is optimal. all steps 1,2,3 run at polynomial time (1 for the assumption, 2 and 3 are trivial). so the algorithm we suggested solves partition problem at polynomial time. thus, our problem is NPC
Q.E.D.

Here's a simple scheme:
Let U be the list of units you want to build, and F be the set of factories that can build them. For each factory, track total time-til-complete; i.e. How long until the queue is completely empty.
Sort U by decreasing time-to-build. Maintain sort order when inserting new items
At the start, or at the end of any time tick after a factory completes a unit runs out of work:
Make a ready list of all the factories with space in the queue
Sort the ready list by increasing time-til-complete
Get the factory that will be done soonest
take the first item from U, add it to thact factory
Repeat until U is empty or all queues are full.
Googling "minimum makespan" may give you some leads into other solutions. This CMU lecture has a nice overview.
It turns out that if you know the set of work ahead of time, this problem is exactly Multiprocessor_scheduling, which is NP-Complete. Apparently the algorithm I suggested is called "Longest Processing Time", and it will always give a result no longer than 4/3 of the optimal time.
If you don't know the jobs ahead of time, it is a case of online Job-Shop Scheduling
The paper "The Power of Reordering for Online Minimum Makespan Scheduling" says
for many problems, including minimum
makespan scheduling, it is reasonable
to not only provide a lookahead to a
certain number of future jobs, but
additionally to allow the algorithm to
choose one of these jobs for
processing next and, therefore, to
reorder the input sequence.
Because you have a FIFO on each of your factories, you essentially do have the ability to buffer the incoming jobs, because you can hold them until a factory is completely idle, instead of trying to keeping all the FIFOs full at all times.
If I understand the paper correctly, the upshot of the scheme is to
Keep a fixed size buffer of incoming
jobs. In general, the bigger the
buffer, the closer to ideal
scheduling you get.
Assign a weight w to each factory according to
a given formula, which depends on
buffer size. In the case where
buffer size = number factories +1, use weights of (2/3,1/3) for 2 factories; (5/11,4/11,2/11) for 3.
Once the buffer is full, whenever a new job arrives, you remove the job with the least time to build and assign it to a factory with a time-to-complete < w*T where T is total time-to-complete of all factories.
If there are no more incoming jobs, schedule the remainder of jobs in U using the first algorithm I gave.
The main problem in applying this to your situation is that you don't know when (if ever) that there will be no more incoming jobs. But perhaps just replacing that condition with "if any factory is completely idle", and then restarting will give decent results.

Time Aware Social Graph DS/Queries

Classic social networks can be represented as a graph/matrix.
With a graph/matrix one can easily compute
shortest path between 2 participants
reachability from A -> B
general statistics (reciprocity, avg connectivity, etc)
etc
Is there an ideal data structure (or a modification to graph/matrix) that enables easy computation of the above while being time aware?
For example,
Input
t = 0...100
A <-> B (while t = 0...10)
B <-> C (while t = 5...100)
C <-> A (while t = 50...100)
Sample Queries
Is A associated with B at any time? (yes)
Is A associated with B while B is associated with C? (yes. #t = 5...10)
Is C ever reachable from A (yes. # t=5 )

What you're looking for is an explicitly persistent data structure. There's a fair body of literature on this, but it's not that well known. Chris Okasaki wrote a pretty substantial book on the topic. Have a look at my answer to this question.
Given a full implementation of something like Driscoll et al.'s node-splitting structure, there are a few different ways to set up your queries. If you want to know about stuff true in a particular time range, you would only examine nodes containing data about that time range. If you wanted to know what time range something was true, you would start searching, and progressively tighten your bounds as you explore each new node. Just remember that your results might not always be contiguous - consider two people start dating, breaking up, and getting back together.
I would guess that there's probably at least one publication worth of unexplored territory in how to do interesting queries over persistent graphs, if not much more.

Need some help understanding this problem about maximizing graph connectivity

I was wondering if someone could help me understand this problem. I prepared a small diagram because it is much easier to explain it visually.
alt text http://img179.imageshack.us/img179/4315/pon.jpg
Problem I am trying to solve:
1. Constructing the dependency graph
Given the connectivity of the graph and a metric that determines how well a node depends on the other, order the dependencies. For instance, I could put in a few rules saying that
node 3 depends on node 4
node 2 depends on node 3
node 3 depends on node 5
But because the final rule is not "valuable" (again based on the same metric), I will not add the rule to my system.
2. Execute the request order
Once I built a dependency graph, execute the list in an order that maximizes the final connectivity. I am not sure if this is a really a problem but I somehow have a feeling that there might exist more than one order in which case, it is required to choose the best order.
First and foremost, I am wondering if I constructed the problem correctly and if I should be aware of any corner cases. Secondly, is there a closely related algorithm that I can look at? Currently, I am thinking of something like Feedback Arc Set or the Secretary Problem but I am a little confused at the moment. Any suggestions?
PS: I am a little confused about the problem myself so please don't flame on me for that. If any clarifications are needed, I will try to update the question.

It looks like you are trying to determine an ordering on requests you send to nodes with dependencies (or "partial ordering" for google) between nodes.
If you google "partial order dependency graph", you get a link to here, which should give you enough information to figure out a good solution.
In general, you want to sort the nodes in such a way that nodes come after their dependencies; AKA topological sort.

I'm a bit confused by your ordering constraints vs. the graphs that you picture: nothing matches up. That said, it sounds like you have soft ordering constraints (A should come before B, but doesn't have to) with costs for violating the constraint. An optimal algorithm for scheduling that is NP-hard, but I bet you could get a pretty good schedule using a DFS biased towards large-weight edges, then deleting all the back edges.

If you know in advance the dependencies of each node, you can easily build layers.
It's amusing, but I faced the very same problem when organizing... the compilation of the different modules of my application :)
The idea is simple:
def buildLayers(nodes):
layers = []
n = nodes[:] # copy the list
while not len(n) == 0:
layer = _buildRec(layers, n)
if len(layer) == 0: raise RuntimeError('Cyclic Dependency')
for l in layer: n.remove(l)
layers.append(layer)
return layers
def _buildRec(layers, nodes):
"""Build the next layer by selecting nodes whose dependencies
already appear in `layers`
"""
result = []
for n in nodes:
if n.dependencies in flatten(layers): result.append(n) # not truly python
return result
Then you can pop the layers one at a time, and each time you'll be able to send the request to each of the nodes of this layer in parallel.
If you keep a set of the already selected nodes and the dependencies are also represented as a set the check is more efficient. Other implementations would use event propagations to avoid all those nested loops...
Notice in the worst case you have O(n3), but I only had some thirty components and there are not THAT related :p

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio