I am trying to figure out an algorithm to efficiently solve the following problem:
There are w warehouses that store p different products with different quantities
A customer places an order on n out of the p products
The goal is to pick the minimum number of warehouses from which the order could be allocated.
E.g. the distribution of inventory in three warehouses is as follows
| Product 1 | Product 2 | Product 3 |
|---------------|---------------|---------------|---------------|
| Warehouse 1 | 2 | 5 | 0 |
| Warehouse 2 | 1 | 4 | 4 |
| Warehouse 3 | 3 | 1 | 4 |
Now suppose an order is placed with the following ordered quantities:
| Product 1 | Product 2 | Product 3 |
|---------------|---------------|---------------|---------------|
| Ordered Qty | 5 | 4 | 1 |
The optimal solution here would be to allocate the order from Warehouse 1 and Warehouse 3. No other smaller subset of the 3 warehouses would be a better choice
I have tried using brute force to solve this, however, for a larger number of warehouses, the algorithm performs very poorly. I have also tried a few greedy allocation algorithms, however, as expected, they are unable to minimize the number of sub-orders in many cases. Are there any other algorithms/approaches that I should look into?
Part 1 (see also Part 2 below)
Your task looks like a Set Cover Problem which is NP-complete, hence having exponential solving time.
I decided (and implemented in C++) my own solution for it, which might be sub-exponential in one case - if it happens that many sub-sets of warehouses produce same amount of products in sum. In other words if an exponential size of a set of all warehouses sub-sets (which is 2^NumWarehouses) is much bigger than a set of all possible combinations of products counts produced by all sub-sets of warehouses. It often happens like so in most of tests of such problem like your in online competition. If so happens then my solution will be sub-exponential both in CPU and in RAM.
I used Dynamic Programming approach for this. Whole algorithm may be described as following:
We create a map as a key having vector of amount of each product, and this key points to a triple, a) set of previous taken warehouses that reach current products amounts, this is to restore exact chosen warehouses, b) minimal amount of needed to take warehouses to achieve this products amounts, c) previous taken warehous that achieved this minimum of needed warehouses. This set is initialized with single key - vector of 0 products (0, 0, ..., 0).
Iterate through all warehouses in a loop and do 3.-4..
Iterate through all current products amounts (vectors) in a map and do 4..
To iterated vector of products (in a map) we add amounts of products of iterated warehouse. This sum of two vectors is a new key in a map, inside a value pointed by this key we add to set an index of iterated warehouse, while minimum and previous warehouse we set to -1 (uninitialized).
Using a recursive function for each key of a map find a minimum needed amount of warehouses and also find previous warehous achieving this minimum. This is easily done if for given key to iterate all warehouses in a Set, and find (recursively) their minimums, then minimum of current key will be minimum of all minimums plus 1.
Iterate through all keys in a map that are bigger or equal (as a vector) to ordered amount of products. All these keys will give a solution, but only some of them will give Minimal solution, save a key that gives minimal solution of all. In a case if all keys in a map are smaller than current ordered vector then there is no possible solution and we can finish program with error.
Having a minimal key we restore path backwards of all used warehouses to achieve this minimum. This is easy because for each key in a map we keep minimal amount of warehouses and previous warehouse that should be taken to achieve this minimum. Jumping by "previous" warehouses we restore whole path of needed warehouses. Finally output this found minimal solution.
As already mentioned this algorithm has Memory and Time complexity equal to amount of different distinct vectors of products that can be formed by all sub-sets of all warehouses. Which may (if we're lucky) or may not be (if we're unlucky) sub-exponential.
Full C++ code implementing algorithm above (implemented from scratch by me):
Try it online!
#include <cstdint>
#include <vector>
#include <tuple>
#include <map>
#include <set>
#include <unordered_map>
#include <functional>
#include <stdexcept>
#include <iostream>
#include <algorithm>
#define ASSERT(cond) { if (!(cond)) throw std::runtime_error("Assertion (" #cond ") failed at line " + std::to_string(__LINE__) + "!"); }
#define LN { std::cout << "LN " << __LINE__ << std::endl; }
using u16 = uint16_t;
using u32 = uint32_t;
using u64 = uint64_t;
int main() {
std::vector<std::vector<u32>> warehouses_products = {
{2, 5, 0},
{1, 4, 4},
{3, 1, 4},
};
std::vector<u32> order_products = {5, 4, 1};
size_t const nwares = warehouses_products.size(),
nprods = warehouses_products.at(0).size();
ASSERT(order_products.size() == nprods);
std::map<std::vector<u32>, std::tuple<std::set<u16>, u16, u16>> d =
{{std::vector<u32>(nprods), {{}, 0, u16(-1)}}};
for (u16 iware = 0; iware < nwares; ++iware) {
auto const & wprods = warehouses_products[iware];
ASSERT(wprods.size() == nprods);
auto dc = d;
for (auto const & [k, _]: d) {
auto prods = k;
for (size_t i = 0; i < wprods.size(); ++i)
prods[i] += wprods[i];
dc.insert({prods, {{}, u16(-1), u16(-1)}});
std::get<0>(dc[prods]).insert(iware);
}
d = dc;
}
std::function<u16(std::vector<u32> const &)> FindMin =
[&](auto const & prods) {
auto & [a, b, c] = d.at(prods);
if (b != u16(-1))
return b;
u16 minv = u16(-1), minw = u16(-1);
for (auto iware: a) {
auto const & wprods = warehouses_products[iware];
auto cprods = prods;
for (size_t i = 0; i < wprods.size(); ++i)
cprods[i] -= wprods[i];
auto const fmin = FindMin(cprods) + 1;
if (fmin < minv) {
minv = fmin;
minw = iware;
}
}
ASSERT(minv != u16(-1) && minw != u16(-1));
b = minv;
c = minw;
return b;
};
for (auto const & [k, v]: d)
FindMin(k);
std::vector<u32> minp;
u16 minv = u16(-1);
for (auto const & [k, v]: d) {
bool matched = true;
for (size_t i = 0; i < nprods; ++i)
if (order_products[i] > k[i]) {
matched = false;
break;
}
if (!matched)
continue;
if (std::get<1>(v) < minv) {
minv = std::get<1>(v);
minp = k;
}
}
if (minp.empty()) {
std::cout << "Can't buy all products!" << std::endl;
return 0;
}
std::vector<u16> answer;
while (minp != std::vector<u32>(nprods)) {
auto const & [a, b, c] = d.at(minp);
answer.push_back(c);
auto const & wprods = warehouses_products[c];
for (size_t i = 0; i < wprods.size(); ++i)
minp[i] -= wprods[i];
}
std::sort(answer.begin(), answer.end());
std::cout << "WareHouses: ";
for (auto iware: answer)
std::cout << iware << ", ";
std::cout << std::endl;
}
Input:
WareHouses Products:
{2, 5, 0},
{1, 4, 4},
{3, 1, 4},
Ordered Products:
{5, 4, 1}
Output:
WareHouses: 0, 2,
Part 2
Totally different solution I also implemented below.
Now it is based on Back Tracking using Recursive Function.
This solution although being exponential in worth case, yet it gives close to optimal solution after little time. So you just run this program as long as you can afford and whatever it has found so far you output as approximate solution.
Algorithm is as follows:
Suppose we have some products left to buy. Lets sort in descending order all not taken so far warehouses by total amount of all products that they can buy us.
In a loop we take each next warehouse from sorted descending list, but we take only first limit (this is fixed given value) elements from this sorted list. This way we take greedely warehouses in order of relevance, in order of the amount of products left to buy.
After warehouse is taken we do recursive descend into current function in which we again form a sorted list of warehouses and take another most relevant warehouse, in other words jump to 1. of this algorithm.
On each function call if we bought all products and amount of taken warehouses is less than current minimum then we output this solution and update minimum value.
Thus algorithm above starts from very greedy behaviour and then becomes slower and slower while becoming less greedy and more of brute force approach. And very good solutions appear already on first seconds.
As an example below I create 40 random warehouses with 40 random amounts of products each. This quite large task is solved Probably optimal within first second. By saying Probably I mean that next minutes of run don't give any better solution.
Try it online!
#include <cstdint>
#include <iomanip>
#include <iostream>
#include <random>
#include <vector>
#include <functional>
#include <chrono>
#include <cmath>
using u8 = uint8_t;
using u16 = uint16_t;
using u32 = uint32_t;
using i32 = int32_t;
double Time() {
static auto const gtb = std::chrono::high_resolution_clock::now();
return std::chrono::duration_cast<std::chrono::duration<double>>(
std::chrono::high_resolution_clock::now() - gtb).count();
}
void Solve(auto const & wps, auto const & ops) {
size_t const nwares = wps.size(), nprods = ops.size(), max_depth = 1000;
std::vector<u32> prods_left = ops;
std::vector<std::vector<u16>> sorted_wares_all(max_depth);
std::vector<std::vector<u32>> prods_copy_all(max_depth);
std::vector<u16> path;
std::vector<u8> used(nwares);
size_t min_wares = size_t(-1);
auto ProdGrow = [&](auto const & prods){
size_t grow = 0;
for (size_t i = 0; i < nprods; ++i)
grow += std::min(prods_left[i], prods[i]);
return grow;
};
std::function<void(size_t, size_t, size_t)> Rec = [&](size_t depth, size_t off, size_t lim){
size_t prods_need = 0;
for (auto e: prods_left)
prods_need += e;
if (prods_need == 0) {
if (path.size() < min_wares) {
min_wares = path.size();
std::cout << std::endl << "Time " << std::setw(4) << std::llround(Time())
<< " sec, Cnt " << std::setw(3) << path.size() << ": ";
auto cpath = path;
std::sort(cpath.begin(), cpath.end());
for (auto e: cpath)
std::cout << e << ", ";
std::cout << std::endl << std::flush;
}
return;
}
auto & sorted_wares = sorted_wares_all.at(depth);
auto & prods_copy = prods_copy_all.at(depth);
sorted_wares.clear();
for (u16 i = off; i < nwares; ++i)
if (!used[i])
sorted_wares.push_back(i);
std::sort(sorted_wares.begin(), sorted_wares.end(),
[&](auto a, auto b){
return ProdGrow(wps[a]) > ProdGrow(wps[b]);
});
sorted_wares.resize(std::min(lim, sorted_wares.size()));
for (size_t i = 0; i < sorted_wares.size(); ++i) {
u16 const iware = sorted_wares[i];
auto const & wprods = wps[iware];
prods_copy = prods_left;
for (size_t j = 0; j < nprods; ++j)
prods_left[j] -= std::min(prods_left[j], wprods[j]);
path.push_back(iware);
used[iware] = 1;
Rec(depth + 1, iware + 1, lim);
used[iware] = 0;
path.pop_back();
prods_left = prods_copy;
}
for (auto e: sorted_wares)
used[e] = 0;
};
for (size_t lim = 1; lim <= nwares; ++lim) {
std::cout << "Limit " << lim << ", " << std::flush;
Rec(0, 0, lim);
}
}
int main() {
size_t const nwares = 40, nprods = 40;
std::mt19937_64 rng{std::random_device{}()};
std::vector<std::vector<u32>> wps(nwares);
for (size_t i = 0; i < nwares; ++i) {
wps[i].resize(nprods);
for (size_t j = 0; j < nprods; ++j)
wps[i][j] = rng() % 90 + 10;
}
std::vector<u32> ops;
for (size_t i = 0; i < nprods; ++i)
ops.push_back(rng() % (nwares * 20));
Solve(wps, ops);
}
Output:
Limit 1, Limit 2, Limit 3, Limit 4,
Time 0 sec, Cnt 13: 6, 8, 12, 13, 29, 31, 32, 33, 34, 36, 37, 38, 39,
Limit 5,
Time 0 sec, Cnt 12: 6, 8, 12, 13, 28, 29, 31, 32, 36, 37, 38, 39,
Limit 6, Limit 7,
Time 0 sec, Cnt 11: 6, 8, 12, 13, 19, 26, 31, 32, 33, 36, 39,
Limit 8, Limit 9, Limit 10, Limit 11, Limit 12, Limit 13, Limit 14, Limit 15,
If you want to go down the ILP route, you could formulate the following programme:
Where w is the number of warehouses, p the number of products, n_j the quantity of product j ordered, and C_ij the quantity of product j stored in warehouse i. Then, the decisions are to select warehouse i (x_i = 1) or not (x_i = 0).
Using Google's ortools and the open-source CBC solver, this could be implemented as follows in Python:
import numpy as np
from ortools.linear_solver import pywraplp
# Some test data, replace with your own.
p = 50
w = 1000
n = np.random.randint(0, 10, p)
C = np.random.randint(0, 5, (w, p))
solver = pywraplp.Solver("model", pywraplp.Solver.CBC_MIXED_INTEGER_PROGRAMMING)
x = [solver.BoolVar(f"x[{i}]") for i in range(w)]
for j in range(p):
solver.Add(C[:, j] # x >= n[j])
solver.Minimize(sum(x))
This formulation solves instances with up to a thousand warehouses in a few seconds to a minute. Smaller instances solve much quicker, for (I hope) obvious reasons.
The following outputs the solution, and some statistics:
assert solver.Solve() is not None
print("Solution:")
print(f"assigned = {[i + 1 for i in range(len(x)) if x[i].solution_value()]}")
print(f" obj = {solver.Objective().Value()}")
print(f" time = {solver.WallTime() / 1000}s")
My situation
Input: a set of rectangles
each rect is comprised of 4 doubles like this: (x0,y0,x1,y1)
they are not "rotated" at any angle, all they are "normal" rectangles that go "up/down" and "left/right" with respect to the screen
they are randomly placed - they may be touching at the edges, overlapping , or not have any contact
I will have several hundred rectangles
this is implemented in C#
I need to find
The area that is formed by their overlap - all the area in the canvas that more than one rectangle "covers" (for example with two rectangles, it would be the intersection)
I don't need the geometry of the overlap - just the area (example: 4 sq inches)
Overlaps shouldn't be counted multiple times - so for example imagine 3 rects that have the same size and position - they are right on top of each other - this area should be counted once (not three times)
Example
The image below contains thre rectangles: A,B,C
A and B overlap (as indicated by dashes)
B and C overlap (as indicated by dashes)
What I am looking for is the area where the dashes are shown
-
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAA--------------BBB
AAAAAAAAAAAAAAAA--------------BBB
AAAAAAAAAAAAAAAA--------------BBB
AAAAAAAAAAAAAAAA--------------BBB
BBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBB
BBBBBBBBBBBBBBBBB
BBBBBB-----------CCCCCCCC
BBBBBB-----------CCCCCCCC
BBBBBB-----------CCCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCC
CCCCCCCCCCCCCCCCCCC
An efficient way of computing this area is to use a sweep algorithm. Let us assume that we sweep a vertical line L(x) through the union of rectangles U:
first of all, you need to build an event queue Q, which is, in this case, the ordered list of all x-coordinates (left and right) of the rectangles.
during the sweep, you should maintain a 1D datastructure, which should give you the total length of the intersection of L(x) and U. The important thing is that this length is constant between two consecutive events q and q' of Q. So, if l(q) denotes the total length of L(q+) (i.e. L just on the rightside of q) intersected with U, the area swept by L between events q and q' is exactly l(q)*(q' - q).
you just have to sum up all these swept areas to get the total one.
We still have to solve the 1D problem. You want a 1D structure, which computes dynamically a union of (vertical) segments. By dynamically, I mean that you sometimes add a new segment, and sometimes remove one.
I already detailed in my answer to this collapsing ranges question how to do it in a static way (which is in fact a 1D sweep). So if you want something simple, you can directly apply that (by recomputing the union for each event). If you want something more efficient, you just need to adapt it a bit:
assuming that you know the union of segments S1...Sn consists of disjoints segments D1...Dk. Adding Sn+1 is very easy, you just have to locate both ends of Sn+1 amongs the ends of D1...Dk.
assuming that you know the union of segments S1...Sn consists of disjoints segments D1...Dk, removing segment Si (assuming that Si was included in Dj) means recomputing the union of segments that Dj consisted of, except Si (using the static algorithm).
This is your dynamic algorithm. Assuming that you will use sorted sets with log-time location queries to represent D1...Dk, this is probably the most efficient non-specialized method you can get.
One way-out approach is to plot it to a canvas! Draw each rectangle using a semi-transparent colour. The .NET runtime will be doing the drawing in optimised, native code - or even using a hardware accelerator.
Then, you have to read-back the pixels. Is each pixel the background colour, the rectangle colour, or another colour? The only way it can be another colour is if two or more rectangles overlapped...
If this is too much of a cheat, I'd recommend the quad-tree as another answerer did, or the r-tree.
The simplest solution
import numpy as np
A = np.zeros((100, 100))
B = np.zeros((100, 100))
A[rect1.top : rect1.bottom, rect1.left : rect1.right] = 1
B[rect2.top : rect2.bottom, rect2.left : rect2.right] = 1
area_of_union = np.sum((A + B) > 0)
area_of_intersect = np.sum((A + B) > 1)
In this example, we create two zero-matrices that are the size of the canvas. For each rectangle, fill one of these matrices with ones where the rectangle takes up space. Then sum the matrices. Now sum(A+B > 0) is the area of the union, and sum(A+B > 1) is the area of the overlap. This example can easily generalize to multiple rectangles.
This is some quick and dirty code that I used in the TopCoder SRM 160 Div 2.
t = top
b = botttom
l = left
r = right
public class Rect
{
public int t, b, l, r;
public Rect(int _l, int _b, int _r, int _t)
{
t = _t;
b = _b;
l = _l;
r = _r;
}
public bool Intersects(Rect R)
{
return !(l > R.r || R.l > r || R.b > t || b > R.t);
}
public Rect Intersection(Rect R)
{
if(!this.Intersects(R))
return new Rect(0,0,0,0);
int [] horiz = {l, r, R.l, R.r};
Array.Sort(horiz);
int [] vert = {b, t, R.b, R.t};
Array.Sort(vert);
return new Rect(horiz[1], vert[1], horiz[2], vert[2]);
}
public int Area()
{
return (t - b)*(r-l);
}
public override string ToString()
{
return l + " " + b + " " + r + " " + t;
}
}
Here's something that off the top of my head sounds like it might work:
Create a dictionary with a double key, and a list of rectangle+boolean values, like this:
Dictionary< Double, List< KeyValuePair< Rectangle, Boolean>>> rectangles;
For each rectangle in your set, find the corresponding list for the x0 and the x1 values, and add the rectangle to that list, with a boolean value of true for x0, and false for x1. This way you now have a complete list of all the x-coordinates that each rectangle either enters (true) or leaves (false) the x-direction
Grab all the keys from that dictionary (all the distinct x-coordinates), sort them, and loop through them in order, make sure you can get at both the current x-value, and the next one as well (you need them both). This gives you individual strips of rectangles
Maintain a set of rectangles you're currently looking at, which starts out empty. For each x-value you iterate over in point 3, if the rectangle is registered with a true value, add it to the set, otherwise remove it.
For a strip, sort the rectangles by their y-coordinate
Loop through the rectangles in the strip, counting overlapping distances (unclear to me as of yet how to do this efficiently)
Calculate width of strip times height of overlapping distances to get areas
Example, 5 rectangles, draw on top of each other, from a to e:
aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbb
aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbb
aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbb
aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbb
aaaaaaaadddddddddddddddddddddddddddddbbbbbb
aaaaaaaadddddddddddddddddddddddddddddbbbbbb
ddddddddddddddddddddddddddddd
ddddddddddddddddddddddddddddd
ddddddddddddddeeeeeeeeeeeeeeeeee
ddddddddddddddeeeeeeeeeeeeeeeeee
ddddddddddddddeeeeeeeeeeeeeeeeee
ccccccccddddddddddddddeeeeeeeeeeeeeeeeee
ccccccccddddddddddddddeeeeeeeeeeeeeeeeee
cccccccccccc eeeeeeeeeeeeeeeeee
cccccccccccc eeeeeeeeeeeeeeeeee
cccccccccccc
cccccccccccc
Here's the list of x-coordinates:
v v v v v v v v v
|aaaaaaa|aa|aaaa | bbbbbbbbbb|bb|bbb
|aaaaaaa|aa|aaaa | bbbbbbbbbb|bb|bbb
|aaaaaaa|aa|aaaa | bbbbbbbbbb|bb|bbb
|aaaaaaa|aa|aaaa | bbbbbbbbbb|bb|bbb
|aaaaaaaddd|dddddddddd|ddddddddddddddbb|bbb
|aaaaaaaddd|dddddddddd|ddddddddddddddbb|bbb
| ddd|dddddddddd|dddddddddddddd |
| ddd|dddddddddd|dddddddddddddd |
| ddd|ddddddddddeeeeeeeeeeeeeeeeee
| ddd|ddddddddddeeeeeeeeeeeeeeeeee
| ddd|ddddddddddeeeeeeeeeeeeeeeeee
ccccccccddd|ddddddddddeeeeeeeeeeeeeeeeee
ccccccccddd|ddddddddddeeeeeeeeeeeeeeeeee
cccccccccccc eeeeeeeeeeeeeeeeee
cccccccccccc eeeeeeeeeeeeeeeeee
cccccccccccc
cccccccccccc
The list would be (where each v is simply given a coordinate starting at 0 and going up):
0: +a, +c
1: +d
2: -c
3: -a
4: +e
5: +b
6: -d
7: -e
8: -b
Each strip would thus be (rectangles sorted from top to bottom):
0-1: a, c
1-2: a, d, c
2-3: a, d
3-4: d
4-5: d, e
5-6: b, d, e
6-7: b, e
7-8: b
for each strip, the overlap would be:
0-1: none
1-2: a/d, d/c
2-3: a/d
3-4: none
4-5: d/e
5-6: b/d, d/e
6-7: none
7-8: none
I'd imagine that a variation of the sort + enter/leave algorithm for the top-bottom check would be doable as well:
sort the rectangles we're currently analyzing in the strip, top to bottom, for rectangles with the same top-coordinate, sort them by bottom coordinate as well
iterate through the y-coordinates, and when you enter a rectangle, add it to the set, when you leave a rectangle, remove it from the set
whenever the set has more than one rectangle, you have overlap (and if you make sure to add/remove all rectangles that have the same top/bottom coordinate you're currently looking at, multiple overlapping rectangles would not be a problem
For the 1-2 strip above, you would iterate like this:
0. empty set, zero sum
1. enter a, add a to set (1 rectangle in set)
2. enter d, add d to set (>1 rectangles in set = overlap, store this y-coordinate)
3. leave a, remove a from set (now back from >1 rectangles in set, add to sum: y - stored_y
4. enter c, add c to set (>1 rectangles in set = overlap, store this y-coordinate)
5. leave d, remove d from set (now back from >1 rectangles in set, add to sum: y - stored_y)
6. multiply sum with width of strip to get overlapping areas
You would not actually have to maintain an actual set here either, just the count of the rectangles you're inside, whenever this goes from 1 to 2, store the y, and whenever it goes from 2 down to 1, calculate current y - stored y, and sum this difference.
Hope this was understandable, and as I said, this is off the top of my head, not tested in any way.
Using the example:
1 2 3 4 5 6
1 +---+---+
| |
2 + A +---+---+
| | B |
3 + + +---+---+
| | | | |
4 +---+---+---+---+ +
| |
5 + C +
| |
6 +---+---+
1) collect all the x coordinates (both left and right) into a list, then sort it and remove duplicates
1 3 4 5 6
2) collect all the y coordinates (both top and bottom) into a list, then sort it and remove duplicates
1 2 3 4 6
3) create a 2D array by number of gaps between the unique x coordinates * number of gaps between the unique y coordinates.
4 * 4
4) paint all the rectangles into this grid, incrementing the count of each cell it occurs over:
1 3 4 5 6
1 +---+
| 1 | 0 0 0
2 +---+---+---+
| 1 | 1 | 1 | 0
3 +---+---+---+---+
| 1 | 1 | 2 | 1 |
4 +---+---+---+---+
0 0 | 1 | 1 |
6 +---+---+
5) the sum total of the areas of the cells in the grid that have a count greater than one is the area of overlap. For better efficiency in sparse use-cases, you can actually keep a running total of the area as you paint the rectangles, each time you move a cell from 1 to 2.
In the question, the rectangles are described as being four doubles. Doubles typically contain rounding errors, and error might creep into your computed area of overlap. If the legal coordinates are at finite points, consider using an integer representation.
PS using the hardware accelerator as in my other answer is not such a shabby idea, if the resolution is acceptable. Its also easy to implement in a lot less code than the approach I outline above. Horses for courses.
Here's the code I wrote for the area sweep algorithm:
#include <iostream>
#include <vector>
using namespace std;
class Rectangle {
public:
int x[2], y[2];
Rectangle(int x1, int y1, int x2, int y2) {
x[0] = x1;
y[0] = y1;
x[1] = x2;
y[1] = y2;
};
void print(void) {
cout << "Rect: " << x[0] << " " << y[0] << " " << x[1] << " " << y[1] << " " <<endl;
};
};
// return the iterator of rec in list
vector<Rectangle *>::iterator bin_search(vector<Rectangle *> &list, int begin, int end, Rectangle *rec) {
cout << begin << " " <<end <<endl;
int mid = (begin+end)/2;
if (list[mid]->y[0] == rec->y[0]) {
if (list[mid]->y[1] == rec->y[1])
return list.begin() + mid;
else if (list[mid]->y[1] < rec->y[1]) {
if (mid == end)
return list.begin() + mid+1;
return bin_search(list,mid+1,mid,rec);
}
else {
if (mid == begin)
return list.begin()+mid;
return bin_search(list,begin,mid-1,rec);
}
}
else if (list[mid]->y[0] < rec->y[0]) {
if (mid == end) {
return list.begin() + mid+1;
}
return bin_search(list, mid+1, end, rec);
}
else {
if (mid == begin) {
return list.begin() + mid;
}
return bin_search(list, begin, mid-1, rec);
}
}
// add rect to rects
void add_rec(Rectangle *rect, vector<Rectangle *> &rects) {
if (rects.size() == 0) {
rects.push_back(rect);
}
else {
vector<Rectangle *>::iterator it = bin_search(rects, 0, rects.size()-1, rect);
rects.insert(it, rect);
}
}
// remove rec from rets
void remove_rec(Rectangle *rect, vector<Rectangle *> &rects) {
vector<Rectangle *>::iterator it = bin_search(rects, 0, rects.size()-1, rect);
rects.erase(it);
}
// calculate the total vertical length covered by rectangles in the active set
int vert_dist(vector<Rectangle *> as) {
int n = as.size();
int totallength = 0;
int start, end;
int i = 0;
while (i < n) {
start = as[i]->y[0];
end = as[i]->y[1];
while (i < n && as[i]->y[0] <= end) {
if (as[i]->y[1] > end) {
end = as[i]->y[1];
}
i++;
}
totallength += end-start;
}
return totallength;
}
bool mycomp1(Rectangle* a, Rectangle* b) {
return (a->x[0] < b->x[0]);
}
bool mycomp2(Rectangle* a, Rectangle* b) {
return (a->x[1] < b->x[1]);
}
int findarea(vector<Rectangle *> rects) {
vector<Rectangle *> start = rects;
vector<Rectangle *> end = rects;
sort(start.begin(), start.end(), mycomp1);
sort(end.begin(), end.end(), mycomp2);
// active set
vector<Rectangle *> as;
int n = rects.size();
int totalarea = 0;
int current = start[0]->x[0];
int next;
int i = 0, j = 0;
// big loop
while (j < n) {
cout << "loop---------------"<<endl;
// add all recs that start at current
while (i < n && start[i]->x[0] == current) {
cout << "add" <<endl;
// add start[i] to AS
add_rec(start[i], as);
cout << "after" <<endl;
i++;
}
// remove all recs that end at current
while (j < n && end[j]->x[1] == current) {
cout << "remove" <<endl;
// remove end[j] from AS
remove_rec(end[j], as);
cout << "after" <<endl;
j++;
}
// find next event x
if (i < n && j < n) {
if (start[i]->x[0] <= end[j]->x[1]) {
next = start[i]->x[0];
}
else {
next = end[j]->x[1];
}
}
else if (j < n) {
next = end[j]->x[1];
}
// distance to next event
int horiz = next - current;
cout << "horiz: " << horiz <<endl;
// figure out vertical dist
int vert = vert_dist(as);
cout << "vert: " << vert <<endl;
totalarea += vert * horiz;
current = next;
}
return totalarea;
}
int main() {
vector<Rectangle *> rects;
rects.push_back(new Rectangle(0,0,1,1));
rects.push_back(new Rectangle(1,0,2,3));
rects.push_back(new Rectangle(0,0,3,3));
rects.push_back(new Rectangle(1,0,5,1));
cout << findarea(rects) <<endl;
}
You can simplify this problem quite a bit if you split each rectangle into smaller rectangles. Collect all of the X and Y coordinates of all the rectangles, and these become your split points - if a rectangle crosses the line, split it in two. When you're done, you have a list of rectangles that overlap either 0% or 100%, if you sort them it should be easy to find the identical ones.
There is a solution listed at the link http://codercareer.blogspot.com/2011/12/no-27-area-of-rectangles.html for finding the total area of multiple rectangles such that the overlapped area is counted only once.
The above solution can be extended to compute only the overlapped area(and that too only once even if the overlapped area is covered by multiple rectangles) with horizontal sweep lines for every pair of consecutive vertical sweep lines.
If aim is just to find out the total area covered by the all the rectangles, then horizontal sweep lines are not needed and just a merge of all the rectangles between two vertical sweep lines would give the area.
On the other hand, if you want to compute the overlapped area only, the horizontal sweep lines are needed to find out how many rectangles are overlapping in between vertical (y1, y2) sweep lines.
Here is the working code for the solution I implemented in Java.
import java.io.*;
import java.util.*;
class Solution {
static class Rectangle{
int x;
int y;
int dx;
int dy;
Rectangle(int x, int y, int dx, int dy){
this.x = x;
this.y = y;
this.dx = dx;
this.dy = dy;
}
Range getBottomLeft(){
return new Range(x, y);
}
Range getTopRight(){
return new Range(x + dx, y + dy);
}
#Override
public int hashCode(){
return (x+y+dx+dy)/4;
}
#Override
public boolean equals(Object other){
Rectangle o = (Rectangle) other;
return o.x == this.x && o.y == this.y && o.dx == this.dx && o.dy == this.dy;
}
#Override
public String toString(){
return String.format("X = %d, Y = %d, dx : %d, dy : %d", x, y, dx, dy);
}
}
static class RW{
Rectangle r;
boolean start;
RW (Rectangle r, boolean start){
this.r = r;
this.start = start;
}
#Override
public int hashCode(){
return r.hashCode() + (start ? 1 : 0);
}
#Override
public boolean equals(Object other){
RW o = (RW)other;
return o.start == this.start && o.r.equals(this.r);
}
#Override
public String toString(){
return "Rectangle : " + r.toString() + ", start = " + this.start;
}
}
static class Range{
int l;
int u;
public Range(int l, int u){
this.l = l;
this.u = u;
}
#Override
public int hashCode(){
return (l+u)/2;
}
#Override
public boolean equals(Object other){
Range o = (Range) other;
return o.l == this.l && o.u == this.u;
}
#Override
public String toString(){
return String.format("L = %d, U = %d", l, u);
}
}
static class XComp implements Comparator<RW>{
#Override
public int compare(RW rw1, RW rw2){
//TODO : revisit these values.
Integer x1 = -1;
Integer x2 = -1;
if(rw1.start){
x1 = rw1.r.x;
}else{
x1 = rw1.r.x + rw1.r.dx;
}
if(rw2.start){
x2 = rw2.r.x;
}else{
x2 = rw2.r.x + rw2.r.dx;
}
return x1.compareTo(x2);
}
}
static class YComp implements Comparator<RW>{
#Override
public int compare(RW rw1, RW rw2){
//TODO : revisit these values.
Integer y1 = -1;
Integer y2 = -1;
if(rw1.start){
y1 = rw1.r.y;
}else{
y1 = rw1.r.y + rw1.r.dy;
}
if(rw2.start){
y2 = rw2.r.y;
}else{
y2 = rw2.r.y + rw2.r.dy;
}
return y1.compareTo(y2);
}
}
public static void main(String []args){
Rectangle [] rects = new Rectangle[4];
rects[0] = new Rectangle(10, 10, 10, 10);
rects[1] = new Rectangle(15, 10, 10, 10);
rects[2] = new Rectangle(20, 10, 10, 10);
rects[3] = new Rectangle(25, 10, 10, 10);
int totalArea = getArea(rects, false);
System.out.println("Total Area : " + totalArea);
int overlapArea = getArea(rects, true);
System.out.println("Overlap Area : " + overlapArea);
}
static int getArea(Rectangle []rects, boolean overlapOrTotal){
printArr(rects);
// step 1: create two wrappers for every rectangle
RW []rws = getWrappers(rects);
printArr(rws);
// steps 2 : sort rectangles by their x-coordinates
Arrays.sort(rws, new XComp());
printArr(rws);
// step 3 : group the rectangles in every range.
Map<Range, List<Rectangle>> rangeGroups = groupRects(rws, true);
for(Range xrange : rangeGroups.keySet()){
List<Rectangle> xRangeRects = rangeGroups.get(xrange);
System.out.println("Range : " + xrange);
System.out.println("Rectangles : ");
for(Rectangle rectx : xRangeRects){
System.out.println("\t" + rectx);
}
}
// step 4 : iterate through each of the pairs and their rectangles
int sum = 0;
for(Range range : rangeGroups.keySet()){
List<Rectangle> rangeRects = rangeGroups.get(range);
sum += getOverlapOrTotalArea(rangeRects, range, overlapOrTotal);
}
return sum;
}
static Map<Range, List<Rectangle>> groupRects(RW []rws, boolean isX){
//group the rws with either x or y coordinates.
Map<Range, List<Rectangle>> rangeGroups = new HashMap<Range, List<Rectangle>>();
List<Rectangle> rangeRects = new ArrayList<Rectangle>();
int i=0;
int prev = Integer.MAX_VALUE;
while(i < rws.length){
int curr = isX ? (rws[i].start ? rws[i].r.x : rws[i].r.x + rws[i].r.dx): (rws[i].start ? rws[i].r.y : rws[i].r.y + rws[i].r.dy);
if(prev < curr){
Range nRange = new Range(prev, curr);
rangeGroups.put(nRange, rangeRects);
rangeRects = new ArrayList<Rectangle>(rangeRects);
}
prev = curr;
if(rws[i].start){
rangeRects.add(rws[i].r);
}else{
rangeRects.remove(rws[i].r);
}
i++;
}
return rangeGroups;
}
static int getOverlapOrTotalArea(List<Rectangle> rangeRects, Range range, boolean isOverlap){
//create horizontal sweep lines similar to vertical ones created above
// Step 1 : create wrappers again
RW []rws = getWrappers(rangeRects);
// steps 2 : sort rectangles by their y-coordinates
Arrays.sort(rws, new YComp());
// step 3 : group the rectangles in every range.
Map<Range, List<Rectangle>> yRangeGroups = groupRects(rws, false);
//step 4 : for every range if there are more than one rectangles then computer their area only once.
int sum = 0;
for(Range yRange : yRangeGroups.keySet()){
List<Rectangle> yRangeRects = yRangeGroups.get(yRange);
if(isOverlap){
if(yRangeRects.size() > 1){
sum += getArea(range, yRange);
}
}else{
if(yRangeRects.size() > 0){
sum += getArea(range, yRange);
}
}
}
return sum;
}
static int getArea(Range r1, Range r2){
return (r2.u-r2.l)*(r1.u-r1.l);
}
static RW[] getWrappers(Rectangle []rects){
RW[] wrappers = new RW[rects.length * 2];
for(int i=0,j=0;i<rects.length;i++, j+=2){
wrappers[j] = new RW(rects[i], true);
wrappers[j+1] = new RW(rects[i], false);
}
return wrappers;
}
static RW[] getWrappers(List<Rectangle> rects){
RW[] wrappers = new RW[rects.size() * 2];
for(int i=0,j=0;i<rects.size();i++, j+=2){
wrappers[j] = new RW(rects.get(i), true);
wrappers[j+1] = new RW(rects.get(i), false);
}
return wrappers;
}
static void printArr(Object []a){
for(int i=0; i < a.length;i++){
System.out.println(a[i]);
}
System.out.println();
}
The following answer should give the total Area only once.
it comes previous answers, but implemented now in C#.
It works also with floats (or double, if you need[it doesn't itterate over the VALUES).
Credits:
http://codercareer.blogspot.co.il/2011/12/no-27-area-of-rectangles.html
EDIT:
The OP asked for the overlapping area - thats obviously very simple:
var totArea = rects.Sum(x => x.Width * x.Height);
and then the answer is:
var overlappingArea =totArea-GetArea(rects)
Here is the code:
#region rectangle overlapping
/// <summary>
/// see algorithm for detecting overlapping areas here: https://stackoverflow.com/a/245245/3225391
/// or easier here:
/// http://codercareer.blogspot.co.il/2011/12/no-27-area-of-rectangles.html
/// </summary>
/// <param name="dim"></param>
/// <returns></returns>
public static float GetArea(RectangleF[] rects)
{
List<float> xs = new List<float>();
foreach (var item in rects)
{
xs.Add(item.X);
xs.Add(item.Right);
}
xs = xs.OrderBy(x => x).Cast<float>().ToList();
rects = rects.OrderBy(rec => rec.X).Cast<RectangleF>().ToArray();
float area = 0f;
for (int i = 0; i < xs.Count - 1; i++)
{
if (xs[i] == xs[i + 1])//not duplicate
continue;
int j = 0;
while (rects[j].Right < xs[i])
j++;
List<Range> rangesOfY = new List<Range>();
var rangeX = new Range(xs[i], xs[i + 1]);
GetRangesOfY(rects, j, rangeX, out rangesOfY);
area += GetRectArea(rangeX, rangesOfY);
}
return area;
}
private static void GetRangesOfY(RectangleF[] rects, int rectIdx, Range rangeX, out List<Range> rangesOfY)
{
rangesOfY = new List<Range>();
for (int j = rectIdx; j < rects.Length; j++)
{
if (rangeX.less < rects[j].Right && rangeX.greater > rects[j].Left)
{
rangesOfY = Range.AddRange(rangesOfY, new Range(rects[j].Top, rects[j].Bottom));
#if DEBUG
Range rectXRange = new Range(rects[j].Left, rects[j].Right);
#endif
}
}
}
static float GetRectArea(Range rangeX, List<Range> rangesOfY)
{
float width = rangeX.greater - rangeX.less,
area = 0;
foreach (var item in rangesOfY)
{
float height = item.greater - item.less;
area += width * height;
}
return area;
}
internal class Range
{
internal static List<Range> AddRange(List<Range> lst, Range rng2add)
{
if (lst.isNullOrEmpty())
{
return new List<Range>() { rng2add };
}
for (int i = lst.Count - 1; i >= 0; i--)
{
var item = lst[i];
if (item.IsOverlapping(rng2add))
{
rng2add.Merge(item);
lst.Remove(item);
}
}
lst.Add(rng2add);
return lst;
}
internal float greater, less;
public override string ToString()
{
return $"ln{less} gtn{greater}";
}
internal Range(float less, float greater)
{
this.less = less;
this.greater = greater;
}
private void Merge(Range rng2add)
{
this.less = Math.Min(rng2add.less, this.less);
this.greater = Math.Max(rng2add.greater, this.greater);
}
private bool IsOverlapping(Range rng2add)
{
return !(less > rng2add.greater || rng2add.less > greater);
//return
// this.greater < rng2add.greater && this.greater > rng2add.less
// || this.less > rng2add.less && this.less < rng2add.greater
// || rng2add.greater < this.greater && rng2add.greater > this.less
// || rng2add.less > this.less && rng2add.less < this.greater;
}
}
#endregion rectangle overlapping
If your rectangles are going to be sparse (mostly not intersecting) then it might be worth a look at recursive dimensional clustering. Otherwise a quad-tree seems to be the way to go (as has been mentioned by other posters.
This is a common problem in collision detection in computer games, so there is no shortage of resources suggesting ways to solve it.
Here is a nice blog post summarizing RCD.
Here is a Dr.Dobbs article summarizing various collision detection algorithms, which would be suitable.
This type of collision detection is often called AABB (Axis Aligned Bounding Boxes), that's a good starting point for a google search.
You can find the overlap on the x and on the y axis and multiply those.
int LineOverlap(int line1a, line1b, line2a, line2b)
{
// assume line1a <= line1b and line2a <= line2b
if (line1a < line2a)
{
if (line1b > line2b)
return line2b-line2a;
else if (line1b > line2a)
return line1b-line2a;
else
return 0;
}
else if (line2a < line1b)
return line2b-line1a;
else
return 0;
}
int RectangleOverlap(Rect rectA, rectB)
{
return LineOverlap(rectA.x1, rectA.x2, rectB.x1, rectB.x2) *
LineOverlap(rectA.y1, rectA.y2, rectB.y1, rectB.y2);
}
I found a different solution than the sweep algorithm.
Since your rectangles are all rectangular placed, the horizontal and vertical lines of the rectangles will form a rectangular irregular grid. You can 'paint' the rectangles on this grid; which means, you can determine which fields of the grid will be filled out. Since the grid lines are formed from the boundaries of the given rectangles, a field in this grid will always either completely empty or completely filled by an rectangle.
I had to solve the problem in Java, so here's my solution: http://pastebin.com/03mss8yf
This function calculates of the complete area occupied by the rectangles. If you are interested only in the 'overlapping' part, you must extend the code block between lines 70 and 72. Maybe you can use a second set to store which grid fields are used more than once. Your code between line 70 and 72 should be replaced with a block like:
GridLocation gl = new GridLocation(curX, curY);
if(usedLocations.contains(gl) && usedLocations2.add(gl)) {
ret += width*height;
} else {
usedLocations.add(gl);
}
The variable usedLocations2 here is of the same type as usedLocations; it will be constructed
at the same point.
I'm not really familiar with complexity calculations; so I don't know which of the two solutions (sweep or my grid solution) will perform/scale better.
Considering we have two rectangles (A and B) and we have their bottom left (x1,y1) and top right (x2,y2) coordination. The Using following piece of code you can calculate the overlapped area in C++.
#include <iostream>
using namespace std;
int rectoverlap (int ax1, int ay1, int ax2, int ay2, int bx1, int by1, int bx2, int by2)
{
int width, heigh, area;
if (ax2<bx1 || ay2<by1 || ax1>bx2 || ay1>by2) {
cout << "Rectangles are not overlapped" << endl;
return 0;
}
if (ax2>=bx2 && bx1>=ax1){
width=bx2-bx1;
heigh=by2-by1;
} else if (bx2>=ax2 && ax1>=bx1) {
width=ax2-ax1;
heigh=ay2-ay1;
} else {
if (ax2>bx2){
width=bx2-ax1;
} else {
width=ax2-bx1;
}
if (ay2>by2){
heigh=by2-ay1;
} else {
heigh=ay2-by1;
}
}
area= heigh*width;
return (area);
}
int main()
{
int ax1,ay1,ax2,ay2,bx1,by1,bx2,by2;
cout << "Inter the x value for bottom left for rectangle A" << endl;
cin >> ax1;
cout << "Inter the y value for bottom left for rectangle A" << endl;
cin >> ay1;
cout << "Inter the x value for top right for rectangle A" << endl;
cin >> ax2;
cout << "Inter the y value for top right for rectangle A" << endl;
cin >> ay2;
cout << "Inter the x value for bottom left for rectangle B" << endl;
cin >> bx1;
cout << "Inter the y value for bottom left for rectangle B" << endl;
cin >> by1;
cout << "Inter the x value for top right for rectangle B" << endl;
cin >> bx2;
cout << "Inter the y value for top right for rectangle B" << endl;
cin >> by2;
cout << "The overlapped area is " << rectoverlap (ax1, ay1, ax2, ay2, bx1, by1, bx2, by2) << endl;
}
The post by user3048546 contains an error in the logic on lines 12-17. Here is a working implementation:
int rectoverlap (int ax1, int ay1, int ax2, int ay2, int bx1, int by1, int bx2, int by2)
{
int width, height, area;
if (ax2<bx1 || ay2<by1 || ax1>bx2 || ay1>by2) {
cout << "Rectangles are not overlapped" << endl;
return 0;
}
if (ax2>=bx2 && bx1>=ax1){
width=bx2-bx1;
} else if (bx2>=ax2 && ax1>=bx1) {
width=ax2-ax1;
} else if (ax2>bx2) {
width=bx2-ax1;
} else {
width=ax2-bx1;
}
if (ay2>=by2 && by1>=ay1){
height=by2-by1;
} else if (by2>=ay2 && ay1>=by1) {
height=ay2-ay1;
} else if (ay2>by2) {
height=by2-ay1;
} else {
height=ay2-by1;
}
area = heigh*width;
return (area);
}