Most efficient algorithm to find the biggest square in a two dimension map [duplicate] - algorithm

This question already has answers here:
Dynamic programming - Largest square block
(7 answers)
Closed 1 year ago.
I would like to know the different algorithms to find the biggest square in a two dimensions map dotted with obstacles.
An example, where o would be obstacles:
...........................
....o......................
............o..............
...........................
....o......................
...............o...........
...........................
......o..............o.....
..o.......o................
The biggest square would be (if we choose the first one):
.....xxxxxxx...............
....oxxxxxxx...............
.....xxxxxxxo..............
.....xxxxxxx...............
....oxxxxxxx...............
.....xxxxxxx...o...........
.....xxxxxxx...............
......o..............o.....
..o.......o................
What would be the fastest algorithm to find it? The one with the smallest complexity?
EDIT: I know that people are interested on the algorithm explained in the accepted answer, so I made a document that explains it a bit more, you can find it here:
https://docs.google.com/document/d/19pHCD433tYsvAor0WObxa2qusAjKdx96kaf3z5I8XT8/edit?usp=sharing

Here is how to do this in the optimal amount of time, O(nm). This is built on top of #dukeling's insight that you never need to check a solution of size less than your current known best solution.
The key is to be able to build a data structure that can answer this query in O(1) time.
Is there an obstacle in the square whose top left corner is at r, c and has size k?
To solve that problem, we'll support answering a slightly harder question, also in O(1).
What is the count of items in the rectangle from r1, c1 to r2, c2?
It's easy to answer the square existence question with an answer from the rectangle count question.
To answer the rectangle count question, note that if you had pre-computed the answer for every rectangle that starts in the top left, then you could answer the general question for from r1, c1 to r2, c2 by a kind of clever/inclusion exclusion tactic using only rectangles that start in the top left
c1 c2
-----------------------
| | | |
| A | B | |
|_____________|____| | r1
| | | |
| C | D | |
|_____________|____| | r2
|_____________________|
We want the count of stuff inside D. In terms of our pre-computed counts from the top left.
Count(D) = Count(A ∪ B ∪ C ∪ D) - Count(A ∪ C) - Count(A ∪ B) + Count(A)
You can pre-compute all the top left rectangles in O(nm) by doing some clever row/column partial sums, but I'll leave that to you.
Then to answer the to the problem you want just involves checking possible solutions, starting with solutions that are at least as good as your known best. Your known best will only get better up to min(n, m) times total, so the best_possible increment will happen very rarely and almost all squares will be rejected in O(1) time.
best_possible = 0
for r in range(n):
for c in range(m):
while True:
# this looks O(min(n, m)), but it's amortized O(1) since best_possible
# rarely increased.
if possible(r, c, best_possible+1):
best_possible += 1
else:
break

One idea, making use of binary search.
The basic idea:
Start off in the top-left corner. See if a 1x1 square would work.
If it will work, increase the sides lengths of the square by 1 and repeat.
If it won't work, move right and repeat. If you've reached the right-most position, move to the next line.
The native approach:
We can simply check every possible cell of every square at each step, but this is fairly inefficient.
The optimized approach:
When increasing the square size, we can just do a binary search over the next row and column to see if that row / column contains an obstacle at any of those positions.
When moving to the right, we can do a binary search for each next column to determine if that column contains an obstacle at any of those positions.
When moving down, we can do a similar binary on each of the columns in the target position.
Implementation note:
To start off, we'd need to go through all the rows and columns and set up arrays containing the positions of the obstacles for each of them, which we can use for the binary searches.
Running time:
We do 2 binary searches to increase the square size, and the square size is maximum the size of the grid, so that is fairly small (O(min(m,n) log max(m,n))) and gets dominated by the below.
Beyond that, for each position, we do a single binary search on a column.
So, for a grid with m columns and n rows, the overall complexity is O(mn log m).
But note how little we're actually searching below when the grid is sparse.
Example:
For your example:
012345678901234567890123456
0...........................
1....o......................
2............o..............
3...........................
4....o......................
5...............o...........
6...........................
7......o..............o.....
8..o.......o................
We'd first try a 1x1 square in the top-left corner, which works.
Then a 2x2 square. For this, we do a binary search for the range [0,1] on the row 1, which can be represented simply by {4} - an array of a single position corresponding to where the obstacle is. And we also do a binary search for the range [0,1] on the column 1, which contains no obstacles, thus an empty array - {}.
Then a 3x3 square. For this, we do a binary search for [0,2] on the row 2, which contains 1 obstacles at position 12, thus {12}. And we also do a binary search for [0,2] on the column 2, which contains an obstacle at position 8, thus {8}.
Then a 4x4 square. For this, we do a binary search for [0,3] on the row 3 - {}. And for [0,3] on column 3 - {}.
Then a 5x5 square. For this, we do a binary search for [0,4] on the row 4 - {4}. And for [0,4] column 4 - {1,4}.
Here is the first one we actually find. In the range [0,4], we find 4 in both the row and the column (we only really need to find one of them). So this indicates a fail.
From here we do a binary search on column 4 (again - not really necessary) for [0,4]. Then binary search columns 5-8 for [0,4], none of them found, so a square starting at position 5,0 is the next possible candidate.
So from here we try to increase the square size to 5x5, which works, then 6x6 and 7x7, which works.
Then we try 8x8, which doesn't work.
And so on.
I know binary search, but how does yours work?
So we're basically doing a range search within a set of values. This is fairly easy to do. First search for the starting value of the range, then the end value. If we get to the same point, there are no values in the range.
We don't really care what values exist in the range, just whether or not there are any.

So here's one rough approach.
Store the x-y positions of all the obstacles.
For each obstacle O
find obstacle C that is nearest to it column-wise.
find obstacle R-top that is nearest to it row-wise from the top.
find obstacle R-bottom that is nearest to it row-wise from the bottom.
if (|R-top.y - R-bottom.y| != |O.x - C.x|) continue
Size of the square = Abs((R-top.y - R-bottom.y) * (O.x - C.x))
Keep track of the sizes and positions to find the largest square
Complexity is roughly O(k^2) where k is the number of obstacles. You could reduce it to O(k * log k) if you use binary search.

The following SO articles are identical/similar to the problem you're trying to solve. You may want to look over those answers as well as the responses to your question.
Dynamic programming - Largest square block
dynamic programming: finding largest non-overlapping squares
Dynamic programming: Find largest diamond (rhombus)
Here's the baseline case I'd use, written in simplified Python/pseudocode.
# obstacleMap is a list of list of MapElements, stored in row-major order
max([find_largest_rect(obstacleMap, element) for row in obstacleMap for element in row])
def find_largest_rect(obstacleMap, upper_left_elem):
size = 0
while not has_obstacles(obstacleMap, upper_left_elem, size+1):
size += 1
return size
def has_obstacles(obstacleMap, upper_left_elem, size):
#determines if there are obstacles on the on outside square layer
#for example, if U is the upper left element and size=3, then has_obstacles checks the elements marked p.
# .....
# ..U.p
# ....p
# ..ppp
periphery_row = obstacleMap[upper_left_elem.row][upper_left_elem.col:upper_left_elem.col+size]
periphery_col = [row[upper_left_elem.col+size] for row in obstacleMap[upper_left_elem.row:upper_left_elem.row+size]
return any(is_obstacle(elem) for elem in periphery_row + periphery_col)
def is_obstacle(elem):
return elem.value == 'o'
class MapElement(object):
def __init__(self, row, col, value):
self.row = row
self.col = col
self.value = value

here is an approach using recurrence relation :-
isSquare(R,C1,C2) = noObstacle(R,C1,R,C2) && noObstacle(R,C2,R-(C2-C1),C2) && isSquare(R-1,C1,C2-1)
isSquare(R,C1,C2) = square that has bottom side (R,C1) to (R,C2)
noObstacle(R1,C1,R2,C2) = checks whether there is no obstacle in line segment (R1,C1) to (R2,C2)
Find Max (C2-C1+1) which where isSquare(R,C1,C2) = true
You can use dynamic programming to solve this problem in polynomial time. Use suitable data structure for searching obstacle.

Related

Find minimum distance between points

I have a set of points (x,y).
i need to return two points with minimal distance.
I use this:
http://www.cs.ucsb.edu/~suri/cs235/ClosestPair.pdf
but , i dont really understand how the algo is working.
Can explain in more simple how the algo working?
or suggest another idea?
Thank!
If the number of points is small, you can use the brute force approach i.e:
for each point find the closest point among other points and save the minimum distance with the current two indices till now.
If the number of points is large, I think you may find the answer in this thread:
Shortest distance between points algorithm
Solution for Closest Pair Problem with minimum time complexity O(nlogn) is divide-and-conquer methodology as it mentioned in the document that you have read.
Divide-and-conquer Approach for Closest-Pair Problem
Easiest way to understand this algorithm is reading an implementation of it in a high-level language (because sometimes understanding the algorithms or pseudo-codes can be harder than understanding the real codes) like Python:
# closest pairs by divide and conquer
# David Eppstein, UC Irvine, 7 Mar 2002
from __future__ import generators
def closestpair(L):
def square(x): return x*x
def sqdist(p,q): return square(p[0]-q[0])+square(p[1]-q[1])
# Work around ridiculous Python inability to change variables in outer scopes
# by storing a list "best", where best[0] = smallest sqdist found so far and
# best[1] = pair of points giving that value of sqdist. Then best itself is never
# changed, but its elements best[0] and best[1] can be.
#
# We use the pair L[0],L[1] as our initial guess at a small distance.
best = [sqdist(L[0],L[1]), (L[0],L[1])]
# check whether pair (p,q) forms a closer pair than one seen already
def testpair(p,q):
d = sqdist(p,q)
if d < best[0]:
best[0] = d
best[1] = p,q
# merge two sorted lists by y-coordinate
def merge(A,B):
i = 0
j = 0
while i < len(A) or j < len(B):
if j >= len(B) or (i < len(A) and A[i][1] <= B[j][1]):
yield A[i]
i += 1
else:
yield B[j]
j += 1
# Find closest pair recursively; returns all points sorted by y coordinate
def recur(L):
if len(L) < 2:
return L
split = len(L)/2
L = list(merge(recur(L[:split]), recur(L[split:])))
# Find possible closest pair across split line
# Note: this is not quite the same as the algorithm described in class, because
# we use the global minimum distance found so far (best[0]), instead of
# the best distance found within the recursive calls made by this call to recur().
for i in range(len(E)):
for j in range(1,8):
if i+j < len(E):
testpair(E[i],E[i+j])
return L
L.sort()
recur(L)
return best[1]
closestpair([(0,0),(7,6),(2,20),(12,5),(16,16),(5,8),\
(19,7),(14,22),(8,19),(7,29),(10,11),(1,13)])
# returns: (7,6),(5,8)
Taken from: https://www.ics.uci.edu/~eppstein/161/python/closestpair.py
Detailed explanation:
First we define an Euclidean distance aka Square distance function to prevent code repetition.
def square(x): return x*x # Define square function
def sqdist(p,q): return square(p[0]-q[0])+square(p[1]-q[1]) # Define Euclidean distance function
Then we are taking the first two points as our initial best guess:
best = [sqdist(L[0],L[1]), (L[0],L[1])]
This is a function definition for comparing Euclidean distances of next pair with our current best pair:
def testpair(p,q):
d = sqdist(p,q)
if d < best[0]:
best[0] = d
best[1] = p,q
def merge(A,B): is just a rewind function for the algorithm to merge two sorted lists that previously divided to half.
def recur(L): function definition is the actual body of the algorithm. So I will explain this function definition in more detail:
if len(L) < 2:
return L
with this part, algorithm terminates the recursion if there is only one element/point left in the list of points.
Split the list to half: split = len(L)/2
Create a recursion (by calling function's itself) for each half: L = list(merge(recur(L[:split]), recur(L[split:])))
Then lastly this nested loops will test whole pairs in the current half-list with each other:
for i in range(len(E)):
for j in range(1,8):
if i+j < len(E):
testpair(E[i],E[i+j])
As the result of this, if a better pair is found best pair will be updated.
So they solve for the problem in Many dimensions using a divide-and-conquer approach. Binary search or divide-and-conquer is mega fast. Basically, if you can split a dataset into two halves, and keep doing that until you find some info you want, you are doing it as fast as humanly and computerly possible most of the time.
For this question, it means that we divide the data set of points into two sets, S1 and S2.
All the points are numerical, right? So we have to pick some number where to divide the dataset.
So we pick some number m and say it is the median.
So let's take a look at an example:
(14, 2)
(11, 2)
(5, 2)
(15, 2)
(0, 2)
What's the closest pair?
Well, they all have the same Y coordinate, so we can look at Xs only... X shortest distance is 14 to 15, a distance of 1.
How can we figure that out using divide-and-conquer?
We look at the greatest value of X and the smallest value of X and we choose the median as a dividing line to make our two sets.
Our median is 7.5 in this example.
We then make 2 sets
S1: (0, 2) and (5, 2)
S2: (11, 2) and (14, 2) and (15, 2)
Median: 7.5
We must keep track of the median for every split, because that is actually a vital piece of knowledge in this algorithm. They don't show it very clearly on the slides, but knowing the median value (where you split a set to make two sets) is essential to solving this question quickly.
We keep track of a value they call delta in the algorithm. Ugh I don't know why most computer scientists absolutely suck at naming variables, you need to have descriptive names when you code so you don't forget what the f000 you coded 10 years ago, so instead of delta let's call this value our-shortest-twig-from-the-median-so-far
Since we have the median value of 7.5 let's go and see what our-shortest-twig-from-the-median-so-far is for Set1 and Set2, respectively:
Set1 : shortest-twig-from-the-median-so-far 2.5 (5 to m where m is 7.5)
Set 2: shortest-twig-from-the-median-so-far 3.5 (looking at 11 to m)
So I think the key take-away from the algorithm is that this shortest-twig-from-the-median-so-far is something that you're trying to improve upon every time you divide a set.
Since S1 in our case has 2 elements only, we are done with the left set, and we have 3 in the right set, so we continue dividing:
S2 = { (11,2) (14,2) (15,2) }
What do you do? You make a new median, call it S2-median
S2-median is halfway between 15 and 11... or 13, right? My math may be fuzzy, but I think that's right so far.
So let's look at the shortest-twig-so-far-for-our-right-side-with-median-thirteen ...
15 to 13 is... 2
11 to 13 is .... 2
14 to 13 is ... 1 (!!!)
So our m value or shortest-twig-from-the-median-so-far is improved (where we updated our median from before because we're in a new chunk or Set...)
Now that we've found it we know that (14, 2) is one of the points that satisfies the shortest pair equation. You can then check exhaustively against the points in this subset (15, 11, 14) to see which one is the closer one.
Clearly, (15,2) and (14,2) are the winning pair in this case.
Does that make sense? You must keep track of the median when you cut the set, and keep a new median for everytime you cut the set until you have only 2 elements remaining on each side (or in our case 3)
The magic is in the median or shortest-twig-from-the-median-so-far
Thanks for asking this question, I went in not knowing how this algorithm worked but found the right highlighted bullet point on the slide and rolled with it. Do you get it now? I don't know how to explain the median magic other than binary search is f000ing awesome.

Pairwise matching of tiles

Recently in a coding competition I came across this question.
We have a 1000 tiles where each tile is a 3x3 matrix. Each cell in the
matrix has an integer value from 0 to 9 which signifies the elevation
of the cell. The problem was to find the maximum pairs of tiles such
that they fit in perfectly. The tiles may be rotated to fit in. By fit
in it means that for tile A and tile B
A[i]+B[i]=const for i=0 to 8
The approach I thought for this problem was that I could maintain a hash value corresponding to each tile. Then I would find the possible combinations of tiles that would be
a possible fit and look it up in the hashtable.
Ex. For the tile below
5 3 2 4 6 7 5 7 8
4 8 9 matches with 5 1 0 for const = 9 & with 6 2 1 for const=10
1 4 5 8 5 4 9 6 5
for this tile the 'const' would range from 9(adding 0 to the maximum element) to 10(adding 9 to the minimum element).
So I would get two possible combinations for tiles which i would look up in the table.
But this method is greedy and does not give the desired answer and also I was unable to think of a proper hash function which would consider of all possible rotations.
So what would be a good approach for solving this problem?
I am sure there is a brute force way to solve this problem but I was actually wondering whether a viable solution to the problem exists on the lines of "pairwise equal to k" problem.
For n=1000 I would stick with the O(n^2) brute force solution. However an O(n log n) algorithm is described below.
The lexicographicalish ordering is defined by the following less-than operator:
Given two matrices M1, M2, define M1' as M1 if M1[1] is positive and -M1 if M1[1] is negative, and likewise or M2'. We say that M1<M2 if M1'[1]<M2'[1], or if M1'[1] == M2'[1] and M1'[2] < M2'[2], or if M1'[1] == M2'[1] and M1'[2] == M2'[2] and M1'[3] < M2'[3] etc.
Subtract the middle element of each matrix from the rest of the elements of the matrix i.e. A'[5] = A[5] and A'[i] = A[i] - A[5]. Then A' fits with B' if A'[i] +B'[i] = 0 for i!=5, and the elevation is A'[5] + B'[5].
Create an array of matrices and a dictionary. Rotate each matrix so that the top left corner has minimal absolute value before adding it to the array. If there are multiple corners with the same absolute value then duplicate the matrix and store both rotations in the array.
If some rotation of a matrix fits with itself and i,j are indices of rotations of this matrix, add the key-value pairs (i,j) and (j, i) to the dictionary.
Create an array S of indices 1,2... and sort S using the lexicographicalish ordering.
Instead of needing O(n^2) operations to check all possible pairs of matrices, it is only necessary to check all pairs of matrices with indices are S_i and S_(i+1). If a pair of matrices fits, use the dictionary to check that the two matrices are not rotations of the same original matrix before calculating the elevation of the pair.
Not sure if this is the most efficient way for doing this, but it sure works.
What I would do is:
Go over all tiles and check the maximum and minimum value of each tile and save it in a different array.
Check all possible pairs.
If min(A) + max(B) == min(B) + max(A) then check if some rotation of B fits perfectly on A. If it does, add 1 to your count.
Else, it does not fit so you can skip the checking for this pair.
Note: The reason for saving both maximum and minimum for each tile is that it might save us unnecessary calculations and checking rotations as in O(1) we can check if it doesn't fit.

Construct a full rank matrix by adding vectors from the standard basis

I have a nxn singular matrix. I want to add k rows (which must be from the standard basis e1, e2, ..., en) to this matrix such that the new (n+k)xn matrix is full column rank. The number of added rows k must be minimum and they can be added in any order (not just e1, e2 ,..., it can be e4, e10, e1, ...) as long as k is minimum.
Does anybody know a simple way to do this? Any help is appreciated.
You can achieve this by doing a QR decomposition with column pivoting, then taking the transpose of the last n-rank(A) columns of the permutation matrix.
In matlab, this is achieved by the qr function(See the matlab documentation here):
r=rank(A);
[Q,R,E]=qr(A);
newA=[A;transpose(E(:,end-r+1:end))];
Each row of transpose(E(:,end-r+1:end)) will be a member of standard basis, rank of newA will be n, and this is also the minimal number of standard basis you will need to do so.
Here is how this works:
QR decomposition with column pivoting is a standard procedure to decompose a matrix A into products:
A*E==Q*R
where Q is an orthogonal matrix if A is real, or an unitary matrix if A is complex; R is upper triangular matrix, and E is a permutation matrix.
In short, the permutations are chosen so that the diagonal elements are larger than the off-diagonals in the same row, and that size of the diagonal elements are non-increasing. More detailed description can be found on the netlib QR factorization page.
Since Q and E are both orthogonal (or unitary) matrices, the rank of R is the same as the rank of A. To bring up the rank of A, we just need to find ways to increase the rank of R; and this is much more straight forward thanks to the structure of R as the result of pivoting and the fact that it is upper-triangular.
Now, with the requirement placed on pivoting procedure, if any diagonal element of R is 0, the entire row has to be 0. The n-rank(A) rows of 0s in the bottom if R is responsible for the nullity. If we replace the lower right corner with an identity matrix, the that new matrix would be full rank. Well, we cannot really do the replacement, but we can append the rows matrix to the bottom of R and form a new matrix that has the same rank:
B==[ 0 I ] => newR=[ R ; B ]
Here the dimensionality of I is the nullity of A and that of R.
It is readily seen that rank(newR)=n. Then we can also define a new unitary Q matrix by expanding its dimensionality in a trivial manner:
newQ=[Q 0 ; 0 I]
With that, our new rank n matrix can be obtained as
newA=newQ*newR.transpose(E)=[Q*R ; B ]*transpose(E) =[A ; B*transpose(E)]
Note that B is [0 I] and E is a permutation matrix, so B*transpose(E) is simply the transpose
of the last n-rank(A) columns of E, and thus a set of rows made of standard basis, and that's just what you wanted!
Is n very large? The simplest solution without using any math would be to try adding e_i and seeing if the rank increases. If it does, keep e_i. proceed until finished.
I like #Xiaolei Zhu's solution because it's elegant, but another way to go (that's even more computationally efficient is):
Determine if any rows, indexed by i, of your matrix A are all zero. If so, then the corresponding e_i must be concatenated.
After that process, you can simply concatenate any subset of the n - rank(A) columns of the identity matrix that you didn't add in step 1.
rows/cols from Identity matrix can be added in any order. it does not need to be added in usual order as e1,e2,... in general situation for making matrix full rank.

Algorithm/Data Structure for finding combinations of minimum values easily

I have a symmetric matrix like shown in the image attached below.
I've made up the notation A.B which represents the value at grid point (A, B). Furthermore, writing A.B.C gives me the minimum grid point value like so: MIN((A,B), (A,C), (B,C)).
As another example A.B.D gives me MIN((A,B), (A,D), (B,D)).
My goal is to find the minimum values for ALL combinations of letters (not repeating) for one row at a time e.g for this example I need to find min values with respect to row A which are given by the calculations:
A.B = 6
A.C = 8
A.D = 4
A.B.C = MIN(6,8,6) = 6
A.B.D = MIN(6, 4, 4) = 4
A.C.D = MIN(8, 4, 2) = 2
A.B.C.D = MIN(6, 8, 4, 6, 4, 2) = 2
I realize that certain calculations can be reused which becomes increasingly important as the matrix size increases, but the problem is finding the most efficient way to implement this reuse.
Can point me in the right direction to finding an efficient algorithm/data structure I can use for this problem?
You'll want to think about the lattice of subsets of the letters, ordered by inclusion. Essentially, you have a value f(S) given for every subset S of size 2 (that is, every off-diagonal element of the matrix - the diagonal elements don't seem to occur in your problem), and the problem is to find, for each subset T of size greater than two, the minimum f(S) over all S of size 2 contained in T. (And then you're interested only in sets T that contain a certain element "A" - but we'll disregard that for the moment.)
First of all, note that if you have n letters, that this amounts to asking Omega(2^n) questions, roughly one for each subset. (Excluding the zero- and one-element subsets and those that don't include "A" saves you n + 1 sets and a factor of two, respectively, which is allowed for big Omega.) So if you want to store all these answers for even moderately large n, you'll need a lot of memory. If n is large in your applications, it might be best to store some collection of pre-computed data and do some computation whenever you need a particular data point; I haven't thought about what would work best, but for example computing data only for a binary tree contained in the lattice would not necessarily help you anything beyond precomputing nothing at all.
With these things out of the way, let's assume you actually want all the answers computed and stored in memory. You'll want to compute these "layer by layer", that is, starting with the three-element subsets (since the two-element subsets are already given by your matrix), then four-element, then five-element, etc. This way, for a given subset S, when we're computing f(S) we will already have computed all f(T) for T strictly contained in S. There are several ways that you can make use of this, but I think the easiest might be to use two such subset S: let t1 and t2 be two different elements of T that you may select however you like; let S be the subset of T that you get when you remove t1 and t2. Write S1 for S plus t1 and write S2 for S plus t2. Now every pair of letters contained in T is either fully contained in S1, or it is fully contained in S2, or it is {t1, t2}. Look up f(S1) and f(S2) in your previously computed values, then look up f({t1, t2}) directly in the matrix, and store f(T) = the minimum of these 3 numbers.
If you never select "A" for t1 or t2, then indeed you can compute everything you're interested in while not computing f for any sets T that don't contain "A". (This is possible because the steps outlined above are only interesting whenever T contains at least three elements.) Good! This leaves just one question - how to store the computed values f(T). What I would do is use a 2^(n-1)-sized array; represent each subset-of-your-alphabet-that-includes-"A" by the (n-1) bit number where the ith bit is 1 whenever the (i+1)th letter is in that set (so 0010110, which has bits 2, 4, and 5 set, represents the subset {"A", "C", "D", "F"} out of the alphabet "A" .. "H" - note I'm counting bits starting at 0 from the right, and letters starting at "A" = 0). This way, you can actually iterate through the sets in numerical order and don't need to think about how to iterate through all k-element subsets of an n-element set. (You do need to include a special case for when the set under consideration has 0 or 1 element, in which case you'll want to do nothing, or 2 elements, in which case you just copy the value from the matrix.)
Well, it looks simple to me, but perhaps I misunderstand the problem. I would do it like this:
let P be a pattern string in your notation X1.X2. ... .Xn, where Xi is a column in your matrix
first compute the array CS = [ (X1, X2), (X1, X3), ... (X1, Xn) ], which contains all combinations of X1 with every other element in the pattern; CS has n-1 elements, and you can easily build it in O(n)
now you must compute min (CS), i.e. finding the minimum value of the matrix elements corresponding to the combinations in CS; again you can easily find the minimum value in O(n)
done.
Note: since your matrix is symmetric, given P you just need to compute CS by combining the first element of P with all other elements: (X1, Xi) is equal to (Xi, X1)
If your matrix is very large, and you want to do some optimization, you may consider prefixes of P: let me explain with an example
when you have solved the problem for P = X1.X2.X3, store the result in an associative map, where X1.X2.X3 is the key
later on, when you solve a problem P' = X1.X2.X3.X7.X9.X10.X11 you search for the longest prefix of P' in your map: you can do this by starting with P' and removing one component (Xi) at a time from the end until you find a match in your map or you end up with an empty string
if you find a prefix of P' in you map then you already know the solution for that problem, so you just have to find the solution for the problem resulting from combining the first element of the prefix with the suffix, and then compare the two results: in our example the prefix is X1.X2.X3, and so you just have to solve the problem for
X1.X7.X9.X10.X11, and then compare the two values and choose the min (don't forget to update your map with the new pattern P')
if you don't find any prefix, then you must solve the entire problem for P' (and again don't forget to update the map with the result, so that you can reuse it in the future)
This technique is essentially a form of memoization.

minimum number tiles within given rectangle

I have been practicing some programming contest questions (for fun and practice for upcoming contests) and am stuck on this one: http://dwite.ca/questions/power_tiles.html
I'm not really sure where I should start =/.
How should I approach this question in order to solve it?
Looks like a Dynamic Programming problem to me
Let F(w,h) be the minimum number of squares that tile the w by h rectangle.
Find a recursive formulation for F:
if w = 0 or h = 0 then F(w, h) = 0
otherwise, F(w,h) =
For each allowable size a=i^2 <= min(w,h), try to place the a by a square (A)
in the top left corner.
One of the following possibilities will describe the
optimal solution:
+---+--+ +---+--+
| A | C| | A | |
+---+--+ +---+ |
| B | |B |C |
+------+ +---+--+
So, choose the best of
(1 + F(h-a, w) + F(h-a, w-a)) or
(1 + F(h-a, a) + F(w-a, h))
Doing big-O analysis on a napkin, this seems to be an O(side^2 * sqrt(side))-ish algorithm. If this is too much, you can:
Try to exploiti symmetries in the problem (such as F(w,h) = F(h, w))
Check the analysis again to be sure it is too slow and you need another algorithm (perhaps you don't need to calculate for all (w,h) pairs?)
Find some property of the problem that allows for a simpler, less exaustive strategy. (For example, picking the largest square whenever possible is a simple greedy strategy... but does it work in all cases?)
I would approach it by recursion.
Write a function that receives two integer values as its inputs. The one value would be the length and the other would be the width. The biggest square you could fit in would be based on the shortest side. Its dimensions would be calculated as follows:
2^RoundDown(Log(ShortSide,Base:2))
This will give you your first square and divide the rectangle up in either 3 or 1 other rectangles, or nothing if it is square with 2^n side lengths.
It is easy to get the dimensions of the remaining rectangles by simple subtraction. After the dimensions are calculated, call the function again(within itself) for each new rectangle with its dimensions.
The function should be terminated when the differences calculated for both sides are zero, i.e. it is square with 2^n side lengths.
A bit like this:
Global int Counter
DivideRectangle(int Width, int Length)
int BigSquare = 2^RoundDown(Log(Width,Base:2))
if NOT(Width - BigSqaure = 0 AND Height- BigSqaure = 0)
DivideRectangle(width - BigSquare, Height - BigSquare)
DivideRectangle(width - BigSquare, BigSquare)
DivideRectangle(BigSquare, Height - BigSquare)
Counter += 1
That's about the just of it; the counter returned after the whole operation is the the number of squres to fill the rectangle. Obviously the code is flawed and needs refinement but it's just an outline of what should happen.
What '1, 2, 4, 8, etc' do remember you?
Look at the figure, what is the order (in sterm of size) of filling you will choose?
I would start by figuring out the answer by hand to say a half dozen or so... then model how you did the problem in a program... Then after you have a working "brute force" answer try to solve the problem more elegantly.
I would start this problem by trying to put as many of the bigest size tiles in first then fill in where you can with the next bigest size that will fit. then smaller... until filled.
You might use Arrays or arrays to spacially keep track of the filled in space... however I suspect there is an easier way to do this via some simple calculations... like taking the dimensions and take the smaller of the two and utilizing log base 2 or something like that...
I am sure there is a nice neat recursive solution.. base on the powers of two .. then you could unravel that into a non-recursive solution...

Resources