Algorithm for minimizing the amount of lines to draw? - algorithm

I am currently working on optimizing a project of mine that is a few years old now. The purpose is to draw an image after hitting a certain combination of keys. The original version that I made a few years ago manually moved to each square section, but I've recently optimized it to draw rectangles for consecutive squares which makes it a lot faster.
The next step I want to take is optimizing the way the program draws a given layout, but I don't know where to start looking. I'm hoping someone can point me in the right direction since I can't even think of a search term for this.
Currently the program has a function called Draw which takes an input like this:
Invader =
(
00100000100
00010001000
00111111100
01101110110
11111111111
10111111101
10100000101
00011011000
)
Draw(Invader, 10) ; Where 10 is the size in pixels of each square
The layout above is for this image:
Draw will take that layout and draw it top to bottom, left to right in the following way:
In total, it takes 18 separate sections to finish the picture. What I'm looking for is some algorithm that can minimize this number. For instance, the following is one of the few ways of having only 16 sections:
Likewise, the difference between the current way and something I just made up on the spot for this image is 19 (65 compared to 46).
Where should I start with this?
Also for reference, here is the current Draw function:
Draw(Layout, BlockSize)
{
Len := StrLen(Layout) ; Total amount of characters
RowSize := StrLen(StrSplit(Layout, "`n")[1]) ; Size of a single row
Index := 0
While (Index < Len)
{
Length := 1
Char := GetChar(Layout, Index) ; Get next character in string
if (Char == "1")
{
; Get the number of consecutive 1s
While (GetChar(Layout, Index + Length) == "1")
{
Length := Length + 1
}
; Draw the rectangle
FillRectangle(Length, BlockSize)
}
else if (Char == "0")
{
; Get the number of consecutive 0s
While (GetChar(Layout, Index + Length) == "0")
{
Length := Length + 1
}
; Skip the entire length
MouseMove, BlockSize * Length, 0, 0, R
}
else
{
; End of line, reset position
MouseMove, -(RowSize * BlockSize), BlockSize, 0, R
}
Index := Index + Length
}
}
FillRectangle(Width, BlockSize)
{
MouseGetPos, mX, mY
mY2 := mY ; Same Y for straight line
mX2 := mX + Width * BlockSize ; Add Width of rectangle times the block size to get final X position
Loop %BlockSize%
{
; Draw line
MouseClickDrag, L, mX, mY, mX2, mY2
; Move to next line
mY -= 1
mY2 -= 1
}
; Move mouse to next position
MouseMove, 0, BlockSize - 1, 0, R
}
GetChar(String, Index)
{
return SubStr(String, Index, 1)
}

You should do some sort of analysis first either way. Afterwards I would propose to pass "image" both way and keep the longer line ( mark each cell of longer line as passed or "black" so you do not repeat checking).
void analyze(){
var horSize = 0, verSize = 0;
// run horizontally & vertically for each white cell
while(!reached_boundary){
++horSize ;
}
while(!reached_boundary){
++verSize ;
}
someContainer.Add( (horSize > verSize) ? horSize:verSize);
}

This is expanded from EpiGen's answer, but I felt it needed its own post to explain the differences.
This is the current status of what I have, but it's still not 100% optimal in all cases (as shown below). If there are any improvements feel free to add them.
So, the basic flow of the algorithm is as follows:
Get horizontal length from current point
Get vertical length from current point
Pick bigger length and use that direction
However, it doesn't just give the length it sees. Instead it picks the max length that doesn't intersect a line that has a greater length. Here are the steps:
Check if next pixel is a 1 (Going right for horizontal, down for vertical)
If it is, then check the length in the opposite direction starting from that index.
If that length is longer than the current length, then save the current length and the opposite length value.
Once a character that isn't a 1 is seen, if the max length in the direction being checked is lower than the max length in the opposite direction, then return the length in our direction before that point.
Here is an example of this logic in action. The grey lines represent lines that have already been drawn, the green line represents the line being checked, and the red line indicates a boundary.
Since the red line's horizontal length is greater than the current vertical length at this point, the values are saved in their current form (vertical 1, horizontal 7). After the vertical line check completes and finds a length of 2, it then sees that it crossed a line of length 7. Since it's less efficient to split this line for a smaller one, it instead changes its length back to 1 which is what it had before it crossed that line. That makes the final output look like this with a total of 16 segments, which is optimal as far as I know.
However, it fails under certain conditions; specifically the bottom left corner of this image.
The green line has a length of 10, and the row it stops at has a length of 9. Since that row isn't greater or equal to its size, it splits the line which leaves a single block to the side. If this problem were fixed, then this image would be optimal as far as I'm aware. (Lowest I've gotten is 44, current logic gets 45).
Regardless, this seems to be working as good as I need it to. If there are any other answers with better solutions in the next day or so I'll take a look at them.
As an extra, here's a gif of it running for one of the larger ones:

Related

Algorithm for calculating the area of a region in a grid of squares

I'm working on a game which uses a tilemap. Squares on the map can either be walls or they can be empty. The algorithm I'm trying to develop should take a point on the map and return the number of cells that can be reached from that point (which is equal to the area of the sector containing the point).
Let the function which carries out the algorithm take an x-coordinate, a y-coordinate and a map in the form of a 2D array.
function sectorArea(x_coord,y_coord,map) { ... }
Say the map looks like this (where 1's represent walls):
map = [0,0,1,0,0,0],
[0,0,1,0,0,0],
[1,1,1,0,0,0],
[0,0,0,0,0,0]
Then sectorArea(0,0,map) == 4 and sectorArea(4,0,map) == 15.
My naive implementation is recursive. The target cell is passed to the go function, which then recurses on any adjacent cells which are empty - eventually spreading across all empty cells in the sector. It runs too slowly and reaches the call stack limit very quickly:
function sectorArea(x_coord,y_coord,map) {
# First convert the map into an array of objects of the form:
# { value: 0 or 1,
# visited: false }
objMap = convertMap(map);
# The recursive function:
function go(x,y) {
if ( outOfBounds(x) || outOfBounds(y) ||
objMap[y][x].value == 1 || objMap[y][x].visited )
return 0;
else
objMap[y][x].visited = true;
return 1 + go(x+1,y) + go(x-1,y) + go(x,y+1) + go(x,y-1);
}
return go(x_coord,y_coord);
}
Could anyone suggest a better algorithm? A non-deterministic solution would actually be fine if it is sufficiently accurate, as speed is the main issue (the algorithm could be called 3 or 4 times on different points during a single tick).
Maybe you can speed up the algorithm itself. Wikipedia suggests that a scanline approach is efficient.
As for the repeated calls: You can cache the results so that you don't have to run the area calculation again every time.
An approach might be to keep an region map of integers alongside your tiles. This denotes several regions, where a special value, -1 for example, means no region. (This region map also serves as your visited attribute.) In addition to that, keep a (short) array of regions with their areas.
In your example above:
When you calculate the area of (0, 0), you will assign 0 to the four tiles in the northwest corner. You will also append the area, 4, to the area array.
When you calculate the area of (0, 1), you notice that the region map for that coordinate has a value of zero, not -1. That means that the area was already calculated.
When you calculate the area of (4, 4), you find -1 in the region map. That means that the region hasn't been calculated yet. Do that, mark the region with 1 and append the new area, 15, to the area array.
I don't know how often the board changes. When you must recalculate the regions, blank out the region map and empty the array list.
The region map is only created once, it isn't recreated for every tick. (I can see this as a potential bottleneck in your code, when the objMap is frequently recreated instead of just being overwritten.)

Positions on grid 'used'

I have a puzzle to solve which involves taking input which is size of grid. Grid is always square. Then a number of points on the grid are provided and the squares on the grid are 'taken' if they are immediately left or right or above or below.
Eg imagine a grid 10 x 10. If points are (1,1) bottom left and (10,10) top right, then if a point (2,1) is given then square positions left and right (10 squares) and above and below (another 9 squares) are taken. So using simple arithmetic, if grid is n squared then n + (n-1) squares will be taken on first point provided.
But it gets complicated if other points are provided as input. Eg if next point is eg (5,5) then another 19 squares will be 'taken' minus thos squares overlapping other point. so it gets complex. and of course a point say (3,1) could be provided which overlaps more.
Is there an algorithm for this type of problem?
Or is it simply a matter of holding a 2 dimensional array and placing an x for each taken square. then at end just totting up taken (or non-taken) squares. That would work but I was wndering if there is an easier way.
Keep two sets: X (storing all x-coords) and Y (storing all y-coords). The number of squares taken will be n * (|X| + |Y|) - |X| * |Y|. This follows because each unique x-coord removes a column of n squares, and each unique y-coord removes a row of n squares. But this counts the intersections of the removed rows and columns twice, so we subtract |X| * |Y| to account for this.
One way to do it is to keep track of the positions that are taken in some data structure, for example a set.
At the first step this involves adding n + (n - 1) squares to that data structure.
At the second (third, fourth) step etc this involves checking for each square at the horizontal and vertical line for the given (x, y) whether it's already in the data structure. If not then you add it to the data structure. Otherwise, if the point is already in there, then it was taken in an earlier step.
We can actually see that the first step is just a special case of the other rounds because in the first round no points are taken yet. So in general the algorithm is to keep track of the taken points and to add any new ones to a data structure.
So in pseudocode:
Create a data structure taken_points = empty data structure (e.g., a set)
Whenever you're processing a point (x, y):
Set a counter = 0.
Given a point (x, y):
for each point (px, py) on the horizontal and vertical lines that intersect with (x, y):
check if that point is already in taken_points
if it is, then do nothing
otherwise, add (px, py) to taken_points and increment counter
You've now updated taken_points to contain all the points that are taken so far and counter is the number of points that were taken in the most recent round.
Here is the way to do it without using large space:-
rowVisited[n] = {0}
colVisited[n] = {0}
totalrows = 0 and totalcol = 0 for total rows and columns visited
total = 0; // for point taken for x,y
given point (x,y)
if(!rowVisited[x]) {
total = total + n - totalcol;
}
if(!colVisited[y]) {
total = total + n-1 - totalrows + rowVisited[x];
}
if(!rowVisited[x]) {
rowVisited[x] = 1;
totalrows++;
}
if(!colVisited[x]) {
colVisited[x] = 1;
totalcol++;
}
print total

Folding a selection of points on a 3D cube

I am trying to find an effective algorithm for the following 3D Cube Selection problem:
Imagine a 2D array of Points (lets make it square of size x size) and call it a side.
For ease of calculations lets declare max as size-1
Create a Cube of six sides, keeping 0,0 at the lower left hand side and max,max at top right.
Using z to track the side a single cube is located, y as up and x as right
public class Point3D {
public int x,y,z;
public Point3D(){}
public Point3D(int X, int Y, int Z) {
x = X;
y = Y;
z = Z;
}
}
Point3D[,,] CreateCube(int size)
{
Point3D[,,] Cube = new Point3D[6, size, size];
for(int z=0;z<6;z++)
{
for(int y=0;y<size;y++)
{
for(int x=0;x<size;x++)
{
Cube[z,y,x] = new Point3D(x,y,z);
}
}
}
return Cube;
}
Now to select a random single point, we can just use three random numbers such that:
Point3D point = new Point(
Random(0,size), // 0 and max
Random(0,size), // 0 and max
Random(0,6)); // 0 and 5
To select a plus we could detect if a given direction would fit inside the current side.
Otherwise we find the cube located on the side touching the center point.
Using 4 functions with something like:
private T GetUpFrom<T>(T[,,] dataSet, Point3D point) where T : class {
if(point.y < max)
return dataSet[point.z, point.y + 1, point.x];
else {
switch(point.z) {
case 0: return dataSet[1, point.x, max]; // x+
case 1: return dataSet[5, max, max - point.x];// y+
case 2: return dataSet[1, 0, point.x]; // z+
case 3: return dataSet[1, max - point.x, 0]; // x-
case 4: return dataSet[2, max, point.x]; // y-
case 5: return dataSet[1, max, max - point.x];// z-
}
}
return null;
}
Now I would like to find a way to select arbitrary shapes (like predefined random blobs) at a random point.
But would settle for adjusting it to either a Square or jagged Circle.
The actual surface area would be warped and folded onto itself on corners, which is fine and does not need compensating ( imagine putting a sticker on the corner on a cube, if the corner matches the center of the sticker one fourth of the sticker would need to be removed for it to stick and fold on the corner). Again this is the desired effect.
No duplicate selections are allowed, thus cubes that would be selected twice would need to be filtered somehow (or calculated in such a way that duplicates do not occur). Which could be a simple as using a HashSet or a List and using a helper function to check if the entry is unique (which is fine as selections will always be far below 1000 cubes max).
The delegate for this function in the class containing the Sides of the Cube looks like:
delegate T[] SelectShape(Point3D point, int size);
Currently I'm thinking of checking each side of the Cube to see which part of the selection is located on that side.
Calculating which part of the selection is on the same side of the selected Point3D, would be trivial as we don't need to translate the positions, just the boundary.
Next would be 5 translations, followed by checking the other 5 sides to see if part of the selected area is on that side.
I'm getting rusty in solving problems like this, so was wondering if anyone has a better solution for this problem.
#arghbleargh Requested a further explanation:
We will use a Cube of 6 sides and use a size of 16. Each side is 16x16 points.
Stored as a three dimensional array I used z for side, y, x such that the array would be initiated with: new Point3D[z, y, x], it would work almost identical for jagged arrays, which are serializable by default (so that would be nice too) [z][y][x] but would require seperate initialization of each subarray.
Let's select a square with the size of 5x5, centered around a selected point.
To find such a 5x5 square substract and add 2 to the axis in question: x-2 to x+2 and y-2 to y+2.
Randomly selectubg a side, the point we select is z = 0 (the x+ side of the Cube), y = 6, x = 6.
Both 6-2 and 6+2 are well within the limits of 16 x 16 array of the side and easy to select.
Shifting the selection point to x=0 and y=6 however would prove a little more challenging.
As x - 2 would require a look up of the side to the left of the side we selected.
Luckily we selected side 0 or x+, because as long as we are not on the top or bottom side and not going to the top or bottom side of the cube, all axis are x+ = right, y+ = up.
So to get the coordinates on the side to the left would only require a subtraction of max (size - 1) - x. Remember size = 16, max = 15, x = 0-2 = -2, max - x = 13.
The subsection on this side would thus be x = 13 to 15, y = 4 to 8.
Adding this to the part we could select on the original side would give the entire selection.
Shifting the selection to 0,6 would prove more complicated, as now we cannot hide behind the safety of knowing all axis align easily. Some rotation might be required. There are only 4 possible translations, so it is still manageable.
Shifting to 0,0 is where the problems really start to appear.
As now both left and down require to wrap around to other sides. Further more, as even the subdivided part would have an area fall outside.
The only salve on this wound is that we do not care about the overlapping parts of the selection.
So we can either skip them when possible or filter them from the results later.
Now that we move from a 'normal axis' side to the bottom one, we would need to rotate and match the correct coordinates so that the points wrap around the edge correctly.
As the axis of each side are folded in a cube, some axis might need to flip or rotate to select the right points.
The question remains if there are better solutions available of selecting all points on a cube which are inside an area. Perhaps I could give each side a translation matrix and test coordinates in world space?
Found a pretty good solution that requires little effort to implement.
Create a storage for a Hollow Cube with a size of n + 2, where n is the size of the cube contained in the data. This satisfies the : sides are touching but do not overlap or share certain points.
This will simplify calculations and translations by creating a lookup array that uses Cartesian coordinates.
With a single translation function to take the coordinates of a selected point, get the 'world position'.
Using that function we can store each point into the cartesian lookup array.
When selecting a point, we can again use the same function (or use stored data) and subtract (to get AA or min position) and add (to get BB or max position).
Then we can just lookup each entry between the AA.xyz and BB.xyz coordinates.
Each null entry should be skipped.
Optimize if required by using a type of array that return null if z is not 0 or size-1 and thus does not need to store null references of the 'hollow cube' in the middle.
Now that the cube can select 3D cubes, the other shapes are trivial, given a 3D point, define a 3D shape and test each part in the shape with the lookup array, if not null add it to selection.
Each point is only selected once as we only check each position once.
A little calculation overhead due to testing against the empty inside and outside of the cube, but array access is so fast that this solution is fine for my current project.

Largest rectangular sub matrix with the same number

I am trying to come up with a dynamic programming algorithm that finds the largest sub matrix within a matrix that consists of the same number:
example:
{5 5 8}
{5 5 7}
{3 4 1}
Answer : 4 elements due to the matrix
5 5
5 5
This is a question I already answered here (and here, modified version). In both cases the algorithm was applied to binary case (zeros and ones), but the modification for arbitrary numbers is quite easy (but sorry, I keep the images for the binary version of the problem). You can do this very efficiently by two pass linear O(n) time algorithm - n being number of elements. However, this is not a dynamic programming - I think using dynamic programming here would be clumsy and inefficient in the end, because of the difficulties with problem decomposition, as the OP mentioned - unless its a homework - but in that case you can try to impress by this algorithm :-) as there's obviously no faster solution than O(n).
Algorithm (pictures depict binary case):
Say you want to find largest rectangle of free (white) elements.
Here follows the two pass linear O(n) time algorithm (n being number of elemets):
1) in a first pass, go by columns, from bottom to top, and for each element, denote the number of consecutive elements available up to this one:
repeat, until:
Pictures depict the binary case. In case of arbitrary numbers you hold 2 matrices - first with the original numbers and second with the auxiliary numbers that are filled in the image above. You have to check the original matrix and if you find a number different from the previous one, you just start the numbering (in the auxiliary matrix) again from 1.
2) in a second pass you go by rows, holding data structure of potential rectangles, i.e. the rectangles containing current position somewhere at the top edge. See the following picture (current position is red, 3 potential rectangles - purple - height 1, green - height 2 and yellow - height 3):
For each rectangle we keep its height k and its left edge. In other words we keep track of the sums of consecutive numbers that were >= k (i.e. potential rectangles of height k). This data structure can be represented by an array with double linked list linking occupied items, and the array size would be limited by the matrix height.
Pseudocode of 2nd pass (non-binary version with arbitrary numbers):
var m[] // original matrix
var aux[] // auxiliary matrix filled in the 1st pass
var rect[] // array of potential rectangles, indexed by their height
// the occupied items are also linked in double linked list,
// ordered by height
foreach row = 1..N // go by rows
foreach col = 1..M
if (col > 1 AND m[row, col] != m[row, col - 1]) // new number
close_potential_rectangles_higher_than(0); // close all rectangles
height = aux[row, col] // maximal height possible at current position
if (!rect[height]) { // rectangle with height does not exist
create rect[height] // open new rectangle
if (rect[height].next) // rectangle with nearest higher height
// if it exists, start from its left edge
rect[height].left_col = rect[height].next.left_col
else
rect[height].left_col = col;
}
close_potential_rectangles_higher_than(height)
end for // end row
close_potential_rectangles_higher_than(0);
// end of row -> close all rect., supposing col is M+1 now!
end for // end matrix
The function for closing rectangles:
function close_potential_rectangles_higher_than(height)
close_r = rectangle with highest height (last item in dll)
while (close_r.height > height) { // higher? close it
area = close_r.height * (col - close_r.left_col)
if (area > max_area) { // we have maximal rectangle!
max_area = area
max_topleft = [row, close_r.left_col]
max_bottomright = [row + height - 1, col - 1]
}
close_r = close_r.prev
// remove the rectangle close_r from the double linked list
}
end function
This way you can also get all maximum rectangles. So in the end you get:
And what the complexity will be? You see that the function close_potential_rectangles_higher_than is O(1) per closed rectangle. Because for each field we create 1 potential rectangle at the maximum, the total number of potential rectangles ever present in particular row is never higher than the length of the row. Therefore, complexity of this function is O(1) amortized!
So the whole complexity is O(n) where n is number of matrix elements.
A dynamic solution:
Define a new matrix A wich will store in A[i,j] two values: the width and the height of the largest submatrix with the left upper corner at i,j, fill this matrix starting from the bottom right corner, by rows bottom to top. You'll find four cases:
case 1: none of the right or bottom neighbour elements in the original matrix are equal to the current one, i.e: M[i,j] != M[i+1,j] and M[i,j] != M[i,j+1] being M the original matrix, in this case, the value of A[i,j] is 1x1
case 2: the neighbour element to the right is equal to the current one but the bottom one is different, the value of A[i,j].width is A[i+1,j].width+1 and A[i,j].height=1
case 3: the neighbour element to the bottom is equal but the right one is different, A[i,j].width=1, A[i,j].height=A[i,j+1].height+1
case 4: both neighbours are equal: A[i,j].width = min(A[i+1,j].width+1,A[i,j+1].width) and A[i,j].height = min(A[i,j+1]+1,A[i+1,j])
the size of the largest matrix that has the upper left corner at i,j is A[i,j].width*A[i,j].height so you can update the max value found while calculating the A[i,j]
the bottom row and the rightmost column elements are treated as if their neighbours to the bottom and to the right respectively are different
in your example, the resulting matrix A would be:
{2:2 1:2 1:1}
{2:1 1:1 1:1}
{1:1 1:1 1:1}
being w:h width:height
Modification to the above answer:
Define a new matrix A wich will store in A[i,j] two values: the width and the height of the largest submatrix with the left upper corner at i,j, fill this matrix starting from the bottom right corner, by rows bottom to top. You'll find four cases:
case 1: none of the right or bottom neighbour elements in the original matrix are equal to the current one, i.e: M[i,j] != M[i+1,j] and M[i,j] != M[i,j+1] being M the original matrix, in this case, the value of A[i,j] is 1x1
case 2: the neighbour element to the right is equal to the current one but the bottom one is different, the value of A[i,j].width is A[i+1,j].width+1 and A[i,j].height=1
case 3: the neighbour element to the bottom is equal but the right one is different, A[i,j].width=1, A[i,j].height=A[i,j+1].height+1
case 4: both neighbours are equal:
Three rectangles are considered:
1. A[i,j].width=A[i,j+1].width+1; A[i,j].height=1;
A[i,j].height=A[i+1,j].height+1; a[i,j].width=1;
A[i,j].width = min(A[i+1,j].width+1,A[i,j+1].width) and A[i,j].height = min(A[i,j+1]+1,A[i+1,j])
The one with the max area in the above three cases will be considered to represent the rectangle at this position.
The size of the largest matrix that has the upper left corner at i,j is A[i,j].width*A[i,j].height so you can update the max value found while calculating the A[i,j]
the bottom row and the rightmost column elements are treated as if their neighbours to the bottom and to the right respectively are different.
This question is a duplicate. I have tried to flag it as a duplicate. Here is a Python solution, which also returns the position and shape of the largest rectangular submatrix:
#!/usr/bin/env python3
import numpy
s = '''5 5 8
5 5 7
3 4 1'''
nrows = 3
ncols = 3
skip_not = 5
area_max = (0, [])
a = numpy.fromstring(s, dtype=int, sep=' ').reshape(nrows, ncols)
w = numpy.zeros(dtype=int, shape=a.shape)
h = numpy.zeros(dtype=int, shape=a.shape)
for r in range(nrows):
for c in range(ncols):
if not a[r][c] == skip_not:
continue
if r == 0:
h[r][c] = 1
else:
h[r][c] = h[r-1][c]+1
if c == 0:
w[r][c] = 1
else:
w[r][c] = w[r][c-1]+1
minw = w[r][c]
for dh in range(h[r][c]):
minw = min(minw, w[r-dh][c])
area = (dh+1)*minw
if area > area_max[0]:
area_max = (area, [(r, c, dh+1, minw)])
print('area', area_max[0])
for t in area_max[1]:
print('coord and shape', t)
Output:
area 4
coord and shape (1, 1, 2, 2)

Implementing a Hilbert map of the Internet

In the XKCD comic 195 a design for a map of the Internet address space is suggested using a Hilbert curve so that items from a similar IP adresses will be clustered together.
Given an IP address, how would I calculate its 2D coordinates (in the range zero to one) on such a map?
This is pretty easy, since the Hilbert curve is a fractal, that is, it is recursive. It works by bisecting each square horizontally and vertically, dividing it into four pieces. So you take two bits of the IP address at a time, starting from the left, and use those to determine the quadrant, then continue, using the next two bits, with that quadrant instead of the whole square, and so on until you have exhausted all the bits in the address.
The basic shape of the curve in each square is horseshoe-like:
0 3
1 2
where the numbers correspond to the top two bits and therefore determine the traversal order. In the xkcd map, this square is the traversal order at the highest level. Possibly rotated and/or reflected, this shape is present at each 2x2 square.
Determination of how the "horseshoe" is oriented in each of the subsquares is determined by one rule: the 0 corner of the 0 square is in the corner of the larger square. Thus, the subsquare corresponding to 0 above must be traversed in the order
0 1
3 2
and, looking at the whole previous square and showing four bits, we get the following shape for the next division of the square:
00 01 32 33
03 02 31 30
10 13 20 23
11 12 21 22
This is how the square always gets divided at the next level. Now, to continue, just focus on the latter two bits, orient this more detailed shape according to how the horseshoe shape of those bits is oriented, and continue with a similar division.
To determine the actual coordinates, each two bits determine one bit of binary precision in the real number coordinates. So, on the first level, the first bit after the binary point (assuming coordinates in the [0,1] range) in the x coordinate is 0 if the first two bits of the address have the value 0 or 1, and 1 otherwise. Similarly, the first bit in the y coordinate is 0 if the first two bits have the value 1 or 2. To determine whether to add a 0 or 1 bit to the coordinates, you need to check the orientation of the horseshoe at that level.
EDIT: I started working out the algorithm and it turns out that it's not that hard after all, so here's some pseudo-C. It's pseudo because I use a b suffix for binary constants and treat integers as arrays of bits, but changing it to proper C shouldn't be too hard.
In the code, pos is a 3-bit integer for the orientation. The first two bits are the x and y coordinates of 0 in the square and the third bit indicates whether 1 has the same x coordinate as 0. The initial value of pos is 011b, meaning that the coordinates of 0 are (0, 1) and 1 has the same x coordinate as 0. ad is the address, treated as an n-element array of 2-bit integers, and starting from the most significant bits.
double x = 0.0, y = 0.0;
double xinc, yinc;
pos = 011b;
for (int i = 0; i < n; i++) {
switch (ad[i]) {
case 0: xinc = pos[0]; yinc = pos[1]; pos[2] = ~pos[2]; break;
case 1: xinc = pos[0] ^ ~pos[2]; yinc = pos[1] ^ pos[2]; break;
case 2: xinc = ~pos[0]; yinc = ~pos[1]; break;
case 3: xinc = pos[0] ^ pos[2]; yinc = pos[1] ^ ~pos[2];
pos = ~pos; break;
}
x += xinc / (1 << (i+1)); y += yinc / (1 << (i+1));
}
I tested it with a couple of 8-bit prefixes and it placed them correctly according to the xkcd map, so I'm somewhat confident the code is correct.
Essentially you would decompose the number, using pairs of bits, MSB to LSB. The pair of bits tells you if the location is in the Upper Left (0) Lower Left (1) Lower Right (2) or Upper Right (3) quadrant, at a scale that gets finer as you shift through the number.
Additionally, you need to track an "orientation". This is the winding that is used at the scale you are at; the initial winding is as above (UL, LL, LR, UR), and depending on which quadrant you end up in, the winding at the next scale down is (rotated -90, 0, 0, +90) from your current winding.
So you could accumulate offsets :
suppose I start at 0,0, and the first pair gives me a 2, I shift offsets to 0.5, 0.5. The winding in the lower right is the same as my initial one. The next pair reduces the scale, so my adjustments are going to be 0.25 in length.
This pair is a 3, so I translate only my x coordinate and I am at .75, .5. The winding is now rotated over and my next scale down will be (LR, LL, UL, UR). The scale is now .125, and so on and so on until I run out of bits in my address.
I expect that based on the wikipedia code for a Hilbert curve you could keep track of your current position (as an (x, y) coordinate) and return that position after n cells had been visited. Then the position scaled onto [0..1] would depend on how high and wide the Hilbert curve was going to be at completion.
from turtle import left, right, forward
size = 10
def hilbert(level, angle):
if level:
right(angle)
hilbert(level - 1, -angle)
forward(size)
left(angle)
hilbert(level - 1, angle)
forward(size)
hilbert(level - 1, angle)
left(angle)
forward(size)
hilbert(level - 1, -angle)
right(angle)
Admittedly, this would be a brute force solution rather than a closed form solution.

Resources