How to efficiently enumerate all points of sphere in n-dimensional grid

How to efficiently enumerate all points of sphere in n-dimensional grid - algorithm

Say, we have an N-dimensional grid and some point X in it with coordinates (x1, x2, ..., xN).
For simplicity we can assume that the grid is unbounded.
Let there be a radius R and a sphere of this radius with center in X, that is the set of all points in grid such that their manhattan distance from X is equal to R.
I suspect that their will be 2*N*R such points.
My question is: how do I enumerate them in efficient and simple way? By "enumerate" I mean the algorithm, which, given N, X and R will produce the list of points which form this sphere (where point is the list of it's coordinates).
UPDATE: Initially I called the metric I used "Hamming distance" by mistake. My apologies to all who answered the question. Thanks to Steve Jessop for pointing this out.

Consider the minimal axis-aligned hypercube that bounds the hypersphere and write a procedure to enumerate the grid points inside the hypercube.
Then you only need a simple filter function that allows you to discard the points that are on the cube but not in the hypersphere.
This is a simple and efficient solution for small dimensions. For instance, for 2D, 20% of the points enumerated for the bounding square are discarded; for 6D, almost 90% of the hypercube points are discarded.
For higher dimensions, you will have to use a more complex approach: loop over every dimension (you may need to use a recursive function if the number of dimensions is variable). For every loop you will have to adjust the minimal and maximal values depending on the values of the already calculated grid components. Well, try doing it for 2D, enumerating the points of a circle and once you understand it, generalizing the procedure to higher dimensions would be pretty simple.
update: errh, wait a minute, you want to use the Manhattan distance. Calling the cross polytope "sphere" may be correct but I found it quite confusing! In any case you can use the same approach.
If you only want to enumerate the points on the hyper-surface of the cross polytope, well, the solution is also very similar, you have to loop over every dimension with appropriate limits. For instance:
for (i = 0; i <= n; i++)
for (j = 0; j + i <= n; j++)
...
for (l = 0; l + ...+ j + i <= n; l++) {
m = n - l - ... - j - i;
printf(pat, i, j, ..., l, m);
}
For every point generated that way, then you will have to consider all the variations resulting of negating any of the components to cover all the faces and then displace them with the vector X.
update
Perl implementation for the case where X = 0:
#!/usr/bin/perl
use strict;
use warnings;
sub enumerate {
my ($d, $r) = #_;
if ($d == 1) {
return ($r ? ([-$r], [$r]) : [0])
}
else {
my #r;
for my $i (0..$r) {
for my $s (enumerate($d - 1, $r - $i)) {
for my $j ($i ? (-$i, $i) : 0) {
push #r, [#$s, $j]
}
}
}
return #r;
}
}
#ARGV == 2 or die "Usage:\n $0 dimension radius\n\n";
my ($d, $r) = #ARGV;
my #r = enumerate($d, $r);
print "[", join(',', #$_), "]\n" for #r;

Input: radius R, dimension D
Generate all integer partitions of R with cardinality ≤ D
For each partition, permute it without repetition
For each permutation, twiddle all the signs
For example, code in python:
from itertools import *
# we have to write this function ourselves because python doesn't have it...
def partitions(n, maxSize):
if n==0:
yield []
else:
for p in partitions(n-1, maxSize):
if len(p)<maxSize:
yield [1] + p
if p and (len(p)<2 or p[1]>p[0]):
yield [ p[0]+1 ] + p[1:]
# MAIN CODE
def points(R, D):
for part in partitions(R,D): # e.g. 4->[3,1]
part = part + [0]*(D-len(part)) # e.g. [3,1]->[3,1,0] (padding)
for perm in set(permutations(part)): # e.g. [1,3,0], [1,0,3], ...
for point in product(*[ # e.g. [1,3,0], [-1,3,0], [1,-3,0], [-...
([-x,x] if x!=0 else [0]) for x in perm
]):
yield point
Demo for radius=4, dimension=3:
>>> result = list( points(4,3) )
>>> result
[(-1, -2, -1), (-1, -2, 1), (-1, 2, -1), (-1, 2, 1), (1, -2, -1), (1, -2, 1), (1, 2, -1), (1, 2, 1), (-2, -1, -1), (-2, -1, 1), (-2, 1, -1), (-2, 1, 1), (2, -1, -1), (2, -1, 1), (2, 1, -1), (2, 1, 1), (-1, -1, -2), (-1, -1, 2), (-1, 1, -2), (-1, 1, 2), (1, -1, -2), (1, -1, 2), (1, 1, -2), (1, 1, 2), (0, -2, -2), (0, -2, 2), (0, 2, -2), (0, 2, 2), (-2, 0, -2), (-2, 0, 2), (2, 0, -2), (2, 0, 2), (-2, -2, 0), (-2, 2, 0), (2, -2, 0), (2, 2, 0), (-1, 0, -3), (-1, 0, 3), (1, 0, -3), (1, 0, 3), (-3, -1, 0), (-3, 1, 0), (3, -1, 0), (3, 1, 0), (0, -1, -3), (0, -1, 3), (0, 1, -3), (0, 1, 3), (-1, -3, 0), (-1, 3, 0), (1, -3, 0), (1, 3, 0), (-3, 0, -1), (-3, 0, 1), (3, 0, -1), (3, 0, 1), (0, -3, -1), (0, -3, 1), (0, 3, -1), (0, 3, 1), (0, -4, 0), (0, 4, 0), (0, 0, -4), (0, 0, 4), (-4, 0, 0), (4, 0, 0)]
>>> len(result)
66
(Above I used set(permutations(...)) to get permutations without repetition, which is not efficient in general, but it might not matter here due to the nature of the points. And if efficiency mattered, you could write your own recursive function in your language of choice.)
This method is efficient because it does not scale with the hypervolume, but just scales with the hypersurface, which is what you're trying to enumerate (might not matter much except for very large radii: e.g. will save you roughly a factor of 100x speed if your radius is 100).

You can work your way recursively from the center, counting zero distance once and working on symmetries. This Python implementation works on the lower-dimension "stem" vector and realizes one 1-dimensional slice at a time. One might also do the reverse, but it would imply iterating on the partial hyperspheres. While mathematically the same, the efficiency of both approaches is heavily language-dependent.
If you know beforehand the cardinality of the target space, I would recommend to write an iterative implementation.
The following enumerates the points on a R=16 hyper-LEGO block in six dimensions in about 200 ms on my laptop. Of course, performance rapidly decreases with more dimensions or larger spheres.
def lapp(lst, el):
lst2 = list(lst)
lst2.append(el)
return lst2
def hypersphere(n, r, stem = [ ]):
mystem = lapp(stem, 0)
if 1 == n:
ret = [ mystem ]
for d in range(1, r+1):
ret.append(lapp(stem, d))
ret.append(lapp(stem, -d))
else:
ret = hypersphere(n-1, r, mystem)
for d in range(1, r+1):
mystem[-1] = d
ret.extend(hypersphere(n-1, r-d, mystem))
mystem[-1] = -d
ret.extend(hypersphere(n-1, r-d, mystem))
return ret
(This implementation assumes the hypersphere is centered in the origin. It would be easier to translate all points afterwards than carrying along the coordinates of the center).

Related

Cartesian product but remove duplicates up to cyclic permutations

Given two integers n and r, I want to generate all possible combinations with the following rules:
There are n distinct numbers to choose from, 1, 2, ..., n;
Each combination should have r elements;
A combination may contain more than one of an element, for instance (1,2,2) is valid;
Order matters, i.e. (1,2,3) and (1,3,2) are considered distinct;
However, two combinations are considered equivalent if one is a cyclic permutation of the other; for instance, (1,2,3) and (2,3,1) are considered duplicates.
Examples:
n=3, r=2
11 distinct combinations
(1,1,1), (1,1,2), (1,1,3), (1,2,2), (1,2,3), (1,3,2), (1,3,3), (2,2,2), (2,2,3), (2,3,3) and (3,3,3)
n=2, r=4
6 distinct combinations
(1,1,1,1), (1,1,1,2), (1,1,2,2), (1,2,1,2), (1,2,2,2), (2,2,2,2)
What is the algorithm for it? And how to implement it in c++?
Thank you in advance for advice.

Here is a naive solution in python:
Generate all combinations from the Cartesian product of {1, 2, ...,n} with itself r times;
Only keep one representative combination for each equivalency class; drop all other combinations that are equivalent to this representative combination.
This means we must have some way to compare combinations, and for instance, only keep the smallest combination of every equivalency class.
from itertools import product
def is_representative(comb):
return all(comb[i:] + comb[:i] >= comb
for i in range(1, len(comb)))
def cartesian_product_up_to_cyclic_permutations(n, r):
return filter(is_representative,
product(range(n), repeat=r))
print(list(cartesian_product_up_to_cyclic_permutations(3, 3)))
# [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 1, 1), (0, 1, 2), (0, 2, 1), (0, 2, 2), (1, 1, 1), (1, 1, 2), (1, 2, 2), (2, 2, 2)]
print(list(cartesian_product_up_to_cyclic_permutations(2, 4)))
# [(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 1), (1, 1, 1, 1)]
You mentioned that you wanted to implement the algorithm in C++. The product function in the python code behaves just like a big for-loop that generates all the combinations in the Cartesian product. See this related question to implement Cartesian product in C++: Is it possible to execute n number of nested "loops(any)" where n is given?.

group line segments into minimum set of polyline

Given a list of line segments I need to construct a list of polyline while keeping the number of polylines minimum.
The polylines must not visit the same edge more than once.
For example if I was given the 4 edges of a rectangle then one polyline would be sufficient.
If I was given the 6 edges of a rectangle with a cross in the middle then I would need two polyline to cover it.
This problem looks very similar to travelling sales man problem so I am not sure if a solution exists. I can live with a sub optimal solution in that case.
Edit:
Performance is more important than precision for us so ideally we would want to group the ones that almost join (1-2 pixels away from joining) each other into one polyline
Examples:
input for a square:
L(0, 0) - (0, 1), L(0, 1) - (1, 1), L(1, 1) - (1, 0), L(1, 0) - (0, 0)
Expected output: Polyline (0, 0), (0, 1), (1, 1), (1, 0), Close
input for a square with X:
L(0, 0) - (0, 1), L(0, 1) - (1, 1), L(1, 1) - (1, 0), L(1, 0) - (0, 0), L(0, 0) - (1, 1), L(1, 0) - (0, 1)
One possible output: Polyline1 (0, 0), (0, 1), (1, 1), (1, 0), (0, 0), (1, 1) Polyline2 (1, 0), (0, 1)
input for lines close to each other:
L(0, 0) - (1, 0), L(2, 0) - (3, 0)
Ideal output: Polyline (0, 0), (3, 0)

Polygon from a grid of squares

I'm looking for an algorithm to find the polygon that surrounds a contiguous grid of squares without holes as shown here:
.
I already have each of the grid squares storing data about the kind of edges with the surrounding area that they are composed of (i.e. top, top-right, top-bottom, no edges, etc.), so I'm thinking that this data could be utilized by the algorithm. If someone could provide some pseudocode for such an algorithm that would also be great.
The input to the algorithm would be a list of data objects, each with a Vector2Int describing the grid positions (note that these are simply positions within a grid, not vertices) as well as an Enum that gives the type of edges that the square has with the surrounding area. The output would be an ordered list of Vector2s describing the vertices of the surrounding polygon, assuming that each grid square is one unit in size.
I have found a similar question in the link below, but I wanted some elaboration on the kind of algorithm that would be specific to my case, especially given the data that I already have stored about the edges. I'd also prefer the algorithm to avoid calculating each of the squares' vertices and running a bunch of straightforward searches to eliminate the shared ones, as I feel that this might be too computationally expensive for my particular application. I just have a suspicion that there has to be a better way.
Outline (circumference) polygon extraction from geometry constructed from equal squares
EDIT: Now I'm beginning to think that some sort of maze walking algorithm might actually be appropriate for my situation. I'm working on a solution that I think will work, but it's very cumbersome to write (involving a tonne of conditional checks against the square edges and the direction of travel around the circumference) and probably isn't as fast as it could be.

I am not sure to understand what your data structure contains, and I assume that you have a list of squares known by the coordinates of some point (corner or center).
Compute the bounding box and create a binary bitmap of the same size. Unless the geometry is really sparse, the area of the bitmap will be of the same order as the number of squares.
For every square, paint the corresponding pixel black. Then use a contouring algorithm. To obtain the outline of the squares, you will need to design a correspondence table between the pixl-to-pixel moves and the outline fragments to be appended.

Came across this post looking for alternatives to my solution. This is what I came up with:
For a cell:
| |
---(0, 0)--------(1, 0)---
| |
| |
| R0C0 |
| |
| |
---(0, 1)--------(1, 1)---
| |
Calculate the borders of each cell as a set of 2 of its corner coordinates:
top: ((c, r), (c, r + 1))
right: ((c, r + 1), (c + 1, r + 1))
bottom: ((c + 1, r + 1), (c + 1, r))
left: ((c + 1, r), (c, r))
Notice how these defined clock-wise, this is important
So for the grid
R0C0 R0C1 R0C2 R0C3
R1C2 R1C3
R2C1 R2C2
you'd get the following edges:
R0C0 (top, bottom, left): (0, 0)-(1, 0), (1, 1)-(0, 1), (0, 1)-(0, 0)
R0C1 (top, bottom): (1, 0)-(2, 0), (2, 1)-(1, 1)
R0C2 (top): (2, 0)-(3, 0)
R0C3 (top, right): (3, 0)-(4, 0), (4, 0)-(4, 1)
R1C2 (left): (2, 2)-(2, 1)
R1C3 (right, bottom): (4, 1)-(4, 2), (4, 2)-(3, 2)
R2C1 (top, bottom, left): (1, 2)-(2, 2), (2, 3)-(1, 3), (1, 3)-(1, 2)
R2C2 (right, bottom): (3, 2)-(3, 3), (3, 3)-(2, 3)
Now it's a question of ordering these in a way that the first coordinate of of one element is the same as second coordinate of its predecessor.
(0, 0)-(1, 0) (0, 0)-(1, 0)
(1, 1)-(0, 1) (1, 0)-(2, 0)
(0, 1)-(0, 0) (2, 0)-(3, 0)
(1, 0)-(2, 0) (3, 0)-(4, 0)
(2, 1)-(1, 1) (4, 0)-(4, 1)
(2, 0)-(3, 0) (4, 1)-(4, 2)
(3, 0)-(4, 0) (4, 2)-(3, 2)
(4, 0)-(4, 1) => (3, 2)-(3, 3)
(2, 2)-(2, 1) (3, 3)-(2, 3)
(4, 1)-(4, 2) (2, 3)-(1, 3)
(4, 2)-(3, 2) (1, 3)-(1, 2)
(1, 2)-(2, 2) (1, 2)-(2, 2)
(2, 3)-(1, 3) (2, 2)-(2, 1)
(1, 3)-(1, 2) (2, 1)-(1, 1)
(3, 2)-(3, 3) (1, 1)-(0, 1)
(3, 3)-(2, 3) (0, 1)-(0, 0)
Now in the result, let's take only the first coordinate, this is your polygon:
(0, 0)
(1, 0)
(2, 0)
(3, 0)
(4, 0)
(4, 1)
(4, 2)
(3, 2)
(3, 3)
(2, 3)
(1, 3)
(1, 2)
(2, 2)
(2, 1)
(1, 1)
(0, 1)
You can now simplify it by eliminating consecutive points that are on a single line (i.e. in three consecutive points that either have the same x or y coordinate, eliminate the middle one)
(0, 0)
(4, 0)
(4, 2)
(3, 2)
(3, 3)
(1, 3)
(1, 2)
(2, 2)
(2, 1)
(0, 1)
This is now your polygon in clock-wise order:
(0, 0)--------------------------------------(4, 0)
| |
| |
(0, 1)----------------(2, 1) |
| |
| |
(1, 2)-----(2, 2) (3, 2)-----(4, 2)
| |
| |
(1, 3)----------------(3, 3)
This algorithm can be expanded to handle holes as well. You'd just need to account for multiple polygons when ordering the edges. Conveniently, holes will be defined counter-clock-wise, this is handy if you want to draw the result with svg paths or other d2 path algorithms that allow for polygons with overlap.

Dynamic programming - board with multiplier

I got quite standard DP problem - board nxn with integers, all positive. I want to start somewhere in the first row, end somewhere in the last row and accumulate as much sum as possible. From field (i,j) I can go to fields (i+1, j-1), (i+1, j), (i+1, j+1).
That's quite standard DP problem. But we add one thing - there can be an asterisk on the field, instead of the number. If we meet the asterisk, then we got 0 points from it, but we increase multiplier by 1. All numbers we collect later during our traversal are multiplied by multiplier.
I can't find out how to solve this problem with that multiplier thing. I assume that's still a DP problem - but how to get the equations right for it?
Thanks for any help.

You can still use DP, but you have to keep track of two values: The "base" value, i.e. without any multipliers applied to it, and the "effective" value, i.e. with multipliers. You work your way backwards through the grid, starting in the previous-to-last row, get the three "adjacent" cells in the row after that (the possible "next" cells on the path), and just pick the one with the highest value.
If the current cell is a *, you get the cell where base + effective is maximal, otherwise you just get the one where the effective score is highest.
Here's an example implementation in Python. Note that instead of * I'm just using 0 for multipliers, and I'm looping the grid in order instead of in reverse, just because it's more convenient.
import random
size = 5
grid = [[random.randint(0, 5) for _ in range(size)] for _ in range(size)]
print(*grid, sep="\n")
# first value is base score, second is effective score (with multiplier)
solution = [[(x, x) for x in row] for row in grid]
for i in range(1, size):
for k in range(size):
# the 2 or 3 values in the previous line
prev_values = solution[i-1][max(0, k-1):k+2]
val = grid[i][k]
if val == 0:
# multiply
base, mult = max(prev_values, key=lambda t: t[0] + t[1])
solution[i][k] = (base, base + mult)
else:
# add
base, mult = max(prev_values, key=lambda t: t[1])
solution[i][k] = (val + base, val + mult)
print(*solution, sep="\n")
print(max(solution[-1], key=lambda t: t[1]))
Example: The random 5x5 grid, with 0 corresponding to *:
[4, 4, 1, 2, 1]
[2, 0, 3, 2, 0]
[5, 1, 3, 4, 5]
[0, 0, 2, 4, 1]
[1, 0, 5, 2, 0]
The final solution grid with base values and effective values:
[( 4, 4), ( 4, 4), ( 1, 1), ( 2, 2), ( 1, 1)]
[( 6, 6), ( 4, 8), ( 7, 7), ( 4, 4), ( 2, 4)]
[( 9, 13), ( 5, 9), ( 7, 11), (11, 11), ( 9, 9)]
[( 9, 22), ( 9, 22), ( 9, 13), (11, 15), (12, 12)]
[(10, 23), ( 9, 31), (14, 27), (13, 17), (11, 26)]
Thus, the best solution for this grid is 31 from (9, 31). Working backwards through the grid solution grid, this corresponds to the path 0-0-5-0-4, i.e. 3*5 + 4*4 = 31, as there are 2 * before the 5, and 3 * before the 4.

Optimal 9-element sorting network that reduces to an optimal median-of-9 network?

I am looking into sorting and median-selection networks for nine elements based exclusively on two-input minimum / maximum operations. Knuth, TAOCP Vol. 3, 2nd ed. states (page 226) that a nine-element sorting network requires at least 25 comparisons, which translates into an equal number of SWAP() primitives or 50 min / max operations. Obviously a sorting network can be converted into a median-selection network by eliminating redundant operations. The conventional wisdom seems to be that this does not result in an optimal median-selection network. While this seems to be empirically true, I can find no proof in the literature that this is necessarily so.
Lukáŝ Sekanina, "Evolutionary Design Space Exploration for Median Circuits". In: EvoWorkshops, March 2004, pp. 240-249, gives the minimal number of min / max operations required for an optimal nine-input median-selection network as 30 (table 1). I verified that this is achieved both by the well-known median-selection network given by John L. Smith, "Implementing median filters in XC4000E FPGAs". XCELL magazine, Vol. 23, 1996, p. 16, and the median-of-9 network from the earlier work of Chaitali Chakrabarti and Li-Yu Wang, "Novel sorting network-based architectures for rank order filters." IEEE Transactions on Very Large Scale Integration Systems, Vol. 2, No. 4 (1994), pp. 502-507, where the latter converts into the former by simple elimination of redundant components. See variants 4 and 5 in the code below.
Examining published optimal nine-element sorting networks for suitability for conversion into efficient median-selection networks through elimination of redundant operations, the best version I managed to find is from John M. Gamble's online generator, which requires 32 min / max operations, so just two shy of the optimal operation count. This is shown as variant 1 in the code below. Other optimal sorting networks reduce to 36 min / max operations (variant 2) and 38 min / max operations (variant 3), respectively.
Is there any known nine-element sorting network (i.e. with 50 two-input min / max operations) which reduces to an optimal nine-input median-selection network (with 30 two-input min / max operations) through elimination of redundant operations alone?
The code below uses float data as a test case, since many processors offer minimum / maximum operations for floating-point data but not integer data, GPUs being one exception. Due to issues with special floating-point operands (which do not occur in my actual use case), optimal code sequences typically require the use of "fast math" modes offered by compilers, such as in this Godbolt testbed.
#include <cstdlib>
#include <cstdio>
#include <algorithm>
#define VARIANT 1
#define FULL_SORT 0
typedef float T;
#define MIN(a,b) std::min(a,b)
#define MAX(a,b) std::max(a,b)
#define SWAP(i,j) do { T s = MIN(a##i,a##j); T t = MAX(a##i,a##j); a##i = s; a##j = t; } while (0)
#define MIN3(x,y,z) MIN(a##x,MIN(a##y,a##z))
#define MAX3(x,y,z) MAX(a##x,MAX(a##y,a##z))
#define MED3(x,y,z) MIN(MAX(MIN(a##y,a##z),a##x),MAX(a##y,a##z))
#define SORT3(x,y,z) do { T s = MIN3(x,y,z); T t = MED3(x,y,z); T u = MAX3(x,y,z); a##x=s; a##y=t; a##z=u; } while (0)
/* Use sorting/median network to fully or partially sort array of nine values
and return the median value
*/
T network9 (T *a)
{
// copy to scalars
T a0, a1, a2, a3, a4, a5, a6, a7, a8;
a0=a[0];a1=a[1];a2=a[2];a3=a[3];a4=a[4];a5=a[5];a6=a[6];a7=a[7];a8=a[8];
#if VARIANT == 1
// Full sort. http://pages.ripco.net/~jgamble/nw.html
SWAP (0, 1); SWAP (3, 4); SWAP (6, 7); SWAP (1, 2); SWAP (4, 5);
SWAP (7, 8); SWAP (0, 1); SWAP (3, 4); SWAP (6, 7); SWAP (0, 3);
SWAP (3, 6); SWAP (0, 3); SWAP (1, 4); SWAP (4, 7); SWAP (1, 4);
SWAP (2, 5); SWAP (5, 8); SWAP (2, 5); SWAP (1, 3); SWAP (5, 7);
SWAP (2, 6); SWAP (4, 6); SWAP (2, 4); SWAP (2, 3); SWAP (5, 6);
#elif VARIANT == 2
// Full sort. Donald E. Knuth, TAOCP Vol. 3, 2nd ed., Fig 51
SWAP (0, 1); SWAP (3, 4); SWAP (6, 7); SWAP (1, 2); SWAP (4, 5);
SWAP (7, 8); SWAP (0, 1); SWAP (3, 4); SWAP (6, 7); SWAP (2, 5);
SWAP (0, 3); SWAP (5, 8); SWAP (1, 4); SWAP (2, 5); SWAP (3, 6);
SWAP (4, 7); SWAP (0, 3); SWAP (5, 7); SWAP (1, 4); SWAP (2, 6);
SWAP (1, 3); SWAP (2, 4); SWAP (5, 6); SWAP (2, 3); SWAP (4, 5);
#elif VARIANT == 3
// Full sort. Vinod K Valsalam and Risto Miikkulainen, "Using Symmetry
// and Evolutionary Search to Minimize Sorting Networks". Journal of
// Machine Learning Research 14 (2013) 303-331
SWAP (2, 6); SWAP (0, 5); SWAP (1, 4); SWAP (7, 8); SWAP (0, 7);
SWAP (1, 2); SWAP (3, 5); SWAP (4, 6); SWAP (5, 8); SWAP (1, 3);
SWAP (6, 8); SWAP (0, 1); SWAP (4, 5); SWAP (2, 7); SWAP (3, 7);
SWAP (3, 4); SWAP (5, 6); SWAP (1, 2); SWAP (1, 3); SWAP (6, 7);
SWAP (4, 5); SWAP (2, 4); SWAP (5, 6); SWAP (2, 3); SWAP (4, 5);
#elif VARIANT == 4
// Chaitali Chakrabarti and Li-Yu Wang, "Novel sorting network-based
// architectures for rank order filters." IEEE Transactions on Very
// Large Scale Integration Systems, Vol. 2, No. 4 (1994), pp. 502-507
// sort columns
SORT3 (0, 1, 2);
SORT3 (3, 4, 5);
SORT3 (6, 7, 8);
// sort rows
SORT3 (0, 3, 6); // degenerate: MAX3 -> a6
SORT3 (1, 4, 7); // degenerate: MED3 -> a4
SORT3 (2, 5, 8); // degenerate: MIN3 -> a2
// median computation
SORT3 (2, 4, 6); // degenerate: MED3 -> a4 has rank 4
#elif VARIANT == 5
// John L. Smith, "Implementing median filters in XC4000E FPGAs",
// XCELL magazine, Vol. 23, 1996, p. 16
SORT3 (0, 1, 2);
SORT3 (3, 4, 5);
SORT3 (6, 7, 8);
a3 = MAX3 (0, 3, 6); // a3 has rank 2,3,4,5,6
a4 = MED3 (1, 4, 7); // a4 has rank 3,4,5
a5 = MIN3 (2, 5, 8); // a5 has rank 2,3,4,5,6
a4 = MED3 (3, 4, 5); // a4 has rank 4
#else
#error unknown VARIANT
#endif
#if FULL_SORT
// copy back sorted results
a[0]=a0;a[1]=a1;a[2]=a2;a[3]=a3;a[4]=a4;a[5]=a5;a[6]=a6;a[7]=a7;a[8]=a8;
#endif
// return median-of-9
return a4;
}

I'm not sure this will meet all the criteria for what you're looking for, but here's a way to transform variant 5 into a 25-swap, 50-min/max sorting network, and then reduce it to a 30-min/max median-selection network:
We start with the median-selection network (John L. Smith, 1996) which uses three SORT3's, one MAX3, one MIN3 and two MED3's:
We change all the MAX3, MIN3 and MED3's into SORT3's, and add four SWAP's to get a full sorting network:
(We don't need full sorting of triples 1,2,3 and 5,6,7 at the end, because 2 cannot be less than both 1 and 3, and 6 cannot be greater than both 5 and 7.)
When we replace the SORT3's by SWAP's, we get this standard 25-swap sorting network:
We can then reduce it to this 30-min/max median-selection network:
MIN = Math.min; MAX = Math.max;
function sortingNetwork9(a) { // 50x min/max
swap(0,1); swap(3,4); swap(6,7);
swap(1,2); swap(4,5); swap(7,8);
swap(0,1); swap(3,4); swap(6,7);
swap(0,3); swap(3,6); swap(0,3);
swap(1,4); swap(4,7); swap(1,4);
swap(5,8); swap(2,5); swap(5,8);
swap(2,4); swap(4,6); swap(2,4);
swap(1,3); swap(2,3);
swap(5,7); swap(5,6);
function swap(i,j) {var tmp = MIN(a[i],a[j]); a[j] = MAX(a[i],a[j]); a[i] = tmp;}
}
function medianSelection9(a) { // 30x min/max
swap(0,1); swap(3,4); swap(6,7);
swap(1,2); swap(4,5); swap(7,8);
swap(0,1); swap(3,4); swap(6,7);
max(0,3); max(3,6); // (0,3);
swap(1,4); min(4,7); max(1,4);
min(5,8); min(2,5); // (5,8);
swap(2,4); min(4,6); max(2,4);
// (1,3); // (2,3);
// (5,7); // (5,6);
function swap(i,j) {var tmp = MIN(a[i],a[j]); a[j] = MAX(a[i],a[j]); a[i] = tmp;}
function min(i,j) {a[i] = MIN(a[i],a[j]);}
function max(i,j) {a[j] = MAX(a[i],a[j]);}
}
var a = [5,7,1,8,2,3,6,4,0], b = [5,7,1,8,2,3,6,4,0];
sortingNetwork9(a);
medianSelection9(b);
document.write("sorted: " + a + "<br>median: " + b[4]);

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio