Linearly reading a multi-dimensional array obeying dimensional sub-sectioning - algorithm

I have an API for reading multi-dimensional arrays, requiring to pass a vector of ranges to read sub-rectangles (or hypercubes) from the backing array. I want to read this array "linearly", all elements in some given order with arbitrary chunk sizes. Thus, the task is with an off and a len, to translate the elements covered by this range into the smallest possible set of hyper-cubes, i.e. the smallest number of read commands issued in the API.
For example, we can calculate index vectors for the set of dimensions giving a linear index:
def calcIndices(off: Int, shape: Vector[Int]): Vector[Int] = {
val modsDivs = shape zip shape.scanRight(1)(_ * _).tail
modsDivs.map { case (mod, div) =>
(off / div) % mod
}
}
Let's say the shape is this, representing an array with rank 4 and 120 elements in total:
val sz = Vector(2, 3, 4, 5)
val num = sz.product // 120
A utility to print these index vectors for a range of linear offsets:
def printIndices(off: Int, len: Int): Unit =
(off until (off + len)).map(calcIndices(_, sz))
.map(_.mkString("[", ", ", "]")).foreach(println)
We can generate all those vectors:
printIndices(0, num)
[0, 0, 0, 0]
[0, 0, 0, 1]
[0, 0, 0, 2]
[0, 0, 0, 3]
[0, 0, 0, 4]
[0, 0, 1, 0]
[0, 0, 1, 1]
[0, 0, 1, 2]
[0, 0, 1, 3]
[0, 0, 1, 4]
[0, 0, 2, 0]
[0, 0, 2, 1]
[0, 0, 2, 2]
[0, 0, 2, 3]
[0, 0, 2, 4]
[0, 0, 3, 0]
[0, 0, 3, 1]
[0, 0, 3, 2]
[0, 0, 3, 3]
[0, 0, 3, 4]
[0, 1, 0, 0]
...
[1, 2, 1, 4]
[1, 2, 2, 0]
[1, 2, 2, 1]
[1, 2, 2, 2]
[1, 2, 2, 3]
[1, 2, 2, 4]
[1, 2, 3, 0]
[1, 2, 3, 1]
[1, 2, 3, 2]
[1, 2, 3, 3]
[1, 2, 3, 4]
Let's look at an example chunk that should be read,
the first six elements:
val off1 = 0
val len1 = 6
printIndices(off1, len1)
I will already partition the output by hand into hypercubes:
// first hypercube or read
[0, 0, 0, 0]
[0, 0, 0, 1]
[0, 0, 0, 2]
[0, 0, 0, 3]
[0, 0, 0, 4]
// second hypercube or read
[0, 0, 1, 0]
So the task is to define a method
def partition(shape: Vector[Int], off: Int, len: Int): List[Vector[Range]]
which outputs the correct list and uses the smallest possible list size.
So for off1 and len1, we have the expected result:
val res1 = List(
Vector(0 to 0, 0 to 0, 0 to 0, 0 to 4),
Vector(0 to 0, 0 to 0, 1 to 1, 0 to 0)
)
assert(res1.map(_.map(_.size).product).sum == len1)
A second example, elements at indices 6 until 22, with manual partitioning giving three hypercubes or read commands:
val off2 = 6
val len2 = 16
printIndices(off2, len2)
// first hypercube or read
[0, 0, 1, 1]
[0, 0, 1, 2]
[0, 0, 1, 3]
[0, 0, 1, 4]
// second hypercube or read
[0, 0, 2, 0]
[0, 0, 2, 1]
[0, 0, 2, 2]
[0, 0, 2, 3]
[0, 0, 2, 4]
[0, 0, 3, 0]
[0, 0, 3, 1]
[0, 0, 3, 2]
[0, 0, 3, 3]
[0, 0, 3, 4]
// third hypercube or read
[0, 1, 0, 0]
[0, 1, 0, 1]
expected result:
val res2 = List(
Vector(0 to 0, 0 to 0, 1 to 1, 1 to 4),
Vector(0 to 0, 0 to 0, 2 to 3, 0 to 4),
Vector(0 to 0, 1 to 1, 0 to 0, 0 to 1)
)
assert(res2.map(_.map(_.size).product).sum == len2)
Note that for val off3 = 6; val len3 = 21, we would need four readings.

The idea of the following algorithm is as follows:
a point-of-interest (poi) is the left-most position
at which two index representations differ
(for example for [0, 0, 0, 1] and [0, 1, 0, 0] the poi is 1)
we recursively sub-divide the original (start, stop) linear index range
we use motions in two directions, first by keeping the start constant
and decreasing the stop through a special "ceil" operation on the start,
later by keeping the stop constant and increasing the start through
a special "floor" operation on the stop
for each sub range, we calculate the poi of the boundaries, and
we calculate "trunc" which is ceil or floor operation described above
if this trunc value is identical to its input, we add the entire region
and return
otherwise we recurse
the special "ceil" operation takes the previous start value and
increases the element at the poi index and zeroes the subsequent elements;
e.g. for [0, 0, 1, 1] and poi = 2, the ceil would be [0, 0, 2, 0]
the special "floor" operation takes the previous stop value and
zeroes the elements after the poi index;
e.g. for [0, 0, 1, 1], and poi = 2, the floor would be [0, 0, 1, 0]
Here is my implementation. First, a few utility functions:
def calcIndices(off: Int, shape: Vector[Int]): Vector[Int] = {
val modsDivs = (shape, shape.scanRight(1)(_ * _).tail, shape.indices).zipped
modsDivs.map { case (mod, div, idx) =>
val x = off / div
if (idx == 0) x else x % mod
}
}
def calcPOI(a: Vector[Int], b: Vector[Int], min: Int): Int = {
val res = (a.drop(min) zip b.drop(min)).indexWhere { case (ai,bi) => ai != bi }
if (res < 0) a.size else res + min
}
def zipToRange(a: Vector[Int], b: Vector[Int]): Vector[Range] =
(a, b).zipped.map { (ai, bi) =>
require (ai <= bi)
ai to bi
}
def calcOff(a: Vector[Int], shape: Vector[Int]): Int = {
val divs = shape.scanRight(1)(_ * _).tail
(a, divs).zipped.map(_ * _).sum
}
def indexTrunc(a: Vector[Int], poi: Int, inc: Boolean): Vector[Int] =
a.zipWithIndex.map { case (ai, i) =>
if (i < poi) ai
else if (i > poi) 0
else if (inc) ai + 1
else ai
}
Then the actual algorithm:
def partition(shape: Vector[Int], off: Int, len: Int): List[Vector[Range]] = {
val rankM = shape.size - 1
def loop(start: Int, stop: Int, poiMin: Int, dir: Boolean,
res0: List[Vector[Range]]): List[Vector[Range]] =
if (start == stop) res0 else {
val last = stop - 1
val s0 = calcIndices(start, shape)
val s1 = calcIndices(stop , shape)
val s1m = calcIndices(last , shape)
val poi = calcPOI(s0, s1m, poiMin)
val ti = if (dir) s0 else s1
val to = if (dir) s1 else s0
val st = if (poi >= rankM) to else indexTrunc(ti, poi, inc = dir)
val trunc = calcOff(st, shape)
val split = trunc != (if (dir) stop else start)
if (split) {
if (dir) {
val res1 = loop(start, trunc, poiMin = poi+1, dir = true , res0 = res0)
loop (trunc, stop , poiMin = 0 , dir = false, res0 = res1)
} else {
val s1tm = calcIndices(trunc - 1, shape)
val res1 = zipToRange(s0, s1tm) :: res0
loop (trunc, stop , poiMin = poi+1, dir = false, res0 = res1)
}
} else {
zipToRange(s0, s1m) :: res0
}
}
loop(off, off + len, poiMin = 0, dir = true, res0 = Nil).reverse
}
Examples:
val sz = Vector(2, 3, 4, 5)
partition(sz, 0, 6)
// result:
List(
Vector(0 to 0, 0 to 0, 0 to 0, 0 to 4), // first hypercube
Vector(0 to 0, 0 to 0, 1 to 1, 0 to 0) // second hypercube
)
partition(sz, 6, 21)
// result:
List(
Vector(0 to 0, 0 to 0, 1 to 1, 1 to 4), // first read
Vector(0 to 0, 0 to 0, 2 to 3, 0 to 4), // second read
Vector(0 to 0, 1 to 1, 0 to 0, 0 to 4), // third read
Vector(0 to 0, 1 to 1, 1 to 1, 0 to 1) // fourth read
)
The maximum number of reads, if I'm not mistaken, would be 2 * rank.

Related

Fast approximation of simple cases of relaxed bipartite dimension of graph problem

Given boolean matrix M, I need to find a set of submatrices A = {A1, ..., An} such that matrices in A contain all True values in matrix M and only them. Submatrices don't have to be continuous, i.e. each submatrix is defined by the two sets of indices {i1, ..., ik}, {j1, ..., jt} of M. (For example submatrix could be something like [{1, 2, 5}, {4, 7, 9, 13}] and it is all cells in intersection of these rows and columns.) Optionally submatrices can intersect if this results in better solution. The total number of submatrices n should be minimal.
Size of the matrix M can be up to 10^4 x 10^4, so I need an effective algorithm. I suppose that this problem may not have an effective exact algorithm, because it reminds me some NP-hard problems. If this is true, then any good and fast approximation is OK. We can also suggest that the amount of true values is not very big, i.e. < 1/10 of all values, but to not have accidental DOS in prod, the solution not using this fact is better.
I don't need any code, just a general idea of the algorithm and justification of its properties, if it's not obvious.
Background
We are calculating some expensive distance matrices for logistic applications. Points in these requests are often intersecting, so we are trying do develop some caching algorithm to not calculate parts of some requests. And to split big requests into smaller ones with only unknown submatrices. Additionally some distances in the matrix may be not needed for the algorithm. On the one hand the small amount of big groups calculates faster, on the other hand if we include a lot of "False" values, and our submatrices are unreasonably big, this can slow down the calculation. The exact criterion is intricate and the time complexity of "expensive" matrix requests is hard to estimate. As far as I know for square matrices it is something like C*n^2.5 with quite big C. So it's hard to formulate a good optimization criterion, but any ideas are welcome.
About data
True value in matrix means that the distance between these two points have never been calculated before. Most of the requests (but not all) are square matrices with the same points on both axes. So most of the M is expected to be almost symmetric. And also there is a simple case of several completely new points and the other distances are cached. I deal with this cases on preprocessing stage. All the other values can be quite random. If they are too random we can give up cache and calculate the full matrix M. But sometimes there are useful patterns. I think that because of the nature of the data it is expected to contain more big sumbatrices then random data. Mostly True values are occasional, but form submatrix patterns, that we need to find. But we cannot rely on this completely, because if algorithm gets too random matrix it should be able to at least detect it to not have too long and complex calculations.
Update
As stated in wikipedia this problem is called Bipartite Dimension of a graph and is known to be NP-hard. So we can reformulate it info finding fast relaxed approximations for the simple cases of the problem. We can allow some percentage of false values and we can adapt some simple, but mostly effective greedy heuristic.
I started working on the algorithm below before you provided the update.
Also, in doing so I realised that while one is looking for blocks of true values, the problem is not one of a block transformation, as you have also now updated.
The algorithm is as as follows:
count the trues in each row
for any row with the maximum count of trues, sort the columns in the
matrix so that the row's trues all move to the left
sort the matrix rows in descending order of congruent trues on the
left (there will now be an upper left rough triangle of congruent trues)
get the biggest rectangle of trues cornered at the upper left
store the row ids and column ids for that rectangle (this is a sub-matrix definition)
change the the sub-matrix's trues to falses
repeat from the top until the upper left triangle has no trues
This algorithm will produce a complete cover of the boolean matrix consisting of row-column intersection sub-matrices containing only true values.
I am not sure if allowing some falses in a sub-matrix will help. While it will allow bigger sub-matrices to be found and hence reduce the number of passes of the boolean matrix to find a cover, it will presumably take longer to find the biggest such sub-matrices because there will be more combinations to check. Also, I am not sure how one might stop falsey sub-matrices from overlapping. It might need the maintenance of a separate mask matrix rather than using the boolean matrix as its own mask, in order to ensure disjoint sub-matrices.
Below is a first cut implementation of the above algorithm in python.
I ran it on Windows 10 on a Intel Pentium N3700 # 1.60Ghz with 4GB RAM
As is, it will do, with randomly generated ~10% trues:
100 rows x 1000 columns < 7 secs
1000 rows x 100 columns < 6 secs
300 rows x 300 columns < 14 secs
3000 rows x 300 columns < 3 mins
300 rows x 3000 columns < 15 mins
1000 rows x 1000 columns < 8 mins
I have not tested it on approximately symmetric matrices, nor have I tested it on matrices with relatively large sub-matrices. It might perform well with relatively large sub-martrices, eg, in the extreme case, ie, the entire boolean matrix is true, only two passes of the algorithm loop are required.
One area I think there can be considerable optimisation is in the row sorting. The implementation below uses the in-built phython sort with a comparator function. A custom crafted sort function will probably do much better, and possibly especially so if it is a virtual sort similar to the column sorting.
If you can try it on some real data, ie, square, approximately symmetric matrix, with relatively large sub-matrices, it would be good to know how it goes.
Please advise if you would like to me to try some optimisation of the python. I presume to handle 10^4 x 10^4 boolean matrices it will need to be a lot faster.
from functools import cmp_to_key
booleanMatrix0 = [
( 0, 0, 0, 0, 1, 1 ),
( 0, 1, 1, 0, 1, 1 ),
( 0, 1, 0, 1, 0, 1 ),
( 1, 1, 1, 0, 0, 0 ),
( 0, 1, 1, 1, 0, 0 ),
( 1, 1, 0, 1, 0, 0 ),
( 0, 0, 0, 0, 0, 0 )
]
booleanMatrix1 = [
( 0, )
]
booleanMatrix2 = [
( 1, )
]
booleanMatrix3 = [
( 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0 )
]
booleanMatrix4 = [
( 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1 )
]
booleanMatrix14 = [
( 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0 ),
( 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1 ),
( 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0 ),
( 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0 ),
( 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1 ),
( 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1 ),
( 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1 ),
( 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1 ),
( 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1 ),
( 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0 ),
( 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0 )
]
booleanMatrix15 = [
( 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 ),
( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0 ),
]
booleanMatrix16 = [
( 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1 ),
( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1 ),
]
import random
booleanMatrix17 = [
]
for r in range(11):
row = []
for c in range(21):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix17.append(tuple(row))
booleanMatrix18 = [
]
for r in range(21):
row = []
for c in range(11):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix18.append(tuple(row))
booleanMatrix5 = [
]
for r in range(50):
row = []
for c in range(200):
row.append(random.randrange(2))
booleanMatrix5.append(tuple(row))
booleanMatrix6 = [
]
for r in range(200):
row = []
for c in range(50):
row.append(random.randrange(2))
booleanMatrix6.append(tuple(row))
booleanMatrix7 = [
]
for r in range(100):
row = []
for c in range(100):
row.append(random.randrange(2))
booleanMatrix7.append(tuple(row))
booleanMatrix8 = [
]
for r in range(100):
row = []
for c in range(1000):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix8.append(tuple(row))
booleanMatrix9 = [
]
for r in range(1000):
row = []
for c in range(100):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix9.append(tuple(row))
booleanMatrix10 = [
]
for r in range(317):
row = []
for c in range(316):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix10.append(tuple(row))
booleanMatrix11 = [
]
for r in range(3162):
row = []
for c in range(316):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix11.append(tuple(row))
booleanMatrix12 = [
]
for r in range(316):
row = []
for c in range(3162):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix12.append(tuple(row))
booleanMatrix13 = [
]
for r in range(1000):
row = []
for c in range(1000):
if random.randrange(5) == 1:
row.append(random.randrange(2))
else:
row.append(0)
booleanMatrix13.append(tuple(row))
booleanMatrices = [ booleanMatrix0, booleanMatrix1, booleanMatrix2, booleanMatrix3, booleanMatrix4, booleanMatrix14, booleanMatrix15, booleanMatrix16, booleanMatrix17, booleanMatrix18, booleanMatrix6, booleanMatrix5, booleanMatrix7, booleanMatrix8, booleanMatrix9, booleanMatrix10, booleanMatrix11, booleanMatrix12, booleanMatrix13 ]
def printMatrix(matrix, colOrder):
for r in range(rows):
row = ""
for c in range(cols):
row += str(matrix[r][0][colOrder[c]])
print(row)
print()
def rowUp(matrix):
rowCount = []
maxRow = [ 0, 0 ]
for r in range(rows):
rowCount.append([ r, sum(matrix[r][0]) ])
if rowCount[-1][1] > maxRow[1]:
maxRow = rowCount[-1]
return rowCount, maxRow
def colSort(matrix):
# For a row with the highest number of trues, sort the true columns to the left
newColOrder = []
otherCols = []
for c in range(cols):
if matrix[maxRow[0]][0][colOrder[c]]:
newColOrder.append(colOrder[c])
else:
otherCols.append(colOrder[c])
newColOrder += otherCols
return newColOrder
def sorter(a, b):
# Sort rows according to leading trues
length = len(a)
c = 0
while c < length:
if a[0][colOrder[c]] == 1 and b[0][colOrder[c]] == 0:
return -1
if b[0][colOrder[c]] == 1 and a[0][colOrder[c]] == 0:
return 1
c += 1
return 0
def allTrues(rdx, cdx, matrix):
count = 0
for r in range(rdx+1):
for c in range(cdx+1):
if matrix[r][0][colOrder[c]]:
count += 1
else:
return
return rdx, cdx, count
def getBiggestField(matrix):
# Starting at (0, 0) find biggest rectangular field of 1s
biggestField = (None, None, 0)
cStop = cols
for r in range(rows):
for c in range(cStop):
rtn = allTrues(r, c, matrix)
if rtn:
if rtn[2] > biggestField[2]:
biggestField = rtn
else:
cStop = c
break;
if cStop == 0:
break
return biggestField
def mask(matrix):
maskMatrix = []
for r in range(rows):
row = []
for c in range(cols):
row.append(matrix[r][0][c])
maskMatrix.append([ row, matrix[r][1] ])
maskRows = []
for r in range(biggestField[0]+1):
maskRows.append(maskMatrix[r][1])
for c in range(biggestField[1]+1):
maskMatrix[r][0][colOrder[c]] = 0
maskCols= []
for c in range(biggestField[1]+1):
maskCols.append(colOrder[c])
return maskMatrix, maskRows, maskCols
# Add a row id to each row to keep track of rearranged rows
rowIdedMatrices = []
for matrix in booleanMatrices:
rowIdedMatrix = []
for r in range(len(matrix)):
rowIdedMatrix.append((matrix[r], r))
rowIdedMatrices.append(rowIdedMatrix)
import time
for matrix in rowIdedMatrices:
rows = len(matrix)
cols = len(matrix[0][0])
colOrder = []
for c in range(cols):
colOrder.append(c)
subMatrices = []
startTime = time.thread_time()
loopStart = time.thread_time()
loop = 1
rowCount, maxRow = rowUp(matrix)
ones = 0
for row in rowCount:
ones += row[1]
print( "_________________________\n", "Rows", rows, "Columns", cols, "Ones", str(int(ones * 10000 / rows / cols) / 100) +"%")
colOrder = colSort(matrix)
matrix.sort(key=cmp_to_key(sorter))
biggestField = getBiggestField(matrix)
if biggestField[2] > 0:
maskMatrix, maskRows, maskCols = mask(matrix)
subMatrices.append(( maskRows, maskCols ))
while biggestField[2] > 0:
loop += 1
rowCount, maxRow = rowUp(maskMatrix)
colOrder = colSort(maskMatrix)
maskMatrix.sort(key=cmp_to_key(sorter))
biggestField = getBiggestField(maskMatrix)
if biggestField[2] > 0:
maskMatrix, maskRows, maskCols = mask(maskMatrix)
subMatrices.append(( maskRows, maskCols) )
if loop % 100 == 0:
print(loop, time.thread_time() - loopStart)
loopStart = time.thread_time()
endTime = time.thread_time()
print("Sub-matrices:", len(subMatrices), endTime - startTime)
for sm in subMatrices:
print(sm)
print()
input("Next matrix")
LOOP over true values
Can you grow the submatrix containing the true value in any direction
( i.e can you go from
t
to
tt
tt
)
Keep growing for as long as possible
Set all cells in M that are in the new submatrix to false
Repeat until every cell in M is false.
Here is a simple example of how it works
The top picture shows the large Matrix M containing a few true values
The bottom rows show the first few iteration, with the blus submatric growing as it finds more adjacent cells with true values. In this case I have stopped because it cannot grow any durther without including false cells. If a few cells in a submatrix can be false, then you could continue a bit further.
Let's say M is an s by t matrix. The trivial (but possibly useful) solution is just to take all the non-empty columns (or rows) as your submatrices. This will result in at most min(s,t) submatrices.

How can you efficiently flip a large range of indices's values from 1 to 0 or vice versa

You're given an N sized array arr. Suppose there's a contiguous interval arr[a....b] where you want to flip all the 1s to 0s and vice versa. Now suppose that there are a large (millions or billions) of these intervals (they could have different starting and end points) that you need to process. Is there an efficient algorithm to get this done?
Note that a and b are inclusive. N can be any finite size essentially. The purpose of the question was just to practice algorithms.
Consider arr = [0,0,0,0,0,0,0]
Consider that we want to flips the following inclusive intervals [1,3], [0,4]
After process [1,3], we have arr = [0,1,1,1,0,0,0] and after processing [0,4], we have arr = [1,0,0,0,1,0,0], which is the final array.
The obvious efficient way to do that is to not do that. Instead first collect at what indices the flipping changes, and then do one pass to apply the collected flipping information.
Python implementation of a naive solution, the efficient solution, and testing:
def naive(arr, intervals):
for a, b in intervals:
for i in range(a, b+1):
arr[i] ^= 1
def efficient(arr, intervals):
flips = [0] * len(arr)
for a, b in intervals:
flips[a] ^= 1
flips[b+1] ^= 1
xor = 0
for i, flip in enumerate(flips):
xor ^= flip
arr[i] ^= xor
def test():
import random
n = 30
arr = random.choices([0, 1], k=n)
intervals = []
while len(intervals) < 100:
a = random.randrange(n-1)
b = random.randrange(n-1)
if a <= b:
intervals.append((a, b))
print(f'{arr = }')
expect = arr * 1
naive(expect, intervals)
print(f'{expect = }')
result = arr * 1
efficient(result, intervals)
print(f'{result = }')
print(f'{(result == expect) = }')
test()
Demo output:
arr = [1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0]
expect = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
result = [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
(result == expect) = True
Cast to Int Array and use bitwise not if you are using C or C++. But this is an SIMD task so its parallelizable if you wish.

Pandas Series correlation against a single vector

I have a DataFrame with a list of arrays as one column.
import pandas as pd
v = [1, 2, 3, 4, 5, 6, 7]
v1 = [1, 0, 0, 0, 0, 0, 0]
v2 = [0, 1, 0, 0, 1, 0, 0]
v3 = [1, 1, 0, 0, 0, 0, 1]
df = pd.DataFrame({'A': [v1, v2, v3]})
print df
Output:
A
0 [1, 0, 0, 0, 0, 0, 0]
1 [0, 1, 0, 0, 1, 0, 0]
2 [1, 1, 0, 0, 0, 0, 1]
I want to do a pd.Series.corr for each row of df.A against the single vector v.
I'm currently doing a loop on df.A and achieving it. It is very slow.
Expected Output:
A B
0 [1, 0, 0, 0, 0, 0, 0] -0.612372
1 [0, 1, 0, 0, 1, 0, 0] -0.158114
2 [1, 1, 0, 0, 0, 0, 1] -0.288675
Here's one using the correlation defintion with NumPy tools meant for performance with corr2_coeff_rowwise -
a = np.array(df.A.tolist()) # or np.vstack(df.A.values)
df['B'] = corr2_coeff_rowwise(a, np.asarray(v)[None])
Runtime test -
Case #1 : 1000 rows
In [59]: df = pd.DataFrame({'A': [np.random.randint(0,9,(7)) for i in range(1000)]})
In [60]: v = np.random.randint(0,9,(7)).tolist()
# #jezrael's soln
In [61]: %timeit df['new'] = pd.DataFrame(df['A'].values.tolist()).corrwith(pd.Series(v), axis=1)
10 loops, best of 3: 142 ms per loop
In [62]: %timeit df['B'] = corr2_coeff_rowwise(np.array(df.A.tolist()), np.asarray(v)[None])
1000 loops, best of 3: 461 µs per loop
Case #2 : 10000 rows
In [63]: df = pd.DataFrame({'A': [np.random.randint(0,9,(7)) for i in range(10000)]})
In [64]: v = np.random.randint(0,9,(7)).tolist()
# #jezrael's soln
In [65]: %timeit df['new'] = pd.DataFrame(df['A'].values.tolist()).corrwith(pd.Series(v), axis=1)
1 loop, best of 3: 1.38 s per loop
In [66]: %timeit df['B'] = corr2_coeff_rowwise(np.array(df.A.tolist()), np.asarray(v)[None])
100 loops, best of 3: 3.05 ms per loop
Use corrwith, but if performance is important, Divakar's anwer should be faster:
df['new'] = pd.DataFrame(df['A'].values.tolist()).corrwith(pd.Series(v), axis=1)
print (df)
A new
0 [1, 0, 0, 0, 0, 0, 0] -0.612372
1 [0, 1, 0, 0, 1, 0, 0] -0.158114
2 [1, 1, 0, 0, 0, 0, 1] -0.288675

Use Ruby to Truncate duplicate patterns in an Array

SITE ADMIN: WOULD YOU PLEASE REMOVE THIS POST?
For example, I have
tt = [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0]
and I would like to slim it down to
tt_out = [0, 1, 1, 2, 2, 1, 1, 0, 0]
also I'd like to know when does the repetition begins and ends, hence I'd like to have the following tip
tip = '0','1.','.5','6.','.11','12.','.15','16.','.20'
tt = [0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0, 0]
tip = []
tt_out = tt.map.with_index{|t, i|
start_range = (i==0 || tt[i-1] != tt[i])
end_range = (tt[i+1] != tt[i])
if start_range && end_range
tip << "#{i}"
elsif start_range
tip << "#{i}."
elsif end_range
tip << ".#{i}"
end
t if start_range || end_range
}.compact
tip
=> ["0", "1.", ".5", "6.", ".11", "12.", ".15", "16.", ".20"]
tt_out
=> [0, 1, 1, 2, 2, 1, 1, 0, 0]
P.S: You've got an error in your example, the last element of tip should be '.20'

Most performant way to invert an array?

If I have an array like
ary = [0, 0, 3, 0, 0, 0, 2, 0, 1, 0, 1, 1, 0]
What is the most performant way to get a list of how many indexes were in the array?
inverted = [2,2,2,6,6,8,10,11]
This is what I've come up with, but it seems like there is a more efficient way:
a = []
ary.each_with_index{|v,i| a << Array.new(v, i) if v != 0}
a.flatten
=> [2, 2, 2, 6, 6, 8, 10, 11]
Unless profiling proves this to be a bottleneck, the cleaner is a functional approach:
>> ary.each_with_index.map { |x, idx| [idx]*x }.flatten(1)
=> [2, 2, 2, 6, 6, 8, 10, 11]
If you use Ruby 1.9, I'd recommend this (thanks to sawa for pointing out Enumerable#flat_map):
>> ary.flat_map.with_index { |x, idx| [idx]*x }
=> [2, 2, 2, 6, 6, 8, 10, 11]
[edit: removed examples using inject and each_with_object, it's unlikely they are faster than flat_map + with_index]
You could use Array#push instead of Array#<< to speed this up a little.
ary.each_with_index{|v,i| a.push(*Array.new(v, i)) if v != 0}
Some quick benchmarking shows me that this is about 30% faster than using <<.
> ary = [0, 0, 3, 0, 0, 0, 2, 0, 1, 0, 1, 1, 0]
# => [0, 0, 3, 0, 0, 0, 2, 0, 1, 0, 1, 1, 0]
> quick_bench(10**5) do
> a = []
> ary.each_with_index{|v,i| a << Array.new(v, i) if v != 0}
> a.flatten
> end
Rehearsal ------------------------------------
1.200000 0.020000 1.220000 ( 1.209861)
--------------------------- total: 1.220000sec
user system total real
1.150000 0.000000 1.150000 ( 1.147103)
# => nil
> quick_bench(10**5) do
> a = []
> ary.each_with_index{|v,i| a.push(*Array.new(v, i)) if v != 0}
> end
Rehearsal ------------------------------------
0.870000 0.000000 0.870000 ( 0.865190)
--------------------------- total: 0.870000sec
user system total real
0.860000 0.000000 0.860000 ( 0.858628)
# => nil
> a = []
# => []
> ary.each_with_index{|v,i| a.push(*Array.new(v, i)) if v != 0}
# => [0, 0, 3, 0, 0, 0, 2, 0, 1, 0, 1, 1, 0]
> a
# => [2, 2, 2, 6, 6, 8, 10, 11]
>

Resources