Data structure and algorithms for 1D velocity model using layers? - algorithm

This is for a geophysical analysis program I am creating. I already have code to do all this, but I am looking for inspirations and ideas (good datastructures and algorithms).
What I want to model:
Velocity as a function of depth (z)
The model is built up from multiple layers (<10)
Every layer is accessible by an index going from 0 for the top most layer to n for the bottom most layer
Every layer has velocity as a linear function of depth (gradient a_k and axis intercept b_k of the kth layer)
Every layer has a top and bottom depth (z_k-1 and z_k)
The model is complete, there is no space between layers. The point directly between two layers belongs to the lower layer
Requirements:
Get velocity at an arbitrary depth within the model. This will be done on the order of 1k to 10k times, so it should be well optimized.
Access to the top and bottom depths, gradients and intercepts of a layer by the layer index
What I have so far:
I have working Python code where every layer is saved as a numpy array with the values of z_k (bottom depth), z_k-1 (top depth), a_k (velocity gradient) and b_k (axis intercept). To evaluate the model at a certain depth, I get the layer index (, use that to get the parameters of the layer and pass them to a function that evaluates the linear velocity gradient.

So you have piecewise linear dependence, where z-coordinates of pieces ends go irrregular, and want to get function value at given z.
Note that there is no sense to use binary search for 10 pieces (3-4 rounds of BS might be slower than 9 simple comparisons).
But what precision have your depth queries? Note that you can store a table both for 1-meter resolution and for 1 millimeter too - only 10^7 entries provide O(1) access to any precalculated velocity value
For limited number of pieces it is possible to make long formula (involving integer division) but results perhaps should be slower.
Example for arbitrary three-pieces polyline with border points 2 and 4.5:
f = f0 + 0.2*int(z/2.0)*(z-2.0) + 0.03*int(z/4.5)*(z-4.5)

Related

How does Principle Component Initialization work for determining the weights of the map vectors in Self Organizing Maps?

I studied on a fundamental SOM initialization and was looking to understand exactly how this process, PCI, works for initializing weight vectors on the map. My understanding is that for a two dimensional Map, this initialization method looks at the eigenvectors for the two largest eigenvalues of the data matrix and then uses the subspace spanned by these eigenvectors to initialize the map. Does that mean that in order to get the initial map weights, does this method take random linear combinations of the largest two eigenvectors in order to generate the map weights? Is there a patten?
For example, for 40 input data vectors on the map, does the lininit initialization method take combinations a1*[e1] + a2*[e2] where [e1] and [e2] are the two largest eigenvectors and a1 and a2 are random integers ranging from -3 to 3? Or is there a different mechanism? I was looking to make sure I knew exactly how lininit takes the two largest eigenvectors of the input data matrix and uses them to construct the initial weight vectors for the map.
The SOM creates a map that has the neighbourhood relationship between nearby nodes. Random initialisation does not help this process, since the nodes start randomly. Therefore, the idea of using the PCA initialisation is just a shortcut to get the map closer to the final state. This saves a lot of computation.
So how does this work? The first two principal components (PCs) are used. Set the initial weights as linear combination of the PCs. Rather than using random a1 and a2, the weights are set in a range that corresponds to the scale of the principal components.
For example, for a 5x3 map, a1 and a2 can both be in the range (-1, 1) with the relevant number of elements. In other words, for the 5x3 map, a1 = [-1.0 -0.5 0.0 0.5 1.0] and a2 = [-1.0 0.0 1.0], with 5 nodes and 3 nodes, respectively.
Then set each of the weights of nodes. For a rectangular SOM, each node has indices [m, n]. Use the values of a1[m] and a2[n]. Thus, for all m = [1 2 3 4 5] and n = [1 2 3]:
weight[m, n] = a1[m] * e1 + a2[n] * e2
That is how to initialize the weights using the principal components. This makes the initial state globally ordered, so now the SOM algorithm is used to create the local ordering.
The Principal Component part of the name is a reference to https://en.wikipedia.org/wiki/Principal_component_analysis.
Here is the idea. You start with data points placed at vectors of many underlying factors. But they may be correlated in your data. So, for example, if you're measuring height, weight, blood pressure, etc, you expect that tall people will weigh more. But what you want to do is replace this with vectors of factors that are not correlated with each other in your data.
So your principal component is a vector of length 1 which is as strongly correlated as possible with the variation in your dataset.
Your secondary component is the vector of length 1 at right angles to the first which is as strongly correlated as possible with the rest of the variation in your data set.
Your tertiary component is the vector of length 1 at right angles to the first two which is as strongly correlated as possible with the rest of the variation in your data set.
And so on.
In practice you may start with many factors, but most of the information is captured in just the first few. For example in the results of intelligence testing the first component is IQ and the second is the difference between how you are at verbal and quantitative reasoning.
How this applies to SOM initialization is that a simple linear model built off of PCA analysis is a pretty good guess for the answer that you're looking for, so starting there reduces how much work you have to do to finish getting the answer.

Is it better to reduce the space complexity or the time complexity for a given program?

Grid Illumination: Given an NxN grid with an array of lamp coordinates. Each lamp provides illumination to every square on their x axis, every square on their y axis, and every square that lies in their diagonal (think of a Queen in chess). Given an array of query coordinates, determine whether that point is illuminated or not. The catch is when checking a query all lamps adjacent to, or on, that query get turned off. The ranges for the variables/arrays were about: 10^3 < N < 10^9, 10^3 < lamps < 10^9, 10^3 < queries < 10^9
It seems like I can get one but not both. I tried to get this down to logarithmic time but I can't seem to find a solution. I can reduce the space complexity but it's not that fast, exponential in fact. Where should I focus on instead, speed or space? Also, if you have any input as to how you would solve this problem please do comment.
Is it better for a car to go fast or go a long way on a little fuel? It depends on circumstances.
Here's a proposal.
First, note you can number all the diagonals that the inputs like on by using the first point as the "origin" for both nw-se and ne-sw. The diagonals through this point are both numbered zero. The nw-se diagonals increase per-pixel in e.g the northeast direction, and decreasing (negative) to the southwest. Similarly ne-sw are numbered increasing in the e.g. the northwest direction and decreasing (negative) to the southeast.
Given the origin, it's easy to write constant time functions that go from (x,y) coordinates to the respective diagonal numbers.
Now each set of lamp coordinates is naturally associated with 4 numbers: (x, y, nw-se diag #, sw-ne dag #). You don't need to store these explicitly. Rather you want 4 maps xMap, yMap, nwSeMap, and swNeMap such that, for example, xMap[x] produces the list of all lamp coordinates with x-coordinate x, nwSeMap[nwSeDiagonalNumber(x, y)] produces the list of all lamps on that diagonal and similarly for the other maps.
Given a query point, look up it's corresponding 4 lists. From these it's easy to deal with adjacent squares. If any list is longer than 3, removing adjacent squares can't make it empty, so the query point is lit. If it's only 3 or fewer, it's a constant time operation to see if they're adjacent.
This solution requires the input points to be represented in 4 lists. Since they need to be represented in one list, you can argue that this algorithm requires only a constant factor of space with respect to the input. (I.e. the same sort of cost as mergesort.)
Run time is expected constant per query point for 4 hash table lookups.
Without much trouble, this algorithm can be split so it can be map-reduced if the number of lampposts is huge.
But it may be sufficient and easiest to run it on one big machine. With a billion lamposts and careful data structure choices, it wouldn't be hard to implement with 24 bytes per lampost in an unboxed structures language like C. So a ~32Gb RAM machine ought to work just fine. Building the maps with multiple threads requires some synchronization, but that's done only once. The queries can be read-only: no synchronization required. A nice 10 core machine ought to do a billion queries in well less than a minute.
There is very easy Answer which works
Create Grid of NxN
Now for each Lamp increment the count of all the cells which suppose to be illuminated by the Lamp.
For each query check if cell on that query has value > 0;
For each adjacent cell find out all illuminated cells and reduce the count by 1
This worked fine but failed for size limit when trying for 10000 X 10000 grid

How to compare two shapes?

Is there a way to compare two geometric shapes (or any two more generic data structures), without using the brute force when a tolerance is involved?
The brute force (that is comparing each value of each object against each value of the other object) works but it's slow, and I can't use it.
I tried sorting the data and comparing two sorted collections. It's fast, but it only works with zero tolerance. As soon as I add the tolerance I get lost. The problem is that two values can be identical when I compare and different when I sort.
Here are some details of my problem.
In my Excel VBA add-in I have a collection of Shape objects made by a collection of Line objects made by two Point objects each. The add-in scans a CAD drawing via COM and creates the collection of Shape objects.
An simplified version could generate this:
Shape 1 Shape 2
Point 1 0.0 5.0 0.0 4.9
Point 2 4.9 0.0 5.1 0.0
Point 3 5.0 5.0 5.0 5.0
I need to find which shapes are identical to which shapes, where identical means has the same shape, size and orientation, but not the same position (so far it's trivial) plus or minus a tolerance (not so trivial now!)
The Point.IsIdenticalTo(OtherPoint) is defined as:
Function IsIdenticalTo(OtherPoint As Point) As Boolean
IsIdenticalTo = Abs(X - OtherPoint.X) < Tolerance And Abs(Y - OtherPoint.Y) < Tolerance
End Function
The brute force implementation of the Shape.IsIdenticalTo(OtherShape) works but it's too slow: if each Line(I) has an identical OtherShape.Line(J) and viceversa, then the two shapes are identical. Sometimes I have hundreds of shapes with hundreds of lines each, so the brute force solution doesn't work for me.
I tried two approaches involving sorted collections. Both are fast because comparing two sorted collections is faster than the brute force way, but both fail in some conditions:
A SortedValues collection contains all the X and Y values of all the lines. The values are sorted, so the info about whether a value is an X or a Y is lost. I have used this approach for months without problems, but it fails for example when the only difference between two shapes is between the points (10, 20) and (20, 10). I added the line angle to the list of values, things have improved, but there are still cases where this approach fails, because some info is lost when the values are sorted together. In the example above this approach would work with the following collections:
Shape 1 Shape 2
0.0 0.0
0.0 0.0
4.9 4.9
5.0 5.0
5.0 5.0
5.0 5.1
A SortedLines collection contains all the lines sorted counter-clockwise and starting from the point closest to the origin. This approach doesn't lose any info, but it fails in the example above because the sorting algorithm doesn't agree with the equality comparison. If the tolerance is 0.5 they should be identical, but the sorting algorithm produces the collections shown below. Things get more difficult because my shapes contain sub-shapes, so there are many starting points on each shape.
Shape 1 Shape 2
Point 1 4.9 0.0 0.0 4.9
Point 2 5.0 5.0 5.1 0.0
Point 3 0.0 5.0 5.0 5.0
EDIT:
Shapes are imported from an external graphical application via COM. A shape can be as simple as rectangle or as complex as any fancy outline with 10 circles inside, 20 internal shapes and 30 lines. They represent panels with holes and simple decorations, and sometimes they have a saw-tooth shape, which makes dozen of edges.
handle shape as polygon
convert your points (each line) to set of lines (length,angle) like on this image:
this ensures invariance on rotation/translation. If you see more lines with angle=PI join them together to avoid miss comparisons of the same shapes with different sampling also try to match the same CW/CCW polygon winding rule for both shapes.
find start point
Can be biggest or smallest angle, length ... or specific order of angles+lengths. So reorder lines of one polygon (cyclic shift) so your shapes are compared from the 'same point' if they can.
comparison - for exact match
number of lines have to be the same
perimeters must be the same +/- some accuracy
so for example:
fabs (sum of all lengths of poly1 - sum of all lengths of poly2) <= 1e-3
if not shapes are different. Then compare all lengths and angles. If any one value differs more then accuracy value then shapes are different.
comparison - size does not matter
compute perimeter of both polygons l1,l2 and resize all lengths of compared poly2 to match perimeter of poly1 so all lengths of poly2 are multiplied by value = l1/l2;. After this use comparison from bullet #3
comparison - shape deviations can still do positive match (size must be the same)
try to set the number of lines to the same value (join all lines with angle close to PI). Then perimeters should "match" ... fabs(l1-l2)<=1e-3*l1. You can use bullet #4 comparison
comparison - size and shape deviations can still match
just resize poly2 to match perimeter of poly1 as in bullet #4 and then use bullet #5
If you can not find the start point in booth polygons (bullet #2)
Then you have to check for all start point shifts so if your polygons have booth 5 lines:
poly1: l11,l12,l13,l14,l15
poly2: l21,l22,l23,l24,l25
Then you have to compare all 5 combinations (unless you found match sooner):
cmp (l11,l12,l13,l14,l15),(l21,l22,l23,l24,l25)
cmp (l11,l12,l13,l14,l15),(l22,l23,l24,l25,l21)
cmp (l11,l12,l13,l14,l15),(l22,l23,l24,l25,l21)
cmp (l11,l12,l13,l14,l15),(l23,l24,l25,l21,l22)
cmp (l11,l12,l13,l14,l15),(l24,l25,l21,l22,l23)
cmp (l11,l12,l13,l14,l15),(l25,l21,l22,l23,l24)
[Notes]
There are also faster ways to compare but they can miss in some cases
you can compare histograms of lines, angles
you can use neural network (I do not like them but they are ideal for classifications like this)
if your shapes have to be oriented in the same ways (no rotation invariance)
then instead of vertex angle use the line direction angle
if you can not ensure the same winding rule for both compared polygons
then you have to check them booth:
cmp (l11,l12,l13,l14,l15),(l21,l22,l23,l24,l25)
cmp (l11,l12,l13,l14,l15),(l25,l24,l23,l22,l21)
I know it is a bit vague answer but still hope it helps at least a little ...
I am not sure how do you want to solve this problem. you want to go deep or you just want a solution. I can suggest you use an OpenCV function called "matchShapes". This function is based on Hu moments and has a good performance for rigid shapes. After you extract the target and the main contours then use the below code to compare them.
dif = cv.matchShapes(Contour1, Contour2, 1, 0, 0)
Smaller "dif" value means more similarity between contours.
I have the same problem. I compute the adjacent matrix of the vertex weighted with the distances. This compute all the sides length and diagonals. Then if the module of each row or column of the matrix are the same with the other matrix, then the two shapes are the same. For the tolerance just use the function round() before start. The complexity is O(n2 / 2), because you have to compute just an half of the adjacent matrix that is symmetric. The problem is that I cannot detect if a shape is flipped.

Benefits of nearest neighbor search with Morton-order?

While working on the simulation of particle interactions, I stumbled across grid indexing in Morton-order (Z-order)(Wikipedia link) which is regarded to provide an efficient nearest neighbor cell search. The main reason that I've read is the almost sequential ordering of spatially close cells in memory.
Being in the middle of a first implementation, I can not wrap my head around how to efficiently implement the algorithm for the nearest neighbors, especially in comparison to a basic uniform grid.
Given a cell (x,y) it is trivial to obtain the 8 neighbor cell indices and compute the respective z-index. Although this provides constant access time to the elements, the z-index has either to be calculated or looked up in predefined tables (separate for each axis and OR'ing). How can this possibly be more efficient? Is it true, that accessing elements in an array A in an order say A[0] -> A1 -> A[3] -> A[4] -> ... is more efficient than in an order A[1023] -> A[12] -> A[456] -> A[56] -> ...?
I've expected that there exists a simpler algorithm to find the nearest neighbors in z-order. Something along the lines: find first cell of neighbors, iterate. But this can't be true, as this works nicely only within 2^4 sized blocks. There are two problems however: When the cell is not on the boundary, one can easily determine the first cell of the block and iterate through the cells in the block, but one has to check whether the cell is a nearest neighbor. Worse is the case when the cell lies on the boundary, than one has to take into account 2^5 cells. What am I missing here? Is there a comparatively simple and efficient algorithm that will do what I need?
The question in point 1. is easily testable, but I'm not very familiar with the underlying instructions that the described access pattern generates and would really like to understand what is going on behind the scenes.
Thanks in advance for any help, references, etc...
EDIT:
Thank you for clarifying point 1! So, with Z-ordering, the cache hit rate is increased on average for neighbor cells, interesting. Is there a way to profile cache hit/miss rates?
Regarding point 2:
I should add that I understand how to build the Morton-ordered array for a point cloud in R^d where the index i = f(x1, x2, ..., xd) is obtained from bitwise interlacing etc. What I try to understand is whether there is a better way than the following naive ansatz to get the nearest neighbors (here in d=2, "pseudo code"):
// Get the z-indices of cells adjacent to the cell containing (x, y)
// Accessing the contents of the cells is irrelevant here
(x, y) \elem R^2
point = (x, y)
zindex = f(x, y)
(zx, zy) = f^(-1)(zindex) // grid coordinates
nc = [(zx - 1, zy - 1), (zx - 1, zy), (zx - 1, zy + 1), // neighbor grid
(zx , zy - 1), (zx, zy + 1), // coordinates
(zx + 1, zy - 1), (zx + 1, zy), (zx + 1, zy + 1)]
ni= [f(x[0], x[1]) for x in nc] // neighbor indices
In modern multi-level cache-based computer systems, spacial locality is an important factor in optimising access-time to data elements.
Put simply, this means if you access a data element in memory, then accessing another data element in memory that is nearby (has an address that is close to the first) can be cheaper by several orders of magnitude that accessing a data element that is far away.
When 1-d data is accessed sequentially, as in simply image processing or sound processing, or iterating over data structures processing each element the same way, then arranging the data elements in memory in order tends to achieve spatial locality - i.e. since you access element N+1 just after accessing element N, the two elements should be placed next to each other in memory.
Standard c arrays (and many other data structures) have this property.
The point of Morton ordering is to support schemes where data is accessed two dimensionally instead of one dimensionally. In other words, after accessing element (x,y), you may go on to access (x+1,y) or (x,y+1) or similar.
The Morton ordering means that (x,y), (x+1,y) and (x,y+1) are near to each other in memory. In a standard c multidimensional array, this is not necessarily the case. For example, in the array myArray[10000][10000], (x,y) and (x,y+1) are 10000 elements apart - too far apart to take advantage of spatial locality.
In a Morton ordering, a standard c array can still be used as a store for the data, but the calculation to work out where (x,y) is is no longer as simple as store[x+y*rowsize].
To implement your application using Morton ordering, you need to work out how to transform a coordinate (x,y) into the address in the store. In other words, you need a function f(x,y) that can be used to access the store as in store[f(x,y)].
Looks like you need to do some more research - follow the links from the wikipedia page, particularly the ones on the BIGMIN function.
Yes, accessing array elements in order is indeed faster. The CPU loads memory from RAM into cache in chunks. If you access sequentially, the CPU can preload the next chunk easily, and you won't notice the load time. If you access randomly, it can't. This is called cache coherency, and what it means is that accessing memory near to memory you've already accessed is faster.
In your example, when loading A[1], A[2], A[3] and A[4], the processor probably loaded several of those indices at once, making them very trivial. Moreover, if you then go on to try to access A[5], it can pre-load that chunk while you operate on A[1] and such, making the load time effectively nothing.
However, if you load A[1023], the processor must load that chunk. Then it must load A[12]- which it hasn't already loaded and thus must load a new chunk. Et cetera, et cetera. I have no idea about the rest of your question, however.

How to quickly count the number of neighboring voxels?

I have got a 3D grid (voxels), where some of the voxels are filled, and some are not. The 3D grid is sparsely filled, so I have got a set filledVoxels with coordinates (x, y, z) of the filled voxels. What I am trying to do is find out is for each filled voxel, how many neighboring voxels are filled too.
Here is an example:
filledVoxels contains the voxels (1, 1, 1), (1, 2, 1), and (1, 3, 1).
Therefore, the neighbor counts are:
(1,1,1) has 1 neighbor
(1,2,1) has 2 neighbors
(1,3,1) has 1 neighbor.
Right now I have this algorithm:
voxelCount = new Map<Voxel, Integer>();
for (voxel v in filledVoxels)
count = checkAllNeighbors(v, filledVoxels);
voxelCount[v] = count;
end
checkAllNeighbors() looks up all 26 surrounding voxels. So in total I am doing 26*filledVoxels.size() lookups, which is quite slow.
Is there any way to cut down the number of required lookups? When you look at the above example you can see that I am checking the same voxels several times, so it might be possible to get rid of lookups with some clever caching.
If this helps in any way, the voxels represent a voxelized 3D surface (but there might be holes in it). I usually want to get a list of all voxels that have 5 or 6 neighbors.
You can transform your voxel space into a octree in which every node contains a flag that specifies whether it contains filled voxels at all.
When a node does not contain filled voxels, you don't need to check any of its descendants.
I'd say if each of your lookups is slow (O(size)), you should optimize it by binary search in an ordered list (O(log(size))).
The constant 26, I wouldn't worry much. If you iterate smarter, you could cache something and have 26 -> 10 or something, I think, but unless you have profiled the whole application and found out decisively that it is the bottleneck I would concentrate on something else.
As ilya states, there's not much you can do to get around the 26 neighbor look-ups. You have to make your biggest gains in efficiently identifying whether a given neighbor is filled or not. Given that the brute force solution is essentially O(N^2), you have a lot of possible ground to gain in that area. Since you have to iterate over all filled voxels at least once, I would take an approach similar to the following:
voxelCount = new Map<Voxel, Integer>();
visitedVoxels = new EfficientSpatialDataType();
for (voxel v in filledVoxels)
for (voxel n in neighbors(v))
if (visitedVoxels.contains(n))
voxelCount[v]++;
voxelCount[n]++;
end
next
visitedVoxels.add(v);
next
For your efficient spatial data type, a kd-tree, as Zifre suggested, might be a good idea. In any case, you're going to want to reduce your search space by binning visited voxels.
If you're marching along the voxels one at a time, you can keep a lookup table corresponding to the grid, so that after you've checked it once using IsFullVoxel() you put the value in this grid. For each voxel you're marching in you can check if its lookup table value is valid, and only call IsFullVoxel() it it isn't.
OTOH it seems like you can't avoid iterating over all neighboring voxels, either using IsFullVoxel() or the LUT. If you had some more a priori information it could help. For instance, if you knew that there were at most x neighboring filled voxels, or you knew that there were at most y neighboring filled voxels in each direction. For instance, if you know you're looking for voxels with 5 to 6 neighbors, you can stop after you've found 7 full neighbors or 22 empty neighbors.
I'm assuming that a function IsFullVoxel() exists that returns true if a voxel is full.
If most of the moves in your iteration were to neighbors, you could reduce your checking by around 25% by not looking back at the ones you just checked before you made the step.
You may find a Z-order curve a useful concept here. It lets you (with certain provisos) keep a sliding window of data around the point you're currently querying, so that when you move to the next point, you don't have to throw away many of the queries you've already performed.
Um, your question is not very clear. I'm assuming you just have a list of the filled points. In that case, this is going to be very slow, because you have to iterate through it (or use some kind of tree structure such as a kd-tree, but this will still be O(log n)).
If you can (i.e. the grid is not too big), just make a 3d array of bools. 26 lookups in a 3d array shouldn't really take that long (and there really is no way to cut down on the number of lookups).
Actually, now that I think of it, you could make it a 3d array of longs (64 bits). Each 64 bit block would hold 64 (4 x 4 x 4) voxels. When you are checking the neighbors of a voxel in the middle of the block, you could do a single 64 bit read (which would be much faster).
Is there any way to cut down the number of required lookups?
You will, at minimum, have to perform at least 1 lookup per voxel. Since that's the minimum, then any algorithm which only performs one lookup per voxel will meet your requirement.
One simplistic idea is to initialize an array to hold the count for each voxel, then look at each voxel and increment the neighbors of that voxel in the array.
Pseudo C might look something like this:
#define MAXX 100
#define MAXY 100
#define MAXZ 100
int x, y, z
char countArray[MAXX][MAXY][MAXZ];
initializeCountArray(MAXX, MAXY, MAXZ); // Set all array elements to 0
for(x=0; x<MAXX; x++)
for(y=0;y<MAXY;y++)
for(z=0;z<MAXZ;z++)
if(VoxelExists(x,y,z))
incrementNeighbors(x,y,z);
You'll need to write initializeCountArray so it sets all array elements to 0.
More importantly you'll also need to write incrementNeighbors so that it won't increment outside the array. A slight speed increase here is to only perform the above algorithm on all voxels on the interior, then do a separate run on all the outside edge voxels with a modified incrementNeighbrs routine that understands there won't be neighbors on one side.
This algorithm results in 1 lookup per voxel, and at most 26 byte additions per voxel. If your voxel space is sparse then this will result in very few (relative) additions. If your voxel space is very dense, you might consider reversing the algorithm - initialize the array to the value of 26 for each entry, then decrement the neighbors when a voxel doesn't exist.
The results for a given voxel (ie, how many neighbors do I have?) reside in the array. If you need to know how many neighbors voxel 2,3,5 has, just look at the byte in countArray[2][3][5].
The array will consume 1 byte per voxel. You could use less space, and possibly increase the speed a little bit by packing the bytes.
There are better algorithms if you know details about your data. For instance, a very sparse voxel space will benefit greatly from an octree, where you can skip large blocks of lookups when you already know there are no filled voxels inside. Most of these algorithms, however, would still require at least one lookup per voxel to fill their matrix, but if you are performing several operations then they may benefit more than this one operation.

Resources