Been stuck on a problem for a while, hope some of you have ideas.
Given a matrix size N*M of binary values (0 / 1), come with an approach to return the number of 1's which is more efficient than simply iterating the matrix.
The key in my opinion is bitmap. Thought about allocating new N*M matrix and manipulate the two... haven't got a solution yet.
Any ideas?
From a theoretical point of view, unless the matrix has special properties, you must test all the N.M elements and this can be achieved by a loop. So this construction is optimal and unbeatable.
In practice, maybe you are looking for a way to get some speedup from a naïve implementation that handles a single element at a time. The answer will be highly dependent on the storage format of the elements and the processor architecture.
If the bits are packed 8 per bytes, you can setup a lookup table of bit counts for every possible byte value. This yields a potential speedup of x8.
If you know that the black zone are simply connected (no hole), then it is not necessary to visit their inside, a contouring algorithm will suffice. But you still have to scan the white areas. This allows to break the N.M limit and to reduce to Nw + Lb, where Nw is the number of white pixels, and Lb the length of the black outlines.
If in addition you know that there is a single, simply connected black zone, and you know a black outline pixel, the complexity drops to Lb, which can be significantly smaller than N.M.
Related
I'm sorry if this is a duplicate of some thread, but I'm really not sure how to describe the question.
I'm wondering what is the minimal data structure to prevent 2D-grid traveler from repeating itself (i.e. travel to some point it already traveled before). The traveler can only move horizontally or vertically 1 step each time. For my special case (below), the 2D-grid is actually a lower-left triagle where one coordinate never exceeds another.
For example, with 1D case, this can be simply done by recording the direction of last travel. If direction changes, it's repeating itself.
For 2D case it becomes complicated. The most trivial way would be creating a list recording the points traveled before, but I'm wondering are there more efficient ways to do that?
I'm implementing a more-or-less "4-finger" algorithm for 4-sum where the 2 fingers in the middle moves in two directions (namely i, j, k, and l):
i=> <=j=> <=k=> <=l
1 2 3 ... 71 72 ... 123 124 ... 201 202 203
The directions fingers travel are decided (or suggested) by some algorithm but might lead to forever-loop. Therefore, I have to force not to take some suggestion if the 2 fingers in the middle starts to repeat history position.
EDIT
Among these days, I found 2 solutions. None of them is ideal solution to this problem, but they're at least somewhat usable:
As #Sorin mentioned below, one solution would be saving a bit array representing state of all cells. For the triangular-grid example here, we can even condense the array to cut memory cost by half (though requiring k^2 time to compute the bit position where k is the degree of freedom i.e. 2 here. A standard array would use only linear time).
Another solution would be directly avoid backward-travelling. Set up the algorithm such that j and k only move in one direction (this is probably greedy).
But still since the 2D-grid traveler have the nice property that it moves along axis 1 step each time, I'm wondering are there more "specialized" representation
for this kind of movement.
Thanks for your help!
If you are looking for optimal lookup complexity, then a hashset is the best thing. You need O(N) memory but all lookups & insertions will be O(1).
If it's often that you visit most of the cells then you can even skip the hash part and store a bit array. That is store one bit for every cell and just check if the corresponding bit is 0 or 1. This is much more compact in memory (at least 32x, one bit vs. one int, but likely more as you also skip storing some pointers internal to the datastructure, 64 bits).
If this still take too much space, you could use a bloom filter (link), but that will give you some false positives (tells you that you've visited a cell, but in fact you didn't). If that's something you can live with the space savings are fairly huge.
Other structures like BSP or Kd-trees could work as well. Once you reach a point where everything is either free or occupied (ignoring the unused cells in the upper triangle) you can store all that information in a single node.
This is hard to recommend because of it's complexity and that it will likely also use O(N) memory in many cases, but with a larger constant. Also all checks will be O(logN).
I'm looking for an algorithm to chunk a non-rectangle image (i.e. transparent image) into blocks of (for example) size 16x16 pixels. The blocks may be overlapping, but the goal is to get the smallest amount of blocks.
Example
Summary
Blocks must have equal sizes
Blocks may be overlapping
Smallest amount of rectangles is the goal
Thank you in advance
This is a special case of set cover. You could try an integer program solver, but there may just be too many possible blocks. The integer program would be amenable to column generation/branch and price, but that's an advanced technique and would require some experimentation to get it right.
I think that you could do pretty well with a greedy algorithm that repeatedly chooses the block covering as many pixels as possible including one boundary pixel.
Suppose I have a rod which I cut to pieces. Given a point on the original rod, is there a way to find out which piece it belongs to, in constant time?
For example:
|------------------|---------|---------------|
0.0 4.5 7.8532 9.123
Given a position:
^
|
8.005
I would like to get 3rd piece.
It is possible to easily get such answer in O(log n) time with binary search but is it possible to do it in O(1)? If I pre-process the "cut" positions somehow?
If you assume the point you want to query is uniformly randomly chosen along the rod, then you can have EXPECTED constant time solution, without crazy memory explosion, as follows. If you break up the rod into N equally spaced pieces, where N is the number of original irregularly spaced segments you have in your rod, and then record for each of the N equal-sized pieces which of the original irregular segment(s) it overlaps, then to do a query you first just take the query point and do simple round-off to find out which equally spaced piece it lies in, then use that index to look up which of your original segments intersect the equally spaced piece, and then check each intersecting original segment to see if the segment contains your point (and you can use binary search if you want to make sure the worst-case performance is still logarithmic). The expected running time for this approach is constant if you assume that the query point is randomly chosen along your rod, and the amount of memory is O(N) if your rod was originally cut into N irregular pieces, so no crazy memory requirements.
PROOF OF EXPECTED O(1) RUNNING TIME:
When you count the total number of intersection pairs between your original N irregular segments and the N equally-spaced pieces I propose constructing, the total number is no more than 2*(N+1) (because if you sort all the end-points of all the regular and irregular segments, a new intersection pair can always be charged to one of the end-points defining either a regular or irregular segment). So you have a multi-set of at most 2(N+1) of your irregular segments, distributed out in some fashion among the N regular segments that they intersect. The actual distribution of intersections among the regular segments doesn't matter. When you have a uniform query point and compute the expected number of irregular segments that intersect the regular segment that contains the query point, each regular segment has probability 1/N of being chosen by the query point, so the expected number of intersected irregular segments that need to be checked is 2*(N+1)/N = O(1).
For arbitrary cuts and precisions, not really, you have to compare the position with the various start or end points.
But, if you're only talking a small number of cuts, performance shouldn't really be an issue.
For example, even with ten segments, you only have nine comparisons, not a huge amount of computation.
Of course, you can always turn the situation into a ploynomial formula (such as ax^4 + bx^3 +cx^2 + dx + e), generated using simultaneous equations, which will give you a segment but the highest power tends to rise with the segment count so it's not necessarily as efficient as simple checks.
You're not going to do better than lg n with a comparison-based algorithm. Reinterpreting the 31 non-sign bits of a positive IEEE float as a 31-bit integer is an order-preserving transformation, so tries and van Emde Boas trees both are options. I would steer you first toward a three-level trie.
You could assign an integral number to every position and then use that as index into a lookup table, which would give you constant-time lookup. This is pretty easy if your stick is short and you don't cut it into pieces that are fractions of a millimeter long. If you can get by with such an approximation, that would be my way to go.
There is one enhanced way which generalizes this even further. In each element of a lookup table, you store the middle position and the segment ID to the left and right. This makes one lookup (O(1)) plus one comparison (O(1)). The downside is that the lookup table has to be so large that you never have more than two different segments in the same table element's range. Again, it depends on your requirements and input data whether this works or not.
For example, the restricted space is 100 x 100 x 100 big, and the radius of each ball is 5, I need to generate 100 of these balls at random position within this space and no overlapping allowed. I come up with two approaches:
Use srand and get 100 positions, then do a checking to delete balls that overlap each other ( check if the distance of the center of two balls are less than two times the radius), then generate another x balls (x is the number of balls deleted) and keep repeating the process until 100 balls don't overlap.
First divide the space into 100 cubes, and place each ball within its allocated cube using srand, this way they won't overlap at all.
I feel the first way is more proper in terms of random, but too time consuming and the second way is fast and easy but I'm not sure about the idea of random there. And this model is trying to simulate the position of molecules in the air. Maybe neither of these ways are good, please let me know if there's better way. Thanks in advance!
Edit:
#Will provides me an option that's similar but much cleaner than my original first approach; every time when adding a new ball, check if it overlap with any existing ones, if it does, regenerate. The complexity is 1+2+3...+(n-1), which is about O(n^n). I still wonder if there's faster algorithm though.
Edit2:
Sorry 1+2+..n should be n^2
You can do an O((n + f) log n) algorithm, where f is the number of failed attempts. Essentially the issue with the time taking too long is finding which neighboring balls you overlap with. You can use an external data structure called a KD-tree to efficiently store the positions of the balls. Then you can look up through the KD-tree your "nearest" neighboring ball. This will take O(log n) time. Determine if they overlap, then add the ball to the space and to the KD-tree -- inserting is a O(log n) operation. In total n balls each taking O(log n) will be O(n log n), and accounting for failed attempts will be O((n+F)*log n). CGAL (computational geometric algorithms library) provides a nice KD-tree implementation. Here is a link to CGAL and a link to KD trees:
http://www.cgal.org/
https://en.wikipedia.org/wiki/K-d_tree
There are other structures like a K-D tree, but this would be the easiest to use for your case.
If you would like to avoid using a fancy data structure, you can compute a grid over the space. Insert each random ball from the entire space into its grid cell. Then when checking overlap you only need to check the balls in adjacent cells (assuming the ball size will not overlap more than one adjacency). This will not improve the overall time complexity, but is a common method in computer graphics to improve implementation time for neighbor finding routines.
Instead of dividing the area into a 100 cubes, you could divide it into 8,000 5 by 5 cubes, and then place balls centered into 100 of those cubes. This way the balls are still placed randomly in the space but the can't overlap.
Edit: Also, for when checking if the balls overlap, you might want to think about using a data structure that would allow you to only check the balls that are closest to the ball you are checking. Checking all of them is wasteful because there's no chance of balls on totally different sides of the space overlap. I'm not too familiar with octrees but you might want to look into them, if you really want to optimize your code.
The volume of your spheres is about 1/1900th the volume of your space, so if you just pick random locations and check for overlap, you won't have to regenerate many. And if you really only need 100, using a fancy algorithm like Octrees to check for collisions would be a waste.
Of course as soon as you code it up, someone will ask you to do it for 10,000 spheres instead of 100, so everything I just said will be wrong.
I like Chris's suggestion of just putting them in randomly chosen cubes. Not the most realistic perhaps, but close and much simpler.
I currently have an algorithm that operates on an adjacency matrix of size n by m. In my algorithm, I need to zero out entire rows or columns at a time. My implementation is currently O(m) or O(n) depending on if it's a column or row.
Is there any way to zero out a column or row in O(1) time?
Essentially this depends on the Chip architecture that you're dealing with. For most CPUs, it isn't possible to zero out whole swathes of memory at go, and therefore each word will require a separate memory operation, no matter what facilities your programming language provides.
It helps tremendously if your memory is contiguous for memory access time, because memory adjacent to memory just accessed will be cached, and subsequent accesses will hit the cache, resulting in fast performance.
The result of this is that if your matrix is large, it may be faster to zero out a row at a time or a column at a time, rather than vice versa, depending on whether your data is written by column or by row.
EDIT: I have assumed that your matrices aren't sparse, or triangular, or otherwise special, since you talk about "zeroing out a whole row". If you know that your matrix is mostly empty or somehow otherwise fits a special pattern, you would be able to represent your matrix in a different way (not a simple nxm array) and the story would be different. But if you have an nxm matrix right now, then this is the case.
Is the distance metric and is the graph undirected? (in this case the matrix is symmetric). In that case you could just operate on lower or upper triangular matrices throughout the program. In this way you just have to 0 out one row (or column if you are dealing with upper triangular). and even then it wont be a whole row, on average half.
It depends on how your matrices are implemented.
If you have a representation such as an array of arrays, you can point to a shared zeroed element array, as long as you check you don't subsequently write to it. Which means one out of a row or column can be zeroed in O(N), with a constant cost on all other write operations.
You also could have a couple of arrays - one for rows, one for columns - which scale the values in the matrix. Putting a zero in either would be a O(1) operation to mask out a row or column, at the cost of extra processing for every read; but it may be worth it as a way of temporarily removing a node from the graph if that's a common use case. It also leaves the original version of the matrix untouched, so you could parallelise your algorithm (assuming the only operation it requires is pruning all edges into or out of a node).