How to get an oriented contour from vtkMarchingSquares - computational-geometry

I have a 2D array of double values, and I want to extract oriented contours at a particular threshold. By "oriented" I mean that the values to the left of the direction of travel should be larger than the threshold, and the values to the right should be smaller.
I am trying to use vtkMarchingSquares, but it just gives me a bunch of tiny line segments across individual grid cells. So I'm trying to stitch them together with vtkStripper.
But the contours come out fragmented, and not in a consistent orientation with respect to the underlying values.
I have also tried setting a vtkMergePoints locator in the vtkMarchingSquares object, but as far as I can tell, the locator's SetTolerance method has no effect on the output.

Related

What do pixel coordinates translated from longitude/latitude represent in d3.js?

If I do nothing else but this:
let projection = d3.geoAzimuthalEqualArea()
console.log(projection([-3.0026, 16.7666])) // passing long/lat
I get [473.67353385539417, 213.6120079887163], which are pixel coordinates.
But pixel coordinates relative to WHAT? I haven't even defined the size of the svg container, the center of the map etc. How can it already know the exact pixel position on the screen if I haven't even specified how big the map will be on the page? If it's a bigger map, the point will be further to the right and down, which means a different pixel value.
Also why are those returned values floats with many decimal places if screen pixels can only ever be whole numbers?
And are there maximum values for these pixel coordinates? (I.e. is this projected on some kind of a default sized map?)
Where is this in the API documentation?
The documentation is not very clear about the default dimensions of a projection's output. Somewhat hidden you will find the value in the section on projection.translate():
# projection.translate([translate])
[…]
The default translation offset places ⟨0°,0°⟩ at the center of a 960×500 area.
Those 960×500 are the dimensions of the 2D-plane the geographic data is projected upon.
The output of a projection is the result of a sequence of mathematical operations on the input values, i.e. the geographical data. Floating-point math is used to get as close to the exact values as possible. No assumption is made about the actual use of that result. Although those coordinates may be interpreted as pixel coordinates by simply rounding the values their use is not restricted to this interpretation. Especially, when it comes to vector formats like SVGs, it is quite normal to have floating point coordinate values with the user client, e.g. the browser, doing the calculations to further project those vectors onto the screen taking into account possible translations, rotations, skews, view boxes etc.

Efficiently retrieve Z ordered objects in viewport in 2D game

Imagine a 2D game with a large play area, say 10000x10000 pixels. Now imagine there are thousands of objects spread across this area. All the objects are in a Z order list, so every object has a well-defined position relative to every other object even if they are far away.
Suppose there is a viewport in to this play area showing say a 500x500 area of this play area. Obviously if the algorithm is "for each object in Z order list, if inside viewport, render it" then you waste a lot of time iterating all the thousands of objects far outside the viewport. A better way would be to maintain a Z-ordered list of objects near or inside the viewport.
If both the objects and the viewport are moving, what is an efficient way of maintaining a Z-ordered list of objects which are candidates to draw? This is for a general-purpose game engine so there are not many other assumptions or details you can add in to take advantage of: the problem is pretty much just that.
You do not need to keep your memory layout strongly ordered by Z. Instead you need to store your objects in a space partitionning structure that is oriented along the viewing surface.
A typical partitionning structure, is a quad-tree, in 2D. You can use a binary tree, you can use a grid, or you can use a spatial hashing scheme. You can even mix those techniques and combine them one into each other.
There is no "best", but you can put in the balance the ease of writing and maintaining the code. And the memory you have available.
Let us consider the grid, it is the most simple to implement, fastest to access, and easiest to traverse. (traversing is the fact of going to neighborhood cells)
Imagine you allow yourself 20MB of RAM usage for your grid skeleton, considering the cell content is just a small object (like a std::vector or a c# List), say 50 bytes. for a 10k pixels square surface you then have:
sqrt(20*1024*1024 / 50) = 647
you get 647 cells for one dimension, therefore 10k/647 = 15 pixels wide cells.
Still very small, so I suppose perfectly acceptable. You can adjust the numbers to get cells of 512 pixels for example. It should be a good fit when a few cells fit in the viewport.
Then, it is trivially easy to determine which cells are activated by the viewport, by dividing the top left corner by the size of the cell and flooring that result, this gives you the index directly in the cell. (provided both your viewport space and grid space start at 0,0 both. otherwise you need to offset)
Finally take the bottom right corner, determine the grid coordinate for the cell; and you can do a dual loop (x and y) between the min and max to iterate over the activated cells.
When treating a cell, you can draw the objects it contains by going through the list of objects that you would have previously stowed.
Beware of objects that spans over 2 cells or more. You need to make a choice, either store them once and only, but then your search algorithms will always need to know the size of the biggest element in the region and also search the lists of the neighbooring cells (by going as far as necessary to be sure to cover at least the size of this biggest element).
Or, you can store it multiple times (my prefered way), and simply make sure when you iterate cells, that you treat objects once only per frame. This is very easily achieved by using a frame id, in the object structure (as a mutable member).
This same logic applies for more flexible parition like binary trees.
I have implementation for both available in my engine, check the code out, it may help you get through the details: http://sourceforge.net/projects/carnage-engine/
Final words about your Z Ordering, if you had multiple memory storage for each Z, then you already did a space partitionning, simply not along the good axis.
This can be called layering.
What you can do as an optimization, is instead of storing lists of objects in your cells, you can store (ordered) maps of objects and their keys is their Z, therefore the iteration will be ordered along Z.
A typical solution to this sort of problem is to group your objects according to their approximate XY location. For example, you can bucket them into 500x500 regions, i.e. objects intersecting [0,500]x[0,500], objects intersecting [500,1000]x[0,500], etc. Very large objects might be listed in multiple buckets, but presumably there are not too many very large objects.
For each viewport, you would need to check at most 4 buckets for objects to render. You will generally only look at about as many objects as you need to render anyway, so it should be efficient. This does require a bit more work updating when you reposition objects. However, assuming that a typical object is only in one bucket, it should still be pretty fast.

Invoice / OCR: Detect two important points in invoice image

I am currently working on OCR software and my idea is to use templates to try to recognize data inside invoices.
However scanned invoices can have several 'flaws' with them:
Not all invoices, based on a single template, are correctly aligned under the scanner.
People can write on invoices
etc.
Example of invoice: (Have to google it, sadly cannot add a more concrete version as client data is confidential obviously)
I find my data in the invoices based on the x-values of the text.
However I need to know the scale of the invoice and the offset from left/right, before I can do any real calculations with all data that I have retrieved.
What have I tried so far?
1) Making the image monochrome and use the left and right bounds of the first appearance of a black pixel. This fails due to the fact that people can write on invoices.
2) Divide the invoice up in vertical sections, use the sections that have the highest amount of black pixels. Fails due to the fact that the distribution is not always uniform amongst similar templates.
I could really use your help on (1) how to identify important points in invoices and (2) on what I should focus as the important points.
I hope the question is clear enough as it is quite hard to explain.
Detecting rotation
I would suggest you start by detecting straight lines.
Look (perhaps randomly) for small areas with high contrast, i.e. mostly white but a fair amount of very black pixels as well. Then try to fit a line to these black pixels, e.g. using least squares method. Drop the outliers, and fit another line to the remaining points. Iterate this as required. Evaluate how good that fit is, i.e. how many of the pixels in the observed area are really close to the line, and how far that line extends beyond the observed area. Do this process for a number of regions, and you should get a weighted list of lines.
For each line, you can compute the direction of the line itself and the direction orthogonal to that. One of these numbers can be chosen from an interval [0°, 90°), the other will be 90° plus that value, so storing one is enough. Take all these directions, and find one angle which best matches all of them. You can do that using a sliding window of e.g. 5°: slide accross that (cyclic) region and find a value where the maximal number of lines are within the window, then compute the average or median of the angles within that window. All of this computation can be done taking the weights of the lines into account.
Once you have found the direction of lines, you can rotate your image so that the lines are perfectly aligned to the coordinate axes.
Detecting translation
Assuming the image wasn't scaled at any point, you can then try to use a FFT-based correlation of the image to match it to the template. Convert both images to gray, pad them with zeros till the originals take up at most 1/2 the edge length of the padded image, which preferrably should be a power of two. FFT both images in both directions, multiply them element-wise and iFFT back. The resulting image will encode how much the two images would agree for a given shift relative to one another. Simply find the maximum, and you know how to make them match.
Added text will cause no problems at all. This method will work best for large areas, like the company logo and gray background boxes. Thin lines will provide a poorer match, so in those cases you might have to blur the picture before doing the correlation, to broaden the features. You don't have to use the blurred image for further processing; once you know the offset you can return to the rotated but unblurred version.
Now you know both rotation and translation, and assumed no scaling or shearing, so you know exactly which portion of the template corresponds to which portion of the scan. Proceed.
If rotation is solved already, I'd just sum up all pixel color values horizontally and vertically to a single horizontal / vertical "line". This should provide clear spikes where you have horizontal and vertical lines in the form.
p.s. Generated a corresponding horizontal image with Gimp's scaling capabilities, attached below (it's a bit hard to see because it's only one pixel high and may get scaled down because it's > 700 px wide; the url is http://i.stack.imgur.com/Zy8zO.png ).

Efficient collision detection between balls (ellipses)

I'm running a simple sketch in an HTML5 canvas using Processing.js that creates "ball" objects which are just ellipses that have position, speed, and acceleration vectors as well as a diameter. In the draw function, I call on a function called applyPhysics() which loops over each ball in a hashmap and checks them against each other to see if their positions make them crash. If they do, their speed vectors are reversed.
Long story short, the number of calculations as it is now is (number of balls)^2 which ends up being a lot when I get to into the hundreds of balls. Using this sort of check slows down the sketch too much so I'm looking for ways to do smart collisions some other way.
Any suggestions? Using PGraphics somehow maybe?
I assume you're already simplifying the physics by treating the ellipses as if they were circles.
Other than that, check out quadtree collision detection:
http://gamedev.tutsplus.com/tutorials/implementation/quick-tip-use-quadtrees-to-detect-likely-collisions-in-2d-space/
I don't know your project, but if the balls have non-random forces applied to them (e.g. gravity) you might use predictive analytics also.
If you grid the space and create a data structure that reflects this (e.g. an array of row objects each containing an array of column objects each containing an ArrayList of ball objects), you can just consider interactions within each cell (or also with neighbouring cells). You can reassign data location for balls that cross boundaries. You then have far fewer interactions.

Raytracing (LoS) on 3D hex-like tile maps

Greetings,
I'm working on a game project that uses a 3D variant of hexagonal tile maps. Tiles are actually cubes, not hexes, but are laid out just like hexes (because a square can be turned to a cube to extrapolate from 2D to 3D, but there is no 3D version of a hex). Rather than a verbose description, here goes an example of a 4x4x4 map:
(I have highlighted an arbitrary tile (green) and its adjacent tiles (yellow) to help describe how the whole thing is supposed to work; but the adjacency functions are not the issue, that's already solved.)
I have a struct type to represent tiles, and maps are represented as a 3D array of tiles (wrapped in a Map class to add some utility methods, but that's not very relevant).
Each tile is supposed to represent a perfectly cubic space, and they are all exactly the same size. Also, the offset between adjacent "rows" is exactly half the size of a tile.
That's enough context; my question is:
Given the coordinates of two points A and B, how can I generate a list of the tiles (or, rather, their coordinates) that a straight line between A and B would cross?
That would later be used for a variety of purposes, such as determining Line-of-sight, charge path legality, and so on.
BTW, this may be useful: my maps use the (0,0,0) as a reference position. The 'jagging' of the map can be defined as offsetting each tile ((y+z) mod 2) * tileSize/2.0 to the right from the position it'd have on a "sane" cartesian system. For the non-jagged rows, that yields 0; for rows where (y+z) mod 2 is 1, it yields 0.5 tiles.
I'm working on C#4 targeting the .Net Framework 4.0; but I don't really need specific code, just the algorithm to solve the weird geometric/mathematical problem. I have been trying for several days to solve this at no avail; and trying to draw the whole thing on paper to "visualize" it didn't help either :( .
Thanks in advance for any answer
Until one of the clever SOers turns up, here's my dumb solution. I'll explain it in 2D 'cos that makes it easier to explain, but it will generalise to 3D easily enough. I think any attempt to try to work this entirely in cell index space is doomed to failure (though I'll admit it's just what I think and I look forward to being proved wrong).
So you need to define a function to map from cartesian coordinates to cell indices. This is straightforward, if a little tricky. First, decide whether point(0,0) is the bottom left corner of cell(0,0) or the centre, or some other point. Since it makes the explanations easier, I'll go with bottom-left corner. Observe that any point(x,floor(y)==0) maps to cell(floor(x),0). Indeed, any point(x,even(floor(y))) maps to cell(floor(x),floor(y)).
Here, I invent the boolean function even which returns True if its argument is an even integer. I'll use odd next: any point point(x,odd(floor(y)) maps to cell(floor(x-0.5),floor(y)).
Now you have the basics of the recipe for determining lines-of-sight.
You will also need a function to map from cell(m,n) back to a point in cartesian space. That should be straightforward once you have decided where the origin lies.
Now, unless I've misplaced some brackets, I think you are on your way. You'll need to:
decide where in cell(0,0) you position point(0,0); and adjust the function accordingly;
decide where points along the cell boundaries fall; and
generalise this into 3 dimensions.
Depending on the size of the playing field you could store the cartesian coordinates of the cell boundaries in a lookup table (or other data structure), which would probably speed things up.
Perhaps you can avoid all the complex math if you look at your problem in another way:
I see that you only shift your blocks (alternating) along the first axis by half the blocksize. If you split up your blocks along this axis the above example will become (with shifts) an (9x4x4) simple cartesian coordinate system with regular stacked blocks. Now doing the raytracing becomes much more simple and less error prone.

Resources