How to determine visibility in 2D - algorithm

I'm developing an AI sandbox and I would like to calculate what every living entity can see.
The rule is to simply hide what's behind the edges of the shapes from the point of view of the entity. The image clarifies everything:
alt text http://img231.imageshack.us/img231/2972/shadows.png
I need it either as an input to the artificial intelligence either graphically, to show it for a specific entity while it moves..
Any cool ideas?

This isn't the fastest algorithm but it produces a polygon which might be useful for further analysis by your AI:
For each line segment, calculate the angle between the entity and the two endpoints.
Sort the points by angle.
“Sweep” around 360°, keeping track of which line segments intersect with the sweep line. When you cross the beginning-of-segment point, you add that segment to the set; when you cross the end-of-segment point, you remove that segment from the set.
The closest line segments form a polygon of what's visible. The polygon is the union of triangle slivers.
I realize this explanation isn't great, but I have an online demo here that you can play with to get a sense for how it works. Extending it to work with circles probably isn't too bad (famous last words).

If you're using simple shapes to block the entity's view, there is an easy way to do this that I have implemented:
Create a VisionWave object which can move either horizontally or vertically. You can define a VisionWave using a source coordinate, two lines that intersect that point, and a distance from the source point.
You should have 4 waves: one going up, one down, one left, and one right, and the lines that define them should have a slope of 1 and -1 (i.e., an X). My crude drawing below shows one wave (going right) as represented by the > character.
\ /
\ />
\ / >
# >
/ \ >
/ \>
/ \
Make a loop that propagates each wave one pixel at a time. When you propagate the wave, you want to do the following:
Mark every pixel that the wave is touching as visible.
If any of the pixels that the wave touches block light, then you want to split the wave into two, and recursively propagate each one.
I implemented a system like this in my Roguelike and it is very fast. Make sure to profile your code for bottlenecks.
If you're very clever you might try circular waves instead of straight lines, but I don't know if it would be faster due to the trigonometric calculations.

Determine which vertices are visible to your eye point, then project those vertices back away from the eye point on a straight line to make new vertices. Close the shape and you will have created a polygon representing the invisible area.
See shadow volumes for the 3D equivalent.

Related

How to identify if a set of lines is similar to a shape

Currently I have a program that allows the user to paint on it by capturing the mouse position every 0.05 seconds and drawing a line between a point and the next. With that setup I am looking for a way to identify shapes like a circle, a rectangle or the letter 'P'.
My current algorithm divides the screen on sections, then marks the sections with points recorded by the player and makes a matrix with the marked sections, then compares that matrix with every shape matrix.
This lacks any kind of support for rotations, sizes or positions. Also the control of the threshold is tricky returning in most cases fake results.
I need an algorithm that allows to identify for example a ' P ' as a ' P '.
Note: My current application is running on a c++ framework so any libraries or tools are welcome but I am interested on the algorithm behind.
Edit: After thinking around the problem I have changed the current grid on the screen, instead of that I capture the points and shift them to resize
the shape so it fits on a grid and over that grid compare with the known shapes.
Picture of the process
This solves the position and size problems while being fast enough, also rotating the input and then resizing in a loop may solve the rotation problem (seems though it would have an high cost and won't be very reliable)
I would gladly welcome alternative methods of handling shape comparison or the rotation.
After thinking around the problem I have changed the current grid on the screen, instead of that I capture the points and shift them to resize
the shape so it fits on a grid and over that grid compare with the known shapes.
Picture of the process
This solves the position and size problems while being fast enough, also rotating the input and then resizing in a loop may solve the rotation problem (seems though it would have an high cost and won't be very reliable)

Game Maker - Touch Event

I´m making my first game in Game Maker.
In the game i need to the user to draw a figure, for example a rectangle, and the game has to recognize the figure. How can i do this?
Thanks!
Well, that is a pretty complex task. To simplify it, you could ask him to place a succession of points, using the mouse coordinates in the click event, and automatically connect them with lines. If you store every point in the same ds_list structure, you will be able to check conditions of angle, distance, etc. This way, you can determine the shape. May I ask why you want to do this ?
The way I would solve this problem is pretty simple. I would create a few variables for each point when someone clicked on one of the points it would equal true. and wait for the player to click on the next point. If the player clicked on the next point i would call in a sprite as a line using image_angle to line both points up and wait for the player to click the next point.
Next I would have a step event waiting to see if all points were clicked and when they were then to either draw a triangle at those coordinates or place an sprite at the correct coordinates to fill in the triangle.
Another way you could do it would be to decide what those points would be and check against mouse_x, and mouse_y to see if that was a point and if it was then do as above. There are many ways to solve this problem. Just keep trying you will find one that works for your skill level and what you want to do.
You need to use draw_rectangle(x1, y1, x2, y2, outline) function. As for recognition of the figure, use point_in_rectangle(px, py, x1, y1, x2, y2).
I'm just wondering around with ideas cause i can't code right now. But listen to this, i think this could work.
We suppose that the user must keep his finger on touchscreen or an event is triggered and all data from the touch event is cleaned.
I assume that in future you could need to recognize other simple geometrical figures too.
1 : Set a fixed amount of pixels of movement defined dependent on the viewport dimension (i'll call this constant MOV from now on), for every MOV you store in a buffer (pointsBuf) the coordinates of the point where the finger is.
2 : Everytime a point is stored you calculate the average of either X and Y coordinates for every point. (Hold the previous average and a counter to reduce time complexity). Comparing them we now can know the direction and versus of the line. Store them in a 2D buffer (dirVerBuf).
3 : If a point is "drastically" different from the most plain average between the X and Y coordinates we can assume that the finger changed direction. This is where the test part of MOV comes critical, we must assure to calculate an angle now. Since only a Parkinsoned user would make really distorted lines we can assume at 95% that we're safe to take the 2nd point that didn't changed the average of the coordinate as vertex and let's say the last and the 2nd point before vertex to calculate the angle. You have now one angle. Test the best error margin of the user to find if the angle is about to be a 90, 60, 45, ecc.. degrees angle. Store in a new buffer (angBuf)
4 : Delete the values from pointsBuf and repeat step 2 and 3 until the user's finger leaves the screen.
5 : if four of the angles are of 90 degrees, the 4 versus and two of the directions are different, the last point is somewhat near (depending from MOV) the first angle stored and the two X lines and the Y lines are somewhat equal, but of different length between them, then you can connect the 4 angles using the four best values next to the 4 coordinates to make perfect rectangular shape.
It's late and i could have forgotten something, but with this method i think you could even figure out a triangle, a circle, ecc..
With just some edit and confronting.
EDIT: If you are really lazy you could instead use a much more space complexity heavy strategy. Just create a grid of rectangles or even triangles of a fixed dimension and check which one the finger has touched, connect their centers after you'have figured out the shape, obviously ignoring the "touched for mistake" ones. This would be extremely easy to draw even circles using the native functions. Gg.

distinguishing objects with opencv

I want to identify lego bricks for building a lego sorting machine (I use c++ with opencv).
That means I have to distinguish between objects which look very similar.
The bricks are coming to my camera individually on a flat conveyer. But they might lay in any possible way: upside down, on the side or "normal".
My approach is to teach the sorting machine the bricks by taping them with the camera in lots of different positions and rotations. Features of each and every view are calculated by surf-algorythm.
void calculateFeatures(const cv::Mat& image,
std::vector<cv::KeyPoint>& keypoints,
cv::Mat& descriptors)
{
// detector == cv::SurfFeatureDetector(10)
detector->detect(image,keypoints);
// extractor == cv::SurfDescriptorExtractor()
extractor->compute(image,keypoints,descriptors);
}
If there is an unknown brick (the brick that i want to sort) its features also get calculated and matched with known ones.
To find wrongly matched features I proceed as described in the book OpenCV 2 Cookbook:
with the matcher (=cv::BFMatcher(cv::NORM_L2)) the two nearest neighbours in both directions are searched
matcher.knnMatch(descriptorsImage1, descriptorsImage2,
matches1,
2);
matcher.knnMatch(descriptorsImage2, descriptorsImage1,
matches2,
2);
I check the ratio between the distances of the found nearest neighbours. If the two distances are very similar it's likely that a false value is used.
// loop for matches1 and matches2
for(iterator matchIterator over all matches)
if( ((*matchIterator)[0].distance / (*matchIterator)[1].distance) > 0.65 )
throw away
Finally only symmatrical match-pairs are accepted. These are matches in which not only n1 is the nearest neighbour to feature f1, but also f1 is the nearest neighbour to n1.
for(iterator matchIterator1 over all matches)
for(iterator matchIterator2 over all matches)
if ((*matchIterator1)[0].queryIdx == (*matchIterator2)[0].trainIdx &&
(*matchIterator2)[0].queryIdx == (*matchIterator1)[0].trainIdx)
// good Match
Now only pretty good matches remain. To filter out some more bad matches I check which matches fit the projection of img1 on img2 using the fundamental matrix.
std::vector<uchar> inliers(points1.size(),0);
cv::findFundamentalMat(
cv::Mat(points1),cv::Mat(points2), // matching points
inliers,
CV_FM_RANSAC,
3,
0.99);
std::vector<cv::DMatch> goodMatches
// extract the surviving (inliers) matches
std::vector<uchar>::const_iterator itIn= inliers.begin();
std::vector<cv::DMatch>::const_iterator itM= allMatches.begin();
// for all matches
for ( ;itIn!= inliers.end(); ++itIn, ++itM)
if (*itIn)
// it is a valid match
The result is pretty good. But in cases of extreme alikeness faults still occur.
In the picture above you can see that a similar brick is recognized well.
However in the second picture a wrong brick is recognized just as well.
Now the question is how I could improve the matching.
I had two different ideas:
The matches in the second picture trace back to the features really fitting, but only if the visual field is intensely changed. To recognize a brick I have to compare it in many different positions anyway (at least as shown in figure three). This means I know that I am only allowed to minimally change the visual field. The information how intensely the visual field is changed should be hidden in the fundamental matrix. How can I read out of this matrix how far the position in the room has changed? Especially the rotation and strong scaling should be of interest; if the brick once is taped farer on the left side this shouldn't matter.
Second idea:
I calculated the fundamental matrix out of 2 pictures and filtered out features that don't fit the projections - shouldn't there be a way to do the same using three or more pictures? (keyword Trifocal tensor). This way the matching should become more stable. But I neither know how to do this using OpenCV nor could I find any information on this on google.
I don't have a complete answer, but I have a few suggestions.
On the image analysis side:
It looks like your camera setup is pretty constant. Easy to just separate the brick from the background. I also see your system finding features in the background. This is unnecessary. Set all non-brick pixels to black to remove them from the analysis.
When you have located just the brick, your first step should be to just filter likely candidates based on the size (i.e. number of pixels) in the brick. That way the example faulty match you show is already less likely.
You can take other features into account such as the aspect ratio of the bounding box of the brick, the major and minor axes (eigevectors of the covariance matrix of the central moments) of the brick etc.
These simpler features will give you a reasonable first filter to limit your search space.
On the mechanical side:
If bricks are actually coming down a conveyor you should be able to "straighten" the bricks along a straight edge using something like a rod that lies at an angle to the direction of the conveyor across the belt so that the bricks arrive more uniformly at your camera like so.
Similar to the previous point, you could use something like a very loose brush suspended across the belt to topple bricks standing up as they pass.
Again both these points will limit your search space.

Best approach for specific Object/Image Recognition task?

I'm searching for an certain object in my photograph:
Object: Outline of a rectangle with an X in the middle. It looks like a rectangular checkbox. That's all. So, no fill, just lines. The rectangle will have the same ratios of length to width but it could be any size or any rotation in the photograph.
I've looked a whole bunch of image recognition approaches. But I'm trying to determine the best for this specific task. Most importantly, the object is made of lines and is not a filled shape. Also, there is no perspective distortion, so the rectangular object will always have right angles in the photograph.
Any ideas? I'm hoping for something that I can implement fairly easily.
Thanks all.
You could try using a corner detector (e.g. Harris) to find the corners of the box, the ends and the intersection of the X. That simplifies the problem to finding points in the right configuration.
Edit (response to comment):
I'm assuming you can find the corner points in your image, the 4 corners of the rectangle, the 4 line endings of the X and the center of the X, plus a few other corners in the image due to noise or objects in the background. That simplifies the problem to finding a set of 9 points in the right configuration, out of a given set of points.
My first try would be to look at each corner point A. Then I'd iterate over the points B close to A. Now if I assume that (e.g.) A is the upper left corner of the rectangle and B is the lower right corner, I can easily calculate, where I would expect the other corner points to be in the image. I'd use some nearest-neighbor search (or a library like FLANN) to see if there are corners where I'd expect them. If I can find a set of points that matches these expected positions, I know where the symbol would be, if it is present in the image.
You have to try if that is good enough for your application. If you have too many false positives (sets of corners of other objects that accidentially form a rectangle + X), you could check if there are lines (i.e. high contrast in the right direction) where you would expect them. And you could check if there is low contrast where there are no lines in the pattern. This should be relatively straightforward once you know the points in the image that correspond to the corners/line endings in the object you're looking for.
I'd suggest the Generalized Hough Transform. It seems you have a fairly simple, fixed shape. The generalized Hough transform should be able to detect that shape at any rotation or scale in the image. You many need to threshold the original image, or pre-process it in some way for this method to be useful though.
You can use local features to identify the object in image. Feature detection wiki
For example, you can calculate features on some referent image which contains only the object you're looking for and save the results, let's say, to a plain text file. After that you can search for the object just by comparing newly calculated features (on images with some complex scenes containing the object) with the referent ones.
Here's some good resource on local features:
Local Invariant Feature Detectors: A Survey

Raytracing (LoS) on 3D hex-like tile maps

Greetings,
I'm working on a game project that uses a 3D variant of hexagonal tile maps. Tiles are actually cubes, not hexes, but are laid out just like hexes (because a square can be turned to a cube to extrapolate from 2D to 3D, but there is no 3D version of a hex). Rather than a verbose description, here goes an example of a 4x4x4 map:
(I have highlighted an arbitrary tile (green) and its adjacent tiles (yellow) to help describe how the whole thing is supposed to work; but the adjacency functions are not the issue, that's already solved.)
I have a struct type to represent tiles, and maps are represented as a 3D array of tiles (wrapped in a Map class to add some utility methods, but that's not very relevant).
Each tile is supposed to represent a perfectly cubic space, and they are all exactly the same size. Also, the offset between adjacent "rows" is exactly half the size of a tile.
That's enough context; my question is:
Given the coordinates of two points A and B, how can I generate a list of the tiles (or, rather, their coordinates) that a straight line between A and B would cross?
That would later be used for a variety of purposes, such as determining Line-of-sight, charge path legality, and so on.
BTW, this may be useful: my maps use the (0,0,0) as a reference position. The 'jagging' of the map can be defined as offsetting each tile ((y+z) mod 2) * tileSize/2.0 to the right from the position it'd have on a "sane" cartesian system. For the non-jagged rows, that yields 0; for rows where (y+z) mod 2 is 1, it yields 0.5 tiles.
I'm working on C#4 targeting the .Net Framework 4.0; but I don't really need specific code, just the algorithm to solve the weird geometric/mathematical problem. I have been trying for several days to solve this at no avail; and trying to draw the whole thing on paper to "visualize" it didn't help either :( .
Thanks in advance for any answer
Until one of the clever SOers turns up, here's my dumb solution. I'll explain it in 2D 'cos that makes it easier to explain, but it will generalise to 3D easily enough. I think any attempt to try to work this entirely in cell index space is doomed to failure (though I'll admit it's just what I think and I look forward to being proved wrong).
So you need to define a function to map from cartesian coordinates to cell indices. This is straightforward, if a little tricky. First, decide whether point(0,0) is the bottom left corner of cell(0,0) or the centre, or some other point. Since it makes the explanations easier, I'll go with bottom-left corner. Observe that any point(x,floor(y)==0) maps to cell(floor(x),0). Indeed, any point(x,even(floor(y))) maps to cell(floor(x),floor(y)).
Here, I invent the boolean function even which returns True if its argument is an even integer. I'll use odd next: any point point(x,odd(floor(y)) maps to cell(floor(x-0.5),floor(y)).
Now you have the basics of the recipe for determining lines-of-sight.
You will also need a function to map from cell(m,n) back to a point in cartesian space. That should be straightforward once you have decided where the origin lies.
Now, unless I've misplaced some brackets, I think you are on your way. You'll need to:
decide where in cell(0,0) you position point(0,0); and adjust the function accordingly;
decide where points along the cell boundaries fall; and
generalise this into 3 dimensions.
Depending on the size of the playing field you could store the cartesian coordinates of the cell boundaries in a lookup table (or other data structure), which would probably speed things up.
Perhaps you can avoid all the complex math if you look at your problem in another way:
I see that you only shift your blocks (alternating) along the first axis by half the blocksize. If you split up your blocks along this axis the above example will become (with shifts) an (9x4x4) simple cartesian coordinate system with regular stacked blocks. Now doing the raytracing becomes much more simple and less error prone.

Resources