How to detect a Triangle gesture with kinect? - algorithm

I am trying to implement a gesture recognition system which interprets the geometric gestures user makes and draws it on screen,
I have some idea of how circle can be recognized, however I have no clue how to get started with triangle recognition.
The data I have is X and Y coordinates of all points the gesture passed through. I get this data by tracking right hand.
I found something online called Hough Transform, which is used for detecting lines but I am not sure whether it will work for discrete collections of points.
Any ideas folks?

If you already have an x,y pair for the hand, the simplest thing that comes to mind is try the $1 Unistroke Recognizer.
A handy thing to look at is Dynamic Time Warping(DTW).
I've seen a fun Processing/SimpleOpenNI project that makes
use of that technique and the full skeleton called KineticSpace.
Since it's open-source might be worth having a peak.
I'd recommend trying the $1 Unistroke Recognizer first. You probably
need to work out a system to mimic press/release (perhaps using
the sign of the hand's velocity on z (positive to negative transitions/
negative to positive transitions) ?).
HTH

You can look for a space filling curve. It reduces the 2 dimension and reorder the points. It also add some spatial information. Maybe you can train or compare the new reordered 1d index with some simulated annealing or ant colony optimization?! A space filling curve is used in map tiling programs.

Related

How to avoid hole filling in surface reconstruction?

I am using Poisson surface reconstruction algorithm to reconstruct triangulated mesh surface from points. However, Poisson will always generate a watertight surface, which fills all holes with interpolation.
For some small holes that is the result of data missing, this hole filling is desirable. But for some big holes, I do not want hole filling and just want the surface to remain open.
The figure above shows my idea, the left one is a pointset with normal, the right one is reconstructed surface. I want the top of this surface remains open rather than current watertight result.
Can anyone provide me with some advice, how may I keep these big holes in Poisson surface reconstruction? Or is there any other algorithms that could solve this?
P.S.
Based on the accepted answer to this question, I understand surface reconstruction algorithms could be categorized as explicit ones and implicit ones. Poisson is implicit ones, and explicit ones could naturally handle big hole problem. But since the points data I have are mostly sparse and noisy, I would prefer an implicit one like Poisson.
Your screenshots look like you might be using MeshLab's implementation which is based on an old implementation. This implementation is not capable of trimming the surface.
The latest implementation, however, contains the SurfaceTrimmer that does exactly what you want. Take a look at the examples at the bottom of the page to see how to use it.
To use SurfaceTrimmer program, you have to first use SSDRecon program to reconstruct a mesh surface with --density, then setting a trim value would exactly remove faces under a specific threshold.
Below is a sample usage of that program on the demo eagle data
./SSDRecon --in eagle.points.ply --out eagle.screened.color.ply --depth 10 --density
./SurfaceTrimmer --in eagle.screened.color.ply --out eagle.screened.color.trimmed.ply --trim 7

Is there some well-known algorithm which turns user's drawings into smoothed shapes?

My requirements:
A user should be able to draw something by hand. Then after he takes off his pen (or finger) an algorithm smooths and transforms it into some basic shapes.
To get started I want to transform a drawing into a rectangle which resembles the original as much as possible. (Naturally this won't work if the user intentionally draws something else.) Right now I'm calculating an average x and y position, and I'm distinguishing between horizontal and vertical lines. But it's not yet a rectangle but some kind of orthogonal lines.
I wondered if there is some well-known algorithm for that, because I saw it a few times at some touchscreen applications. Do you have some reading tip?
Update: Maybe a pattern recognition algorithm would help me. There are some phones which request the user to draw a pattern to unlock it's keys.
P.S.: I think this question is not related to a particular programming language, but if you're interested, I will build a web application with RaphaelGWT.
The Douglas-Peucker algorithm is used in geography (to simplify a GPS track for instance) I guess it could be used here as well.
Based on your description I guess you're looking for a vectorization algorithm. Here are some pointers that might help you:
https://en.wikipedia.org/wiki/Image_tracing
http://outliner.codeplex.com/ - open source vectorizer of the edges in the raster pictures.
http://code.google.com/p/shapelogic/wiki/vectorization - describes different vectorization algorithm implementations
http://cardhouse.com/computer/vector.htm
There are a lot of resources on vectorization algorithms, I'm sure you'll be able to find something that fits your needs. I don't know how complex these algorithms are to implement them, though,

Looking for ways for a robot to locate itself in the house

I am hacking a vacuum cleaner robot to control it with a microcontroller (Arduino). I want to make it more efficient when cleaning a room. For now, it just go straight and turn when it hits something.
But I have trouble finding the best algorithm or method to use to know its position in the room. I am looking for an idea that stays cheap (less than $100) and not to complex (one that don't require a PhD thesis in computer vision). I can add some discrete markers in the room if necessary.
Right now, my robot has:
One webcam
Three proximity sensors (around 1 meter range)
Compass (no used for now)
Wi-Fi
Its speed can vary if the battery is full or nearly empty
A netbook Eee PC is embedded on the robot
Do you have any idea for doing this? Does any standard method exist for these kind of problems?
Note: if this question belongs on another website, please move it, I couldn't find a better place than Stack Overflow.
The problem of figuring out a robot's position in its environment is called localization. Computer science researchers have been trying to solve this problem for many years, with limited success. One problem is that you need reasonably good sensory input to figure out where you are, and sensory input from webcams (i.e. computer vision) is far from a solved problem.
If that didn't scare you off: one of the approaches to localization that I find easiest to understand is particle filtering. The idea goes something like this:
You keep track of a bunch of particles, each of which represents one possible location in the environment.
Each particle also has an associated probability that tells you how confident you are that the particle really represents your true location in the environment.
When you start off, all of these particles might be distributed uniformly throughout your environment and be given equal probabilities. Here the robot is gray and the particles are green.
When your robot moves, you move each particle. You might also degrade each particle's probability to represent the uncertainty in how the motors actually move the robot.
When your robot observes something (e.g. a landmark seen with the webcam, a wifi signal, etc.) you can increase the probability of particles that agree with that observation.
You might also want to periodically replace the lowest-probability particles with new particles based on observations.
To decide where the robot actually is, you can either use the particle with the highest probability, the highest-probability cluster, the weighted average of all particles, etc.
If you search around a bit, you'll find plenty of examples: e.g. a video of a robot using particle filtering to determine its location in a small room.
Particle filtering is nice because it's pretty easy to understand. That makes implementing and tweaking it a little less difficult. There are other similar techniques (like Kalman filters) that are arguably more theoretically sound but can be harder to get your head around.
A QR Code poster in each room would not only make an interesting Modern art piece, but would be relatively easy to spot with the camera!
If you can place some markers in the room, using the camera could be an option. If 2 known markers have an angular displacement (left to right) then the camera and the markers lie on a circle whose radius is related to the measured angle between the markers. I don't recall the formula right off, but the arc segment (on that circle) between the markers will be twice the angle you see. If you have the markers at known height and the camera is at a fixed angle of inclination, you can compute the distance to the markers. Either of these methods alone can nail down your position given enough markers. Using both will help do it with fewer markers.
Unfortunately, those methods are imperfect due to measurement errors. You get around this by using a Kalman estimator to incorporate multiple noisy measurements to arrive at a good position estimate - you can then feed in some dead reckoning information (which is also imperfect) to refine it further. This part is goes pretty deep into math, but I'd say it's a requirement to do a great job at what you're attempting. You can do OK without it, but if you want an optimal solution (in terms of best position estimate for given input) there is no better way. If you actually want a career in autonomous robotics, this will play large in your future. (
Once you can determine your position you can cover the room in any pattern you'd like. Keep using the bump sensor to help construct a map of obstacles and then you'll need to devise a way to scan incorporating the obstacles.
Not sure if you've got the math background yet, but here is the book:
http://books.google.com/books/about/Applied_optimal_estimation.html?id=KlFrn8lpPP0C
This doesn't replace the accepted answer (which is great, thanks!) but I might recommend getting a Kinect and use that instead of your webcam, either through Microsoft's recently released official drivers or using the hacked drivers if your EeePC doesn't have Windows 7 (presumably it does not).
That way the positioning will be improved by the 3D vision. Observing landmarks will now tell you how far away the landmark is, and not just where in the visual field that landmark is located.
Regardless, the accepted answer doesn't really address how to pick out landmarks in the visual field, and simply assumes that you can. While the Kinect drivers may already have feature detection included (I'm not sure) you can also use OpenCV for detecting features in the image.
One solution would be to use a strategy similar to "flood fill" (wikipedia). To get the controller to accurately perform sweeps, it needs a sense of distance. You can calibrate your bot using the proximity sensors: e.g. run motor for 1 sec = xx change in proximity. With that info, you can move your bot for an exact distance, and continue sweeping the room using flood fill.
Assuming you are not looking for a generalised solution, you may actually know the room's shape, size, potential obstacle locations, etc. When the bot exists the factory there is no info about its future operating environment, which kind of forces it to be inefficient from the outset.
If that's you case, you can hardcode that info, and then use basic measurements (ie. rotary encoders on wheels + compass) to precisely figure out its location in the room/house. No need for wifi triangulation or crazy sensor setups in my opinion. At least for a start.
Ever considered GPS? Every position on earth has a unique GPS coordinates - with resolution of 1 to 3 metres, and doing differential GPS you can go down to sub-10 cm range - more info here:
http://en.wikipedia.org/wiki/Global_Positioning_System
And Arduino does have lots of options of GPS-modules:
http://www.arduino.cc/playground/Tutorials/GPS
After you have collected all the key coordinates points of the house, you can then write the routine for the arduino to move the robot from point to point (as collected above) - assuming it will do all those obstacles avoidance stuff.
More information can be found here:
http://www.google.com/search?q=GPS+localization+robots&num=100
And inside the list I found this - specifically for your case: Arduino + GPS + localization:
http://www.youtube.com/watch?v=u7evnfTAVyM
I was thinking about this problem too. But I don't understand why you can't just triangulate? Have two or three beacons (e.g. IR LEDs of different frequencies) and a IR rotating sensor 'eye' on a servo. You could then get an almost constant fix on your position. I expect the accuracy would be in low cm range and it would be cheap. You can then map anything you bump into easily.
Maybe you could also use any interruption in the beacon beams to plot objects that are quite far from the robot too.
You have a camera you said ? Did you consider looking at the ceiling ? There is little chance that two rooms have identical dimensions, so you can identify in which room you are, position in the room can be computed from angular distance to the borders of the ceiling and direction can probably be extracted by the position of doors.
This will require some image processing but the vacuum cleaner moving slowly to be efficiently cleaning will have enough time to compute.
Good luck !
Use Ultra Sonic Sensor HC-SR04 or similar.
As above told sense the walls distance from robot with sensors and room part with QR code.
When your are near to a wall turn 90 degree and move as width of your robot and again turn 90deg( i.e. 90 deg left turn) and again move your robot I think it will help :)

Algorithm for following the path of ridges on a 3D image

I'm trying to find an algorithm (or algorithm ideas) for following a ridge on a 3D image, derived from a digital elevation model (DEM). I've managed to get very basic program working which just iterates across each row of the image marking a ridge line wherever it finds a large change in aspect (ie. from < 180 degrees to > 180 degrees).
However, the lines this produces aren't brilliant, there are often gaps and various strange artefacts. I'm hoping to try and extend this by using some sort of algorithm to follow the ridge lines, thus producing lines that are complete (that is, no gaps) and more accurate.
A number of people have mentioned snake algorithms to me, but they don't seem to be quite what I'm looking for. I've also done a lot of searching about path-finding algorithms, but again, they don't seem to be quite the right thing.
Does anyone have any suggestions for types or algorithms or specific algorithms I should look at?
Update: I've been asked to add some more detail on the exact area I'll be applying this to. It's working with gridded elevation data of sand dunes. I'm trying to extract the crests if these sand dunes, which look similar to the boundaries between drainage basins, but can be far more complex (for example, there can be multiple sand dunes very close to each other with gradually merging crests)
You can get a good estimate of the ridges using sign changes of the curvature. Note that the curvature will be near infinity at flat regions. Hence possible psuedo-code for a ridge detection algorithm could be:
for each face in the mesh
compute 1/curvature
if abs(1/curvature) != zeroTolerance
flag face as ridge
else
continue
(zeroTolerance is a number near but not equal to zero e.g. 0.003 etc)
Also Meshlab provides a module for normal & curvature estimation on most formats. You can test the idea using it, before you code it up.
I don't know how what your data is like or how much automation you need. This won't work if if consists of peaks without clear ridges (but then you probably wouldn't be asking the question.)
startPoint = highest point in DEM (or on ridge)
curPoint = startPoint;
line += curPoint;
Loop
curPoint = highest point adjacent to curPoint not in line; // (Don't backtrack)
line += point;
Repeat
Curious what the real solution turns out to be.
Edited to add: depending on the coarseness of your data set, 'point' can be a single point or a smoothed average of a local region of points.
http://en.wikipedia.org/wiki/Ridge_detection
You can treat the elevation as you would a grayscale color, then use a 2D edge recognition filter. There are lots of edge recognition methods available. The best would depend on your specific needs.

Raytracing (LoS) on 3D hex-like tile maps

Greetings,
I'm working on a game project that uses a 3D variant of hexagonal tile maps. Tiles are actually cubes, not hexes, but are laid out just like hexes (because a square can be turned to a cube to extrapolate from 2D to 3D, but there is no 3D version of a hex). Rather than a verbose description, here goes an example of a 4x4x4 map:
(I have highlighted an arbitrary tile (green) and its adjacent tiles (yellow) to help describe how the whole thing is supposed to work; but the adjacency functions are not the issue, that's already solved.)
I have a struct type to represent tiles, and maps are represented as a 3D array of tiles (wrapped in a Map class to add some utility methods, but that's not very relevant).
Each tile is supposed to represent a perfectly cubic space, and they are all exactly the same size. Also, the offset between adjacent "rows" is exactly half the size of a tile.
That's enough context; my question is:
Given the coordinates of two points A and B, how can I generate a list of the tiles (or, rather, their coordinates) that a straight line between A and B would cross?
That would later be used for a variety of purposes, such as determining Line-of-sight, charge path legality, and so on.
BTW, this may be useful: my maps use the (0,0,0) as a reference position. The 'jagging' of the map can be defined as offsetting each tile ((y+z) mod 2) * tileSize/2.0 to the right from the position it'd have on a "sane" cartesian system. For the non-jagged rows, that yields 0; for rows where (y+z) mod 2 is 1, it yields 0.5 tiles.
I'm working on C#4 targeting the .Net Framework 4.0; but I don't really need specific code, just the algorithm to solve the weird geometric/mathematical problem. I have been trying for several days to solve this at no avail; and trying to draw the whole thing on paper to "visualize" it didn't help either :( .
Thanks in advance for any answer
Until one of the clever SOers turns up, here's my dumb solution. I'll explain it in 2D 'cos that makes it easier to explain, but it will generalise to 3D easily enough. I think any attempt to try to work this entirely in cell index space is doomed to failure (though I'll admit it's just what I think and I look forward to being proved wrong).
So you need to define a function to map from cartesian coordinates to cell indices. This is straightforward, if a little tricky. First, decide whether point(0,0) is the bottom left corner of cell(0,0) or the centre, or some other point. Since it makes the explanations easier, I'll go with bottom-left corner. Observe that any point(x,floor(y)==0) maps to cell(floor(x),0). Indeed, any point(x,even(floor(y))) maps to cell(floor(x),floor(y)).
Here, I invent the boolean function even which returns True if its argument is an even integer. I'll use odd next: any point point(x,odd(floor(y)) maps to cell(floor(x-0.5),floor(y)).
Now you have the basics of the recipe for determining lines-of-sight.
You will also need a function to map from cell(m,n) back to a point in cartesian space. That should be straightforward once you have decided where the origin lies.
Now, unless I've misplaced some brackets, I think you are on your way. You'll need to:
decide where in cell(0,0) you position point(0,0); and adjust the function accordingly;
decide where points along the cell boundaries fall; and
generalise this into 3 dimensions.
Depending on the size of the playing field you could store the cartesian coordinates of the cell boundaries in a lookup table (or other data structure), which would probably speed things up.
Perhaps you can avoid all the complex math if you look at your problem in another way:
I see that you only shift your blocks (alternating) along the first axis by half the blocksize. If you split up your blocks along this axis the above example will become (with shifts) an (9x4x4) simple cartesian coordinate system with regular stacked blocks. Now doing the raytracing becomes much more simple and less error prone.

Resources