Google Tango: Aligning Depth and Color Frames - google-project-tango

I would like to align a (synchronous) depth/color frame pair, using the Google Tango tablet, such that, assuming that both frames have the same resolution, each pixel in the depth frame corresponds to the same pixel in the color frame, i.e., I would like to achieve a retinotopic mapping. How can this be achieved using the latest C API (Hilbert Release Version 1.6)? Any help on this will be greatly appreciated.

Generating simple crude UV coordinates to map tango point cloud points back onto source image (texture coordinates) - see comments above for more details, we've messed this thread up but good :-( (Language is C#, classes are .Net) Field of view calculate FOV horizontal (true) or vertical (false)
public PointF PictureUV(Vector3D imagePlaneLocation)
{
// u is a function of x where y is 0
double u = Math.Atan2(imagePlaneLocation.X, imagePlaneLocation.Z);
u += (FieldOfView(true) / 2.0);
u = u/FieldOfView(true);
double v = Math.Atan2(imagePlaneLocation.Y, imagePlaneLocation.Z);
v += (FieldOfView() / 2.0);
v = v / FieldOfView();
return new PointF((float)u, (float)(1.0 - v));
}

Mark, thanks for your quick response. Probably my question was a bit inprecise. You are of course damn right in saying that a retinotopic mapping between a 2D and a 3D image cannot be established. Shame on me. Nonetheless,
what I need is a mapping in which all depth samples (x_n,y_n,d_n), 1<=n<=N, N being the number of depth values, correspond to the same pixels (x_n,y_n) in the (synchronized) color frame. It is well taken that the depth sensor cannot provide depth information for troublesome areas in the visual field.

One of your conditions is not possible - there is no guarantee that tango will hand you a point cloud measurement of something in the visual field if it has trouble seeing it - also there isn't a 1:1 correspondence between pixels and depth frame as the depth info is 3D

I have not tried this but we can probably do:
for each (X,Y,Z) from point cloud:
u_pixel = -(X/Z)* Fx, v_pixel = -(Y/Z)* Fy.
x = (u-cx)/Fx, y = (v-cy)/Fy.
for distortion correction (k1,k2,k2 can from distortion[] part of TangoInstrinsics, r = Math.sqrt(x^2 + y^2)))
x_corrected = x * (1 + k1 * r2 + k2 * r4 + k3 * r6)
y_corrected = y * (1 + k1 * r2 + k2 * r4 + k3 * r6)
Then we can convert normalized x_corrected, y_corrected to x_raster, y_raster by using reverse of the above formula (x_raster = x_correct*Fx+ cx)

Related

Is there a Octave function or general solution for locating dead space in a scatter plot? [duplicate]

This question already has answers here:
Fitting largest circle in free area in image with distributed particle
(5 answers)
Closed 1 year ago.
I deal with plots of data on the order of half a million points in Octave. I am trying to find the center of empty spaces that are in the data (on purpose).
I know how many points to look for and I was thinking of feeding in starter locations and then try to expand a circle in one direction until you hit valid data point locations and keep doing that in a few directions until you have a circle that is filled with no data but touches valid data points. The center of that circle would be the center of the void space. I'm not entirely sure how to write that since I'm very green in coding.
Obviously a graphical solution probably isn't the best method, but I don't know how to find big x and y gaps in a huge matrix of x y locations.
A section of the data I deal with. Trying to write a program to automatically find the center of that hole.
A sample of the data I'm working with. Each data point is an x and y location with a z height that isn't really valuable to what I'm trying to solve here. The values do not line up in consistent intervals
Here is a large sample of what I'm working with
I know you said your data does not line-up in x or y, but it still seems suspiciously grid-like.
In this case, you can probably express each gridpoint as a 'pixel' in an image; this gives you access to excellent functions you can use from the image package, such as the imregionalmin function. This will give you connected components of 'holes', in your case. For each component you can find their centres of mass easily by finding the 'average coordinate' over the pixels within that component. You can then perform a distance transform (e.g. using bwdist) to find the exact radius for the circle you describe, as the distance from that centre of mass to the nearest pixel. Alternatively, you can start with bwdist and then use immaximas to detect the centres of mass directly. If you have multiple such regions, you can use bwconncomp to find connected components first (or over the output of imregionalmin).
If your data is not specifically grid-like, then you could probably interpolate your data to make them fit such a grid.
Example:
pkg load image
t = 0 : 0.1 : 2 * pi; % for use when plotting circles later
[X0, Y0] = ndgrid( 1:100, 1:100 ); % Create 'index' grid
X = X0 - 0.25 * Y0; Y = 0.25 * X0 + Y0; % Create transformed grid
Z = 0.5 * (X0 - 50) .^ 2 + (Y0 - 50) .^ 2 > 250; % Assign a logical value to each 'index' point on grid
M = imregionalmin ( Z ); % Find 'hole' as mask
C = { round(mean(X0(M))), round(mean(Y0(M))) }; % Find centre of mass (as index)
R = bwdist( ~M )(C{:}); % Find distance from centre of mass to nearest pixel
R = min( abs( X(C{1}+R, C{2}) - X(C{:}) ), abs( Y(C{1}, C{2}+R) - Y(C{:}) ) ); % Adjust for transformed grid
figure(1); hold on
plot( X(Z), Y(Z), '.', 'markerfacecolor', 'b' ) % Draw original transformed grid data
plot( X(C{:}), Y(C{:}), 'o', 'markerfacecolor', 'r' ); % Draw centre of mass in transformed grid
plot( X(C{:}) + R * cos(t), Y(C{:}) + R * sin(t), 'r-' ) % Draw optimal circle on top
axis equal; hold off

Distance to Object Webcam C920HD or use OpenCV calibrate.py

I am trying to determine the distance of an object and the height of an object towards my camera. Is it possible or do I need to use OpenCV calibrate.py to gather more information? I am confused because the Logitech C920HD has 3 MP and scales to 15 MP via software.
I have following info:
Resolution (pixel): 1920x1080
Focal Length (mm): 3.67mm
Pixel Size (µm): 3.98
Sensor Size (inches): 1/2.88
Object real height (mm): 180
Object image height (px): 370
I checked this formula:
distance (mm) = 3.67(mm) * 180(mm) * 1080(px) / 511 (px) * (1/2.88)(inches)*2.54 (mm/inches)
Which gives me 15.8 cm. Altough it should be about 60cm.
What am I doing wrong?
Thanks for help!
Your formula looks the correct one, however, for it to hold over the entire image plane, you should correct lens distortions first, e.g., following the answer
Camera calibration, reverse projection of pixel to direction
Along the way, OpenCV lens calibration module will estimate your true focal length.
Filling the formula gives
Distance = 3.67 mm * 180 mm * 1080/511 / sensor_height_mm = 1396 mm^2 / sensor_height_mm
Leaving sensor_height_mm unknown. Given your camera is 16:9 format
w^2 + h^2 = D^2
(16x)^2+(9x)^2 = D^2
<=>
x = sqrt( D^2/337 )
<=>
h = 9x = 9*sqrt( D^2/337 )
Remember the rule of 16:
https://photo.stackexchange.com/questions/24952/why-is-a-1-sensor-actually-13-2-%C3%97-8-8mm/24954
Most importantly, a 1/2.88" sensor has 16/2.88 mm image circle diameter instead of 25.4/2.88 mm. Funny enough, the true image circle diameter is metric. Thus the sensor diameter is
D = 16 mm/ 2.88 = 5.556 mm
and
sensor_height_mm = h = 2.72 mm
giving
Distance = 513 mm
Note, that this distance is measured with respect to the lenses first principal point and not the sensor position or the lens front element position.
As you correct the barrel distortion, the reading should get more accurate. It's quite a lot for this camera. I have similar.
Hope this helps.

Finding initial speed and angle to hit a known position (parabolic trajectory)

I am currently doing a small turn based cannon game with XNA 4.0. The game is very simple: the player chooses the speed and angle at which he desires to shoot his rocket in order to hit another player. There is also a randomly generated wind vector that affects the X trajectory of the rocket. I would like to add an AI so that the player could play against the computer in a single player mode.
The way I would like to implement the AI is very simple: find the velocity and angle that would make the rocket hit the player directly, and add a random modifier to those fields so that the AI doesn't hit another player each time.
This is the code I use in order to update the position and speed of the rocket:
Vector2 gravity = new Vector2(0, (float)400); // 400 is the sweet spot value that i have found works best for the gravity
Vector2 totalAcceleration = gravity + _terrain.WindDirection;
float deltaT = (float)gameTime.ElapsedGameTime.TotalSeconds; // Elapsed time since last update() call
foreach (Rocket rocket in _instantiatedRocketList)
{
rocket.RocketSpeed += Vector2.Multiply(gravity, deltaT); // Only changes the Y component
rocket.RocketSpeed += Vector2.Multiply(_terrain.WindDirection, deltaT); // Only changes the X component
rocket.RocketPosition += Vector2.Multiply(rocket.RocketSpeed, deltaT) + Vector2.Multiply(totalAcceleration, (float)0.5) * deltaT * deltaT;
// We update the angle of the rocket accordingly
rocket.RocketAngle = (float)Math.Atan2(rocket.RocketSpeed.X, -rocket.RocketSpeed.Y);
rocket.CreateSmokeParticles(3);
}
I know that the basic equations to find the final X and Y coordinates are:
X = V0 * cos(theta) * totalFlightTime
Y = V0 * sin(theta) * totalFlightTime - 0.5 * g * totalFlightTime^2
where X and Y are the coordinates of the player I want to hit, V0 the initial speed, theta the angle at witch the rocket is shot, totalFlightTime is, like the name says, the total flight time of the rocket until it reaches (X, Y) and g is the gravity (400 in my game).
Questions:
What I am having problems with, is knowing where to add the wind in those formulas (is it just adding "+ windDirection * totalFlightTime" in the X = equation?), and also what to do with those equations in order to do what I want to do (finding the initial speed and theta angle) since there are 3 variables (V0, theta and totalFlightTime) and only 2 equations?
Thanks for your time.
You can do this as follows:
Assuming there is no specific limit to V0 (i.e. the robot can fire the rocket at any desired speed) and using the substitutions
T=totalFlightTime
Vx=V0cos(theta)
Vy=V0sin(theta)
Choose an arbitrary value for Vx. Now your first equation simplifies to
X=VxT so T=X/Vx
to solve for T. Now substitute the value of T into the second equation and solve for Vy
Y=VyT + gT^2/2 so Vy = (Y - gT^2/2)/T
Finally you can now solve for V0 and theta
V0 = Sqrt(Vx^2 + Vy^2) and Theta = aTan(Vy/Vx)
Note that your initial choice of Vx will determine the trajectory the missile will take - if Vx is large then T will be small and the trajectory will be almost a straight line (like a bullet fired at a nearby target) - if Vx is small then T will be large and the trajectory will be an arc (like a mortar round's path). You dis start with three variables (V0, totalFlightTime, and theta) but they are dependent variables so choosing any one (or in this case Vx) plus the two equations solves for the other two. You could also pre-determine flight time and solve for Vx, Vy, theta and V0, or predetermine theta (although this would be tricky as some theta wouldn't provide a real solution.

Xcode Graph Calculator sin(X) cos(x) tan(x)

I want to create a graphic calculator and Im stuck with the graph bit. I want to know how to plot a graph for sin(x) cos(x) tan(x). I have made the grid already. I dont want to use core plot framework.
Any help would be appreciated.
Thanks.
To actually plot the function, do like you would with paper and pencil: evaluate the function for a number of inputs. Then draw lines to connect the resulting points.
Not that I would actually do this (I would look at Core Plot), but you could plot such a graph using a Core Image generator filter, like this:
//wavelength and magnitude are distances in destination pixels. Think of them as the width and height of each wave.
kernel vec4 sineWave(float wavelength, float magnitude, __color color)
{
vec2 coord = destCoord();
coord.y -= magnitude;
coord /= vec2(wavelength, magnitude / 2.0);
float pi = radians(180.0);
float value = sin(coord.x * pi);
//Smaller threshold = finer wave line. For a gradient, replace the comparison with 1.0 - abs(…).
float threshold = 0.1;
float alpha = abs(coord.y - value) <= threshold;
return color * alpha;
}
Here is some pseudo-code that could answer your question:
for i = xmin to xmax do
{
draw XY point at X=(i*x_scale_factor+x_offset) and Y=(sin(i)*y_scale_factor+Y_offset);
}
And beware: don't use floats in for loops
EDIT in response to comments
The easiest way to proceed, IMHO, would be to get the bounds of your view, to get the min and max values of your data on both X and Y axis.
You then can use a NSAffineTransform instance to transform the coordinates of your drawings. So everything can be done in your graphic coordinates, which is easier. You can write a label at coordinates (4.6, 3.2*10-7) if you wish to. This is a key point to get you started. The road is long. But using NSAffineTransform will make it easier.

Circular Hough Transform Improvements

I'm working on an iris recognition algorithm that processes these kind of images into unique codes for identification and authentication purposes.
After filtering, intelligently thresholding, then finding edges in the image, the next step is obviously to fit circles to the pupil and iris. I've looked around the the technique to use is the circular Hough Transform. Here is the code for my implementation. Sorry about the cryptic variable names.
print "Populating Accumulator..."
# Loop over image rows
for x in range(w):
# Loop over image columns
for y in range(h):
# Only process black pixels
if inp[x,y] == 0:
# px,py = 0 means pupil, otherwise pupil center
if px == 0:
ra = r_min
rb = r_max
else:
rr = sqrt((px-x)*(px-x)+(py-y)*(py-y))
ra = int(rr-3)
rb = int(rr+3)
# a is the width of the image, b is the height
for _a in range(a):
for _b in range(b):
for _r in range(rb-ra):
s1 = x - (_a + a_min)
s2 = y - (_b + b_min)
r1 = _r + ra
if (s1 * s1 + s2 * s2 == r1 * r1):
new = acc[_a][_b][_r]
if new >= maxVotes:
maxVotes = new
print "Done"
# Average all circles with the most votes
for _a in range(a):
for _b in range(b):
for _r in range(r):
if acc[_a][_b][_r] >= maxVotes-1:
total_a += _a + a_min
total_b += _b + b_min
total_r += _r + r_min
amount += 1
top_a = total_a / amount
top_b = total_b / amount
top_r = total_r / amount
print top_a,top_b,top_r
This is written in python and uses the Python Imaging Library to do image processing. As you can see, this is a very naive brute force method of finding circles. It works, but takes several minutes. The basic idea is to draw circles from rmin to rmax wherever there is a black pixel (from thresholding and edge-detection), the build an accumulator array of the number of times a location on the image is "voted" on. Whichever x, y, and r has the most votes is the circle of interest. I tried to use the fact that the iris and pupil have about the same center (variables ra and rb) to reduce some of the complexity of the r loop, but the pupil detection takes so long that it doesn't matter.
Now, obviously my implementation is very naive. It uses a three dimensional parameter space (x, y, and r), which unfortunately makes it run slower than is acceptable. What kind of improvements can I make? Is there any way to reduce this to a two-dimensional parameter space? Is there a more efficient way of accessing and setting pixels that I'm not aware of?
On a side note, are there any other techniques for improving the overall runtime of this algorithm that I'm not aware of? Such as methods to approximate the maximum radius of the pupil or iris?
Note: I've tried to use OpenCV for this as well, but I could not tune the parameters enough to be consistently accurate.
Let me know if there's any other information that you need.
NOTE: Once again I misinterpreted my own code. It is technically 5-dimensional, but the 3-dimensional x,y,r loop only operates on black pixels.
Assuming you want the position of the circle rather than a measure of R.
If you have a decent estimate of the possible range of R then a common technique is to run the algorithm for a first guess of fixed R, adjust it and try again.

Resources