My project combines a projection screen with a head tracking device, where the screen should act as a window through which I could see my virtual "world". Basically, this.
Initially, I thought this would be easy: Map the camera position to the head tracking, have it point towards my window in the virtual world, adjust camera parameters to fit its frustum to the window, and voilà!
Except it doesn't work because I'm viewing the window (both real and virtual) at an angle, so the regular perspective camera doesn't do the trick: If I understand correctly, that camera 'input' is always rectangular, but I need to 'fit' it in a trapezoïd instead.
I think I should be able to achieve that by making my own projection matrix, but I'm a bit lost on how to do that: I have played a bit with basic matrix transforms (translate, scale, rotate), but I have zero experience with more complex stuff (ie perspective).
My best guess for now is trying to deduce the projection matrix from known transformed points (the corners of my window => the corners of the screen) but I feel like it's going to be quite expensive to do that each frame, and that doesn't account for the perspective inside the "window".
thanks for any help!
Related
I am curious about the limits of three.js. The following question is asked mainly as a challenge, not because I actually need the specific knowledge/code right away.
Say you have a game/simulation world model around a sphere geometry representing a planet, like the worlds of the game Populous. The resolution of polygons and textures is sufficient to look smooth when the globe fills the view of an ordinary camera. There are animated macroscopic objects on the surface.
The challenge is to project everything from the model to a global map projection on the screen in real time. The choice of projection is yours, but it must be seamless/continuous, and it must be possible for the user to rotate it, placing any point on the planet surface in the center of the screen. (It is not an option to maintain an alternative model of the world only for visualization.)
There are no limits on the number of cameras etc. allowed, but the performance must be expected to be "realtime", say two-figured FPS or more.
I don't expect ayn proof in the form of a running application (although that would be cool), but some explanation as to how it could be done.
My own initial idea is to place a lot of cameras, in fact one for every pixel in the map projection, around the globe, within a Group object that is attached to some kind of orbit controls (with rotation only), but I expect the number of object culling operations to become a huge performance issue. I am sure there must exist more elegant (and faster) solutions. :-)
why not just use a spherical camera-model (think a 360° camera) and virtually put it in the center of the sphere? So this camera would (if it were physically possible) be wrapped all around the sphere, looking toward the center from all directions.
This camera could be implemented in shaders (instead of the regular projection-matrix) and would produce an equirectangular image of the planet-surface (or in fact any other projection you want, like spherical mercator-projection).
As far as I can tell the vertex-shader can implement any projection you want and it doesn't need to represent a camera that is physically possible. It just needs to produce consistent clip-space coordinates for all vertices. Fragment-Shaders for lighting would still need to operate on the original coordinates, normals etc. but that should be achievable. So the vertex-shader would just need compute (x,y,z) => (phi,theta,r) and go on with that.
Occlusion-culling would need to be disabled, but iirc three.js doesn't do that anyway.
I would like to make a game where I use a camera with infrared tracking, so that I can track peoples heads (from top view). For example each player will get a helmet so that the camera or infrared sensor can track him/her.
After that I need to know the exact positions of that person in unity, to place a 3D gameobject at the players position.
Maybe there is another workaround to get peoples positions in unity. I know I could use a kinect, but I need to track at least 10 people at the same time.
Thanks
Note: This is not really a closed answer, just a collection of my thoughts regarding your question on how to transfer recorded positions into unity.
If you really need full 3D positions, I believe you won't be happy when using only one sensor. In order to obtain depth information, which can further be used to calculate 3D positions in a reference coordinate system, you would have to use at least 2 sensors.
Another thing you could do is fixing the camera position and assuming, that all persons are moving in the same plane (e.g. fixed y-component), which would allow you to determine 3D positions utilizing the projection formula given the camera parameters (so camera has to be calibrated).
What also comes to my mind is: You could try to simulate your real camera with a virtual camera in unity. This way you can use the virtual camera to project image coordinates (coming from the real camera) into unity's 3D world. I haven't tried this myself, but there was someone who tried it, you can have a look at that: https://community.unity.com/t5/Editor/How-to-simulate-Unity-Pinhole-Camera-from-its-intrinsic/td-p/1922835
Edit given your comment:
Okay, sticking to your soccer example, you could proceed as follows:
Setup: Say you define your playing area to be rectangular with its origin in the bottom left corner (think of UVs). You set these points in the real world (and in unitys representation of it) as (0,0) (bottom left) and (width, height) (top right), choosing whichever measure you like (e.g. meters, as this is unitys default unit). As your camera is stationary, you can assign the corresponding corner points in image coordinates (pixel coordinates) as well. To make things easier, work with normalized coordinates instead of pixels, thus bottom left is (0,0) ans top right is (1,1).
Tracking: When tracking persons in the image, you can calculate their normalized position (x,y) (with x and y in [0,1]). These normalized positions can be transferred into unitys 3D space (in unity you will have a playable area of the same width and height) by simply calculating a Vector3 as (x*widht, 0, y*height) (in unity x is pointing right, y is pointing up and z is pointing forward).
Edit on Tracking:
For top-view tracking in a game, I would say you are on the right track with using some sort of helmet, which enables you to use some sort of marker based tracking (in my opinion markerless multi-target tracking is not reliable enough for use in a video game) (if you want learn more about object tracking, there are lots of resources in the field of computer vision).
Independent of the sensor you are using (IR or camera), you would go create some unique marker for each helmet, thus enabling you to identify each helmet (and also the player). A marker in that case is some sort of unique pattern, that can be recognized by an algorithm for each recorded frame. In IR you can arrange quadratic IR markers to form a specific pattern and for normal cameras you can use markers like QR codes (there are also libraries for augmented reality related content, that offer functionality for creating and recognizing markers, e.g. ArUco or ARToolkit, although I don't know if they offer C# libraries, I have only used ArUco with c++ a while ago).
When you have your markers of choice, the tracking procedure is then pretty straightforward, for each recorded image:
- detect all markers in the current image (these correspond to all players currently visible)
- follow the steps from my last edit using the detected positions
I hope that helps, feel free to contact me again.
When several objects overlap on the same plane, they start to flicker. How do I tell the renderer to put one of the objects in front?
I tried to use .renderDepth, but it only works partly -
see example here: http://liveweave.com/ahTdFQ
Both boxes have the same size and it works as intended. I can change which of the boxes is visible by setting .renderDepth. But if one of the boxes is a bit smaller (say 40,50,50) the contacting layers are flickering and the render depth doesn't work anymore.
How to fix that issue?
When .renderDepth() doesn't work, you have to set the depths yourself.
Moving whole meshes around is indeed not really efficient.
What you are looking for are offsets bound to materials:
material.polygonOffset = true;
material.polygonOffsetFactor = -0.1;
should solve your issue. See update here: http://liveweave.com/syC0L4
Use negative factors to display and positive factors to hide.
Try for starters to reduce the far range on your camera. Try with 1000. Generally speaking, you shouldn't be having overlapping faces in your 3d scene, unless they are treated in a VERY specific way (look up the term 'decal textures'/'decals'). So basically, you have to create depth offsets, and perhaps even pre sort the objects when doing this, which all requires pretty low-level tinkering.
If the far range reduction helps, then you're experiencing a lack of precision (depending on the device). Also look up 'z fighting'
UPDATE
Don't overlap planes.
How do I tell the renderer to put one of the objects in front?
You put one object in front of the other :)
For example if you have a camera at 0,0,0 looking at an object at 0,0,10, if you want another object to be behind the first object put it at 0,0,11 it should work.
UPDATE2
What is z-buffering:
http://en.wikipedia.org/wiki/Z-buffering
http://msdn.microsoft.com/en-us/library/bb976071.aspx
Take note of "floating point in range of 0.0 - 1.0".
What is z-fighting:
http://en.wikipedia.org/wiki/Z-fighting
...have similar values in the z-buffer. It is particularly prevalent with
coplanar polygons, where two faces occupy essentially the same space,
with neither in front. Affected pixels are rendered with fragments
from one polygon or the other arbitrarily, in a manner determined by
the precision of the z-buffer.
"The renderer cannot reposition anything."
I think that this is completely untrue. The renderer can reposition everything, and probably does if it's not shadertoy, or some video filter or something. Every time you move your camera the renderer repositions everything (the camera is actually the only thing that DOES NOT MOVE).
It seems that you are missing some crucial concepts here, i'd start with this:
http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/
About the depth offset mentioned:
How this would work, say you want to draw a decal on a surface. You can 'draw' another mesh on this surface - by say, projecting a quad onto it. You want to draw a bullet hole over a concrete wall and end up with two coplanar surfaces - the wall, the bullet hole. You can figure out the depth buffer precision, find the smallest value, and then move the bullet hole mesh by that value towards the camera. The object does not get scaled (you're doing this in NDC which you can visualize as a cube and moving planes back and forth in the smallest possible increment), but does translate in depth direction, ending up in front of the other.
I don't see any flicker. The cube movement in 3D seems to be super-smooth. Can you try in a different computer (may be faster one)? I used Chrome on Macbook Pro.
A quick introduction:
We're developing a positioning system that works the following way. Our camera is situated on a robot and is pointed upwards (looking at the ceiling). On the ceiling we have something like landmarks, thanks to whom we can compute the position of the robot. It looks like this:
Our problem:
The camera is tilted a bit (0-4 degrees I think), because the surface of the robot is not perfectly even. That means, when the robot turns around but stays at the same coordinates, the camera looks at a different position on the ceiling and therefore our positioning program yields a different position of the robot, even though it only turned around and wasn't moved a bit.
Our current (hardcoded) solution:
We've taken some test photos from the camera, turning it around the lens axis. From the pictures we've deduced that it's tilted ca. 4 degrees in the "up direction" of the picture. Using some simple geometrical transformations we've managed to reduce the tilt effect and find the real camera position. On the following pictures the grey dot marks the center of the picture, the black dot is the real place on the ceiling under which the camera is situated. The black dot was transformed from the grey dot (its position was computed correcting the grey dot position). As you can easily notice, the grey dots form a circle on the ceiling and the black dot is the center of this circle.
The problem with our solution:
Our approach is completely unportable. If we moved the camera to a new robot, the angle and direction of tilt would have to be completely recalibrated. Therefore we wanted to leave the calibration phase to the user, that would demand takings some pictures, assessing the tilt parameters by him and then setting them in the program. My question to you is: can you think of any better (more automatic) solution to computing the tilt parameters or correcting the tilt on the pictures?
Nice work. To have an automatic calibration is a nice challenge.
An idea would be to use the parallel lines from the roof tiles:
If the camera is perfectly level, then all lines will be parallel in the picture too.
If the camera is tilted, then all lines will be secant (they intersect in the vanishing point).
Now, this is probably very hard to implement. With the camera you're using, distortion needs to be corrected first so that lines are indeed straight.
Your practical approach is probably simpler and more robust. As you describe it, it seems it can be automated to become user friendly. Make the robot turn on itself and identify pragmatically which point remains at the same place in the picture.
I'd like to implement a dragging feature where users can drag objects around the workspace. That of course is the easy bit. The hard bit is to try and make it a physically correct drag which incorporates rotation due to torque moments (imagine dragging a book around on a table using only one finger, how does it rotate as you drag?).
Does anyone know where I can find explanations on how to code this (2D only, rectangles only, no friction required)?
Much obliged,
David
EDIT:
I wrote a small app (with clearly erroneous behaviour) that I hope will convey what I'm looking for much better than words could. C# (VS 2008) source and compiled exe here
EDIT 2:
Adjusted the example project to give acceptable behaviour. New source (and compiled exe) is available here. Written in C# 2008. I provide this code free of any copyright, feel free to use/modify/whatever. No need to inform me or mention me.
Torque is just the applied force projected perpendicular to a vector between the point where the force is applied and the centroid of the object. So, if you pull perpendicular to the diameter, the torque is equal to the applied force. If you pull directly away from the centroid, the torque is zero.
You'd typically want to do this by modeling a spring connecting the original mouse-down point to the current position of the mouse (in object-local coordinates). Using a spring and some friction smooths out the motions of the mouse a bit.
I've heard good things about Chipmunk as a 2D physics package:
http://code.google.com/p/chipmunk-physics/
Okay, It's getting late, and I need to sleep. But here are some starting points. You can either do all the calculations in one coordinate space, or you can define a coordinate space per object. In most animation systems, people use coordinate spaces per object, and use transformation matrices to convert, because it makes the math easier.
The basic sequence of calculations is:
On mouse-down, you do your hit-test,
and store the coordinates of the
event (in the object coordinate
space).
When the mouse moves, you create a
vector representing the distance
moved.
The force exterted by the spring is k * M, where M is the amount of distance between that initial mouse-down point from step 1, and the current mouse position. k is the spring constant of the spring.
Project that vector onto two direction vectors, starting from the initial mouse-down point. One direction is towards the center of the object, the other is 90 degrees from that.
The force projected towards the center of the object will move it towards the mouse cursor, and the other force is the torque around the axis. How much the object accelerates is dependent on its mass, and the rotational acceleration is dependent on angular momentum.
The friction and viscosity of the medium the object is moving in causes drag, which simply reduces the motion of the object over time.
Or, maybe you just want to fake it. In that case, just store the (x,y) location of the rectangle, and its current rotation, phi. Then, do this:
Capture the mouse-down location in world coordinates
When the mouse moves, move the box according to the change in mouse position
Calculate the angle between the mouse and the center of the object (atan2 is useful here), and between the center of the object and the initial mouse-down point. Add the difference between the two angles to the rotation of the rectangle.
This would seem to be a basic physics problem.
You would need to know where the click, and that will tell you if they are pushing or pulling, so, though you are doing this in 2D, your calculations will need to be in 3D, and your awareness of where they clicked will be in 3D.
Each item will have properties, such as mass, and perhaps information for air resistance, since the air will help to provide the motion.
You will also need to react differently based on how fast the user is moving the mouse.
So, they may be able to move the 2 ton weight faster than is possible, and you will just need to adapt to that, as the user will not be happy if the object being dragged is slower than the mouse pointer.
Which language?
Here's a bunch of 2d transforms in C