I'm trying to use the coordinate frame pair START_OF_SERVICE to AREA_DEFINITION, post localisation to the AD. I'd expect this to allow me to reconcile the original SOS origin to a proper location, and I'd like to use the incoming data in the posedata.
My test process is to create the ADF by centering my device in my area at a known position in the world with a known orientation in the world, then creating the ADF file. when I run my test app, if I provide a unity world-space offset that matches my ADF origin, everything looks exactly as expected. E.g. if I create my ADF origin at 0,1,0 (unity WS coordinates) by centering the physical device at 0,0 in my ground plane and 1m up in the Y axis, it matches what I'd expect in my Unity scene given a starting AD frame offset of 0,1,0.
If I then start the device at exactly the same real-world position such that the SOS frame should exactly match the AD frame, when the app localises to the AD, I get a translation of (close to) zero, but a rotation of 90 degrees around the Z axis in the quaternion.
As both the base and target frame share the same coordinate space, I'd expect the translation and rotation, given a SOS origin that matches pretty-accurately with the AD origin, to be a zero offset and an identity matrix.
Can anyone shed any light on what I'm doing wrong here? Thanks in advance!
Related
Our main idea is that we take a picture by hololens then we get the 2D coordinates of something(the thermos and the printer) in this picture. Then we deproject these two things' 2D coordinates of their screenshot back to 3D coordinates in the unreal world, then we draw a box at coordinates' position.
However, as you can see, we marked the thermos(the first picture) and the printer(the second picture) with the 3D coordinates we calculated from their coordinates in their 2D screenshot with a static mesh. But, they have an obvious offset to left down. We speculate that maybe such kind of problem comes from the reason that our camera center is wrong.
Did you meet or solve such kind of problem? Can you give me some advice? Thanks a lot.
We noticed that you already have a new one with more information in Microsoft Q&A community platform. It seems like where exactly is returned from GetActorLocation is headset position and there is an offset from location of PV camera. So that, it is recommended you using the GetPVCameraToWorldTransform API which in HoloLensARFunctionLibrary.h header file to find the camera position in World Space. And then GetWorldSpaceRayFromCameraPoint can help to find what exists in world space at a particular pixel coordinate. For more detail about how to implement this solution, please go through this section: Find Camera Positions in World Space
I am building an application in AFrame and I want to constrain the viewers movement, that is I want to limit where the camera can go in the scene. For example I have a a-plane that is the floor and I want the camera to stop moving when it reaches 0 on the Z axis to stop the camera from going through the floor or stop again if it reaches 20 on Z axis. I also wish to limit the movement in x,y directions. There are no obstacles in the scene besides the a-plane. Is creating a navigation mesh my only option or is there an easier way to constrain movement? Thanks!
I don't know of built in tools to do this, but you could do it with programming (this sounds pretty easy). You could create a custom component, attached to the camera, with a tick handler, that records the position of the camera in world space and stores in in a variable (camPosPrevFrame). Then create a function to test if the current position is outside of the bounds. If so, set the camera coordinate on the axis that has exceeded its limit, to the previously recorded boundary (camPosPrevFrame). If you are simply testing whether the camera is on one side of an orthagonal plane (say the world space xy plane), that is pretty simple math (camera.getWorldPosition.x>someAmount). If you have a more complex situation, there are ways to test if a point is on either side of any arbitrary plane (it involves the dot product).
I'm using Tango motion tracking and it is very easy to get the pose of the device relative to the TANGO_START_OF_SERVICE. For the translation that works fine for me, but I'd like my orientation to be aligned with gravity, so that the yaw and roll angles are aligned with gravity rather than with the arbitrary position at which the Tango service started. I'm fine with an arbitrary azimuth angle.
I can do this by using the accelerometer data to get the absolute orientation at one point in time and then use that going forward, but is there an easier way?
I think the Z axis of TANGO_COORDINATE_FRAME_CAMERA_DEPTH frame is always aligned with gravity.
I would like to make a game where I use a camera with infrared tracking, so that I can track peoples heads (from top view). For example each player will get a helmet so that the camera or infrared sensor can track him/her.
After that I need to know the exact positions of that person in unity, to place a 3D gameobject at the players position.
Maybe there is another workaround to get peoples positions in unity. I know I could use a kinect, but I need to track at least 10 people at the same time.
Thanks
Note: This is not really a closed answer, just a collection of my thoughts regarding your question on how to transfer recorded positions into unity.
If you really need full 3D positions, I believe you won't be happy when using only one sensor. In order to obtain depth information, which can further be used to calculate 3D positions in a reference coordinate system, you would have to use at least 2 sensors.
Another thing you could do is fixing the camera position and assuming, that all persons are moving in the same plane (e.g. fixed y-component), which would allow you to determine 3D positions utilizing the projection formula given the camera parameters (so camera has to be calibrated).
What also comes to my mind is: You could try to simulate your real camera with a virtual camera in unity. This way you can use the virtual camera to project image coordinates (coming from the real camera) into unity's 3D world. I haven't tried this myself, but there was someone who tried it, you can have a look at that: https://community.unity.com/t5/Editor/How-to-simulate-Unity-Pinhole-Camera-from-its-intrinsic/td-p/1922835
Edit given your comment:
Okay, sticking to your soccer example, you could proceed as follows:
Setup: Say you define your playing area to be rectangular with its origin in the bottom left corner (think of UVs). You set these points in the real world (and in unitys representation of it) as (0,0) (bottom left) and (width, height) (top right), choosing whichever measure you like (e.g. meters, as this is unitys default unit). As your camera is stationary, you can assign the corresponding corner points in image coordinates (pixel coordinates) as well. To make things easier, work with normalized coordinates instead of pixels, thus bottom left is (0,0) ans top right is (1,1).
Tracking: When tracking persons in the image, you can calculate their normalized position (x,y) (with x and y in [0,1]). These normalized positions can be transferred into unitys 3D space (in unity you will have a playable area of the same width and height) by simply calculating a Vector3 as (x*widht, 0, y*height) (in unity x is pointing right, y is pointing up and z is pointing forward).
Edit on Tracking:
For top-view tracking in a game, I would say you are on the right track with using some sort of helmet, which enables you to use some sort of marker based tracking (in my opinion markerless multi-target tracking is not reliable enough for use in a video game) (if you want learn more about object tracking, there are lots of resources in the field of computer vision).
Independent of the sensor you are using (IR or camera), you would go create some unique marker for each helmet, thus enabling you to identify each helmet (and also the player). A marker in that case is some sort of unique pattern, that can be recognized by an algorithm for each recorded frame. In IR you can arrange quadratic IR markers to form a specific pattern and for normal cameras you can use markers like QR codes (there are also libraries for augmented reality related content, that offer functionality for creating and recognizing markers, e.g. ArUco or ARToolkit, although I don't know if they offer C# libraries, I have only used ArUco with c++ a while ago).
When you have your markers of choice, the tracking procedure is then pretty straightforward, for each recorded image:
- detect all markers in the current image (these correspond to all players currently visible)
- follow the steps from my last edit using the detected positions
I hope that helps, feel free to contact me again.
my use case is only concerned with locationing, in fact only 2-d locationing. so a lot of the cool capabilities in tango are probably not useful to me. so I'm trying to see if i could implement the location algorithm myself.
from teardown reports it seems the 9dof sensors are pretty commodity hardware. the basic integration-based location algorithm (even with magnetic field calibration) has been mature knowledge. what algorithm does tango use?
from the description it seems that tango tries to aid in navigation by using the images it sees as a reference, sort of like the "terrain-following" mode in cruise missiles, is this right? this would be too ccomplex for me to implemente
You may easily get 2D position using the TangoPoseData with the correct coordinate system:
Project Tango uses a right-handed, local-level frame for the START_OF_SERVICE and AREA_DESCRIPTION coordinate frames. This convention sets the Z-axis aligned with gravity, with Z+ pointed upwards, and the X-Y plane is perpendicular to gravity and locally level with the ground plane. This local-level convention is based on the local east-north-up (ENU) earth-based coordinate system. Instead of true north, Project Tango uses the direction the back of the device is pointed when the service started as the Y axis, and the X axis is pointed to the right. The START_OF_SERVICE and AREA_DESCRIPTION base coordinate frames of the API will use this local-level frame convention.
Said more simply, use the pose data y/x coordinates for your space as you would latitude/longitude for the earth.
Heading data is also derived from the TangoPoseData and can be converted from quaternion to euler angles. Euler angles may be easier for you to use in your 2D location app.
Tango uses 3D to increase the confidence of its position within the space...even if you don't need 3D. I would let Tango do the hard stuff and extract the 2D position so you can focus on your app.
Tango uses the camera images to detect any change in position. And uses the IMU for device rotation and acceleration. Try blocking the camera and using the Motion Tracking app, it will fail.