Three.js matrix precision for real worlds - three.js

I'm experimenting some issues when work with real worlds.
The center of my camera is 280000, 45787254 (for example).
The extension of my world is about 500 x 500 (not too big)
I'm using data based in metric units (meters).
I have created a tile map structure build with simple planes.
I see little gaps between the plane borders and this planes are built to be contiguous (that is xmin of the adjacent plane is equal to xmax of previous).
In the past I have issues related with ray cast.
Matrix projection with this big units have low precision.
Change near value to number great than 10 can be the fix. However, using this value means bad visualization (you can't place the cam much near of the scene, it disappears).
I talked with the guy who develops potree and he said me is had to move the lidar worlds to 0,0 to work properly.
So... the final solution is to work in 0,0 worlds, isn't it ?
Or is there any trick we can do at matrix calculations?
I'd like to know three.js developers.

Floating point math is best at ranges close to zero, you just end up compounding errors as you move far away. You can always do as much math as possible near the origin and then translate the result to wherever you need, that will help with some of it, but if you can, work in local coordinates.
Potree probably gets odd ripple-looking aliasing effects when too far from the origin, no?

Related

Future prospects for improvement of depth data on Project Tango tablet

I am interested in using the Project Tango tablet for 3D reconstruction using arbitrary point features. In the current SDK version, we seem to have access to the following data.
A 1280 x 720 RGB image.
A point cloud with 0-~10,000 points, depending on the environment. This seems to average between 3,000 and 6,000 in most environments.
What I really want is to be able to identify a 3D point for key points within an image. Therefore, it makes sense to project depth into the image plane. I have done this, and I get something like this:
The problem with this process is that the depth points are sparse compared to the RGB pixels. So I took it a step further and performed interpolation between the depth points. First, I did Delaunay triangulation, and once I got a good triangulation, I interpolated between the 3 points on each facet and got a decent, fairly uniform depth image. Here are the zones where the interpolated depth is valid, imposed upon the RGB iamge.
Now, given the camera model, it's possible to project depth back into Cartesian coordinates at any point on the depth image (since the depth image was made such that each pixel corresponds to a point on the original RGB image, and we have the camera parameters of the RGB camera). However, if you look at the triangulation image and compare it to the original RGB image, you can see that depth is valid for all of the uninteresting points in the image: blank, featureless planes mostly. This isn't just true for this single set of images; it's a trend I'm seeing for the sensor. If a person stands in front of the sensor, for example, there are very few depth points within their silhouette.
As a result of this characteristic of the sensor, if I perform visual feature extraction on the image, most of the areas with corners or interesting textures fall in areas without associated depth information. Just an example: I detected 1000 SIFT keypoints from an an RGB image from an Xtion sensor, and 960 of those had valid depth values. If I do the same thing to this system, I get around 80 keypoints with valid depth. At the moment, this level of performance is unacceptable for my purposes.
I can guess at the underlying reasons for this: it seems like some sort of plane extraction algorithm is being used to get depth points, whereas Primesense/DepthSense sensors are using something more sophisticated.
So anyway, my main question here is: can we expect any improvement in the depth data at a later point in time, through improved RGB-IR image processing algorithms? Or is this an inherent limit of the current sensor?
I am from the Project Tango team at Google. I am sorry you are experiencing trouble with depth on the device. Just so that we are sure your device is in good working condition, can you please test the depth performance against a flat wall. Instructions are as below:
https://developers.google.com/project-tango/hardware/depth-test
Even with a device in good working condition, the depth library is known to return sparse depth points on scenes with low IR reflectance objects, small sized objects, high dynamic range scenes, surfaces at certain angles and objects at distances larger than ~4m. While some of these are inherent limitations in the depth solution, we are working with the depth solution provider to bring improvements wherever possible.
Attached an image of a typical conference room scene and the corresponding point cloud. As you can see, 1) no depth points are returned from the laptop screen (low reflectance), the table top objects such as post-its, pencil holder etc (small object sizes), large portions of the table (surface at an angles), room corner at the far right (distance >4m).
But as you move around the device, you will start getting depth point returns. Accumulating depth points is a must to get denser point clouds.
Please also keep us posted on your findings at project-tango-hardware-support#google.com
In my very basic initial experiments, you are correct with respect to depth information returned from the visual field, however, the return of surface points is anything but constant. I find as I move the device I can get major shifts in where depth information is returned, i.e. there's a lot of transitory opacity in the image with respect to depth data, probably due to the characteristics of the surfaces.
So while no return frame is enough, the real question seems to be the construction of a larger model (point cloud to open, possibly voxel spaces as one scales up) to bring successive scans into a common model. It's reminiscent of synthetic aperture algorithms in spirit, but the letters in the equations are from a whole different set of laws.
In short, I think a more interesting approach is to synthesize a more complete model by successive accumulation of point cloud data - now, for this to work, the device team has to have their dead reckoning on the money for whatever scale this is done. Also this addresses an issue that no sensor improvements can address - if your visual sensor is perfect, it still does nothing to help you relate the sides of an object at least be in the close neighborhood of the front of the object.

Raytracing via diffusion algorithm

Many certain resources about raytracing tells about:
"shoot rays, find the first obstacle to cut it"
"shoot secondary rays..."
"or, do it reverse and approximate/interpolate"
I didnt see any algortihm that uses a diffusion algorithm. Lets assume a point-light is a point that has more density than other cells(all space is divided into cells), every step/iteration of lighting/tracing makes that source point to diffuse into neighbours using a velocity field and than their neighbours and continues like that. After some satisfactory iterations(such as 30-40 iterations), the density info of each cell is used for enlightment of objects in that cell.
Point light and velocity field:
But it has to be a like 1000x1000x1000 size and this would take too much time and memory to compute. Maybe just computing 10x10x10 and when finding an obstacle, partitioning that area to 100x100x100(in a dynamic kd-tree fashion) can help generating lighting/shadows for acceptable resolution? Especially for vertex-based illumination rather than triangle.
Has anyone tried this approach?
Note: Velocity field is here to make light diffuse to outwards mostly(not %100 but %99 to have some global illumination). Finite-element-method can make this embarassingly-parallel.
Edit: any object that is hit by a positive-density will be an obstacle to generate a new velocity field around the surface of it. So light cannot go through that object but can be mirrored to another direction.(if it is a lens object than light diffuse harder through it) So the reflection of light can affect other objects with a higher iteration limit
Same kd-tree can be used in object-collision algorithms :)
Just to take as a grain of salt: a neural-network can be trained for advection&diffusion in a 30x30x30 grid and that can be used in a "gpu(opencl/cuda)-->neural-network ---> finite element method --->shadows" way.
There's a couple problems with this as it stands.
The first problem is that, fundamentally, a photon in the Newtonian sense doesn't react or change based on the density of other photons around. So using a density field and trying to light to follow the classic Navier-Stokes style solutions (which is what you're trying to do, based on the density field explanation you gave) would result in incorrect results. It would also, given enough iterations, result in complete entropy over the scene, which is also not what happens to light.
Even if you were to get rid of the density problem, you're still left with the the problem of multiple photons going different directions in the same cell, which is required for global illumination and diffuse lighting.
So, stripping away the problem portions of your idea, what you're left with is a particle system for photons :P
Now, to be fair, sudo-particle systems are currently used for global illumination solutions. This type of thing is called Photon Mapping, but it's only simple to implement a direct lighting solution using it :P

Is it possible to import a Collada model that aligns to pixels?

Assume I have a model that is simply a cube. (It is more complicated than a cube, but for the purposes of this discussion, we will simplify.)
So when I am in Sketchup, the cube is Xmm by Xmm by Xmm, where X is an integer. I then export the a Collada file and subsequently load that into threejs.
Now if I look at the geometry bounding box, the values are floats, not integers.
So now assume I am putting cubes next to each other with a small space in between say 1 pixel. Because screens can't draw half pixels, sometimes I see one pixel and sometimes I see two, which causes a lack of uniformity.
I think I can resolve this satisfactorily if I can somehow get the imported model to have integer dimensions. I have full access to all parts of the model starting with Sketchup, so any point in the process is fair game.
Is it possible?
Thanks.
Clarification: My app will have two views. The view that this is concerned with is using an OrthographicCamera that is looking straight down on the pieces, so this is really a 2D view. For purposes of this question, after importing the model, it should look like a grid of squares with uniform spacing in between.
UPDATE: I would ask that you please not respond unless you can provide an actual answer. If I need help finding a way to accomplish something, I will post a new question. For this question, I am only interested in knowing if it is possible to align an imported Collada model to full pixels and if so how. At this point, this is mostly to serve my curiosity and increase my knowledge of what is and isn't possible. Thank you community for your kind help.
Now you have to learn this thing about 3D programming: numbers don't mean anything :)
In the real world 1mm, 2.13cm and 100Kg specify something that can be measured and reproduced. But for a drawing library, those numbers don't mean anything.
In a drawing library, 3D points are always represented with 3 float values.You submit your points to the library, it transforms them in 2D points (they must be viewed on a 2D surface), and finally these 2D points are passed to a rasterizer which translates floating point values into integer values (the screen has a resolution of NxM pixels, both N and M being integers) and colors the actual pixels.
Your problem simply is not a problem. A cube of 1mm really means nothing, because if you are designing an astronomic application, that object will never be seen, but if it's a microscopic one, it will even be way larger than the screen. What matters are the coordinates of the point, and the scale of the overall application.
Now back to your cubes, don't try to insert 1px in between two adjacent ones. Your cubes are defined in terms of mm, so try to choose the distance in mm appropriate to your world, and let the rasterizer do its job and translate them to pixels.
I have been informed by two co-workers that I tracked down that this is indeed impossible using normal means.

3D Matrices - How to "Wobble" an object?

I'm trying to make a 3D object do a wobble effect, very much like a boss in StarFox 64 did when it teleported (see this video at 5:17 for reference). This seems like either a skewing effect, or perhaps an un-uniform scale that rotated around and was applied without rotating the object itself.
Does anyone have any idea how this might be done, or perhaps does anyone have any links to programs where I can play with the matrices directly to see how this is done?
You can use skew based on roll axis in the Euler angles coordinate system
See Euler angles
http://en.wikipedia.org/wiki/Euler_angles
Euler angles-matrix transformation ("General rotations" part of the article):
http://en.wikipedia.org/wiki/Rotation_matrix
An euler angles-matrix conversion utility in DirectX SDK
http://msdn.microsoft.com/en-us/library/microsoft.windowsmobile.directx.matrix.rotationyawpitchroll%28v=VS.85%29.aspx
And threads about skew matrices
skew matrix algorithm
http://www.quantunet.com/flash8/knowledgebase/actionscript/advanced/matrix/matrix_skew.html

A 360 degree Sphere panorama into Cube panorama transformations algorithm (pseudocode or at least full logic wanted)

So we can take such image from wikipedia
And try to map it for future cube or something like cube
And than distort for top and bottom like
Some one may think that doing disturtion only for half and than triing to fill it would work
it would not=( and content aware filling would not help filling that square=(
but it looks bad if you will try to render such cubic panorama.
Another way that I can imagine is to render 3d panorama onto sphere and than somehow take snapshots/projections of it onto cube... but I do not know how to write it down wit simple math operations (idea here is not to use rendering engines but to do it as mathematically as possible)
Jim,
I am Ken Chan the primary architect of the Quadrilateralized Spherical Cube (QLSC). You can look up Google for many references to the 1975 report "Feasilibilty Study of a Quadrilateralized Earth Data Base" which I co-authored with my colleague Mike O'Neill. I did all the formulation and mathematical analysis and Mike did all the software design and coding. I still have the report somewhere. I believe the code is in an appendix in the back, but I cannot testify to that.
There was an earlier report "Organizational Structures for Constant Resolution Earth Data Bases" in 1973 which I co-authored with two other colleagues (Paul Beaudet and Leon Goldshlak) at Computer Sciences Corporation (CSC). Leon was the project manager. Paul proposed one structure and I proposed four. The QLSC was one of my four conceptualizations and was subsequently chosen by the Navy for adoption. No code was developed for any of these models.
I have been away from that area of work for more than 35 years but I was aware that NASA Goddard in Greenbelt, Maryland eventually used QLSC for its COBE mission. I also became aware that the QLSC (or some derivative of it) was used by astronomers and astrophysicists in the US and Europe for star-mapping because of its equal area properties as well as its heirarchical indexing scheme.
Lately, I have also become aware that the basic organizational structure has been used in Hyperspectral Data Management and Compression.
I just turned 70 years old a few days ago and nothing makes me feel more satisfied that I am leaving behind something that other people can use. The thought of patenting it never crossed my mind when I developed the approach. Also, the thought of naming it the "Chan Spherical Cube" (to be abbreviated CSC) was rejected by Computer Sciences Corporation and by me.
I hope this gives you some idea of the history of the QLSC.
Ken
There's a map projection called the Quadrilateralized Spherical Cube that's used in astrophysics to represent all-sky maps. It has a nice property that the pixels are within a few percent of having equal areas all over the sky, so that geometric distortions are reduced.
Basically, the celestial globe is projected onto a cube, and each cube face is divided into pixels; but rather than being a rectilinear grid, the row and column boundaries are slightly curved so that each pixel maps to a roughly equally sized area on the sphere.
The pixel addressing is kind of interesting. Suppose you have a pixel with coordinates
X,Y on one of the cube faces. If X has binary representation abcd, and Y is ABCD,
then the pixel address on that face has X and Y interleaved: aAbBcCdD. So to rebin the
image to larger pixels, all you need to do is shift right 2 bits to get the pixel address at the lower resolution.
With 32-bit pixel addresses, you can use 3 bits to represent the cube face, and 28 bits to represent the interleaved X and Y coordinates within that face. At this resolution, each pixel covers an area of about 20x20 arcsec, or about a third of a mile square(ish) -- so one could make good use of this as a sort of geographic or celestial coordinate hashing technique.
To use this, you'd have to implement forward transformations (long, lat) or (RA, dec) to pixel numbers, and inverse transformations going from pixel numbers to (long, lat) or (RA, dec). And of course there are tons of well-known map projections from image coordinates to (long,lat) and back.
I didn't find any code for this in a few minutes of Googling -- maybe I can dig up some code I wrote about 20 years ago when I worked on the EUVE astrophysics mission, which used this projection for their all-sky survey maps.

Resources