How to best use a scene graph when either rasterizing or ray-tracing? - raytracing

So this was a question on my Computer Graphics final to which I still don't know an answer.
What is a scene-graph? How is it best used when rasterising or ray-tracing an image respectively?
A scene-graph is a way to manage hierarchical transformations.
However I do not know whether it makes a difference if you generate an image by rasterizing or by ray-tracing it.
Hoping somebody can enlighten me.

When rasterizing, usually you recursively traverse the scenegraph and build a transformation matrix, which you then apply to your base geometry (object space) to transform it into monitor space.
When raytracing, you recursively traverse the scenegraph as well, but usually instead of transforming the geometry, you transform the ray.
I'm not sure if that's what they meant, but that's the main difference I'm aware of.

Related

Best data structure for point cloud updates?

I'm working on a robot using the new jetson nano. I've got points generating from the depth image of my camera and am working towards creating a scene as the robot moves around. My issue is just throwing points into the data structure every frame would make me run out of memory super quickly. Thus I want to have some heuristic that says if a point meets some condition don't add it.
For this I imagine I need an acceleration structure like an Octree, KDTree, BVH Hierarchy, or maybe something else. While I am familiar with them and find lots of info on how to build them, I'm a little confused on which of them would be easiest to update each frame or if some require complete rebuilds compared to incremental rebuilds. Could some be parallelized? Any insight on what type data structure maybe with a link about it would super helpful.
Edit:
I believe the best structure for this is likely a Sparse Voxel Octree. You can find some general ideas of how to do so with this blog from Nvidia. https://devblogs.nvidia.com/thinking-parallel-part-iii-tree-construction-gpu/ .
If a morton code maps to a specific voxel that voxel is 'filled'. Redundant points are automatically taken care of as voxel is either filled or unfilled. For removal I think i can do ray tracing on the octree and if I collide with a filled voxel before I expect too delete the existing voxel. There are some resolution problems, but I think I can handle this with a hybrid approach.

Is there a way to create simple animations "on the fly" in modern OpenGL?

I think this requires a bit of background information:
I have been modding Minecraft for a while now, but I alway wanted to make my own game, so I started digging into the freshly released LWJGL3 to actually get things done. Yes, I know it's a bit ow level and I should use an engine and stuff...indeed, I already tried some engines and they never quite match what I want to do, so I decided I want to tackle the problem at its root.
So far, I kind of understand how to render meshes, move the "camera", etc. and I'm willing to take the learning curve.
But the thing is, at some point all the tutorials start to explain how to load models and create skeletal animations and so on...but I think I do not really want to go that way. A lot of things in working with Minecraft code was awful, but I liked how I could create models and animations from Java code. Sure, it did not look super realistic, but since I'm not great with Blender either, I doubt having "classic" models and animations would help. Anyway, in that code, I could rotate a box around to make a creature look at a player, I could use a sinus function to move legs and arms (or wings, in my case) and that was working, since Minecraft used immediate mode and Java could directly tell the graphics card where to draw each vertex.
So, actual question(s): Is there any good way to make dynamic animations in modern (3.3+) OpenGL? My models would basically be a hierarchy of shapes (boxes or whatever) and I want to be able to rotate them on the fly. But I'm not sure how to organize that. Would I store all the translation/rotation-matrices for each sub-shape? Would that put a hard limit on the amount of sub-shapes a model could have? Did anyone try something like that?
Edit: For clarification, what I did looked something like this:
Create a model: https://github.com/TheOnlySilverClaw/Birdmod/blob/master/src/main/java/silverclaw/birds/client/model/ModelOstrich.java
The model is created as a bunch of boxes in the constructor, the render and setRotationAngles methods set scale and rotations.
You should follow one opengl tutorial in order to understand the basics.
Let me suggest "Learning Modern 3D Graphics Programming", and especially this chapter, where you move one robot arm with multiple joints.
I did a port in java using jogl here, but you can easily port it over lwjgl.
What you are looking for is exactly skeletal animation, the only difference being the fact you do not want to load animations for your bones but want to compute / generate transforms on the fly.
You basically have a hierarchy of bones, and geometry attached to it. It looks like you want to manipulate this geometry "rigidly", so before sending your meshes / transforms to the GPU (the classic way), you want to start by computing the new transforms in model or world space, then send those freshly computed matrices to draw your geometries on the gpu the standard way.
As Sorin said, to compute each transform you simply have to iterate over your hierarchy and accumulate transforms given the transform of the parent bone and your local transform w.r.t the parent.
Yes and no.
You can have your hierarchy of shapes and store a relative transform for each.
For example the "player" whould have a translation to 100,100, 10 (where the player is), and then the "head" subcomponent would have an additional translation of 0,0,5 (just a bit higher on the z axis).
You can store these as matrices (they can encode translation, roation and scaling) and use glPushMatrix and glPop matrix to add and remove a matrix to a stack maintained by openGL.
The draw() function(or whatever you call it) should look something like :
glPushMatrix();
glMultMatrix(my_transform); // You can also just have glTranslate, glRotate or anything else.
// Draw my mesh
for (child : children) { child.draw(); }
glPopMatrix();
This gives you a hierarchical setup so that objects move with their parent. Alternatively you can have a stack in the main memory and do the multiplications yourself (use a library). I think the openGL stack may have a limit (implementation dependent), but if you handle it yourself the only limit is the amount of ram you can use. Once all the matrices are multiplied rendering is done in the same amount of time, that is it doesn't matter for performance how deep a mesh is in the hierarchy.
For actual animations you need to compute the intermediate transformations. For example for a crouch animation you probably want to have a few frames in between so that the camera doesn't just jump to the low position. You can do this with a time based linear interpolation between the start and end positions, but this only covers simple animations and you still have to implement it yourself.
Anything more complicated (i.e. modify the mesh based on the bone links) you would need to implement yourself.

Finding cross on the image

I have set of binary images, on which i need to find the cross (examples attached). I use findcontours to extract borders from the binary image. But i can't understand how can i determine is this shape (border) cross or not? Maybe opencv has some built-in methods, which could help to solve this problem. I thought to solve this problem using Machine learning, but i think there is a simpler way to do this. Thanks!
Viola-Jones object detection could be a good start. Though the main usage of the algorithm (AFAIK) is face detection, it was actually designed for any object detection, such as your cross.
The algorithm is Machine-Learning based algorithm (so, you will need a set of classified "crosses" and a set of classified "not crosses"), and you will need to identify the significant "features" (patterns) that will help the algorithm recognize crosses.
The algorithm is implemented in OpenCV as cvHaarDetectObjects()
From the original image, lets say you've extracted sets of polygons that could potentially be your cross. Assuming that all of the cross is visible, to the extent that all edges can be distinguished as having a length, you could try the following.
Reject all polygons that did not have exactly 12 vertices required to
form your polygon.
Re-order the vertices such that the shortest edge length is first.
Create a best fit perspective transformation that maps your vertices onto a cross of uniform size
Examine the residuals generated by using this transformation to project your cross back onto the uniform cross, where the residual for any given point is the distance between the projected point and the corresponding uniform point.
If all the residuals are within your defined tolerance, you've found a cross.
Note that this works primarily due to the simplicity of the geometric shape you're searching for. Your contours will also need to have noise removed for this to work, e.g. each line within the cross needs to be converted to a single simple line.
Depending on your requirements, you could try some local feature detector like SIFT or SURF. Check OpenSURF which is an interesting implementation of the latter.
after some days of struggle, i came to a conclusion that the only robust way here is to use SVM + HOG. That's all.
You could erode each blob and analyze their number of pixels is going down. No mater the rotation scaling of the crosses they should always go down with the same ratio, excepted when you're closing down on the remaining center. Again, when the blob is small enough you should expect it to be in the center of the original blob. You won't need any machine learning algorithm or training data to resolve this.

What's the best depth map generation algorithm?

I'm into a 2D-to-3D application project and I'm looking for a method to produce the depth map of a single input image, without other external informations. I know that's a sort of "artificial intelligence" mattern but maybe an efficient algorythm exists.
At the moment I've found this one: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109.7959&rep=rep1&type=pdf but I'm wondering if there is a better method, before start implementing. Suggestions? Thanks!
I've written quite a few automatic depth map generators. I don't think there's one that's better than all others in all cases. It all depends on the stereo pair you're starting with. I personally think a depth map generator based on local method (window or block based) with an edge preserving smoother is probably the best all-around depth map generator.
In any case, on this page:
depth map generation software
you can find depth map generator software based on optical flow, weight-based windows, graph cuts, and many other things that relate to depth map generation and lenticular creation. The best part is that it's all free.
For 2d to 3d conversion (which is more what you are asking), there's a piece of software called DMAG4 that uses a scarsely populated depth map (typically, done in Gimp with the paint brush) to indicate the main depths and then fills the unfilled areas using interpolation while maintaining the edges of the objects (edge-preserving).
DMAG4 can be found here (it's free to use):
2d to 3d conversion software DMAG4
Another way to 2d to 3d conversion is to use a sculpting program like Gimpel3d or Blender, both free. Clearly, this goes beyond depth map since you're essentially creating a 3d scene in which you can then move around (using the camera movement in Blender). This is often referred to as "camera mapping".
Well, I have recently come upon this:
http://make3d.cs.cornell.edu/code.html
which comes together with code, although the license might be too restrictive
("Noncommercial — You may not use this work for commercial purposes").
the gallery is impressive
http://make3d.stanford.edu/images/showall

Shapes dragging problem

I need to design a software where user can drag shapes in a window. The problem is that there might be thousands of shapes and there might be some restrictions, e.g. one shape cannot be over another shape.
So I actually need to know how to organize data storage and some algorithm to quickly determine if a shape can be placed in some particular position.
I think this problem was solved many times, but I don't know how to google it properly. Could you please provide me with some information on this topic?
Thanks!
A quadtree (2D, octree if 3D) is often used in collision detection. Idea is to recursively divide the space in squares/cubes and place the shapes into the correct squares/cubes. When you need to perform collision detection on a given shape, you can then test only the shapes in the same square/cube.
There are other structures, each having pros/cons depending on the constraints you have. If the other shapes are static, BSP Trees can be also a good structure.

Resources