Get Pure View Matrix in Vuforia - matrix

I am using Vuforia SDK to build my AR app.
By using trackableResult->getPose(), I can get a model view matrix of the target frame marker.
But I also need pure view matrix to do some calculations. Is there any way to get it?

To follow on from peedee's comment above, here is a picture I found extremely helpful, linked to from this web page.
Sorry I couldn't leave this as a comment, but they don't allow pictures in comments.
Note that different texts may use slightly different names for each space, but the overall idea should be the same!

My first learning OpenGL was actually with Vuforia two months ago, and I have now concluded for myself that Vuforia's naming of variables in their examples is confusing.
I changed the name of what they call "modelViewMatrix_Vuforia" to "viewMatrix", because that's really what it is. It transforms points from world space to view space / eye space.
In my understanding the "model matrix" is something that will be unique to each model and will place my models with respect to each other. You don't necessarily need one; if your models are already defined in world space then you can just forget about it.
But if your models are imported (for example mine are made with Blender) then they are probably defined in model space, with the origin at their own center of mass. In that case you need a model matrix to place them away from the origin, otherwise all your models will be stacked onto each other.
Maybe what I wrote is not clear enough... I only just figured this out this week myself. I wanted to do environment reflection with a cubemap, and there I needed an inverted view matrix, which brought me exactly to the question you asked.

Related

Removing skew/distortion based on known dimensions of a shape

I have an idea for an app that takes a printed page with four squares in each corner and allows you to measure objects on the paper given at least two squares are visible. I want to be able to have a user take a picture from less than perfect angles and still have the objects be measured accurately.
I'm unable to figure out exactly how to find information on this subject due to my lack of knowledge in the area. I've been able to find examples of opencv code that does some interesting transforms and the like but I've yet to figure out what I'm asking in simpler terms.
Does anyone know of papers or mathematical concepts I can lookup to get further into this project?
I'm not quite sure how or who to ask other than people on this forum, sorry for the somewhat vague question.
What you describe is very reminiscent of augmented reality marker tracking. Maybe you can start by searching these words on a search engine of your choice.
A single marker, if done correctly, can be used to identify it without confusing it with other markers AND to determine how the surface is placed in 3D space in front of the camera.
But that's all very difficult and advanced stuff, I'd greatly advise to NOT try and implement something like this, it would take years of research... The only way you have is to use a ready-made open source library that outputs the data you need for your app.
It may even not exist. In that case you'll have to buy one. Given the niché of your problem that would be perfectly plausible.
Here I give you only the programming aspect and if you want you can find out about the mathematical aspect from those examples. Most of the functions you need can be done using OpenCV. Here are some examples in python:
To detect the printed paper, you can use cv2.findContours function. The most outer contour is possibly the paper, but you need to test on actual images. https://docs.opencv.org/3.1.0/d4/d73/tutorial_py_contours_begin.html
In case of sloping (not in perfect angle), you can find the angle by cv2.minAreaRect which return the angle of the contour you found above. https://docs.opencv.org/3.1.0/dd/d49/tutorial_py_contour_features.html (part 7b).
If you want to rotate the paper, use cv2.warpAffine. https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html
To detect the object in the paper, there are some methods. The easiest way is using the contours above. If the objects are in certain colors, you can detect it by using color filter. https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html

3d modeling of human body in the browser? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am currently working on a personal project (ecommerce site for clothing), I want to create a virtual trial room for the customers. For this I am taking their height, weight and body shape, etc. as an input and based on these inputs I want to create a 3D model of their body on the fly. Then they can pick from the clothes and see how it looks on these models.
After doing some research I came to know that, In order to acheiving this, three.js and webgl should be my primary weapons.
Also I came across this awesome site https://zygotebody.com/ which I found very inspiring, Here its too advance, I only want to create their 3D model.
I would really appreciate If you guys could guide me in the right direction, and point out some methods to acheive this. I know I have not done the full research for this, but asking this question here is also a part of my research. I don't want to get started in the wrong direction, So i thought some pros advice is no harm.
Thank you.
One approach would be to create morph targets in a 3D software (Blender, Lightwave, 3DMAX, Maya etc.). Using a 3D software gives you much more tools to manipulate the base shape. You could also look into Poser from Smith Micro which can morph body models internally and export those as morph target (but check license for usage of these exported objects).
These morph targets can then be used to interpolate the base object into various shapes - even combined.
You would create one morph target for legs, one for arms, one for belly etc. These will be known as morph channels. Then interpolate each point in the morph target using linear-interpolation with the base to morph the shape into what you need.
This approach doesn't require you to remap points or vertices etc. Just go point by point and interpolate.
Kind of echoing an existing answer but would also recommend using a handful of base meshes (possibly even just one) and using relative morph targets (relative vertex positions to change the shape of the model in some way) which you can interpolate along with possibly adding variation through textures/decals.
I've even seen artists use the same base mesh for both male and female, though more often a male and female base mesh would use different topology. On top of programs like Poser which might give you some nice ideas, game engines also use this technique to customize your character, like so:
It's basically the same model but the sliders control how much of a morph (relative vertex deltas) to apply along with affecting textures and some shader parameters, and often these are just linearly interpolated to create a new appearance. You can also use morph targets to do facial animation like make characters smile which is sometimes awkward to do with bones and skinning, e.g.
For adjusting height in ways that keeps the proportions identical, a morph target might be overkill. In that case you might just use a transformation matrix. If you want to adjust the proportions though like make the torso and legs shorter without affecting the head size as much (which would generally produce a more convincing effect), then morph targets make sense again.
All of this does require some work from a skilled 3D character modeler to create convincing content, but it's a very economical way from the programming side to achieve all these variations. If you populate a world filled with such characters, it's also very economical in terms of memory and processing since only things like vertex positions and shader parameters need to be made unique while the rest of data (texture coordinates, polygons, shader associations, etc) can be shared/instanced across characters.
I just found this old thread, but I am working on something similar.
I was able to find what I would call a beautiful model to work with from:
https://github.com/fashiontec/bodyapps-viz
This one includes a male, female, and child. I was able to clone that repo and fire it up in my browser within seconds.
I am working on porting it into React right now, so I am finding this thread for different reasons, but I thought I should share my earlier findings.
It comes with Data.GUI() so you can use that to modulate the parameters such as neck girth, torso girth, etc. It's probably exactly what the original question was asking for.

ThreeJS: is it possible to simplify an object / reduce the number of vertexes?

I'm starting to learn ThreeJS. I have some very complex models to display.
These models come from Autocad files that my customer provides.
But sometimes the amount of details in the model is just way too much for the purpose of the website.
I would like to reduce the amount of vertexes in the model to simplify the display and enhance performance.
Is this possible from within ThreeJS? Or is there maybe an other solution for this?
There's a modifier called SimplifyModifier that works very well. You'll find it in the Three.js examples
https://threejs.org/examples/#webgl_modifier_simplifier
If you can import the model into Blender, you could try Decimate Modifier. In the latest version of Blender, it features three different methods with configurable "amount" parameters. Depending on your mesh topology, it might reduce the poly count drastically with virtually no visual changes, or it might completely break the model with even a slight reduction attempt. Other 3d packages should include a similar functionality, but I've never used those.
.
Another thing that came into mind: Sometimes when I've encountered a too high-poly Blender model, a good start has been checking if it has a Subdivision Modifier applied and removing that if so. While I do not know if there's something similar in Autocad, it might be worth investigating.
I updated SimplifyModifier function. it works with Textured models. Here is example:
https://rigmodels.com/3d_LOD.php?view=5OOOEWDBCBR630L4R438TGJCD
you can extract JS codes and use in your project.

Augmented Reality display

A quick question.
Why is paper required when viewing a 3D AR image through your phone? Whenever I watch a youtube video of someone demonstrating AR they anyway place paper within the view of the camera. Is it used as a point of reference or something? Can any paper be used or does it need to be specific to the image being displayed? Does it need to be paper or would any square or rectangular surface such as the kitchen table be used?
Thanks
Yes, it will be used as a point of reference. Objects could be placed on top of a paper and so would be parallel to the ground. Else you would need some calibration method to recognize the ground. The IMU could be used but it's not always accurate enough.
With Vuforia SDK you could either predefine a printing or allow the user to take a picture of any planar surface. When you're predefining a printing you can be sure, that it is appropriate for tracking and you could use it as an advert for your app. On the other hand, allowing the user to use any paper with enough details, is more user-friendly as one won't need a printer.
Target papers don't need to be retangular, it only must contain enough features. Most frameworks use corners as "features". So circular shapes are inappropriate. Also you have to make sure the surface has enough contrasts.
Note that there are methods to do it without a paper, but you have to solve the calibration issue and accept measuring errors. At this moment there isn't any free framework, which supports that out-of-the-box, but as a strating point you could read Luca's answer to one of my questions.

Dilemma about image cropping algorithm - is it possible?

I am building a web application using .NET 3.5 (ASP.NET, SQL Server, C#, WCF, WF, etc) and I have run into a major design dilemma. This is a uni project btw, but it is 100% up to me what I develop.
I need to design a system whereby I can take an image and automatically crop a certain object within it, without user input. So for example, cut out the car in a picture of a road. I've given this a lot of thought, and I can't see any feasible method. I guess this thread is to discuss the issues and feasibility of achieving this goal. Eventually, I would get the dimensions of a car (or whatever it may be), and then pass this into a 3d modelling app (custom) as parameters, to render a 3d model. This last step is a lot more feasible. It's the cropping issue which is an issue. I have thought of all sorts of ideas, like getting the colour of the car and then the outline around that colour. So if the car (example) is yellow, when there is a yellow pixel in the image, trace around it. But this would fail if there are two yellow cars in a photo.
Ideally, I would like the system to be completely automated. But I guess I can't have everything my way. Also, my skills are in what I mentioned above (.NET 3.5, SQL Server, AJAX, web design) as opposed to C++ but I would be open to any solution just to see the feasibility.
I also found this patent: US Patent 7034848 - System and method for automatically cropping graphical images
Thanks
This is one of the problems that needed to be solved to finish the DARPA Grand Challenge. Google video has a great presentation by the project lead from the winning team, where he talks about how they went about their solution, and how some of the other teams approached it. The relevant portion starts around 19:30 of the video, but it's a great talk, and the whole thing is worth a watch. Hopefully it gives you a good starting point for solving your problem.
What you are talking about is an open research problem, or even several research problems. One way to tackle this, is by image segmentation. If you can safely assume that there is one object of interest in the image, you can try a figure-ground segmentation algorithm. There are many such algorithms, and none of them are perfect. They usually output a segmentation mask: a binary image where the figure is white and the background is black. You would then find the bounding box of the figure, and use it to crop. The thing to remember is that none of the existing segmentation algorithm will give you what you want 100% of the time.
Alternatively, if you know ahead of time what specific type of object you need to crop (car, person, motorcycle), then you can try an object detection algorithm. Once again, there are many, and none of them are perfect either. On the other hand, some of them may work better than segmentation if your object of interest is on very cluttered background.
To summarize, if you wish to pursue this, you would have to read a fair number of computer vision papers, and try a fair number of different algorithms. You will also increase your chances of success if you constrain your problem domain as much as possible: for example restrict yourself to a small number of object categories, assume there is only one object of interest in an image, or restrict yourself to a certain type of scenes (nature, sea, etc.). Also keep in mind, that even the accuracy of state-of-the-art approaches to solving this type of problems has a lot of room for improvement.
And by the way, the choice of language or platform for this project is by far the least difficult part.
A method often used for face detection in images is through the use of a Haar classifier cascade. A classifier cascade can be trained to detect any objects, not just faces, but the ability of the classifier is highly dependent on the quality of the training data.
This paper by Viola and Jones explains how it works and how it can be optimised.
Although it is C++ you might want to take a look at the image processing libraries provided by the OpenCV project which include code to both train and use Haar cascades. You will need a set of car and non-car images to train a system!
Some of the best attempts I've see of this is using a large database of images to help understand the image you have. These days you have flickr, which is not only a giant corpus of images, but it's also tagged with meta-information about what the image is.
Some projects that do this are documented here:
http://blogs.zdnet.com/emergingtech/?p=629
Start with analyzing the images yourself. That way you can formulate the criteria on which to match the car. And you get to define what you cannot match.
If all cars have the same background, for example, it need not be that complex. But your example states a car on a street. There may be parked cars. Should they be recognized?
If you have access to MatLab, you could test your pattern recognition filters with specialized software like PRTools.
Wwhen I was studying (a long time ago:) I used Khoros Cantata and found that an edge filter can simplify the image greatly.
But again, first define the conditions on the input. If you don't do that you will not succeed because pattern recognition is really hard (think about how long it took to crack captcha's)
I did say photo, so this could be a black car with a black background. I did think of specifying the colour of the object, and then when that colour is found, trace around it (high level explanation). But, with a black object in a black background (no constrast in other words), it would be a very difficult task.
Better still, I've come across several sites with 3d models of cars. I could always use this, stick it into a 3d model, and render it.
A 3D model would be easier to work with, a real world photo much harder. It does suck :(
If I'm reading this right... This is where AI shines.
I think the "simplest" solution would be to use a neural-network based image recognition algorithm. Unless you know that the car will look the exact same in each picture, then that's pretty much the only way.
If it IS the exact same, then you can just search for the pixel pattern, and get the bounding rectangle, and just set the image border to the inner boundary of the rectangle.
I think that you will never get good results without a real user telling the program what to do. Think of it this way: how should your program decide when there is more than 1 interesting object present (for example: 2 cars)? what if the object you want is actually the mountain in the background? what if nothing of interest is inside the picture, thus nothing to select as the object to crop out? etc, etc...
With that said, if you can make assumptions like: only 1 object will be present, then you can have a go with using image recognition algorithms.
Now that I think of it. I recently got a lecture about artificial intelligence in robots and in robotic research techniques. Their research went on about language interaction, evolution, and language recognition. But in order to do that they also needed some simple image recognition algorithms to process the perceived environment. One of the tricks they used was to make a 3D plot of the image where x and y where the normal x and y axis and the z axis was the brightness of that particular point, then they used the same technique for red-green values, and blue-yellow. And lo and behold they had something (relatively) easy they could use to pick out the objects from the perceived environment.
(I'm terribly sorry, but I can't find a link to the nice charts they had that showed how it all worked).
Anyway, the point is that they were not interested (that much) in image recognition so they created something that worked good enough and used something less advanced and thus less time consuming, so it is possible to create something simple for this complex task.
Also any good image editing program has some kind of magic wand that will select, with the right amount of tweaking, the object of interest you point it on, maybe it's worth your time to look into that as well.
So, it basically will mean that you:
have to make some assumptions, otherwise it will fail terribly
will probably best be served with techniques from AI, and more specifically image recognition
can take a look at paint.NET and their algorithm for their magic wand
try to use the fact that a good photo will have the object of interest somewhere in the middle of the image
.. but i'm not saying that this is the solution for your problem, maybe something simpler can be used.
Oh, and I will continue to look for those links, they hold some really valuable information about this topic, but I can't promise anything.

Resources