3d modeling of human body in the browser? [closed] - three.js

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am currently working on a personal project (ecommerce site for clothing), I want to create a virtual trial room for the customers. For this I am taking their height, weight and body shape, etc. as an input and based on these inputs I want to create a 3D model of their body on the fly. Then they can pick from the clothes and see how it looks on these models.
After doing some research I came to know that, In order to acheiving this, three.js and webgl should be my primary weapons.
Also I came across this awesome site https://zygotebody.com/ which I found very inspiring, Here its too advance, I only want to create their 3D model.
I would really appreciate If you guys could guide me in the right direction, and point out some methods to acheive this. I know I have not done the full research for this, but asking this question here is also a part of my research. I don't want to get started in the wrong direction, So i thought some pros advice is no harm.
Thank you.

One approach would be to create morph targets in a 3D software (Blender, Lightwave, 3DMAX, Maya etc.). Using a 3D software gives you much more tools to manipulate the base shape. You could also look into Poser from Smith Micro which can morph body models internally and export those as morph target (but check license for usage of these exported objects).
These morph targets can then be used to interpolate the base object into various shapes - even combined.
You would create one morph target for legs, one for arms, one for belly etc. These will be known as morph channels. Then interpolate each point in the morph target using linear-interpolation with the base to morph the shape into what you need.
This approach doesn't require you to remap points or vertices etc. Just go point by point and interpolate.

Kind of echoing an existing answer but would also recommend using a handful of base meshes (possibly even just one) and using relative morph targets (relative vertex positions to change the shape of the model in some way) which you can interpolate along with possibly adding variation through textures/decals.
I've even seen artists use the same base mesh for both male and female, though more often a male and female base mesh would use different topology. On top of programs like Poser which might give you some nice ideas, game engines also use this technique to customize your character, like so:
It's basically the same model but the sliders control how much of a morph (relative vertex deltas) to apply along with affecting textures and some shader parameters, and often these are just linearly interpolated to create a new appearance. You can also use morph targets to do facial animation like make characters smile which is sometimes awkward to do with bones and skinning, e.g.
For adjusting height in ways that keeps the proportions identical, a morph target might be overkill. In that case you might just use a transformation matrix. If you want to adjust the proportions though like make the torso and legs shorter without affecting the head size as much (which would generally produce a more convincing effect), then morph targets make sense again.
All of this does require some work from a skilled 3D character modeler to create convincing content, but it's a very economical way from the programming side to achieve all these variations. If you populate a world filled with such characters, it's also very economical in terms of memory and processing since only things like vertex positions and shader parameters need to be made unique while the rest of data (texture coordinates, polygons, shader associations, etc) can be shared/instanced across characters.

I just found this old thread, but I am working on something similar.
I was able to find what I would call a beautiful model to work with from:
https://github.com/fashiontec/bodyapps-viz
This one includes a male, female, and child. I was able to clone that repo and fire it up in my browser within seconds.
I am working on porting it into React right now, so I am finding this thread for different reasons, but I thought I should share my earlier findings.
It comes with Data.GUI() so you can use that to modulate the parameters such as neck girth, torso girth, etc. It's probably exactly what the original question was asking for.

Related

Is there a way to create simple animations "on the fly" in modern OpenGL?

I think this requires a bit of background information:
I have been modding Minecraft for a while now, but I alway wanted to make my own game, so I started digging into the freshly released LWJGL3 to actually get things done. Yes, I know it's a bit ow level and I should use an engine and stuff...indeed, I already tried some engines and they never quite match what I want to do, so I decided I want to tackle the problem at its root.
So far, I kind of understand how to render meshes, move the "camera", etc. and I'm willing to take the learning curve.
But the thing is, at some point all the tutorials start to explain how to load models and create skeletal animations and so on...but I think I do not really want to go that way. A lot of things in working with Minecraft code was awful, but I liked how I could create models and animations from Java code. Sure, it did not look super realistic, but since I'm not great with Blender either, I doubt having "classic" models and animations would help. Anyway, in that code, I could rotate a box around to make a creature look at a player, I could use a sinus function to move legs and arms (or wings, in my case) and that was working, since Minecraft used immediate mode and Java could directly tell the graphics card where to draw each vertex.
So, actual question(s): Is there any good way to make dynamic animations in modern (3.3+) OpenGL? My models would basically be a hierarchy of shapes (boxes or whatever) and I want to be able to rotate them on the fly. But I'm not sure how to organize that. Would I store all the translation/rotation-matrices for each sub-shape? Would that put a hard limit on the amount of sub-shapes a model could have? Did anyone try something like that?
Edit: For clarification, what I did looked something like this:
Create a model: https://github.com/TheOnlySilverClaw/Birdmod/blob/master/src/main/java/silverclaw/birds/client/model/ModelOstrich.java
The model is created as a bunch of boxes in the constructor, the render and setRotationAngles methods set scale and rotations.
You should follow one opengl tutorial in order to understand the basics.
Let me suggest "Learning Modern 3D Graphics Programming", and especially this chapter, where you move one robot arm with multiple joints.
I did a port in java using jogl here, but you can easily port it over lwjgl.
What you are looking for is exactly skeletal animation, the only difference being the fact you do not want to load animations for your bones but want to compute / generate transforms on the fly.
You basically have a hierarchy of bones, and geometry attached to it. It looks like you want to manipulate this geometry "rigidly", so before sending your meshes / transforms to the GPU (the classic way), you want to start by computing the new transforms in model or world space, then send those freshly computed matrices to draw your geometries on the gpu the standard way.
As Sorin said, to compute each transform you simply have to iterate over your hierarchy and accumulate transforms given the transform of the parent bone and your local transform w.r.t the parent.
Yes and no.
You can have your hierarchy of shapes and store a relative transform for each.
For example the "player" whould have a translation to 100,100, 10 (where the player is), and then the "head" subcomponent would have an additional translation of 0,0,5 (just a bit higher on the z axis).
You can store these as matrices (they can encode translation, roation and scaling) and use glPushMatrix and glPop matrix to add and remove a matrix to a stack maintained by openGL.
The draw() function(or whatever you call it) should look something like :
glPushMatrix();
glMultMatrix(my_transform); // You can also just have glTranslate, glRotate or anything else.
// Draw my mesh
for (child : children) { child.draw(); }
glPopMatrix();
This gives you a hierarchical setup so that objects move with their parent. Alternatively you can have a stack in the main memory and do the multiplications yourself (use a library). I think the openGL stack may have a limit (implementation dependent), but if you handle it yourself the only limit is the amount of ram you can use. Once all the matrices are multiplied rendering is done in the same amount of time, that is it doesn't matter for performance how deep a mesh is in the hierarchy.
For actual animations you need to compute the intermediate transformations. For example for a crouch animation you probably want to have a few frames in between so that the camera doesn't just jump to the low position. You can do this with a time based linear interpolation between the start and end positions, but this only covers simple animations and you still have to implement it yourself.
Anything more complicated (i.e. modify the mesh based on the bone links) you would need to implement yourself.

Get Pure View Matrix in Vuforia

I am using Vuforia SDK to build my AR app.
By using trackableResult->getPose(), I can get a model view matrix of the target frame marker.
But I also need pure view matrix to do some calculations. Is there any way to get it?
To follow on from peedee's comment above, here is a picture I found extremely helpful, linked to from this web page.
Sorry I couldn't leave this as a comment, but they don't allow pictures in comments.
Note that different texts may use slightly different names for each space, but the overall idea should be the same!
My first learning OpenGL was actually with Vuforia two months ago, and I have now concluded for myself that Vuforia's naming of variables in their examples is confusing.
I changed the name of what they call "modelViewMatrix_Vuforia" to "viewMatrix", because that's really what it is. It transforms points from world space to view space / eye space.
In my understanding the "model matrix" is something that will be unique to each model and will place my models with respect to each other. You don't necessarily need one; if your models are already defined in world space then you can just forget about it.
But if your models are imported (for example mine are made with Blender) then they are probably defined in model space, with the origin at their own center of mass. In that case you need a model matrix to place them away from the origin, otherwise all your models will be stacked onto each other.
Maybe what I wrote is not clear enough... I only just figured this out this week myself. I wanted to do environment reflection with a cubemap, and there I needed an inverted view matrix, which brought me exactly to the question you asked.

Differentiate objects?

i want to identify a ball in the picture. I am thiking of using sobel edge detection algorithm,with this i can detect the round objects in the image.
But how do i differentiate between different objects. For example, a foot ball is there in one picture and in another picture i have a picture of moon.. how to differentiate what object has been detected.
When i use my algorithm i get ball in both the cases. Any ideas?
Well if all the objects you would like to differentiate are round, you could even use a hough transformation for round objects. This is a very good way of distinguishing round objects.
But your basic problem seems to be classification - sorting the objects on your image into different classes.
For this you don't really need a Neural Network, you could simply try with a Nearest Neighbor match. It's functionalities are a bit like neural networks since you can give it several reference pictures where you tell the system what can be seen there and it will optimize itself to the best average values for each attribute you detected. By this you get a dictionary of clusters for the different types of objects.
But for this you'll of course first need something that distinguishes a ball from a moon.
Since they are all real round objects (which appear as circles) it will be useless to compare for circularity, circumference, diameter or area (only if your camera is steady and if you know a moon will always have the same size on your images, other than a ball).
So basically you need to look inside the objects itself and you can try to compare their mean color value or grayscale value or the contrast inside the object (the moon will mostly have mid-gray values whereas a soccer ball consists of black and white parts)
You could also run edge filters on the segmented objects just to determine which is more "edgy" in its texture. But for this there are better methods I guess...
So basically what you need to do first:
Find several attributes that help you distinguish the different round objects (assuming they are already separated)
Implement something to get these values out of a picture of a round object (which is already segmented of course, so it has a background of 0)
Build a system that you feed several images and their class to have a supervised learning system and feed it several images of each type (there are many implementations of that online)
Now you have your system running and can give other objects to it to classify.
For this you need to segment the objects in the image, by i.e Edge filters or a Hough Transformation
For each of the segmented objects in an image, let it run through your classification system and it should tell you which class (type of object) it belongs to...
Hope that helps... if not, please keep asking...
When you apply an edge detection algorithm you lose information.
Thus the moon and the ball are the same.
The moon has a diiferent color, a different texture, ... you can use these informations to differnentiate what object has been detected.
That's a question in AI.
If you think about it, the reason you know it's a ball and not a moon, is because you've seen a lot of balls and moons in your life.
So, you need to teach the program what a ball is, and what a moon is. Give it some kind of dictionary or something.
The problem with a dictionary of course would be that to match the object with all the objects in the dictionary would take time.
So the best solution would probably using Neural networks. I don't know what programming language you're using, but there are Neural network implementations to most languages i've encountered.
You'll have to read a bit about it, decide what kind of neural network, and its architecture.
After you have it implemented it gets easy. You just give it a lot of pictures to learn (neural networks get a vector as input, so you can give it the whole picture).
For each picture you give it, you tell it what it is. So you give it like 20 different moon pictures, 20 different ball pictures. After that you tell it to learn (built in function usually).
The neural network will go over the data you gave it, and learn how to differentiate the 2 objects.
Later you can use that network you taught, give it a picture, and it a mark of what it thinks it is, like 30% ball, 85% moon.
This has been discussed before. Have a look at this question. More info here and here.

image feature identification

I am looking for a solution to do the following:
( the focus of my question is step 2. )
a picture of a house including the front yard
extract information from the picture like the dimensions and location of the house, trees, sidewalk, and car. Also, the textures and colors of the house, cars, trees, and sidewalk.
use extracted information to generate a model
How can I extract that information?
You could also consult Tatiana Jaworska research on this. As I understood, this details at least 1 new algorithm to feature extraction (targeted at roofs, doors, ...) by colour (RGB). More intriguing, the last publication also uses parameterized objects to be identified in the house images... that must might be a really good starting point for what you're trying to do.
link to her publications:
http://www.springerlink.com/content/w518j70542780r34/
http://portal.acm.org/citation.cfm?id=1578785
http://www.ibspan.waw.pl/~jaworska/TJ_BOS2010.pdf
Yes. You can extract these information from a picture.
1. You just identify these objects in a picture using some detection algorithms.
2. Measure these objects dimensions and generate a model using extracted information.
well actually your desired goal is not so easy to achieve. First of all you'll need a good way to figure what what is what and what is where on your image. And there simply is no easy "algorithm" for detecting houses/cars/whatsoever on an image. There are ways to segment different objects (like cars) from an image, but those don't work generally. Especially on houses this would be hard since each house looks different and it's hard to find one solid measurement for "this is house and this is not"...
Am I assuming it right that you are trying to simply photograph a house (with front yard) and build a texturized 3D-model out of it? This is not going to work since you need several photos of the house to get positions of walls/corners and everything in 3D space (There are approaches that try a mesh reconstruction with one image only but they lack of depth information and results are fairly poor). So if you would like to create 3D-mdoels you will need several photos of different angles of the house.
There are several different approaches that use this kind of technique to reconstruct real world objects to triangle-meshes.
Basically they work after the principle:
Try to find points in images of different viewpoint which are the same on an object. Considering you are photographing a house this could be salient structures likes corners of windows/doors or corners or edges on the walls/roof/...
Knowing where one and the same point of your house is in several different photos and knowing the position of the camera of both photos you can reconstruct this point in 3D-space.
Doing this for a lot of equal points will "empower" you to reconstruct the shape of your house as a 3D-model by triangulating the points.
Taking parts of the image as textures and mapping them on the generated model would work as well since you know where what is.
You should have a look at these papers:
http://www.graphicon.ru/1999/3D%20Reconstruction/Valiev.pdf
http://people.csail.mit.edu/wojciech/pubs/LabeledRec.pdf
http://people.csail.mit.edu/sparis/publi/2006/oceans/Paris_06_3D_Reconstruction.ppt
The second paper even has an example of doing exactly what you try to achieve, namely reconstruct a textured 3D-model of a house photographed from different angles.
The third link is a powerpoint presentation that shows how the reconstruction works and shows the drawbacks there are.
So you should get familiar with these papers to see what problems you are up to... If you then want to try this on your own have a look at OpenCV. This library provides some methods for feature extraction in images. You then can try to find salient points in each image and try to match them.
Good luck on your project... If you have problems, please keep asking!
I suggest to look at this blog
https://jwork.org/main/node/35
that shows how to identify certain features on images using a convolutional neural network. This particular blog discusses how to identify human faces on images from a large set of random images. You can adjust this example to train neural network using some other images. Note that even in the case of human faces, the identification rate is about 85%, therefore, more complex objects can be even harder to identify

Dilemma about image cropping algorithm - is it possible?

I am building a web application using .NET 3.5 (ASP.NET, SQL Server, C#, WCF, WF, etc) and I have run into a major design dilemma. This is a uni project btw, but it is 100% up to me what I develop.
I need to design a system whereby I can take an image and automatically crop a certain object within it, without user input. So for example, cut out the car in a picture of a road. I've given this a lot of thought, and I can't see any feasible method. I guess this thread is to discuss the issues and feasibility of achieving this goal. Eventually, I would get the dimensions of a car (or whatever it may be), and then pass this into a 3d modelling app (custom) as parameters, to render a 3d model. This last step is a lot more feasible. It's the cropping issue which is an issue. I have thought of all sorts of ideas, like getting the colour of the car and then the outline around that colour. So if the car (example) is yellow, when there is a yellow pixel in the image, trace around it. But this would fail if there are two yellow cars in a photo.
Ideally, I would like the system to be completely automated. But I guess I can't have everything my way. Also, my skills are in what I mentioned above (.NET 3.5, SQL Server, AJAX, web design) as opposed to C++ but I would be open to any solution just to see the feasibility.
I also found this patent: US Patent 7034848 - System and method for automatically cropping graphical images
Thanks
This is one of the problems that needed to be solved to finish the DARPA Grand Challenge. Google video has a great presentation by the project lead from the winning team, where he talks about how they went about their solution, and how some of the other teams approached it. The relevant portion starts around 19:30 of the video, but it's a great talk, and the whole thing is worth a watch. Hopefully it gives you a good starting point for solving your problem.
What you are talking about is an open research problem, or even several research problems. One way to tackle this, is by image segmentation. If you can safely assume that there is one object of interest in the image, you can try a figure-ground segmentation algorithm. There are many such algorithms, and none of them are perfect. They usually output a segmentation mask: a binary image where the figure is white and the background is black. You would then find the bounding box of the figure, and use it to crop. The thing to remember is that none of the existing segmentation algorithm will give you what you want 100% of the time.
Alternatively, if you know ahead of time what specific type of object you need to crop (car, person, motorcycle), then you can try an object detection algorithm. Once again, there are many, and none of them are perfect either. On the other hand, some of them may work better than segmentation if your object of interest is on very cluttered background.
To summarize, if you wish to pursue this, you would have to read a fair number of computer vision papers, and try a fair number of different algorithms. You will also increase your chances of success if you constrain your problem domain as much as possible: for example restrict yourself to a small number of object categories, assume there is only one object of interest in an image, or restrict yourself to a certain type of scenes (nature, sea, etc.). Also keep in mind, that even the accuracy of state-of-the-art approaches to solving this type of problems has a lot of room for improvement.
And by the way, the choice of language or platform for this project is by far the least difficult part.
A method often used for face detection in images is through the use of a Haar classifier cascade. A classifier cascade can be trained to detect any objects, not just faces, but the ability of the classifier is highly dependent on the quality of the training data.
This paper by Viola and Jones explains how it works and how it can be optimised.
Although it is C++ you might want to take a look at the image processing libraries provided by the OpenCV project which include code to both train and use Haar cascades. You will need a set of car and non-car images to train a system!
Some of the best attempts I've see of this is using a large database of images to help understand the image you have. These days you have flickr, which is not only a giant corpus of images, but it's also tagged with meta-information about what the image is.
Some projects that do this are documented here:
http://blogs.zdnet.com/emergingtech/?p=629
Start with analyzing the images yourself. That way you can formulate the criteria on which to match the car. And you get to define what you cannot match.
If all cars have the same background, for example, it need not be that complex. But your example states a car on a street. There may be parked cars. Should they be recognized?
If you have access to MatLab, you could test your pattern recognition filters with specialized software like PRTools.
Wwhen I was studying (a long time ago:) I used Khoros Cantata and found that an edge filter can simplify the image greatly.
But again, first define the conditions on the input. If you don't do that you will not succeed because pattern recognition is really hard (think about how long it took to crack captcha's)
I did say photo, so this could be a black car with a black background. I did think of specifying the colour of the object, and then when that colour is found, trace around it (high level explanation). But, with a black object in a black background (no constrast in other words), it would be a very difficult task.
Better still, I've come across several sites with 3d models of cars. I could always use this, stick it into a 3d model, and render it.
A 3D model would be easier to work with, a real world photo much harder. It does suck :(
If I'm reading this right... This is where AI shines.
I think the "simplest" solution would be to use a neural-network based image recognition algorithm. Unless you know that the car will look the exact same in each picture, then that's pretty much the only way.
If it IS the exact same, then you can just search for the pixel pattern, and get the bounding rectangle, and just set the image border to the inner boundary of the rectangle.
I think that you will never get good results without a real user telling the program what to do. Think of it this way: how should your program decide when there is more than 1 interesting object present (for example: 2 cars)? what if the object you want is actually the mountain in the background? what if nothing of interest is inside the picture, thus nothing to select as the object to crop out? etc, etc...
With that said, if you can make assumptions like: only 1 object will be present, then you can have a go with using image recognition algorithms.
Now that I think of it. I recently got a lecture about artificial intelligence in robots and in robotic research techniques. Their research went on about language interaction, evolution, and language recognition. But in order to do that they also needed some simple image recognition algorithms to process the perceived environment. One of the tricks they used was to make a 3D plot of the image where x and y where the normal x and y axis and the z axis was the brightness of that particular point, then they used the same technique for red-green values, and blue-yellow. And lo and behold they had something (relatively) easy they could use to pick out the objects from the perceived environment.
(I'm terribly sorry, but I can't find a link to the nice charts they had that showed how it all worked).
Anyway, the point is that they were not interested (that much) in image recognition so they created something that worked good enough and used something less advanced and thus less time consuming, so it is possible to create something simple for this complex task.
Also any good image editing program has some kind of magic wand that will select, with the right amount of tweaking, the object of interest you point it on, maybe it's worth your time to look into that as well.
So, it basically will mean that you:
have to make some assumptions, otherwise it will fail terribly
will probably best be served with techniques from AI, and more specifically image recognition
can take a look at paint.NET and their algorithm for their magic wand
try to use the fact that a good photo will have the object of interest somewhere in the middle of the image
.. but i'm not saying that this is the solution for your problem, maybe something simpler can be used.
Oh, and I will continue to look for those links, they hold some really valuable information about this topic, but I can't promise anything.

Resources