is there a simple algorithm usable in real-time which is able to morph two images without any user inputs (so completely automatic, there is no controls points to set) ?
Basically, i don't want to morph faces nor realistic scenes, the images would actually be completely abstract and a combination of drawing patterns with regular shapes such as lines.
Thank by advance.
I have written a tool that doesn't require setting manual keypoints and is not restricted to a domain (like faces). Anyway, the images have to be similar (e.g. two faces or two cars from the same perspective). It's still a work in progress but already works great!
https://github.com/kallaballa/Poppy
example usage:
poppy -o video.mkv image1.png image2.png
Related
I have up to 200'000 individual images in a scene (done with sprites, so far). I want to look at these sprites, and when I fly around they should always face the camera (as sprites do).
My question is: How can I achieve the best performance WebGL-wise? Are Sprites with useScreenCoordinates:false rendered as with GL_POINT?
At the moment the fps drops with very low image counts already. I'm using mipmapping and sprites so far. And since they need to turn around to face me I didn't want to use BufferGeometry..
I'd highly appreciate some ideas and inputs :) Thanks!
PS: Point of it all is that you can "fly" through 200'000 images and stop/select the ones you figure to be interesting
My team needed to accomplish this too, and sadly Doidel's notes trail off before the project is completed. We developed PixPlot, a three.js visualization layer for images:
I put together a blog post outlining the details here: http://douglasduhaime.com/posts/visualizing-tsne-maps-with-three-js.html
In short, if others face this problem, you'll want to create one geometry (ideally) with one large image atlas (a single jpg of size 2048px by 2048px containing lots of smaller images) serving as the texture for the geometry. Add vertices, faces, and vertexUV's for each of the little images to display, and pull each image from the atlas texture.
Used tons of techniques and stuff, I'll be writing about it on http://blogs.fhnw.ch/threejs/ once I got it all working
I'm thinking of stitching images from 2 or more(currently maybe 3 or 4) cameras in real-time using OpenCV 2.3.1 on Visual Studio 2008.
However, I'm curious about how it is done.
Recently I've studied some techniques of feature-based image stitching method.
Most of them requires at least the following step:
1.Feature detection
2.Feature matching
3.Finding Homography
4.Transformation of target images to reference images
...etc
Now most of the techniques I've read only deal with images "ONCE", while I would like it to deal with a series of images captured from a few cameras and I want it to be "REAL-TIME".
So far it may still sound confusing. I'm describing the detail:
Put 3 cameras at different angles and positions, while each of them must have overlapping areas with its adjacent one so as to build a REAL-TIME video stitching.
What I would like to do is similiar to the content in the following link, where ASIFT is used.
http://www.youtube.com/watch?v=a5OK6bwke3I
I tried to consult the owner of that video but I got no reply from him:(.
Can I use image-stitching methods to deal with video stitching?
Video itself is composed of a series of images so I wonder if this is possible.
However, detecting feature points seems to be very time-consuming whatever feature detector(SURF, SIFT, ASIFT...etc) you use. This makes me doubt the possibility of doing Real-time video stitching.
I have worked on a real-time video stitching system and it is a difficult problem. I can't disclose the full solution we used due to an NDA, but I implemented something similar to the one described in this paper. The biggest problem is coping with objects at different depths (simple homographies are not sufficient); depth disparities must be determined and the video frames appropriately warped so that common features are aligned. This essentially is a stereo vision problem. The images must first be rectified so that common features appear on the same scan line.
You might also be interested in my project from a few years back. It's a program which lets you experiment with different stitching parameters and watch the results in real-time.
Project page - https://github.com/lukeyeager/StitcHD
Demo video - https://youtu.be/mMcrOpVx9aY?t=3m38s
Given a set of 2d images that cover all dimensions of an object (e.g. a car and its roof/sides/front/read), how could I transform this into a 3d objdct?
Is there any libraries that could do this?
Thanks
These "2D images" are usually called "textures". You probably want a 3D library which allows you to specify a 3D model with bitmap textures. The library would depend on platform you are using, but start with looking at OpenGL!
OpenGL for PHP
OpenGL for Java
... etc.
I've heard of the program "Poser" doing this using heuristics for human forms, but otherwise I don't believe this is actually theoretically possible. You are asking to construct volumetric data from flat data (inferring the third dimension.)
I think you'd have to make a ton of assumptions about your geometry, and even then, you'd only really have a shell of the object. If you did this well, you'd have a contiguous surface representing the boundary of the object - not a volumetric object itself.
What you can do, like Tomas suggested, is slap these 2d images onto something. However, you still will need to construct a triangle mesh surface, and actually do all the modeling, for this to present a 3D surface.
I hope this helps.
What there is currently that can do anything close to what you are asking for automagically is extremely proprietary. No libraries, but there are some products.
This core issue is matching corresponding points in the images and being able to say, this spot in image A is this spot in image B, and they both match this spot in image C, etc.
There are three ways to go about this, manually matching (you have the photos and have to use your own brain to find the corresponding points), coded targets, and texture matching.
PhotoModeller, www.photomodeller.com, $1,145.00US, supports manual matching and coded targets. You print out a bunch of images, attach them to your object, shoot your photos, and the software finds the targets in each picture and creates a 3D object based on those points.
PhotoModeller Scanner, $2,595.00US, adds texture matching. Tiny bits of the the images are compared to see if they represent the same source area.
Both PhotoModeller products depend on shooting the images with a calibrated camera where you use a consistent focal length for every shot and you got through a calibration process to map the lens distortion of the camera.
If you can do manual matching, the Match Photo feature of Google SketchUp may do the job, and SketchUp is free. If you can shoot new photos, you can add your own targets like colored sticker dots to the object to help you generate contours.
If your images are drawings, like profile, plan view, etc. PhotoModeller will not help you, but SketchUp may be just the tool you need. You will have to build up each part manually because you will have to supply the intelligence to recognize which lines and points correspond from drawing to drawing.
I hope this helps.
I 'm trying to find an efficient way of acceptable complexity to
detect an object in an image so I can isolate it from its surroundings
segment that object to its sub-parts and label them so I can then fetch them at will
It's been 3 weeks since I entered the image processing world and I've read about so many algorithms (sift, snakes, more snakes, fourier-related, etc.), and heuristics that I don't know where to start and which one is "best" for what I'm trying to achieve. Having in mind that the image dataset in interest is a pretty large one, I don't even know if I should use some algorithm implemented in OpenCV or if I should implement one my own.
Summarize:
Which methodology should I focus on? Why?
Should I use OpenCV for that kind of stuff or is there some other 'better' alternative?
Thank you in advance.
EDIT -- More info regarding the datasets
Each dataset consists of 80K images of products sharing the same
concept e.g. t-shirts, watches, shoes
size
orientation (90% of them)
background (95% of them)
All pictures in each datasets look almost identical apart from the product itself, apparently. To make things a little more clear, let's consider only the 'watch dataset':
All the pictures in the set look almost exactly like this:
(again, apart form the watch itself). I want to extract the strap and the dial. The thing is that there are lots of different watch styles and therefore shapes. From what I've read so far, I think I need a template algorithm that allows bending and stretching so as to be able to match straps and dials of different styles.
Instead of creating three distinct templates (upper part of strap, lower part of strap, dial), it would be reasonable to create only one and segment it into 3 parts. That way, I would be confident enough that each part was detected with respect to each other as intended to e.g. the dial would not be detected below the lower part of the strap.
From all the algorithms/methodologies I've encountered, active shape|appearance model seem to be the most promising ones. Unfortunately, I haven't managed to find a descent implementation and I'm not confident enough that that's the best approach so as to go ahead and write one myself.
If anyone could point out what I should be really looking for (algorithm/heuristic/library/etc.), I would be more than grateful. If again you think my description was a bit vague, feel free to ask for a more detailed one.
From what you've said, here are a few things that pop up at first glance:
Simplest thing to do it binarize the image and do Connected Components using OpenCV or CvBlob library. For simple images with non-complex background this usually yeilds objects
HOwever, looking at your sample image, texture-based segmentation techniques may work better - the watch dial, the straps and the background are wisely variant in texture/roughness, and this could be an ideal way to separate them.
The roughness of a portion can be easily found by the Eigen transform (explained a bit on SO, check the link to the research paper provided there), then the Mean Shift filter can be applied on the output of the Eigen transform. This will give regions clearly separated according to texture. Both the pyramidal Mean Shift and finding eigenvalues by SVD are implemented in OpenCV, so unless you can optimize your own code its better (and easier) to use inbuilt functions (if present) as far as speed and efficiency is concerned.
I think I would turn the problem around. Instead of hunting for the dial, I would use a set of robust features from the watch to 'stitch' the target image onto a template. The first watch has a set of squares in the dial that are white, the second watch has a number of white circles. I would per type of watch:
Segment out the squares or circles in the dial. Segmentation steps can be tricky as they are usually both scale and light dependent
Estimate the centers or corners of the above found feature areas. These are the new feature points.
Use the Hungarian algorithm to match features between the template watch and the target watch. Alternatively, one can take the surroundings of each feature point in the original image and match these using cross correlation
Use matching features between the template and the target to estimate scaling, rotation and translation
Stitch the image
As the image is now in a known form, one can extract the regions simply via pre set coordinates
I am looking for a solution to do the following:
( the focus of my question is step 2. )
a picture of a house including the front yard
extract information from the picture like the dimensions and location of the house, trees, sidewalk, and car. Also, the textures and colors of the house, cars, trees, and sidewalk.
use extracted information to generate a model
How can I extract that information?
You could also consult Tatiana Jaworska research on this. As I understood, this details at least 1 new algorithm to feature extraction (targeted at roofs, doors, ...) by colour (RGB). More intriguing, the last publication also uses parameterized objects to be identified in the house images... that must might be a really good starting point for what you're trying to do.
link to her publications:
http://www.springerlink.com/content/w518j70542780r34/
http://portal.acm.org/citation.cfm?id=1578785
http://www.ibspan.waw.pl/~jaworska/TJ_BOS2010.pdf
Yes. You can extract these information from a picture.
1. You just identify these objects in a picture using some detection algorithms.
2. Measure these objects dimensions and generate a model using extracted information.
well actually your desired goal is not so easy to achieve. First of all you'll need a good way to figure what what is what and what is where on your image. And there simply is no easy "algorithm" for detecting houses/cars/whatsoever on an image. There are ways to segment different objects (like cars) from an image, but those don't work generally. Especially on houses this would be hard since each house looks different and it's hard to find one solid measurement for "this is house and this is not"...
Am I assuming it right that you are trying to simply photograph a house (with front yard) and build a texturized 3D-model out of it? This is not going to work since you need several photos of the house to get positions of walls/corners and everything in 3D space (There are approaches that try a mesh reconstruction with one image only but they lack of depth information and results are fairly poor). So if you would like to create 3D-mdoels you will need several photos of different angles of the house.
There are several different approaches that use this kind of technique to reconstruct real world objects to triangle-meshes.
Basically they work after the principle:
Try to find points in images of different viewpoint which are the same on an object. Considering you are photographing a house this could be salient structures likes corners of windows/doors or corners or edges on the walls/roof/...
Knowing where one and the same point of your house is in several different photos and knowing the position of the camera of both photos you can reconstruct this point in 3D-space.
Doing this for a lot of equal points will "empower" you to reconstruct the shape of your house as a 3D-model by triangulating the points.
Taking parts of the image as textures and mapping them on the generated model would work as well since you know where what is.
You should have a look at these papers:
http://www.graphicon.ru/1999/3D%20Reconstruction/Valiev.pdf
http://people.csail.mit.edu/wojciech/pubs/LabeledRec.pdf
http://people.csail.mit.edu/sparis/publi/2006/oceans/Paris_06_3D_Reconstruction.ppt
The second paper even has an example of doing exactly what you try to achieve, namely reconstruct a textured 3D-model of a house photographed from different angles.
The third link is a powerpoint presentation that shows how the reconstruction works and shows the drawbacks there are.
So you should get familiar with these papers to see what problems you are up to... If you then want to try this on your own have a look at OpenCV. This library provides some methods for feature extraction in images. You then can try to find salient points in each image and try to match them.
Good luck on your project... If you have problems, please keep asking!
I suggest to look at this blog
https://jwork.org/main/node/35
that shows how to identify certain features on images using a convolutional neural network. This particular blog discusses how to identify human faces on images from a large set of random images. You can adjust this example to train neural network using some other images. Note that even in the case of human faces, the identification rate is about 85%, therefore, more complex objects can be even harder to identify