I have some images where I wish to distinguish between the hood of the car, and the rest of the objects by any means necessary.
Specifically, Is there any way I can accurately segment the 'hood' of the car in all these images below? It contains some amount of light refections so using basic filters becomes tricky.
To be clear; I do not have any labelled data - but I think it is possible to achieve this with simple filters only.
A few samples:-
Is there any way I can somehow separate the hood of the car with the rest of the image?
Filters that convert the hood to, say black and the rest of the environments oddly identified as black or white doesn't matter. The only requirement is that demarcation between the hood and the surrounding road be present.
Any other way to accurately generalize and filter/segment/extract the hood based on other features are also welcome!
==> The real difficulty here is the reflective surfaces - I am well aware simple color based filters might have worked but the reflection causes havoc with the simple thresholding based methods I have tried! :)
There is no way to achieve this because the reflections that you see are legit image content, and the edge of the hood is not visible enough. Even the best semantic segmentation would fail.
I assume that the hood area is fixed wrt the camera, so the simplest is to draw the outline by hand.
Related
I am trying to separate the different kinds of grains in an image. And sometimes the image also contains some impurity substance which need to be considered as an extra type.
here are some example images:
corn and beans
long rice and wheat
I tried to find a general method for the different pics, but the result is not good enough.
I used flood-fill and some gradient method to get the regions, and try to use clustering method to classify the contains, but the feature selection is a hard problem, I try gabor filter, but it cannot get me a clear boundary, and so does the classification method such as kmeans.
Any ideas about segmentation, getting the contours or classification will be appreciated. thanks!
I try to post some more pics of my current results, but I am sorry that there is the 2 pics restriction for the beginner here.
It's almost a craft work dealing with image processing problems. I would suggest you to use a robust library (such as OpenCV of course) and use cvFindContours function to identify the contours. Also, search for mathematical morphology. Basic operators such as erosion and dilation may help you since areas of foreground pixels shrink in size, and holes within those areas become larger and vice-versa. Working with color segmentation is also helpful but you might have some troubles since grain color is not uniform. Lastly, feature extraction is another way out. Scale-invariant feature transform can be used to identify every single grain on the image, based on the fact that it is invariant to linear transformations and illumination issues. Hope it helps.
I have a series of mostly identical images taken over a period of time. However, the objects in the images drifts over time, and I would like to correct for this. What is a good was to do this?
[EDIT] Okay, I may have to explain why I'm going this. I've taken some series of X-ray images of objects at different X-ray energies. I now want to compare the object are the various energies, but since it drifts I have to correct for the drift first. The object has no sharps edges or anything which otherwise would be easy to use for alignment. Therefore I'm looking for a more general method
In its general form this problem is known as image registration, and is a large topic of research in the image processing community. There are a varity of different methods and algorithms, often specialized for image modality. Depending on your images, to do this could be easy, or it could be difficult. I would recommend using one of the registration methods found in the file-exchange.
Based on your description of your images, it seems a rigid transformation should be enough. In that case, this method should work nicely.
I 'm trying to find an efficient way of acceptable complexity to
detect an object in an image so I can isolate it from its surroundings
segment that object to its sub-parts and label them so I can then fetch them at will
It's been 3 weeks since I entered the image processing world and I've read about so many algorithms (sift, snakes, more snakes, fourier-related, etc.), and heuristics that I don't know where to start and which one is "best" for what I'm trying to achieve. Having in mind that the image dataset in interest is a pretty large one, I don't even know if I should use some algorithm implemented in OpenCV or if I should implement one my own.
Summarize:
Which methodology should I focus on? Why?
Should I use OpenCV for that kind of stuff or is there some other 'better' alternative?
Thank you in advance.
EDIT -- More info regarding the datasets
Each dataset consists of 80K images of products sharing the same
concept e.g. t-shirts, watches, shoes
size
orientation (90% of them)
background (95% of them)
All pictures in each datasets look almost identical apart from the product itself, apparently. To make things a little more clear, let's consider only the 'watch dataset':
All the pictures in the set look almost exactly like this:
(again, apart form the watch itself). I want to extract the strap and the dial. The thing is that there are lots of different watch styles and therefore shapes. From what I've read so far, I think I need a template algorithm that allows bending and stretching so as to be able to match straps and dials of different styles.
Instead of creating three distinct templates (upper part of strap, lower part of strap, dial), it would be reasonable to create only one and segment it into 3 parts. That way, I would be confident enough that each part was detected with respect to each other as intended to e.g. the dial would not be detected below the lower part of the strap.
From all the algorithms/methodologies I've encountered, active shape|appearance model seem to be the most promising ones. Unfortunately, I haven't managed to find a descent implementation and I'm not confident enough that that's the best approach so as to go ahead and write one myself.
If anyone could point out what I should be really looking for (algorithm/heuristic/library/etc.), I would be more than grateful. If again you think my description was a bit vague, feel free to ask for a more detailed one.
From what you've said, here are a few things that pop up at first glance:
Simplest thing to do it binarize the image and do Connected Components using OpenCV or CvBlob library. For simple images with non-complex background this usually yeilds objects
HOwever, looking at your sample image, texture-based segmentation techniques may work better - the watch dial, the straps and the background are wisely variant in texture/roughness, and this could be an ideal way to separate them.
The roughness of a portion can be easily found by the Eigen transform (explained a bit on SO, check the link to the research paper provided there), then the Mean Shift filter can be applied on the output of the Eigen transform. This will give regions clearly separated according to texture. Both the pyramidal Mean Shift and finding eigenvalues by SVD are implemented in OpenCV, so unless you can optimize your own code its better (and easier) to use inbuilt functions (if present) as far as speed and efficiency is concerned.
I think I would turn the problem around. Instead of hunting for the dial, I would use a set of robust features from the watch to 'stitch' the target image onto a template. The first watch has a set of squares in the dial that are white, the second watch has a number of white circles. I would per type of watch:
Segment out the squares or circles in the dial. Segmentation steps can be tricky as they are usually both scale and light dependent
Estimate the centers or corners of the above found feature areas. These are the new feature points.
Use the Hungarian algorithm to match features between the template watch and the target watch. Alternatively, one can take the surroundings of each feature point in the original image and match these using cross correlation
Use matching features between the template and the target to estimate scaling, rotation and translation
Stitch the image
As the image is now in a known form, one can extract the regions simply via pre set coordinates
i want to identify a ball in the picture. I am thiking of using sobel edge detection algorithm,with this i can detect the round objects in the image.
But how do i differentiate between different objects. For example, a foot ball is there in one picture and in another picture i have a picture of moon.. how to differentiate what object has been detected.
When i use my algorithm i get ball in both the cases. Any ideas?
Well if all the objects you would like to differentiate are round, you could even use a hough transformation for round objects. This is a very good way of distinguishing round objects.
But your basic problem seems to be classification - sorting the objects on your image into different classes.
For this you don't really need a Neural Network, you could simply try with a Nearest Neighbor match. It's functionalities are a bit like neural networks since you can give it several reference pictures where you tell the system what can be seen there and it will optimize itself to the best average values for each attribute you detected. By this you get a dictionary of clusters for the different types of objects.
But for this you'll of course first need something that distinguishes a ball from a moon.
Since they are all real round objects (which appear as circles) it will be useless to compare for circularity, circumference, diameter or area (only if your camera is steady and if you know a moon will always have the same size on your images, other than a ball).
So basically you need to look inside the objects itself and you can try to compare their mean color value or grayscale value or the contrast inside the object (the moon will mostly have mid-gray values whereas a soccer ball consists of black and white parts)
You could also run edge filters on the segmented objects just to determine which is more "edgy" in its texture. But for this there are better methods I guess...
So basically what you need to do first:
Find several attributes that help you distinguish the different round objects (assuming they are already separated)
Implement something to get these values out of a picture of a round object (which is already segmented of course, so it has a background of 0)
Build a system that you feed several images and their class to have a supervised learning system and feed it several images of each type (there are many implementations of that online)
Now you have your system running and can give other objects to it to classify.
For this you need to segment the objects in the image, by i.e Edge filters or a Hough Transformation
For each of the segmented objects in an image, let it run through your classification system and it should tell you which class (type of object) it belongs to...
Hope that helps... if not, please keep asking...
When you apply an edge detection algorithm you lose information.
Thus the moon and the ball are the same.
The moon has a diiferent color, a different texture, ... you can use these informations to differnentiate what object has been detected.
That's a question in AI.
If you think about it, the reason you know it's a ball and not a moon, is because you've seen a lot of balls and moons in your life.
So, you need to teach the program what a ball is, and what a moon is. Give it some kind of dictionary or something.
The problem with a dictionary of course would be that to match the object with all the objects in the dictionary would take time.
So the best solution would probably using Neural networks. I don't know what programming language you're using, but there are Neural network implementations to most languages i've encountered.
You'll have to read a bit about it, decide what kind of neural network, and its architecture.
After you have it implemented it gets easy. You just give it a lot of pictures to learn (neural networks get a vector as input, so you can give it the whole picture).
For each picture you give it, you tell it what it is. So you give it like 20 different moon pictures, 20 different ball pictures. After that you tell it to learn (built in function usually).
The neural network will go over the data you gave it, and learn how to differentiate the 2 objects.
Later you can use that network you taught, give it a picture, and it a mark of what it thinks it is, like 30% ball, 85% moon.
This has been discussed before. Have a look at this question. More info here and here.
This is a difficult question to search in Google since it has other meaning in finance.
Of course, what I mean here is "Drawing" as in .. computer graphics.. not money..
I am interested in preventing overdrawing for both 3D Drawing and 2D Drawing.
(should I make them into two different questions?)
I realize that this might be a very broad question since I didn't specify which technology to use. If it is too broad, maybe some hints on some resources I can read up will be okay.
EDIT:
What I mean by overdrawing is:
when you draw too many objects, rendering single frame will be very slow
when you draw more area than what you need, rendering a single frame will be very slow
It's quite complex topic.
First thing to consider is frustum culling. It will filter out objects that are not in camera’s field of view so you can just pass them on render stage.
The second thing is Z-sorting of objects that are in camera. It is better to render them from front to back so that near objects will write “near-value” to the depth buffer and far objects’ pixels will not be drawn since they will not pass depth test. This will save your GPU’s fill rate and pixel-shader work. Note however, if you have semitransparent objects in scene, they should be drawn first in back-to-front order to make alpha-blending possible.
Both things achievable if you use some kind of space partition such as Octree or Quadtree. Which is better depends on your game. Quadtree is better for big open spaces and Octree is better for in-door spaces with many levels.
And don't forget about simple back-face culling that can be enabled with single line in DirectX and OpenGL to prevent drawing of faces that are look at camera with theirs back-side.
Question is really too broad :o) Check out these "pointers" and ask more specifically.
Typical overdraw inhibitors are:
Z-buffer
Occlusion based techniques (various buffer techniques, HW occlusions, ...)
Stencil test
on little bit higher logic level:
culling (usually by view frustum)
scene organization techniques (usually trees or tiling)
rough drawing front to back (this is obviously supporting technique :o)
EDIT: added stencil test, has indeed interesting overdraw prevention uses especially in combination of 2d/3d.
Reduce the number of objects you consider for drawing based on distance, and on position (ie. reject those outside of the viewing frustrum).
Also consider using some sort of object-based occlusion system to allow large objects to obscure small ones. However this may not be worth it unless you have a lot of large objects with fairly regular shapes. You can pre-process potentially visible sets for static objects in some cases.
Your API will typically reject polygons that are not facing the viewpoint also, since you typically don't want to draw the rear-face.
When it comes to actual rendering time, it's often helpful to render opaque objects from front-to-back, so that the depth-buffer tests end up rejecting entire polygons. This works for 2D too, if you have depth-buffering turned on.
Remember that this is a performance optimisation problem. Most applications will not have a significant problem with overdraw. Use tools like Pix or NVIDIA PerfHUD to measure your problem before you spend resources on fixing it.