I'd like to detect regions in an image which contain a comparatively large amount of small details, but equally I need to ignore strong edges. For example I would like to (approximately) identify regions of small text on a poster which is located on a building, but I also want to ignore the strong edges of the building itself.
I guess I'm probably looking for specific frequency bands, so approaches that spring to mind include: hand tuning a convolution kernel(s) until I hit what I need, use specific DCT coefficients, apply a histogram on directional filter responses. But perhaps I'm missing something more obvious?
To answer a question in the comments below, I'm developing in Matlab
I'm open to any suggestions for how to achieve this - thanks!
Here is something unscientific, but maybe not bad to get folks talking. I start with this image.
and use the excellent, free ImageMagick to divide it up into tiles 400x400 pixels, like this:
convert -crop 400x400 cinema.jpg tile%d.jpg
Now I measure the entropy of each tile, and sort by increasing entropy:
for f in tile*.jpg; do
convert $f -print '%[entropy] %f\n' null:
done | sort -n
and I get this outoput:
0.142574 tile0.jpg
0.316096 tile15.jpg
0.412495 tile9.jpg
0.482801 tile5.jpg
0.515268 tile4.jpg
0.534078 tile18.jpg
0.613911 tile12.jpg
0.629857 tile14.jpg
0.636475 tile11.jpg
0.689776 tile17.jpg
0.709307 tile10.jpg
0.710495 tile16.jpg
0.824499 tile6.jpg
0.826688 tile3.jpg
0.849991 tile8.jpg
0.851871 tile1.jpg
0.863232 tile13.jpg
0.917552 tile7.jpg
0.971176 tile2.jpg
So, if I look at the last 3 (i.e. those with the most entropy), I get:
The question itself is too broad for a non-paper worthy answer on my side. That being said, I can offer you some advice of narrowing the question down.
First off, go to Google Scholar and search for the keywords your work is revolved around. In your case, one of them would probably be edge detection.
Look through the most recent papers ( no more than 5 years ) for work that satisfies your needs. If you don't find anything, expand the search criteria or try different terms.
If you have something more specific, please edit your question and let me know.
Always remember to split the big question into smaller chunks and then split them into even smaller chunks, until you have a plate of delicious, manageable bites.
EDIT: From what I've gathered, you're interested in an edge detection and feature selection algorithm? Here are a couple of helpful links, which might prove useful:
-MATLAB feature detection
-MATLAB edge detection
Also this MATLAB edge detection write up, which is a part of their extensive guide documentation will hopefully prove useful enough for you to dig through the Matlab image processing toolbox. documentation for specific answers to your question.
You'll find Maximally Stable Extremal Regions (MSER) useful for this. You should be able to impose an area constraint to filter out large MSERs and then calculate a MSER density, for example as Mark had done in his answer by dividing the image into tiles.
Related
Given an image of the region containing the lips and other "noise" (teeth, skin), how can we isolate and recolor only the lips (simulating a "lipstick" effect)?
Attached is a photo describing the lips/mouth states.
What we have tried so far is a three-part process:
Color matching the lips using a stable point on the lips (provided by internal API).
Use this color as the base color for the lips isolation.
Recolor the lips (lipstick behavior)
We tried a few algorithms like hue difference, HSV difference, ∆and E after converting them to CIE color space. Unfortunately, nothing has panned out or has produced artifacts due to the skin's relative similarity in color to the lips and the discoloration from shadows cast by the nose and mouth.
What are we missing? Is there a better way to approach it?
We are looking for a solution/direction from a classic Computer Vision color algorithm, not a solution from the Machine Learning/Depp Learning domain. Thanks!
You probably won't like this answer, but your question is ill-posed (there is no measurable solution that is better than others, there are only peoples' opinions.)
In this case, the best answer you can hope for then is usually:
Ask an expert for a large set of examples that would be acceptable in practice.
Your problem can easily be solved by an appropriate artist (who you trust will produce usable results) with access to the right tools (for example photoshop,) but a single artist (or even a group of them) can't possibly scale to millions (or whatever large number you care about) of examples.
To address the short-coming of the artist-based solution, you can use the following strategy:
Collect a sufficiently large set of before and after images created by artists, who you deem trustworthy.
Apply your favorite machine learning algorithm to learn a mapping from the before to the after images. There are many possible choices, and it almost really doesn't matter which you choose as long as you know how to use it well.
Note, the above two steps are usually not one-and-done, as most algorithms are. Usually, you will come across pathological or not-well behaved examples to your ML solution above in using the product. The key is to collect these examples, pass them through the artist and retrain or update your ML model. Repeat this enough times and you will produce a state-of-the-art solution to your problem.
Whether you have the funding, time, motivation and resources to accomplish this is another matter.
You should try semantic segmentation techniques that would definitely give you very good results and it would be a generalized concept.
In image processing, each of the following methods can be used to get the orientation of a blob region:
Using second order central moments
Using PCA to find the axis
Using distance transform to get the skeleton and axis
Other techniques, like fitting the contour of the region with an ellipse.
When should I consider using a specific method? How do they compare, in terms of accuracy and performance?
I'll give you a vague general answer, and I'm sure others will give you more details. This issue comes up all the time in image processing. There are N ways to solve my problem, which one should I use? The answer is, start with the simplest one that you understand the best. For most people, that's probably 1 or 2 in your example. In most cases, they will be nearly identical and sufficient. If for some reason the techniques don't work on your data, you have now learned for yourself, a case where the techniques fail. Now, you need to start exploring other techniques. This is where the hard work comes in, in being a image processing practitioner. There are no silver bullets, there's a grab bag of techniques that work in specific contexts, which you have to learn and figure out. When you learn this for yourself, you will become god like among your peers.
For this specific example, if your data is roughly ellipsoidal, all these techniques will be similar results. As your data moves away from ellipsoidal, (say spider like) the PCA/Second order moments / contours will start to give poor results. The skeleton approaches become more robust, but mapping a complex skeleton to a single axis / orientation can become a very difficult problem, and may require more apriori knowledge about the blob.
I understand how to do classification problems and starting to understand convolution networks which I think is the answer to some extent. I'm a bit confused on how to setup a network to give me the output position.
Let's say you have the position of the end point of noses for a data set with faces. To find the end point do you just do a 'classification' type problem where your output layer is something like 64x64 = 4096 points but if the nose is at point row 43 and column 20 of your grid you just set the output as all zero's except for at element 43*64 + 20 = 2772 where you set it equal to 1? Then just map it back to your image dimensions.
I can't find much info on how this part of identification works and this is my best guess. I'm working towards a project at the second with this methodology, but it is going to be a lot of work and want to know if I'm at least on the right track. This seems to be a solved problem, but I just can't seem to find how people do this.
Although what you describe could feasibly work, generally neural networks (convolutional and otherwise) are not used to determine the position of a feature in an image. In particular, Convolutional Neural Networks (CNNs) are specifically designed to be translation invariant so that they will detect features regardless of their position in the input image - this is sort of the inverse of what you're looking for.
One common and effective solution for the kind of problem you're describing is a cascade classifier. They have some limitations, but for the kind of application you're describing, it would probably work quite well. In particular, cascade classifiers are designed to provide good performance owing to the staged approach in which most sections of the input image are very quickly dismissed by the first couple stages.
Don't get me wrong, it may be interesting to experiment with using the approach you described; just be aware that it may prove difficult to get it to scale well.
I have a big problem detecting objects within an image - I know this topic was already highly discusses in many forums, but I spend the last 4 days searching for an answer and was not able.
In fact: I have a picture from a branch (http://cl.ly/image/343Y193b2m1c). My goal is to count every single needle in this picture. So I have to face several problems:
Separate the branch with its needles from the background (which in this case is no problem).
Select the borders of the needles. This is a huge problem; I tried different ways including all edges() functions but the problem is always the same - the borders around the needles are not closed and - which leads to the last problem:
Needles are overlapping! This leads in "squares between the needles" which are, if I use imfill() or equal formula, filled in instead of the needles. And: the places where the needles are concentrated (many needles at one place) are nearly impossible to distinguish.
I tried watershed, I tried to enhance the contrast, Kmeans clustering, I tried imerose, imdilate and related functions with subsequent edge detection. I tried as well to filter and smooth the picture a bit in order to "unsharp" the needles a bit so that not every small change in color is recognized as a border (which is another problem).
I am relatively new to matlab, so I dont know what I have to look for. I tried to follow the MatLab tutorial used for Nuclei detection - but with this I just can get all the green objects (all needles at once).
I hope this questions did not came up before - if yes, I apologize deeply for the double post. If anybody has an idea what to do or what methods to use, it would be awesome and would safe this really bad beginning of the week.
Thank you very much in advance,
Phillip
Distinguishing overlapping objects is very, very hard, particularly if you do not know how many objects you have to distinguish. Your brain is much better at distinguishing overlapping objects than any segmentation algorithm I'm aware of, since it is able to integrate a lot of information that is difficult to encode. Therefore: If you're not able to distinguish some of the features yourself, forget about doing it via code.
Having said that, there may be a way for you to be able to get an approximate count of the needles: If you can segment the image pixels into two classes: "needle" versus "not needle", and you know how much area in your picture is covered by a needle (it may help to include a ruler when you take the picture), you can then divide number of "needle"-pixels by the number of pixels covered by a single needle to estimate the total number of needles in the image. This will somewhat underestimate the needle count due to overlaps, and it will underestimate more the denser the needles are (due to more overlaps), but it should allow you to compare automatically between branches with lots of needles and branches with few needles, as well as to identify changes in time, should that be one of your goals.
I agree with #Jonas = you got yourself one HUGE problem.
Let me make a few suggestions.
First, along #Jonas' direction, instead of getting an accurate count, another way of getting a rough estimate is by counting the tips of the needles. Obviously, not all the tips are clearly visible. But, if you can get a clear mask of the branch it might be relatively easy to identify the tips of the needles using some of the morphological operations you mentioned yourself.
Second, is there any way you can get more information? For example, if you could have depth information it might help a little in distinguishing the needles from one another (it will not completely solve the task but it may help). You may get depth information from stereo - that is, taking two pictures of the branch while moving the camera a bit. If you have a Kinect device at your disposal (or some other range-camera) you can get a depth map directly...
I have a bit of a difficult algorithm question, I can't find any suitable algorithm from a lot of searching, so I am hoping that someone here on stackoverflow might know the answer.
I have a set of x,y coordinates for a vehicle as it moves through a 2D space, the coordinates are recorded at "decision points" in the time period (i.e. they have stopped and made a determination of where to move next).
What I want to do is find a mechanism for comparing these trails efficiently (i.e. not going through each point individually). Compounding this is that I am interested in the "pattern" of their movement, not necessarily the individual points they went to. This means that the "path" is considered the same if you reflect it around an axis, or if you rotate it by 90,180 or 270 degrees.
Basically I am trying to distil some sort of "behaviour" to the way they move through the space, then examine the different "behaviours" for classification purposes.
Cheers,
Aidan
This may be way more complicated than you're looking for, but it sounds like what the guys did at astrometry.net may be similar to what you're looking for. Essentially, you can upload a picture of some stars, and it will figure out the position in the sky it belongs, along with rotation, you may be able to use similar pattern matching in what you're looking for.
They have a great pdf explaining how it works here, and apparently you can email them and they'll send you the source code (details are in the pdf).
Edit: apparently you can download the code directly here.
Hope it helps.
there are several approaches you could make:
Using vector paths and translation matricies together with two algorithms, The A* (a star) algorithm ( to locate best routes from what are called greedy functions ), and the "nearest neighbour" algorithm --- these are both commonly used for comparing path efficiencies for routes.
you may not know it but the issue you have is known as the "travelling salesman" problem and has many many approaches.
so look up
traveling salesman problem
A*
Nearest neighbour
also look at
Random walk algorithm - for the most basic approach
for a learned behaviour approach try neural networks "ANN" or genetic algorithms
the mathematics for this type of problem are covered under what is called "graph theory"
It seems that basically what is needed is some metric to compare two(N in general) paths and choose the best one?
If that's the case then I'd suggest plain statistics. I'd start with heading(orientation) histogram, relative(relative to previous heading) heading histogram and so on. Other thing comes to mind - distance/orientation between points covariance. Or just simply make up some kind of "statistics"(number of turns, etc.) and compare those paths using that.