Dominant "color" of an image - image

I have the following image:
What I want to do is "id" the individual strips based on their dominant color. What is the best approach to do this?
What I've done is used the image's value (HSV) and make a distribution on that value's occurrence. The problem is, for strip0 values [27=32191, 28=5433, others=8] strip1 values [26=7107, 27=23111, others=22]. I can't get a definitive distinction.
The project's main goal is to compare an actual yellow-colored paper to the strips and determine which strip is the most similar.

First, since you know the boundaries of each strip in the reference image, the only problem possible here is that your reference image is noisy. A relatively overkill way to handle that is clustering the colors in each strip and taking the cluster's centroid as the representative color of the strip. In order to get a more meaningful response here, consider the CIELAB colorspace for this step. Doing this, and converting the results back to RGB, for the first strip I get the rgb triplet (0.949375, 0.879872, 0.147898), and for the second strip (0.945324, 0.857322, 0.129756) (each channel in range [0, 1]).
When you get a new image, you perform the same operation. But there are a lot of problems here. For instance, how are you handling the white balance in this input image ? Supposing you have no such problem, then now it is only a matter of finding the nearest color to the one you just found by the same process. To find the nearest color you have to use a meaningful colorspace for such thing too, and CIELAB is recommended again since the well established Delta-E functions are defined on it. See http://en.wikipedia.org/wiki/Color_difference for some such metrics, the simplest being the euclidean distance in CIELAB.

Calibrate your equipment. If you do not calibrate your equipment, you will have arbitrary errors between the test sample and the reference. Lighting is part of your equipment.
Use edge detection and your knowledge of the reference strip's geometry (strips are equal width) to determine sampling regions. For each sampling region, extract an internal patch.
For the test strip, compute an image where each pixel is the max difference within a sampling window (e.g. 5x5). This will let you identify a relatively homogeneous region which is dissimilar to the outside region (i.e. the paper). Extract a patch.
Use downsampling to find an integrated color for each patch per svnpenn's advice. You can look at other computation methods later, but this should work quite well.
For weights wh, ws, wv, compute similarity = whabs(h0-h1) + wsabs(s0-s1) + wv*abs(v0-v1) between the test color and each reference color. You can look at other distance measures later, but this should work quite well. Start with equal weights. One perk to this method is that it behaves well regardless of the dimension or combination of dimensions under which the reference strip varies.
Sort the results to find the most similar and second most similar matches. Note that similarity is set up so zero is an exact match, and a big number is a poor match. Use the ratio of these two results to estimate the quality of the most similar match - if the first two matches are very close, it's probably not a great match to either.

You can scan through all the colors and use a hashtable to keep track of how many pixels of each color there are.
Take those numbers and, remembering which colors they correspond to, sort them in decreasing order.
Look at the sorted list of numbers and find the difference between each consecutive pair of numbers. Keep track the indices in the list of the two numbers that resulted in each difference. Sort this difference list.
Look at the maximum number in the difference list. You now have the biggest drop-off between two sets of pixels. Go find which was the bigger one. Everything with this number of pixels and above is a dominant color. Everything below is a sub-dominant color. Now you know how many dominant colors you have, and what they are.
Should be pretty easy from there to do whatever it is you want to do.
The only time this wouldn't work is if some of the noise was of the same color as a strip, so much so that it corrupted your data.
In this case, you would use a different approach, which you can also use in the first case - looking at runs. Go through the pixels, and each time you find a new color, look at how many of the following pixels are of the same color.
Use the method described earlier to cluster the colors into dominant and non-dominant, for the same result.
In both cases, if you know that the picture is of vertical strips, you could limit the number of horizontal lines of colors you look at to make things go faster.

You could split the image into sections, then resize each section to one pixel. This is an example using the whole image
$ convert Y82IirS.jpg -resize 1x1 txt:
# ImageMagick pixel enumeration: 1,1,255,srgb
0,0: (220,176, 44) #DCB02C srgb(220,176,44)
Average colour of an image

Related

anyway to remove algorithmically discolorations from aerial imagery

I don't know much about image processing so please bear with me if this is not possible to implement.
I have several sets of aerial images of the same area originating from different sources. The pictures have been taken during different seasons, under different lighting conditions etc. Unfortunately some images look patchy and suffer from discolorations or are partially obstructed by clouds or pix-elated, as par example picture1 and picture2
I would like to take as an input several images of the same area and (by some kind of averaging them) produce 1 picture of improved quality. I know some C/C++ so I could use some image processing library.
Can anybody propose any image processing algorithm to achieve it or knows any research done in this field?
I would try with a "color twist" transform, i.e. a 3x3 matrix applied to the RGB components. To implement it, you need to pick color samples in areas that are split by a border, on both sides. You should fing three significantly different reference colors (hence six samples). This will allow you to write the nine linear equations to determine the matrix coefficients.
Then you will correct the altered areas by means of this color twist. As the geometry of these areas is intertwined with the field patches, I don't see a better way than contouring the regions by hand.
In the case of the second picture, the limits of the regions are blurred so that you will need to blur the region mask as well and perform blending.
In any case, don't expect a perfect repair of those problems as the transform might be nonlinear, and completely erasing the edges will be difficult. I also think that colors are so washed out at places that restoring them might create ugly artifacts.
For the sake of illustration, a quick attempt with PhotoShop using manual HLS adjustment (less powerful than color twist).
The first thing I thought of was a kernel matrix of sorts.
Do a first pass of the photo and use an edge detection algorithm to determine the borders between the photos - this should be fairly trivial, however you will need to eliminate any overlap/fading (looks like there's a bit in picture 2), you'll see why in a minute.
Do a second pass right along each border you've detected, and assume that the pixel on either side of the border should be the same color. Determine the difference between the red, green and blue values and average them along the entire length of the line, then divide it by two. The image with the lower red, green or blue value gets this new value added. The one with the higher red, green or blue value gets this value subtracted.
On either side of this line, every pixel should now be the exact same. You can remove one of these rows if you'd like, but if the lines don't run the length of the image this could cause size issues, and the line will likely not be very noticeable.
This could be made far more complicated by generating a filter by passing along this line - I'll leave that to you.
The issue with this could be where there was development/ fall colors etc, this might mess with your algorithm, but there's only one way to find out!

Invoice / OCR: Detect two important points in invoice image

I am currently working on OCR software and my idea is to use templates to try to recognize data inside invoices.
However scanned invoices can have several 'flaws' with them:
Not all invoices, based on a single template, are correctly aligned under the scanner.
People can write on invoices
etc.
Example of invoice: (Have to google it, sadly cannot add a more concrete version as client data is confidential obviously)
I find my data in the invoices based on the x-values of the text.
However I need to know the scale of the invoice and the offset from left/right, before I can do any real calculations with all data that I have retrieved.
What have I tried so far?
1) Making the image monochrome and use the left and right bounds of the first appearance of a black pixel. This fails due to the fact that people can write on invoices.
2) Divide the invoice up in vertical sections, use the sections that have the highest amount of black pixels. Fails due to the fact that the distribution is not always uniform amongst similar templates.
I could really use your help on (1) how to identify important points in invoices and (2) on what I should focus as the important points.
I hope the question is clear enough as it is quite hard to explain.
Detecting rotation
I would suggest you start by detecting straight lines.
Look (perhaps randomly) for small areas with high contrast, i.e. mostly white but a fair amount of very black pixels as well. Then try to fit a line to these black pixels, e.g. using least squares method. Drop the outliers, and fit another line to the remaining points. Iterate this as required. Evaluate how good that fit is, i.e. how many of the pixels in the observed area are really close to the line, and how far that line extends beyond the observed area. Do this process for a number of regions, and you should get a weighted list of lines.
For each line, you can compute the direction of the line itself and the direction orthogonal to that. One of these numbers can be chosen from an interval [0°, 90°), the other will be 90° plus that value, so storing one is enough. Take all these directions, and find one angle which best matches all of them. You can do that using a sliding window of e.g. 5°: slide accross that (cyclic) region and find a value where the maximal number of lines are within the window, then compute the average or median of the angles within that window. All of this computation can be done taking the weights of the lines into account.
Once you have found the direction of lines, you can rotate your image so that the lines are perfectly aligned to the coordinate axes.
Detecting translation
Assuming the image wasn't scaled at any point, you can then try to use a FFT-based correlation of the image to match it to the template. Convert both images to gray, pad them with zeros till the originals take up at most 1/2 the edge length of the padded image, which preferrably should be a power of two. FFT both images in both directions, multiply them element-wise and iFFT back. The resulting image will encode how much the two images would agree for a given shift relative to one another. Simply find the maximum, and you know how to make them match.
Added text will cause no problems at all. This method will work best for large areas, like the company logo and gray background boxes. Thin lines will provide a poorer match, so in those cases you might have to blur the picture before doing the correlation, to broaden the features. You don't have to use the blurred image for further processing; once you know the offset you can return to the rotated but unblurred version.
Now you know both rotation and translation, and assumed no scaling or shearing, so you know exactly which portion of the template corresponds to which portion of the scan. Proceed.
If rotation is solved already, I'd just sum up all pixel color values horizontally and vertically to a single horizontal / vertical "line". This should provide clear spikes where you have horizontal and vertical lines in the form.
p.s. Generated a corresponding horizontal image with Gimp's scaling capabilities, attached below (it's a bit hard to see because it's only one pixel high and may get scaled down because it's > 700 px wide; the url is http://i.stack.imgur.com/Zy8zO.png ).

Counting object on image algorithm

I got school task again. This time, my teacher gave me task to create algorithm to count how many ducks on picture.
The picture is similar to this one:
I think I should use pattern recognition for searching how many ducks on it. But I don't know which pattern match for each duck.
I think that you can solve this problem by segmenting the ducks' beaks and counting the number of connected components in the binary image.
To segment the ducks' beaks, first convert the image to HSV color space and then perform a binarization using the hue component. Note that the ducks' beaks hue are different from other parts of the image.
Here's one way:
Hough transform for circles:
Initialize an accumulator array indexed by (x,y,radius)
For each pixel:
calculate an edge (e.g. Sobel operator will provide both magnitude and direction), if magnitude exceeds some threshold then:
increment every accumulator for which this edge could possibly lend evidence (only the (x,y) in the direction of the edge, only radii between min_duck_radius and max_duck_radius)
Now smooth and threshold the accumulator array, and the coordinates of highest accumulators show you where the heads are. The threshold may leap out at you if you histogram the values in the accumulators (there may be a clear difference between "lots of evidence" and "noise").
So that's very terse, but it can get you started.
It might be just because I'm working with SIFT right now, but to me it looks like it could be good for your problem.
It is an algorithm that matches the same object on two different pictures, where the objects can have different orientations, scales and be viewed from different perspectives on the two pictures. It can also work when an object is partially hidden (as your ducks are) by another object.
I'd suggest finding a good clear picture of a rubber ducky ( :D ) and then use some SIFT implementation (VLFeat - C library with SIFT but no visualization, SIFT++ - based on VLFeat, but in C++ , Rob Hess in C with OpenCV...).
You should bear in mind that matching with SIFT (and anything else) is not perfect - so you might not get the exact number of rubber duckies in the picture.

Mapping a list of numeric values to colors

I have a list of numeric values. I may normalize the values if needed.
I need to transform this list to a list of colors (in HSL, RGB or any other color model — I can always do conversion later).
For any given value the color must be the same every time.
The more different two given numeric values are, the more contrast corresponding values should be.
All used colors must be as contrast to each other as possible (this is a soft limitation, rough solution would do).
Note that list is rather large (thousands of numbers), so simply squeezing all numbers into a single color channel would produce too dense results.
You could consider using a 3D space-filling curve through your chosen colour space. I'll second Mark's CIELAB suggestion, wish I'd known about that last time I had to solve a similar problem.
Whatever algorithm you finally settle on, you might try the CIELAB color space. It normalizes the differences in human color perception, so that equal numeric spacing gives equal perceptual differences.
See: How to automatically generate N "distinct" colors?
It would be best to normalize your values, and run them through the code I suggested (where hue == your value), building a map/hash. (You can use a hash-style function instead, which is probably more efficient.)
You can "randomize" lightness (or brightness, depending on your model) and saturation using some predetermined bits of your number, for example.
Why not use shades of gray? Just calculate the min/max values and use that to translate each number into a different shade from white to black.
I know it's not colors, but in my opinion it'll be easier to interpret the results. I can tell what it means when something is darker vs. lighter, but who is to say that, for example, green is a higher value than orange?

How does Google's image color search work?

Let's say I query for
http://images.google.com.sg/images?q=sky&imgcolor=black
and I get all the black color sky, how actually does the algorithm behind work?
Based on this paper published by Google engineers Henry Rowley, Shumeet Baluja, and Dr. Yushi Jing, it seems the most important implication of your question about recognizing colors in images relates to google's "saferank" algorithm for pictures that can detect flesh-tones without any text around it.
The paper begins by describing by describing the "classical" methods, which are typically based on normalizing color brightness and then using a "Gaussian Distribution," or using a three-dimensional histogram built up using the RGB values in pixels (each color is a 8bit integer value from 0-255 representing how much . of that color is included in the pixel). Methods have also been introduced that rely on properties such as "luminance" (often incorrectly called "luminosity"), which is the density of luminous intensity to the naked eye from a given image.
The google paper mentions that they will need to process roughly 10^9 images with their algorithm so it needs to be as efficient as possible. To achieve this, they perform the majority of their calculations on an ROI (region of interest) which is a rectangle centered in the image and inset by 1/6 of the image dimensions on all sides. Once they've determined the ROI, they have many different algorithms that are then applied to the image including Face-Detection algs, Color Constancy algs, and others, which as a whole find statistical trends in the image's coloring and most importantly find the color shades with the highest frequency in the statistical distribution.
They use other features such as Entropy , Edge-Detection, and texture-definitions to
In order to extract lines from the images, they use the OpenCV implementation (Bradski, 2000) of the probabilistic Hough transform (Kiryati et al., 1991) computed on the edges of the skin color connected components, which allows them to find straight lines which are probably not body parts and additionally allows them to better determine which colors are most important in an image, which is a key factor in their Image Color Search.
For more on the technicalities of this topic including the math equations and etc, read the google paper linked to in the beginning and look at the Research section of their web site.
Very interesting question and subject!
Images are just pixels. Pixels are just RGB values. We know what black is in RGB, so we can look for it in an image.
Well, one method is, in very basic terms:
Given a corpus of images, determine the high concentrations of a given color range (this is actually fairly trivial), store this data, index accordingly (index the images according to colors determined from the previous step). Now, you have essentially the same sort of thing as finding documents containing certain words.
This is a very, very basic description of one possible method.
There are various ways of extracting color from an image, and I think other answers addressed them (K-Means, distributions, etc).
Assuming you have extracted the colors, there are a few ways to search by color. One slow, but obvious approach would be to calculate the distance between the search color and the dominant colors of the image using some metric (e.g. Color Difference), and then weight the results based on "closeness."
Another, much faster, approach would be to essentially downscale the resolution of your color space. Rather than deal with all possible RGB color values, limit the extraction to a smaller range like Google does (just Blue, Green, Black, Yellow, etc). Then the user can search with a limited set of color swatches and calculating color distance becomes trivial.

Resources