I am playing with computer graphics programming for the first time. I want to convert RGB (24-bit) images to indexed-palette (8-bit) images (like GIF). My initial thought is to use k-means (with k=256).
How would one go about picking the optimal palette for a given image? This is a learning experience for me, so I would prefer an overview-type answer to source code.
Edit: Dithering is currently off-topic. I am only referring to "simple" color conversion, psycho-visual/perceptual models aside; color-space is also currently off-topic, though moving between color-spaces is what got me thinking about this in the first place :)
http://en.wikipedia.org/wiki/Color_quantization
Octree
Median-cut
K-means
Gamut subdivision
http://www.cs.berkeley.edu/~dcoetzee/downloads/scolorq/
The reference links people have provided are good, and there are several solutions to this problem, but since I've been working on this problem recently (with complete ignorance as to how others have solved it), I offer my approach in plain English:
Firstly, realize that (human perceived) color is 3-dimensional. This is fundamentally because the human eye has 3 distinct receptors: red, green, and blue. Likewise your monitor has red, green, and blue pixel elements. Other representations, like, hue, saturation, luminance (HSL) can be used, but basically all representations are 3-dimensional.
This means RGB color space is a cube, with red, green, and blue axes. From a 24-bit source image, this cube has 256 discrete levels on each axis. A naive approach to reducing the image to 8-bit is to simply reduce the levels per axis. For instance, an 8x8x4 cube palette with 8 levels for red and green, 4 levels for blue is easily created by taking the high 3 bits of the red and green values, and the high 2 bits of the blue value. This is easy to implement, but has several disadvantages. In the resulting 256 color palette, many entries will not be used at all. If the image has detail using very subtle color shifts, these shifts will disappear as the colors snap into the same palette entry.
An adaptive palette approach needs to account for not just averaged/common colors in the image, but which areas of color space have the greatest variance. That is, an image that has thousands of subtle shades of light green requires a different palette than an image that has thousands of pixels of exactly the same shade of light green, since the latter would ideally use a single palette entry for that color.
To this end, I took an approach that results in 256 buckets containing exactly the same number of distinct values each. So if the original image contained 256000 distinct 24-bit colors, this algorithm results in 256 buckets each containing 1000 of the original values. This is accomplished by binary spatial partitioning of color space using the median of distinct values present (not the mean).
In English, this means we first divide the whole color cube into the half of pixels with less than the median red value and the half with more than the median red value. Then, divide each resulting half by green value, then by blue, and so on. Each split requires a single bit to indicate the lower or higher half of pixels. After 8 splits, variance has effectively been split into 256 equally important clusters in color space.
In psuedo-code:
// count distinct 24-bit colors from the source image
// to minimize resources, an array of arrays is used
paletteRoot = {colors: [ [color0,count],[color1,count], ...]} // root node has all values
for (i=0; i<8; i++) {
colorPlane = i%3 // red,green,blue,red,green,blue,red,green
nodes = leafNodes(paletteRoot) // on first pass, this is just the root itself
for (node in nodes) {
node.colors.sort(colorPlane) // sort by red, green, or blue
node.lo = { colors: node.colors[0..node.colors.length/2] }
node.hi = { colors: node.colors[node.colors.length/2..node.colors.length] }
delete node.colors // free up space! otherwise will explode memory
node.splitColor = node.hi.colors[0] // remember the median color used to partition
node.colorPlane = colorPlane // remember which color this node split on
}
}
You now have 256 leaf nodes, each containing the same number of distinct colors from the original image, clustered spatially in the color cube. To assign each node a single color, find the weighted average using the color counts. The weighting is an optimization that improves perceptual color matching, but is not that important. Make sure to average each color channel independently. The results are excellent. Note that it is intentional that blue is divided once less than red and green, since the blue receptors in the eye are less sensitive to subtle changes than red and green.
There are other optimizations possible. By using HSL you could instead put the higher quantizing in the luminance dimension instead of blue. Also the above algorithm will slightly reduce overall dynamic range (since it ultimately averages color values), so dynamically expanding the resulting palette is another possibility.
EDIT:
updated to support palette of 256 colors
If you need simplest method then I would suggest histogram based approach:
Calculate histograms of R/G/B channels
Define 4 intensity ranges
For each channel in intensity range
Split histogram into 4 equal parts
For each histogram part
Extract most frequent value of that part
Now you will have 4*4^3=256 colors palette. When assigning pixel to palette color, just calculate average intensity of pixel to see what intensity region you must use. After that just map one of those 64 colors of intensity region to pixel value.
Good luck.
This might be a little late to answer, but try this:
make a set of each color in the image,
sort them according to red, then green then blue (if preceding channels are equal), (they are now into a list)
suppress nearby colors if they are too similar, ie distance in rgb space is smaller than 4.
if they are still to much colors, suppress the least used ones.
Each time you supress a color, you will have to add into a hash_map the colors and it's destination.
Related
I have segmentation masks with indexed colors. Unfortunately there is (colored) noise at the edges of objects. At the transition from one color region to the next, there are small pixel regions in different colors, separating the two color regions (caused by converting transparent pixels at the edges).
I want to remove this noise (with MATLAB) by assigning a color of one of the neighboring large regions. It doesn't matter, which one - main thing is to remove the small areas.
It can be assumed that small regions of ANY color may be removed in this way (reassign to neighboring large region).
In case of a binary image, I could use bwareaopen (suggested in this Q&A: Remove small chunks of labels in an image). Converting the image to a binary image for each color might be a workaround, however this is costly (for many colors) and leaves the question of reassignment open. I hope there are more elegant ways to do this.
Check the following:
Convert RGB to indexed image.
Apply median filter on indexed map.
Convert back to RGB
RGB = imread('GylzKm.png');
%Convert RGB to indexed image with 4 levels
[X, map] = rgb2ind(RGB, 4);
%Apply median filter on 4 levels images
X = medfilt2(X, [5, 5]);
%Convert indexed image back to RGB.
J = ind2rgb(X, map);
figure;imshow(J);
The black border is a little problematic.
I recently asked a question about using matlab to reduce the number of colors in an image. However, when I attempted this, I was only able to get color approximations which then matched the pixel to the nearest color within the color map.
For example, using a color map with only three colors [red, green, blue], it would scan each color and then map either red green or blue. However, this process did not vary the RGB densities to create realistic looking color.
I'm curious if there is any sort of built in function that would use these three colors and vary the density of them to achieve the average color of a certain "pixel field".
I realize this would lose resolution, but I'm essentially trying to make realistic looking images, using only three colors by varying the amounts of RGB within a certain region.
You are looking for the function rgb2ind and its 'dither' option.
I have more then 1 week reading about selective color change of an image. It meand selcting a color from a color picker and then select a part of image in which I want to change the color and apply the changing of color form original color to color of color picker.
E.g. if I select a blue color in color picker and I also select a red part in the image I should be able to change red color to blue color in all the image.
Another example. If I have an image with red apples and oranges and if I select an apple on the image and a blue color in the color picket, then all apples should be changing the color from red to blue.
I have some ideas but of course I need something more concrete on how to do this
Thank you for reading
As a starting point, consider clustering the colors of your image. If you don't know how many clusters you want, then you will need methods to determine whether to merge or not two given clusters. For the moment, let us suppose that we know that number. For example, given the following image at left, I mapped its colors to 3 clusters, which have the mean colors as shown in the middle, and representing each cluster by its mean color gives the figure at right.
With the output at right, now what you need is a method to replace colors. Suppose the user clicks (a single point) somewhere in your image, then you know the positions in the original image that you will need to modify. For the next image, the user (me) clicked on a point that is contained by the "orange" cluster. Then he clicked on some blue hue. From that, you make a mask representing the points in the "orange" cluster and play with that. I considered a simple gaussian filter followed by a flat dilation 3x5. Then you replace the hues in the original image according to the produced mask (after the low pass filtering, the values on it are also considered as a alpha value for compositing the images).
Not perfect at all, but you could have a better clustering than me and also a much-less-primitive color replacement method. I intentionally skipped the details about clustering method, color space, and others, because I used only basic k-means on RGB without any pre-processing of the input. So you can consider the results above as a baseline for anything else you can do.
Given the image, a selected color, and a target new color - you can't do much that isn't ugly. You also need a range, some amount of variation in color, so you can say one pixel's color is "close enough" while another is clearly "different".
First step of processing: You create a mask image, which is grayscale and varying from 0.0 to 1.0 (or from zero to some maximum value we'll treat as 1.0), and the same size as the input image. For each input pixel, test if its color is sufficiently near the selected color. If it's "the same" or "close enough" put 1.0 in the mask. If it's different, put 0.0. If is sorta borderline, put an in-between value. Exactly how to do this depends on the details of the image.
This might work best in LAB space, and testing for sameness according to the angle of the A,B coordinates relative to their origin.
Once you have the mask, put it aside. Now color-transform the whole image. This might be best done in HSV space. Don't touch the V channel. Add a constant to S, modulo 360deg (or mod 256, if S is stored as bytes) and multiply S by a constant chosen so that the coordinates in HSV corresponding to the selected color is moved to the HSV coordinates for the target color. Convert the transformed S and H, with the unchanged L, back to RGB.
Finally, use the mask to blend the original image with the color-transformed one. Apply this to each channel - red, green, blue:
output = (1-mask)*original + mask*transformed
If you're doing it all in byte arrays, 0 is 0.0 and 255 is 1.0, and be careful of overflow and signed/unsigned problems.
I am playing with computer graphics programming for the first time. I want to convert RGB (24-bit) images to indexed-palette (8-bit) images (like GIF). My initial thought is to use k-means (with k=256).
How would one go about picking the optimal palette for a given image? This is a learning experience for me, so I would prefer an overview-type answer to source code.
Edit: Dithering is currently off-topic. I am only referring to "simple" color conversion, psycho-visual/perceptual models aside; color-space is also currently off-topic, though moving between color-spaces is what got me thinking about this in the first place :)
http://en.wikipedia.org/wiki/Color_quantization
Octree
Median-cut
K-means
Gamut subdivision
http://www.cs.berkeley.edu/~dcoetzee/downloads/scolorq/
The reference links people have provided are good, and there are several solutions to this problem, but since I've been working on this problem recently (with complete ignorance as to how others have solved it), I offer my approach in plain English:
Firstly, realize that (human perceived) color is 3-dimensional. This is fundamentally because the human eye has 3 distinct receptors: red, green, and blue. Likewise your monitor has red, green, and blue pixel elements. Other representations, like, hue, saturation, luminance (HSL) can be used, but basically all representations are 3-dimensional.
This means RGB color space is a cube, with red, green, and blue axes. From a 24-bit source image, this cube has 256 discrete levels on each axis. A naive approach to reducing the image to 8-bit is to simply reduce the levels per axis. For instance, an 8x8x4 cube palette with 8 levels for red and green, 4 levels for blue is easily created by taking the high 3 bits of the red and green values, and the high 2 bits of the blue value. This is easy to implement, but has several disadvantages. In the resulting 256 color palette, many entries will not be used at all. If the image has detail using very subtle color shifts, these shifts will disappear as the colors snap into the same palette entry.
An adaptive palette approach needs to account for not just averaged/common colors in the image, but which areas of color space have the greatest variance. That is, an image that has thousands of subtle shades of light green requires a different palette than an image that has thousands of pixels of exactly the same shade of light green, since the latter would ideally use a single palette entry for that color.
To this end, I took an approach that results in 256 buckets containing exactly the same number of distinct values each. So if the original image contained 256000 distinct 24-bit colors, this algorithm results in 256 buckets each containing 1000 of the original values. This is accomplished by binary spatial partitioning of color space using the median of distinct values present (not the mean).
In English, this means we first divide the whole color cube into the half of pixels with less than the median red value and the half with more than the median red value. Then, divide each resulting half by green value, then by blue, and so on. Each split requires a single bit to indicate the lower or higher half of pixels. After 8 splits, variance has effectively been split into 256 equally important clusters in color space.
In psuedo-code:
// count distinct 24-bit colors from the source image
// to minimize resources, an array of arrays is used
paletteRoot = {colors: [ [color0,count],[color1,count], ...]} // root node has all values
for (i=0; i<8; i++) {
colorPlane = i%3 // red,green,blue,red,green,blue,red,green
nodes = leafNodes(paletteRoot) // on first pass, this is just the root itself
for (node in nodes) {
node.colors.sort(colorPlane) // sort by red, green, or blue
node.lo = { colors: node.colors[0..node.colors.length/2] }
node.hi = { colors: node.colors[node.colors.length/2..node.colors.length] }
delete node.colors // free up space! otherwise will explode memory
node.splitColor = node.hi.colors[0] // remember the median color used to partition
node.colorPlane = colorPlane // remember which color this node split on
}
}
You now have 256 leaf nodes, each containing the same number of distinct colors from the original image, clustered spatially in the color cube. To assign each node a single color, find the weighted average using the color counts. The weighting is an optimization that improves perceptual color matching, but is not that important. Make sure to average each color channel independently. The results are excellent. Note that it is intentional that blue is divided once less than red and green, since the blue receptors in the eye are less sensitive to subtle changes than red and green.
There are other optimizations possible. By using HSL you could instead put the higher quantizing in the luminance dimension instead of blue. Also the above algorithm will slightly reduce overall dynamic range (since it ultimately averages color values), so dynamically expanding the resulting palette is another possibility.
EDIT:
updated to support palette of 256 colors
If you need simplest method then I would suggest histogram based approach:
Calculate histograms of R/G/B channels
Define 4 intensity ranges
For each channel in intensity range
Split histogram into 4 equal parts
For each histogram part
Extract most frequent value of that part
Now you will have 4*4^3=256 colors palette. When assigning pixel to palette color, just calculate average intensity of pixel to see what intensity region you must use. After that just map one of those 64 colors of intensity region to pixel value.
Good luck.
This might be a little late to answer, but try this:
make a set of each color in the image,
sort them according to red, then green then blue (if preceding channels are equal), (they are now into a list)
suppress nearby colors if they are too similar, ie distance in rgb space is smaller than 4.
if they are still to much colors, suppress the least used ones.
Each time you supress a color, you will have to add into a hash_map the colors and it's destination.
Sometimes I have a true colored image, by using dithering algorithm, I can reduce the color to just 256. I want to know how the dithering algorithm achieve this.
I understand that dithering can reduce the error, but how can the algorithm decrease color depth, especially from true color to just 256 colors or even less.
Dithering simulates a higher color depth by "mixing" the colors in a defined palette to create the illusion of a color that isn't really there. In reality, it's doing the same thing that your computer monitor is already doing: taking a color, decomposing it into primary colors, and displaying those right next to each other. Your computer monitor does it with variable-intensity red, green, and blue, while dithering does it with a set of fixed-intensity colors. Since your eye has limited resolution, it sums the inputs, and you perceive the average color.
In the same way, a newspaper can print images in grayscale by dithering the black ink. They don't need lots of intermediate gray colors to get a decent grayscale image; they simply use smaller or larger dots of black ink on the page.
When you dither an image, you lose information, but your eye perceives it in largely the same way. In this sense, it's a little like JPEG or other lossy compression algorithms which discard information that your eye can't see.
Dithering by itself does not decrease the number of colors. Rather, dithering is applied during the process of reducing the colors to make the artifacts of the color reduction less visible.
A color that is halfway between two other colors can be simulated by a pattern that is half of one color and half of the other. This can be generalized to other percentages as well. A color that is a mixture of 10% of one color and 90% of the other can be simulated by having 10% of the pixels be the first color and 90% of the pixels be the second. This is because the eye will tend to consider the random variations as noise and average them into the overall impression of the color of an area.
The most effective dithering algorithms will track the difference between the original image and the color-reduced one, and account for that difference while converting future pixels. This is called error diffusion - the errors on the current pixel are diffused into the conversions of other pixels.
The process of selecting the best 256 colors for the conversion is separate from dithering.