Mapping a list of numeric values to colors - algorithm

I have a list of numeric values. I may normalize the values if needed.
I need to transform this list to a list of colors (in HSL, RGB or any other color model — I can always do conversion later).
For any given value the color must be the same every time.
The more different two given numeric values are, the more contrast corresponding values should be.
All used colors must be as contrast to each other as possible (this is a soft limitation, rough solution would do).
Note that list is rather large (thousands of numbers), so simply squeezing all numbers into a single color channel would produce too dense results.

You could consider using a 3D space-filling curve through your chosen colour space. I'll second Mark's CIELAB suggestion, wish I'd known about that last time I had to solve a similar problem.

Whatever algorithm you finally settle on, you might try the CIELAB color space. It normalizes the differences in human color perception, so that equal numeric spacing gives equal perceptual differences.

See: How to automatically generate N "distinct" colors?
It would be best to normalize your values, and run them through the code I suggested (where hue == your value), building a map/hash. (You can use a hash-style function instead, which is probably more efficient.)
You can "randomize" lightness (or brightness, depending on your model) and saturation using some predetermined bits of your number, for example.

Why not use shades of gray? Just calculate the min/max values and use that to translate each number into a different shade from white to black.
I know it's not colors, but in my opinion it'll be easier to interpret the results. I can tell what it means when something is darker vs. lighter, but who is to say that, for example, green is a higher value than orange?

Related

Dominant "color" of an image

I have the following image:
What I want to do is "id" the individual strips based on their dominant color. What is the best approach to do this?
What I've done is used the image's value (HSV) and make a distribution on that value's occurrence. The problem is, for strip0 values [27=32191, 28=5433, others=8] strip1 values [26=7107, 27=23111, others=22]. I can't get a definitive distinction.
The project's main goal is to compare an actual yellow-colored paper to the strips and determine which strip is the most similar.
First, since you know the boundaries of each strip in the reference image, the only problem possible here is that your reference image is noisy. A relatively overkill way to handle that is clustering the colors in each strip and taking the cluster's centroid as the representative color of the strip. In order to get a more meaningful response here, consider the CIELAB colorspace for this step. Doing this, and converting the results back to RGB, for the first strip I get the rgb triplet (0.949375, 0.879872, 0.147898), and for the second strip (0.945324, 0.857322, 0.129756) (each channel in range [0, 1]).
When you get a new image, you perform the same operation. But there are a lot of problems here. For instance, how are you handling the white balance in this input image ? Supposing you have no such problem, then now it is only a matter of finding the nearest color to the one you just found by the same process. To find the nearest color you have to use a meaningful colorspace for such thing too, and CIELAB is recommended again since the well established Delta-E functions are defined on it. See http://en.wikipedia.org/wiki/Color_difference for some such metrics, the simplest being the euclidean distance in CIELAB.
Calibrate your equipment. If you do not calibrate your equipment, you will have arbitrary errors between the test sample and the reference. Lighting is part of your equipment.
Use edge detection and your knowledge of the reference strip's geometry (strips are equal width) to determine sampling regions. For each sampling region, extract an internal patch.
For the test strip, compute an image where each pixel is the max difference within a sampling window (e.g. 5x5). This will let you identify a relatively homogeneous region which is dissimilar to the outside region (i.e. the paper). Extract a patch.
Use downsampling to find an integrated color for each patch per svnpenn's advice. You can look at other computation methods later, but this should work quite well.
For weights wh, ws, wv, compute similarity = whabs(h0-h1) + wsabs(s0-s1) + wv*abs(v0-v1) between the test color and each reference color. You can look at other distance measures later, but this should work quite well. Start with equal weights. One perk to this method is that it behaves well regardless of the dimension or combination of dimensions under which the reference strip varies.
Sort the results to find the most similar and second most similar matches. Note that similarity is set up so zero is an exact match, and a big number is a poor match. Use the ratio of these two results to estimate the quality of the most similar match - if the first two matches are very close, it's probably not a great match to either.
You can scan through all the colors and use a hashtable to keep track of how many pixels of each color there are.
Take those numbers and, remembering which colors they correspond to, sort them in decreasing order.
Look at the sorted list of numbers and find the difference between each consecutive pair of numbers. Keep track the indices in the list of the two numbers that resulted in each difference. Sort this difference list.
Look at the maximum number in the difference list. You now have the biggest drop-off between two sets of pixels. Go find which was the bigger one. Everything with this number of pixels and above is a dominant color. Everything below is a sub-dominant color. Now you know how many dominant colors you have, and what they are.
Should be pretty easy from there to do whatever it is you want to do.
The only time this wouldn't work is if some of the noise was of the same color as a strip, so much so that it corrupted your data.
In this case, you would use a different approach, which you can also use in the first case - looking at runs. Go through the pixels, and each time you find a new color, look at how many of the following pixels are of the same color.
Use the method described earlier to cluster the colors into dominant and non-dominant, for the same result.
In both cases, if you know that the picture is of vertical strips, you could limit the number of horizontal lines of colors you look at to make things go faster.
You could split the image into sections, then resize each section to one pixel. This is an example using the whole image
$ convert Y82IirS.jpg -resize 1x1 txt:
# ImageMagick pixel enumeration: 1,1,255,srgb
0,0: (220,176, 44) #DCB02C srgb(220,176,44)
Average colour of an image

Value as colour representation

Converting a value to a colour is well known, I do understand the following two approaches (very well described in changing rgb color values to represent a value)
Value as shades of grey
Value as brightness of a base colour (e.g. brightness of blue)
But what is the best algorithm when I want to use the full colour range ("all colours"). When I use "greys" with 8bit RGB values, I actually do have a representation of 256 shades (white to black). But if I use the whole range, I could use more shades. Something like this. Also this would be easier to recognize.
Basically I need the algorithm in Javascript, but I guess all code such as C#, Java, pseudo code would do as well. The legend at the bottom shows the encoding, and I am looking for the algorithm for this.
So having a range of values(e.g. 1-1000), I could represent 1 as white and 1000 as black, but I could also represent 1 as yellow and 1000 as blue. But is there a standard algorithm for this? Looking at the example here, it is shown that they use colour intervals. I do not only want to use greys or change the brightness, but use all colours.
This is a visual demonstration (Flash required). Given values a represented in a color scheme, my goal is to calculate the colours.
I do have a linear colour range, e.g. from 1-30000
-- Update --
Here I found that here is something called a LabSpace:
Lab space is a way of representing colours where points that are close to each other are those that look similar to each other to humans.
So what I would need is an algorithm to represent the linear values in this lab space.
There are two basic ways to specify colors. One is a pre-defined list of colors (a palette) and then your color value is an index into this list. This is how old 8-bit color systems worked, and how GIF images still work. There are lists of web-safe colors, eg http://en.wikipedia.org/wiki/Web_colors, that typically fit into an 8-bit value. Often similar colors are adjacent, but sometimes not.
A palette has the advantage of requiring a small amount of data per pixel, but the disadvantage that you're limited in the number of different colors that can be on the screen at the same time.
The other basic way is to specify the coordinates of a color. One way is RGB, with a separate value for each primary color. Another is Hue/Saturation/Luminance. CMYK (Cyan, Magenta, Yellow and sometimes blacK) is used for print. This is what's typically referred to as true color and when you use a phrase like "all colors" it sounds like you're looking for a solution like this. For gradients and such HSL might be a perfect fit for you. For example, a gradient from a color to grey simply reduces the saturation value. If all you want are "pure" colors, then fix the saturation and luminance values and vary the hue.
Nearly all drawing systems require RGB, but the conversion from HSL to RGB is straight forward. http://en.wikipedia.org/wiki/HSL_and_HSV
If you can't spare the full 24 bits per color (8 bits per color, 32-bit color is the same but adds a transparency channel) you can use 15 or 16 bit color. It's the same thing, but instead of 8 bits per color you get 5 each (15 bit) or 5-6-5 (16 bit, green gets the extra bit because our eyes are more sensitive to shades of green). That fits into a short integer.
It depends on the purposes of your datasets.
For example, you can assign a color to each range of values (0-100 - red, 100-200 - green, 200-300 - blue) by changing the brightness within the range.
Horst,
The example you gave does not create gradients. Instead, they use N preset colors from an array and pick the next color as umbr points out. Something like this:
a = { "#ffffff", "#ff00ff", "#ff0000", "#888888", ... };
c = a[pos / 1000];
were pos is your value from 1 to 30,000 and c is the color you want to use. (you'd need to better define the index than pos / 1000 for this to work right in all situations.)
If you want a gradient effect, you can just use the simple math shown on the other answer you pointed out, although if you want to do that with any number of points, it has to be done with triangles. You'll have a lot of work to determine the triangles and properly define every point.
In JavaScript, it will be dog slow. (with OpenGL it would be instantaneous and you would not even have to compute the gradients, and that would be "faster than realtime.")
What you need is a transfer function.
given a float number, a transfer function can generate a color.
see this:
http://http.developer.nvidia.com/GPUGems/gpugems_ch39.html
and this:
http://graphicsrunner.blogspot.com/2009/01/volume-rendering-102-transfer-functions.html
the second article says that the isovalue is between [0,255]. But it doesn't have to be in that range.
Normally, we scale any float number to the [0,1] range, and apply transfer function to get the color value.

DensityPlot command in mathematica

The info for DensityPlot says that the "default generates colorized grayscale output, in which larger values are shown lighter." What on earth is colorized grayscale? Is there a way to make it truly grayscale without the blue and purple colors that it generates? And when I do it, it appears a little pixellated. Is there a way to evaluate it at more points so that it doesnt look so choppy?
By colorized grayscale, I think they mean that it's monochrome, or maybe bichromatic - that is, there's a linear scale from one color to another, rather than fully varying across the whole color space. It's not a very good term, I agree.
Specifying ColorFunction->GrayLevel should give you pure grayscale. This is distinct from the built-in gradient GrayTones (ColorFunction->"GrayTones"), which appears to stop a bit short of pure black and white on the ends and is a bit warm. There are plenty of other built-in gradients - see the return value of ColorData["Gradients"]. You can also specify your own function, of course - it will take as input a real number from 0 to 1, and should return a color specification, e.g. the return values of GrayLevel, RGBColor, Hue, or CMYKColor.
To make it less choppy, as with basically all plotting functions, try specifying a higher value for PlotPoints (the number of initial sampling points) or MaxRecursion (how many times it can resample).

Color generation based on random number

I would like to create a color generator based on random numbers, which might differ just slightly, but I need colors to be easily recognizable from each other. I was thinking about generation then in a rgb format which would be probably easiest. I'm afraid simply multiplying given arguments wouldn't do very well. What algorithm do you suggest using? Also, second generated color should not be the same as previous one, but I don't want to store them - nor multiplying with (micro)time would do well since the scripts' parts are usually faster.
If you wanted truly random colors, then generating the same color 10 times in a row would be acceptable. To get values that are perceived as random, you have to strip out true randomness.
The easiest way to do this is probably with a cycling index into a list of colors. Say you pick web colors, a list of 216 colors. Each time you want a new color, add a random number to the index, wrapping as needed. To prevent getting the same color, limit random numbers to less than the number of colors.
colorIndex = ( colorIndex + ( random() % 100 ) + 1 ) % 216;
If you do not want a lookup table, then generate HSB colors but limit the hue to part of the circle that does not include the previous color. If the previous hue was 60 degrees, then pick the next hue above 90 or below 30 degrees, for example. You probably want to limit the saturation and brightness to be above 50% or so.
There are 255*255*255 possible combination of colors that you can do if you generate a random number for each value of RBG.
I wouldn't be afraid of color collision, but if you want to make sure that there will be no collisions whatsoever you will need to record the previous color.
This simple pseudo code illustrates how to avoid some necessary comparisons
if red is not equals previous_red then
if blue is not equals previous_blue then
if green is not equals previous_green then
use this color
else
generate again
Not an answer, but just to share a nice picture of xkcd:
It's not easy to model what constitutes "easily recognizable colors". The euclidean distance of the R,G,B components of a color is a rough measure, but the human eye is not an RGB color receptor! E.g. if a pair of colors has some euclidean distance between them, and another pair of colors have the exact distance between them, you don't really know whether each pair color is equally distinguishable, unless you see them!
For a true random number generator, have a look here. I'm sure you can bound it within a range of numbers too.
Let me sugest this:
Create a pseudo aleatory number algorythm (Type Google to find thowsands) and create an array with the colors.
You didn't specified the language, byt anyway you can have something like:
colors = [0xFF0000, 0x00FF00, 0x0000FF]
Red, Green and Blue
And you can have something like:
position = fn_random();
draw(colors[position]);
Hope it's what you are looking for...
Let me know!!

Generate unique colours

I want to draw some data into a texture: many items in a row. They aren't created in order, and they may all be different sizes (think of a memory heap). Each data item is a small rectangle and I want to be able to distinguish them apart, so I'd like each of them to have a unique colour.
Now I could just use rand() to generate RGB values and hope they are all different, but I suspect I won't get good distribution in RGB space. Is there a better way than this? E.g. what is a good way of cycling through different colours before they (nearly) repeat?
The colours don't have to match with any data in the items. I just want to be able to look at many values and see that they are different, as they are adjacent.
I could figure something out but I think this is an interesting question. :)
Using the RGB color model is not a good way to get a good color mix. It's better to use another color model to generate your color, and then convert from that color model to RGB.
I suggest you the HSV or HSL color model instead, in particular you want to vary the Hue.
If you want X different color values, vary them from 0 to 360 with a step size of 360 divided by X.
Whats your sample space... how many items are we talking.
You could build up an array of RGB Triples from
for(int r = 0; r < 255; r = r+16)
for(int g = 0; g < 255; g = g+16)
for(int b = 0; b < 255; b = b+16)
// take r, g, b and add it to a list
Then randomise your list and iterate through it.
that'd give you 16^3 (4096) different colors before a repeated color.
In general RGB isn't a great color space for doing these sorts of things because it's perceptually nonlinear, for starters. This means that equal distances moved between RGB triplets do not look equally different to our eyes.
I'd probably work in the L*c*h* space (see also) space, or HSL space, and just generate a uniform spacing in hue. These spaces have been designed to be approximately perceptually linear.
Google "delta e cie 2000"; the colour-difference formula is useful for determining apparent (visual) distance between 2 colours. (On a monitor; there's a different formula for pigments.) It operates on colours in Lab space (props to simon), but applies a perceptual calculation of difference.
We found that a number around 1.5 was sufficient to ensure visually distinct colours (i.e. you can tell the difference if they are near one another), but if you want identifiable colours (you can find any colour in a legend) you'll need to bump that up.
As to creating a set of colours... I'd probably start at some corner of Lab space, and walk around it using a step size that gives large enough visual differences (note: it's not linear, so step size will probably have to be adaptive) and then randomize the list.
This is very similar to the four-colour problem relating to colouring maps, this might yield some interesting solutions for you:
Four colour theorem
If you just need a set of perceptually-distinct colors (and not an algorithm to generate them) I have created a free tool on my website that does just that:
http://phrogz.net/css/distinct-colors.html
Instead of just using even spacing in RGB or HSV space (which are not uniformly distributed with respect to human perception) the tool allows you to generate a grid of values in HSV space and it then uses the CMC(I:c) standard for color distance to throw out colors that are perceptually too close to each other. (The 'threshold' slider on the second tab allows you to control how visually distinct the colors must be, showing you the results in realtime.)
In the end, you can sort your list of generated colors by various criteria, and then evenly 'shuffle' that list so that you are guaranteed to have visually-distinct values adjacent to each other in the list. (I recommend an 'Interleave' value of about 5.)
As of this writing the tool works well with Chrome, Safari, and (via a shim) Firefox; IE9 does not support HTML5 range input sliders, which the UI uses extensively for interactive exploration.

Resources