create random colors that can be instantly distinguished by humans - random

I was looking on stackoverflow for an answer, but wasn't really satisfied with what I found.
I need several different colors. I only find out the number of different colors I need at runtime.
I created the colors here now with 3 random numbers each. But as you can see, the two light green / brown colors are already quite difficult to distinguish from each other.
In the worst case, it can happen that the values are completely the same or differ only in one number.
So I wanted to ask, how do you create colors in a colorspace or in a color scheme, so that the generated colors can be distinguished?
And as an additional question, how can you create them so that they are softened, because you don't want an object to be immediately eye-catching?
I wrote this in python but any answer would help me so I can write it then by my own in python.

Related

improve cartographic visualization

I need some advice about how to improve the visualization of cartographic information.
User can select different species and the webmapping app shows its geographical distribution (polygonal degree cells), each specie with a range of color (e.g darker orange color where we find more info, lighter orange where less info).
The problem is when more than one specie overlaps. What I am currently doing is just to calculate the additive color mix of two colors using http://www.xarg.org/project/jquery-color-plugin-xcolor/
As you can see in the image below, the resulting color where two species overlap (mixed blue and yellow) is not intuitive at all.
Someone has any idea or knows similar tools where to get inspiration? for creating the polygons I use d3.js, so if more complex SVG features have to be created I can give a try.
Some ideas I had are...
1) The more data on a polygon, the thicker the border (or each part of the border with its corresponding color)
2) add a label at the center of polygon saying how many species overlap.
3) Divide polygon in different parts, each one with corresponding species color.
thanks in advance,
Pere
My suggestion is something along the lines of option #3 that you listed, with a twist. Rather painting the entire cell with species colors, place a dot in each cell, one for each species. You can vary the color of each dot in the same way that you currently are: darker for more, ligher for less. This doesn't require you to blend colors, and it will expose more of your map to provide more context to the data. I'd try this approach with the border of the cell and without, and see which one works best.
Your visualization might also benefit from some interactivity. A tooltip providing more detailed information and perhaps a further breakdown of information could be displayed when the user hovers his mouse over each cell.
All of this is very subjective. However one thing's for sure: when you're dealing with multi-dimensional data as you are, the less you project dimensions down onto the same visual/perceptual axis, the better. I've seen some examples of "4-dimensional heatmaps" succeed in doing this (here's an example of visualizing latency on a heatmap, identifying different sources with different colors), but I don't think any attempt's made to combine colors.
My initial thoughts about what you are trying to create (a customized variant of a heat map for a slightly crowded data set, I believe:
One strategy is to employ a formula suggested for
n + 1
with regards to breaks in bin spacing. This causes me some concern regarding how many outliers your set has.
Equally-spaced breaks are ideal for compact data sets without
outliers. In many real data sets, especially proteomics data sets,
outliers can make this representation less effective.
One suggestion I have would be to consider the idea of adding some filters to your categories if you have not yet. This would allow slimming down the rendered data for faster reading by the user.
another solution would be to use something like (Comprehensive) R
or maybe even DanteR
Tutorial in displaying mass spectrometry-based proteomic data using heat maps
(Particularly worth noting I felt, was 'Color mapping'.)

Algorithm to simulate color blindness?

There are many tools online that take images and simulate what that image might look like to someone with color blindness. However, I can't find any descriptions of these algorithms.
Is there a standard algorithm used to simulate color blindness? I'm aware that there are many types of color blindness (see the Wikipedia page on the subject for more details), but I'm primarily interested in algorithms for simulating dichromacy.
I had the same frustration and wrote an article comparing opensource color blindness simulations. In short, there are four main algorithms:
Coblis and the "HCIRN Color Blind Simulation function". You'll find this one in many places, and a Javascript implementation by MaPePeR. The full HCIRN simulation function was not properly evaluated but is reasonable in practice. However the "ColorMatrix" approximation by colorjack is very inaccurate and should be totally avoided (the author himself said that). Unfortunately it's still widespread as it was easy to copy/paste.
"Computerized simulation of color appearance for dichromats" by Brettel, Viénot, and Mollon (1997). A very solid reference. Works for all kinds of dichromacies. I wrote a public domain C implementation in libDaltonLens.
"Digital video colourmaps for checking the legibility of displays by dichromats" by Viénot, Brettel and Mollon (1999). A solid reference too, simplifies the 1997 paper for protanopia and deuteranopia (2 of the 3 kinds of color blindness). Also in libDaltonLens.
"A Physiologically-based Model for Simulation of Color Vision Deficiency" by Machado et al. (2009). Precomputed matrices are available on their website, which makes it easy to implement yourself. You just need to add the conversion from sRGB to linearRGB.
Looks like you're answer is in the wikipedia entry you linked.
For example:
Protanopia (1% of males): Lacking the long-wavelength sensitive
retinal cones, those with this condition are unable to distinguish
between colors in the green–yellow–red section of the spectrum. They
have a neutral point at a greenish wavelength around 492 nm – that is,
they cannot discriminate light of this wavelength from white.
So you need to de-saturate any colors in the green-yellow-red spectrum to white.
Image color saturation
The other 2 types of dichromacy can be handled similarly.
First we have to understand how the eye works:
A regular/healthy eye has 3 types of cones and 1 type of rods that have activation functions over the visible spectrum of light.
Their activations then pass through some function to produce the signal that goes to your brain. Roughly speaking, the function takes 4 channels as input and produces 3 channels as output (namely lightness, yellow-blue and red-green).
A colorblind person would have one of those two things be different (afaik usually/always 1.), so for example the person would be missing one type of cone or the cone's activation would be different.
The best thing to do would be:
Convert all pixels from RGB space to a combination of frequencies (with intensities). To do this, first take calculate the activations of each of the three cones (of a healthy person) then find a "natural" solution for a set of frequencies (+ intensities) that would result in the same activation. Of course, one solution is just the original three RGB frequencies with their intensities, but it is unlikely that the original image actually had that. A natural solution would be for example a normal distribution around some frequency (or even just one frequency).
Then, (again for each pixel) calculate the activations of a colorblind person's cones to your combination of frequencies.
Finally, find an RGB value such that a healthy person would have the same activations as the ones the colorblind person has.
Note that, if the way these activations are combined is also different for the relevant type of colorblindness, you might want to carry that out as well in the above steps. (So instead of matching activations, you are matching the result of the function over the activations).

How to compare two contours? (font comparison)

I'm trying to analyse two contours and give a percent corresponding to their similarity. Assuming I have all the point's coordinates describing these contours (just like an SVG path), based on which factor should I tell they're almost identical ?
After some Google searches, I found something related to Fourier descriptors, are they relevant for my case ?
Edit
What I want to do is to compare several fonts to another one. Just like would do What the font, but not with an image. Thanks to the produced algorithm, it would be possible to find a font equivalent according to the similarity percentage.
Some scripts just compare the bounding box for each letters, but it's not enough. I need a way to tell that Arial is closest to Verdana than to Webdings. So assuming I can extract the contour from the fonts, I need a way to compare two contours.
For example (with "logical" percent values):
there are two basic ways to approach the general problem (font matching): symbolic and statistical. a good solution will probably combine both in some way.
a symbolic approach uses your knowledge of the problem in a direct way. for example, you can make a list of the things you (as an intelligent human) would use to characterise fonts. the kind of questions that identifont uses. this approach means writing routines that are smart enough to detect the various properties (eg stroke width, whether certain loops are closed, existence of serifs, etc) plus a decision tree (or "rule engine") that puts the yes/no/unsure answers together and comes up with an answer.
the statistical approach sounds more like what you were thinking about, and is probably how what the font works. here the idea is to find some general properties and use those as weights to find a "best" selection. for example, if you have lots of fonts then you can train a neural net (input being pixels at some sample resolution). there you don't need to know "how" the net decides - just that given enough training data it will find a way to do so. or you could just look at the sum of all the dark pixels - that would likely give you results similar to your percentages above.
this sounds simple, but often it's not so easy to find simple statistical measurements that show differences well in all the ways you want.
so then there's a large middle ground between the two. the idea being that if you can pull in some of the ideas from the first group then you can make the approaches in the second much more efficient. while the simplest neural net approach is "all in one" (it includes the calculations and the decisions) you can separate those out. so instead of just giving the net a bunch of pixels you can give it more "meaningful" inputs - things that you know help detect between different fonts. things like stroke width, or the number of "holes" in the character. you can also add some smarts to remove things that might otherwise confuse results - for example, pre-scaling to the same height (if you have a full font set then you can scale everything so that the height of a lowercase "m", say, is constant).
fourier descriptors are a way of characterising the "outside shape" of something and so could be used as an input to a statistical approach as i've described above. in the example you give the fourier descriptors will pick up the "spikiness" of the serifs in the lower G, and so would indicate that it is very different from the G on the left. but they care much less about stroke width and nothing at all about scale (magnification/zoom) (which can be a good or bad thing - if you're being given random letters of different sizes, you don't want to be sensitive to size, but if you've normalized to a standard "m" for an entire alphabet then you certainly do want to include that). since the output is just a spectrum you can compare different letters by cross-correlation of use something like PCA to categorize different types of letter.
other ideas would be 2d cross-correlation (the maximum of the normalised correlation gives you some idea of how similar two things are) or simply seeing what fraction of pixels are common in both letters.
as the comments say, this is a huge problem (and i am not an expert - the above is just random bullshit from being an interested bystander).
but, to finally answer your question, if what you have is an outline, then a fourier descriptor would be a good place to start. since that focuses on shape rather than "weight" i would combine that with something like total area enclosed by the outline. then write some code to calculate those and see what numbers you get for some example alphabets. if it seems to distinguish some letters, but not others, then look for some other measurements that would help in those cases. you will likely end up combining quite a few approaches to get something both fast and reliable.
alternatively, if you just want something simple, try using some easy-to-measure values like height, width, total number of pixels "inside" the contours, how many strokes you cross along vertical or horizontal lines, etc. combining a bunch of those could get you something "good enough" for some purposes, if you aren't comfortable with the maths involved in fourier transforms etc.
Have you considered using a neural network based approach? This paper uses a Self-Organizing Tree map to perform content based image retrieval. With a good training set, it should be possible to create a multilayer network (or SOM) that can give you an accurate similarity measure.

How to choose n different color automatically for plotting n different objects?

I need to draw n different objects on a chart. I want to pick a different color for each of them to make them distinguishable. The objects will be moved around, so I cannot count on ideas like "four color theorem" to assign same color to non-adjacent items. So far my problem call for up to 20 different items.
Is there a good way to pick n different colors to make them as distinguishable from each other as possible?
First of all, I have since changed the design so that it is not important to use 20 distinct colors. The default palette of 10 colors show up quite well.
Secondly, I've found an answer to my own question. The thing I want to do is called Color scale for categorical coding. Here is a paper that propose a method to do it
An algorithm for generating color scales for both categorical and ordinal coding - Breslow - 2009 - Color Research & Application - Wiley Online Library
http://onlinelibrary.wiley.com/doi/10.1002/col.20559/full
I'm going to give the paper a glance. It is probably too technical than what I prepare to do.
I'd say colour distinction is a very subjective matter and you're probably better off looking for an existing colour palette and working your way from there. The higher your n, the higher your chance of two automatically generated colours being indistinguishable by your users even though by some colour-theoretic criterion they are very different.
And don't forget to make sure you don't use colour as the only distinction between objects, or:
you'll be in for a lot of hate mail from colour blind people
you risk people mistaking objects of similar colours as having some sort of implicit grouping
Do you really need to use 20 different colors? That is a lot of colors if you still want people to be able to distinguish them. Also realize that people who are colorblind will be lost looking at your charts. 10% of males are color blind. It would be better if you could further break down your objects into two to five groups. Then you could use different shapes as well as color to distinguish objects. For instance, you might have crosses, circles, triangles, stars, and squares of four different colors as shown here:
For choice of colors, I would check out the color brewer. However, notice it doesn't go up to 20 colors.

Display of colorblind images

Quick question: At this website here http://www.vischeck.com/examples/ there are a few pictures of numbers hidden within another color to test for color blindness. Is there any way that these images can be generated algorithmically?
They are based on ready-made dot fields, and you overlay a number on them, and do the coloring on the whole dot each time it is partially filled. If You know the correct colors - that will do ;)
What language are you coding in? It's impossible to give any definitive answer without knowing your problem well.
If you're in .NET, GDI is your best bet for generating such a dot field, but it is not simple to do algorithmically, and it's possible that these were hand-drawn.
One easier possibility you have is to use an evenly-spaced circles field, even if it's not as elegant.
Then, you'd pick two colors that aren't supposed to be (easily) distinguishable by (certain?) color-blind people.
Now, you draw a number in the square field (using one of 10 matrices for the numbers 0-9 that represent each number character with sizes compatible with the circle field) using (limited) random variations of the two colors that the person shouldn't distinguish.
In other words, if the person isn't supposed to distinguish red and green, you'd make a character using shades of red on top of a shades of green background.
You'd possibly need HUE>RGB functions, for .NET you'd have to look for a library (I remember using one from codeproject).

Resources