Correct usage of SetDeviceGammaRamp

Correct usage of SetDeviceGammaRamp - winapi

I'd like to add the ability to adjust screen gamma at application startup and reset it at exit. While it's debatable whether one should tamper with gamma at all (personal I find it useless and detrimental), but hey, some people expect being able to do that kind of thing.
It's just one simple API call too, so it's all easy, right?
MSDN says: "The gamma ramp is specified in three arrays of 256 WORD elements each [...] values must be stored in the most significant bits of each WORD to increase DAC independence.". This means, in my understanding, something like word_value = byte_value<<8, which sounds rather weird, but it's how I read it.
The Doom3 source code contains a function that takes three arrays of char values and converts them into an array of uint16_t values that have the same byte value both in the upper and lower half. In other words something like word_value = (byte_value<<8)|byte_value. This is equally weird, but what's worse it is not the same as above.
Also there exist a few of code snippets on the internet on various hobby programmer sites (apparently one stolen from the other, because they're identical to the letter) which do some obscure math multiplying the linear index with a value, biasing with 128, and clamping to 65535. I'm not quite sure what this is about, but it looks like total nonsense to me, and again it is not the same as either of the above two.
What gives? It must be well-defined -- without guessing -- how the data that you supply must look like? In the end, what one will do is read the original values and let the user tweak some sliders anyway (and optionally save that blob to disk with the user's config), but still... in order to modify these values, one needs to know what they are and what's expected.
Has anyone done (and tested!) this before and knows which one is right?

While investigating the ability to change screen brightness programmatically, I came across this article Changing the screen brightness programmingly - By using the Gama Ramp API.
Using the debugger, I took a look at the values provided by the GetDeviceGamaRamp() function. The output is a two dimensional array defined as something like WORD GammaArray[3][256]; and is a table of 256 values to modify the Red, Green, and Blue values of displayed pixels. The values I saw started with a value of zero (0) at index 0 and adding a value of 256 to calculate the next value. So the sequence is 0, 256, 512, ..., 65024, 65280 for indices 0, 1, 2, ..., 254, 255.
My understanding is that these values are used to modify the RGB value for each pixel. By modifying the table value you can modify the display brightness. However the effectiveness of this technique may vary depending on display hardware.
You may find this brief article, Gamma Controls, of interest as it describes gamma ramp levels though from a Direct3D perspective. The article has this to say about Gamma Ramp Levels.
In Direct3D, the term gamma ramp describes a set of values that map
the level of a particular color component—red, green, blue—for all
pixels in the frame buffer to new levels that are received by the DAC
for display. The remapping is performed by way of three look-up
tables, one for each color component.
Here's how it works: Direct3D takes a pixel from the frame buffer and
evaluates its individual red, green, and blue color components. Each
component is represented by a value from 0 to 65535. Direct3D takes
the original value and uses it to index a 256-element array (the
ramp), where each element contains a value that replaces the original
one. Direct3D performs this look-up and replace process for each color
component of each pixel in the frame buffer, thereby changing the
final colors for all of the on-screen pixels.
According to the online documentation for GetDeviceGamaRamp() and SetDeviceGamaRamp() these functions are supported in the Windows API beginning with Windows 2000 Professional.
I used their source condensed down to the following example inserted into a Windows application to test the effect using values from the referenced article. My testing was done with Windows 7 and an AMD Radeon HD 7450 Graphics adapter.
With this test both of my displays, I have two displays, were affected.
//Generate the 256-colors array for the specified wBrightness value.
WORD GammaArray[3][256];
HDC hGammaDC = ::GetDC(NULL);
WORD wBrightness;
::GetDeviceGammaRamp (hGammaDC, GammaArray);
wBrightness = 64; // reduce the brightness
for (int ik = 0; ik < 256; ik++) {
int iArrayValue = ik * (wBrightness + 128);
if (iArrayValue > 0xffff) iArrayValue = 0xffff;
GammaArray[0][ik] = (WORD)iArrayValue;
GammaArray[1][ik] = (WORD)iArrayValue;
GammaArray[2][ik] = (WORD)iArrayValue;
}
::SetDeviceGammaRamp (hGammaDC, GammaArray);
Sleep (3000);
wBrightness = 128; // set the brightness back to normal
for (int ik = 0; ik < 256; ik++) {
int iArrayValue = ik * (wBrightness + 128);
if (iArrayValue > 0xffff) iArrayValue = 0xffff;
GammaArray[0][ik] = (WORD)iArrayValue;
GammaArray[1][ik] = (WORD)iArrayValue;
GammaArray[2][ik] = (WORD)iArrayValue;
}
::SetDeviceGammaRamp (hGammaDC, GammaArray);
Sleep (3000);
::ReleaseDC(NULL, hGammaDC);
As an additional note, I made a slight change to the above source so that instead of modifying each of the RGB values equally, I commented out the first two assignments so that only GammaArray[2][ik] was modified. The result was a yellowish cast to the display.
I also tried putting the above source in a loop to check how the display changed and it was quite a difference from wBrightness=0 to wBrightness=128.
for (wBrightness = 0; wBrightness <= 128; wBrightness += 16) {
for (int ik = 0; ik < 256; ik++) {
int iArrayValue = ik * (wBrightness + 128);
if (iArrayValue > 0xffff) iArrayValue = 0xffff;
GammaArray[0][ik] = (WORD)iArrayValue;
GammaArray[1][ik] = (WORD)iArrayValue;
GammaArray[2][ik] = (WORD)iArrayValue;
}
::SetDeviceGammaRamp (hGammaDC, GammaArray);
Sleep (3000);
}
Microsoft provides an on-line MSDN article, Using gamma correction, that is part of the Direct3D documentation which describes the basics of gamma as follows:
At the end of the graphics pipeline, just where the image leaves the
computer to make its journey along the monitor cable, there is a small
piece of hardware that can transform pixel values on the fly. This
hardware typically uses a lookup table to transform the pixels. This
hardware uses the red, green and blue values that come from the
surface to be displayed to look up gamma-corrected values in the table
and then sends the corrected values to the monitor instead of the
actual surface values. So, this lookup table is an opportunity to
replace any color with any other color. While the table has that level
of power, the typical usage is to tweak images subtly to compensate
for differences in the monitor’s response. The monitor’s response is
the function that relates the numerical value of the red, green and
blue components of a pixel with that pixel’s displayed brightness.
Additionally the software application Redshift has a page Windows gamma adjustments which has this to say about Microsoft Windows.
When porting Redshift to Windows I ran into trouble when setting a
color temperature lower than about 4500K. The problem is that Windows
sets limitations on what kinds of gamma adjustments can be made,
probably as a means of protecting the user against evil programs that
invert the colors, blank the display, or play some other annoying
trick with the gamma ramps. This kind of limitation is perhaps
understandable, but the problem is the complete lack of documentation
of this feature (SetDeviceGammaRamp on MSDN). A program that tries to
set a gamma ramp that is not allowed will simply fail with a generic
error leaving the programmer wondering what went wrong.

I haven't tested this, but if I had to guess, early graphics cards were non-standard in their implementation of SetDeviceGammaRamp() when Doom was written and sometimes used the LOBYTE and sometimes used the HIBYTE of the WORD value. The consensus moved to only using the HIBYTE, hence the word_value = byte_value<<8.
Here's another datapoint, from the PsychoPy library (in python) which is just swapping LOBYTE and HIBYTE:
"""Sets the hardware look-up table, using platform-specific ctypes functions.
For use with pyglet windows only (pygame has its own routines for this).
Ramp should be provided as 3x256 or 3x1024 array in range 0:1.0
"""
if sys.platform=='win32':
newRamp= (255*newRamp).astype(numpy.uint16)
newRamp.byteswap(True)#necessary, according to pyglet post from Martin Spacek
success = windll.gdi32.SetDeviceGammaRamp(pygletWindow._dc, newRamp.ctypes)
if not success: raise AssertionError, 'SetDeviceGammaRamp failed'
It also appears that Windows doesn't allow all gamma settings, see:
http://jonls.dk/2010/09/windows-gamma-adjustments/
Update:
The first Windows APIs to offer gamma control are Windows Graphics Device Interface (GDI)’s SetDeviceGammaRamp and GetDeviceGammaRamp. These APIs work with three 256-entry arrays of WORDs, with each WORD encoding zero up to one, represented by WORD values 0 and 65535. The extra precision of a WORD typically isn’t available in actual hardware lookup tables, but these APIs were intended to be flexible. These APIs, in contrast to the others described later in this section, allow only a small deviation from an identity function. In fact, any entry in the ramp must be within 32768 of the identity value. This restriction means that no app can turn the display completely black or to some other unreadable color.
http://msdn.microsoft.com/en-us/library/windows/desktop/jj635732(v=vs.85).aspx

Related

Blob detection on embedded platform, memory restricted

I have a STM32H7 MCU with 1MB of RAM and 1MB of ROM. I need to make a blob detection algorithm on a binary image array of max size 1280x1024.
I have searched about blob detection algorithms and found out that they are mainly divided into 2 categories, LINK:
Algorithms based on label-propagation (One component at a time):
They first search an unlabeled object pixel, label the pixel with a new label; then, in the later processing, they propagate the same label to all object pixels that are connected to the pixel. A demo code would look something like this:
void setLabels(){
int m=2;
for(int y=0; y<height; y++){
for(int x=0; x<width; x++){
if(getPixel(x,y) == 1) compLabel(x,y,m++);
}
}
}
void compLabel(int i, int j,int m){
if(getPixel(i,j)==1){
setPixel(i,j,m); //assign label
compLabel(i-1,j-1,m);
compLabel(i-1,j,m);
compLabel(i-1,j+1,m);
compLabel(i,j-1,m);
compLabel(i,j+1,m);
compLabel(i+1,j-1,m);
compLabel(i+1,j,m);
compLabel(i+1,j+1,m);
}
}
Algorithms based on label-equivalent-resolving (Two-pass): They consist of two steps: in the first step, they assign a provisional label to each object pixel. In the second step, they integrate all provisional labels assigned to each object, which are called equivalent labels, to a unique label, which called the representative label, and replace the provisional label of each object pixel by its representative label.
The down sides of the 1st algorithm is that it is using recursive calls for all the pixel around the original pixel. I am afraid that it will cause hard fault errors on STM32 because of the limited stack.
The down sides of the 2nd algorithm is that it requires a lot of memory for the labeling image. For instance, for the max. resolution of 1280x1024 and for the max. number of labels 255 (0 for no label), image label size is 1.25MB. Way more than we have available.
I am looking for some advice on how to proceed. How to get center coordinates and area information of all blobs in the image without using to much memory? Any help is appreciated. I presume that the 2nd algorithm is out of the picture since there is no memory available.

You firstly have to go over you image with a scaling kernel to scale your image back to something that is able to be processed. 4:1 or 9:1 are good possibilities. Or you are going to have to get more RAM. Because this situation seems unworkable otherwise. Bit access is not really fast and is going to kill your efficiency and I don't even think that you need that big of an image. (at least that is my experience with vision systems)
You can then store the pixels in straight unsigned char array which can be labeled with the first method you named. It doesn't have to be a recursive process. You can also determine if a blob was relabeled to another blob and set a flag to do it again.
This makes it possible to have an externally visible function have a while loop which keeps calling your labeling function without creating a big stack.
Area determination is then done by going over the image and counting the instance of a pixel for every labeled blob.
The center of a certain blob can be found by calulating the moments of a blob and then calculating the center of mass. This is some pretty hefty math so don't be discouraged, it is a though apple to bite through but it is a great solution.
(small hint: you can take the C++ code from OpenCV and look through their code to find out how it's done)

How to change dynamic range of an RGB image?

I have 16-bit raw image (12 effective bits). I convert it to rgb and now I want to change the dynamic range. I created 2 map functions. You can see them visualized below. As you can see the first function maps values 0-500 to 0-100 and the second one maps the rest values to 101-255.
Now I want to apply the map-functions on the rgb image. What I'm doing is iterating through each pixel, find appropriate function for each channel and apply it on the channel. For example, the pixel is RGB=[100 2000 4000]. On R channel I'll apply the first function since 100 is in 0-500 range. But, on G and B channels I'll apply the second function since their values are in 501-4095.
But, in doing this way I'm actually changing the actual color of the pixel since I apply different functions on the channels of the pixel.
Can you suggest how to do it or at least give me a direction or show some articles?

What you're doing is a very straightforward imaging operation, frequently applied in image and video processing. Sometimes it's (imprecisely) called a lookup table (LUT), even though it's not always implemented via an actual lookup table. Examples of this are gamma adjustment or log encoding.
For instance, an example of this kind of encoding is sRGB, which is a gamma encoding from linear light. You can read about it here: http://en.wikipedia.org/wiki/SRGB. You'll see that it has a nonlinear adjustment.
The name LUT implies a good way of doing it. If you can make your image a uint8 or uint16 valued set, you can create a vector of desired output values for any input value. The lookup table has the same number of elements as the possible range of the variable type. If you were using a uint8, you'd have a lookup table of 256 values. Then the lookup is easy, you just use the image value as an index into your LUT to get the resulting value. That computational efficiency is why LUTS are so widely used.
In your case, since you're working in RGB space, it is acceptable to apply the curves in exactly the same way to each of the three color channels. RGB space is nice for that reason. However, for various reasons, sometimes different LUTs are implemented per-channel.
So if you had an image (we'll use one included in MATLAB and pretend it's 12 bit by scaling it):
someimage = uint16(imread('autumn.tif')).*16;
image(someimage.*16); % Need to multiply again to display 16 bit data scaled properly
For your LUT, you would implement this as:
lut = uint8([(0:500).*(1/5), (501:4095).*((255-101)/(4095-501)) + 79.5326]);
plot(lut); %Take a look at the lut
This makes the piecewise calculation you described in your question.
You could make a new image this way:
convertedimage = lut(double(someimage)+1);
image(convertedimage);
Note that because MATLAB indexes with doubles--one based--you need to cast properly and add one. This doesn't slow things down as much as you may think; MATLAB is made to do this. I've been using MATLAB for decades and this still looks odd to me.
This method lets you get fancy with the LUT creation (logs, exp, whatever) and it still runs very fast.
In your case, your LUT only needs 4096 elements since your input data is only 12 bits. You may want to be careful with the bounds, since it's possible a uint16 could have higher values. One clean way to bound this is to use the min and end functions:
convertedimage = lut(min(double(someimage)+1, end));
Now, this has implemented your function, but perhaps you want a slightly different function. For instance, a common function of this type is a simple gamma adjustment. A gamma of 2.2 means that the incoming image values are scaled by taking them to the 1/2.2 power (if scaled between 0 and 1). We can create such a LUT as follows:
lutgamma = uint8(256.*(((0:4095)./4095).^(1/2.2)));
plot(lutgamma);
Again, we apply the LUT with a simple indexing:
convertedimage = lutgamma(min(double(someimage)+1, end));
And we get the following image:
Using a smooth LUT will usually improve overall image quality. A piecewise linear LUT will tend to cause the resulting image to have odd discontinuities in the shaded regions.
These are so common in many imaging systems that LUTs have file formats. To see what I mean, look at this LUT generator from a major camera company. LUTs are a big deal, and it looks like you're on the right track.

I think you are referring to something that Photoshop calls "Enhance Monochromatic Contrast", which is described here - look at "Step 3: Try Out The Different Algorithms".
Basically, I think you find a single min from all the channels and a single max from across all 3 channels and apply the same scaling to all the channels, rather than doing each channel individually with its own min and max.
Alternatively, you can convert to Lab (Lightness plus a and b) mode and apply your function to the Lightness channel (without affecting the a and b channels which hold the colour information) then transform back to RGB, your colour unaffected.

Size limitation when drawing to CanvasRenderingContext2D

I HAVE HEAVILY EDITED THIS QUESTION TO PROVIDE MORE DETAILS
I came across a limitation when drawing to CanvasRenderingContext2D via EaselJS framework. I have objects like this:
but when the position of those objects surpasses couple million pixels the drawings start to crumble apart. This is the same object with x position 58524928. (The parent container is moved to -58524928 so that we can see the object on stage.) The more I offset the object the more it will crumble. Also when I try to move the object - drag it with mouse - it will "jump" like it was subjected to a large grid.
This is EaseJS framework and the shapes are ultimately drawn to the CanvasRenderingContext2D via the drawImage() method. Here is snippet from the code:
ctx.drawImage(cacheCanvas, this._cacheOffsetX+this._filterOffsetX, this._cacheOffsetY+this._filterOffsetY, cacheCanvas.width/scale, cacheCanvas.height/scale);
I suppose it has something to do with the limited number of real numbers in JavaScript:
Note that there are infinitely many real numbers, but only a finite
number of them (18437736874454810627, to be exact) can be represented
exactly by the JavaScript floating-point format. This means that when
you're working with real numbers in JavaScript, the representation of
the number will often be an approximation of the actual number.
Source: JavaScript: The Definitive Guide
Can someone confirm/reject my assumption? 58 million (58524928) does not seems so much to me, is it some inefficiency of EaselJS or it is a limit of the Canvas?
PS:
Scaling has no effect. I have drawn everything 1000 times smaller and 1000 times closer with no effect. Equally, if you scale the object up 1000 times while still x:58 million it will not look crumbled. But move it to 50 billion and you are where you started. Basically offset divided by size is constant limit for details.
EDIT
Here is example jsfiddle.net/wzbsbtgc/2. Basically there are two separate problems
If I use huge numbers as parameters for the drawing itself (red curve) it will be distorted. This can be avoided by using smaller numbers and moving the DisplayObject instead (blue curve).
In both cases it is not possible to move the DisplayObject by 1px. I think this is explained in GameAlchemist's post.
Any advice/workaround for the second problem is welcome.

It appears that Context2D uses lower precision numbers for transforms. I haven't confirmed the precision yet, but my guess is that it is using floats instead of doubles.
As such, with higher values, the transform method (and other similar C2D methods) that EaselJS relies on loses precision, similar to what GameAlchemist describes. You can see this issue reproduced using pure C2D calls here:
http://jsfiddle.net/9fLff2we/
The best workaround that I can think of, would be to precalculate the "final" values external to the transform methods. Normal JS numbers are higher precision than what C2D is using, so this should solve the issue. Really rough example to illustrate:
http://jsfiddle.net/wzbsbtgc/3/

The behavior that you see is related to the way figures are represented in the IEEE 754 standard.
While Javascript uses 64bits floats, WebGL uses only 32bits float, and since most (?all?) canvases are webGL accelerated, all your numbers will be (down)converted before the draw.
The IEEE 754 32 bits standard uses 32 bits to represent a number : 1 bit for sign, 8 exponent bits, and then only 23 bits for the mantissa.
Let's call IEEE_754_32max :
IEEE_754_32max = ( 1 << 23 ) -1 = 8.388.6071 (8+ millions)
We can have full precision for integers only in the [-IEEE_754_32max, IEEE_754_32max] range.
Beyond that point, the exponent will be used, and we'll loose the weak bits of the mantissa.
For instance ( 10 millions + 1 ) = 10.000.001 is too big, it can't fit into 23 bits,so it will be stored as
10.000.001 = 10.000.00• * 10 = 1e7 = 10.000.000
- We lost the final '1' -
The grid effect that you see is linked to the exponent being used /precision being lost : with figures such as 58524928, we need 26 bits to represent the figure. So 3 bits are lost, and we have, for instance :
58524928 + 7 == 58524928
So when using a figure that is near from 58524928, it will either be rounded to 58524928, OR 'jump' to the nearest possible figure : you have your grid effect.
Solution ?
-->> Change the units you are using for your applications, to have much smaller figures. Maybe you're using mm --> use meters or kilometers.
Mind that the precision you are using is an illusion : display resolution is the first limit, and the mouse is 1 pixel precise at most, so even with a 4K display, there's no way 32 bit floats can be a limit.
Choose the right measure unit to fit your all your coordinates in a smaller range and you'll solve your issue.
More clearly : you must change the units you are using for the display. Which does not mean you have to trade accuracy : you just have to do the translation + scaling by yourself before drawing : that way you still use the Javascript IEEE 64 bits accuracy and you've got no more those 32 bits rounding issue.
(you might override the x, y properties with getters/setters
Object.defineProperty(targetObject, 'x', { get : function() { return view.pixelWidth*(this.rx-view.left)/view.width ; } }
)

You can use any sized drawing coordinates that you desire.
Canvas will clip your drawing to the display area of the canvas element.
For example, here's a demo that starts drawing a line from x = -50000000 and finishes on the canvas. Only the visible portion of the line is rendered. All non-visible (off-canvas) points are clipped.
var canvas=document.getElementById("canvas");
var ctx=canvas.getContext("2d");
var cw=canvas.width;
var ch=canvas.height;
ctx.beginPath();
ctx.moveTo(-50000000,100);
ctx.lineTo(150,100);
ctx.stroke();
body{ background-color: ivory; padding:10px; }
#canvas{border:1px solid red;}
<h4>This line starts at x = negative 50 million!</h4>
<canvas id="canvas" width=300 height=300></canvas>

Remember that the target audience for W3C standard is mainly browser vendors. The unsigned long value (or 232) addresses more the underlying system for creating a bitmap by the browser. The standard says values in this range are valid, but there is no guarantee the underlying system will be able to provide a bitmap that large (most browsers today limits the bitmap to much smaller sizes than this). You stated that you don't mean the canvas element itself, but the link reference is the interface definition of the element so I just wanted to point that out in regards to the number range.
From the JavaScript side of things, where we developers usually are, and with the exception of typed arrays, there is no such thing as ulong etc. Only Number (aka unrestricted double) which is signed and stores numbers in 64-bit, formatted as IEEE-754.
The valid range for Number is:
Number.MIN_VALUE = 5e-324
Number.MAX_VALUE = 1.7976931348623157e+308
You can use any values in this range with canvas for your vector paths. Canvas will clip them to the bitmap based on the current transformation matrix when the paths are rasterized.
If you by drawing mean another bitmap (ie. Image, Canvas, Video) then they will be subject to the same system and browser capabilities/restrictions as the target canvas itself. Positioning (direct or via transformation) is limited (in sum) by the range of a Number.

zoom a large picture

There is a very large picture that could not load into memory once. Because it may cause out of memory exception. I need to zoom this picture to small size. So what should I do?
The simple thought is open an inputstream, and process a buffer size at a time. But the zoom algorithm?

If you can access the picture row-by-row (e.g. it's a bitmap), the simplest thing you could do is just downsample it, e.g. only read every nth pixel of every nth row.
// n is an integer that is the downsampling factor
// width, height are the width and height of the original image, in pixels
// down is a new image that is (height/n * width/n) pixels in size
for (y = 0; y < height; y += n) {
row = ... // read row y from original image into a buffer
for (x = 0; x < width; x += n) {
down[y/n, x/n] = row[x]; // image[row,col] -- shorthand for accessing a pixel
}
}
This is a quick-and-dirty way that can quickly and cheaply resize the original image without ever loading the whole thing into memory. Unfortunately, it also introduces aliasing in the output image (down). Dealing with aliasing would require performing interpolation -- still possible using the above row-by-row approach, but is a bit more involved.
If you can't easily access the image row-by-row, e.g. it's a JPEG, which encodes data in 8x8 blocks, you can still do something similar to the approach I described above. You would simply read a row of blocks instead of a row of pixels -- the remainder of the algorithm would work the same. Furthermore, if you're downsampling by a factor of 8, then it's really easy with JPEG -- you just take the DC coefficient of each block. Downsampling by factors that are multiples of 8 is also possible using this approach.
I've glossed over many other details (such as color channels, pixel stride, etc), but it should be enough to get you started.

There are a lot of different resizing algorithms which offer varying level of quality with the trade off being cpu time.
I believe with any of these you should be able to process a massive file in chunks relatively easily, however, you should probably try existing tools to see whether they can already just handle the massive file anyway.
Gd graphics library allows you to define how much working memory it can use I believe so it obviously already has logic for processing files in chunks.

Generate unique colours

I want to draw some data into a texture: many items in a row. They aren't created in order, and they may all be different sizes (think of a memory heap). Each data item is a small rectangle and I want to be able to distinguish them apart, so I'd like each of them to have a unique colour.
Now I could just use rand() to generate RGB values and hope they are all different, but I suspect I won't get good distribution in RGB space. Is there a better way than this? E.g. what is a good way of cycling through different colours before they (nearly) repeat?
The colours don't have to match with any data in the items. I just want to be able to look at many values and see that they are different, as they are adjacent.
I could figure something out but I think this is an interesting question. :)

Using the RGB color model is not a good way to get a good color mix. It's better to use another color model to generate your color, and then convert from that color model to RGB.
I suggest you the HSV or HSL color model instead, in particular you want to vary the Hue.
If you want X different color values, vary them from 0 to 360 with a step size of 360 divided by X.

Whats your sample space... how many items are we talking.
You could build up an array of RGB Triples from
for(int r = 0; r < 255; r = r+16)
for(int g = 0; g < 255; g = g+16)
for(int b = 0; b < 255; b = b+16)
// take r, g, b and add it to a list
Then randomise your list and iterate through it.
that'd give you 16^3 (4096) different colors before a repeated color.

In general RGB isn't a great color space for doing these sorts of things because it's perceptually nonlinear, for starters. This means that equal distances moved between RGB triplets do not look equally different to our eyes.
I'd probably work in the L*c*h* space (see also) space, or HSL space, and just generate a uniform spacing in hue. These spaces have been designed to be approximately perceptually linear.

Google "delta e cie 2000"; the colour-difference formula is useful for determining apparent (visual) distance between 2 colours. (On a monitor; there's a different formula for pigments.) It operates on colours in Lab space (props to simon), but applies a perceptual calculation of difference.
We found that a number around 1.5 was sufficient to ensure visually distinct colours (i.e. you can tell the difference if they are near one another), but if you want identifiable colours (you can find any colour in a legend) you'll need to bump that up.
As to creating a set of colours... I'd probably start at some corner of Lab space, and walk around it using a step size that gives large enough visual differences (note: it's not linear, so step size will probably have to be adaptive) and then randomize the list.

This is very similar to the four-colour problem relating to colouring maps, this might yield some interesting solutions for you:
Four colour theorem

If you just need a set of perceptually-distinct colors (and not an algorithm to generate them) I have created a free tool on my website that does just that:
http://phrogz.net/css/distinct-colors.html
Instead of just using even spacing in RGB or HSV space (which are not uniformly distributed with respect to human perception) the tool allows you to generate a grid of values in HSV space and it then uses the CMC(I:c) standard for color distance to throw out colors that are perceptually too close to each other. (The 'threshold' slider on the second tab allows you to control how visually distinct the colors must be, showing you the results in realtime.)
In the end, you can sort your list of generated colors by various criteria, and then evenly 'shuffle' that list so that you are guaranteed to have visually-distinct values adjacent to each other in the list. (I recommend an 'Interleave' value of about 5.)
As of this writing the tool works well with Chrome, Safari, and (via a shim) Firefox; IE9 does not support HTML5 range input sliders, which the UI uses extensively for interactive exploration.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio