Otsu Threshold in general - threshold

I want to know if I could use the Otsu Code for a Histogram which contains other data than greyValues? I have a set of mm which multiplied by 1000 to get int values and now I want to cut off the left part of the histogram with Otsu.

Sure, why not? The algorithm doesn't know what the values mean, it just separates the values into two classes. You just have to make sure that the code you're using iterates over the full range of your int values.

Related

OpenCV matchTemplate threshold values for different methods

I searched a lot for finding the threshold values for the below mention methods.
methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', cv2.TM_SQDIFF_NORMED']
I also tried to figure them out by myself but I could only find thresholds for 3 methods which have max value of 1.0. The other methods values were in range of 10^5. I would like to know the bounds of these methods.
Can somebody point me in the right direction. My agenda is to loop through all the methods for template matching and get the best outcome.I went through the documentation and source code, but no luck.
These are the values I got , I could understand that *NORMED methods have values 0-1.
cv2.TM_CCOEFF -- 25349100.0
cv2.TM_CCOEFF_NORMED -- 0.31208357214927673
cv2.TM_CCORR -- 616707328.0
cv2.TM_CCORR_NORMED -- 0.9031367897987366
cv2.TM_SQDIFF -- 405656000.0
cv2.TM_SQDIFF_NORMED -- 0.737377941608429
As described in opencv documentation matchTemplate result is a sum of differences (varies with method) for each pixel, so for not normalized methods - thresholds would vary with size of template.
You can see formulas for each method and calculate thresholds for your template type considering that max difference between pixels is 255 for CV_8UC1 image.
So lets say you have 2 grayscale images and smallest one is 10x10.
In that case for TM_SQDIFF minimum distance would be 10x10x0^2=0 (images are identical) and maximum would be 10x10x255^2=6502500 (one image is completely black and other is white), which results in [0, 6502500] boundaries.
Of course it is possible to calculate that for the undefined sizes [A, B].
For TM_CCORR it would be AxBxmax(T(x',y')I(x+x',y+y')) = 65025AB
You can go on and calculate that for remaining methods, remember that if you have different from CV_8UC image types (like 32FC or 32SC) - you would need to replace 255 with corresponding values (max(float) max(int32))

Sort labels of segmented image in kmeans based on cluster mean

I have a simple question but is very interesting. As you know, Kmeans can be give different result after each running due to randomly initial cluster center. However, assume I know that cluster 1 has smaller mean value than cluster 2, cluster 2 has smaller mean value than cluster 3 and so on. I want to make a algorithm to implement that cluster has small mean value, then it will be assigned to small cluster index.
This is my Matlab code. If you are have more sort or more clear way. Please suggest to me
%% K-mean
num_cluster=2;
nrows = size(Img_original,1);
ncols = size(Img_original,2);
I_1D = reshape(Img_original,nrows*ncols,1);
[cluster_idx mu]=kmeans(double(I_1D),num_cluster,'distance','sqEuclidean','Replicates',3);
cluster_label = reshape(cluster_idx,nrows,ncols);
%% Sort based on mu
[mu_sort id_sort]=sort(mu);
idx=cell(1,num_cluster)
%% Save index of order if mu
for i=1:num_cluster
idx{i}=find(cluster_label==id_sort(i));
end
%% Sort cluster label based on mu
for i=1:num_cluster
cluster_label(idx{i})=i;
end
It's unclear to me as to why you'd want to relabel the clusters based on the ordering of each centroid. You can simply use the labelling vector that is output from k-means to reference which cluster / centroid each point belongs to.
Nevertheless, the initial idea that you had to sort the centroids is a good one. The last part of your code seems rather inefficient because you're looping over each label and doing the reassignment. One thing I could perhaps suggest is to have a lookup table where the input is the original label and the output is the reordered labels based on the sorted centroids.
If you want to pursue this route, you can use a containers.Map where the keys are the labels given from the sort order that is output from sort, and the values are the reordered labels... namely, a vector that goes from 1 up to as many classes you have. You need to do this because the second output of sort tells you where each value in the original array would appear in the sorted result, so you must use this ordering to properly perform the relabelling. In addition, I would use the sortrows function in MATLAB, not raw sort. With how you're doing it, you are sorting each column / variable independently and that will give the wrong centroids. This will work for grayscale images where you only have one feature to consider, namely the grayscale, but if you go beyond grayscale and perhaps go into RGB or whatever colour space you desire, using raw sort will give you incorrect results. You need to consider each row as a single point, then sort the rows jointly.
Given your code, you'd do something like this:
%% K-mean
num_cluster=2;
nrows = size(Img_original,1);
ncols = size(Img_original,2);
I_1D = reshape(Img_original,nrows*ncols,1);
[cluster_idx mu]=kmeans(double(I_1D),num_cluster,'distance','sqEuclidean','Replicates',3);
%% Sort based on mu
[mu_sort id_sort]=sortrows(mu);
%// New - Create lookup
lookup = containers.Map(id_sort, 1:size(mu_sort,1));
%// Relabel the vector
cluster_idx_sort = lookup.values(num2cell(cluster_idx));
cluster_idx_sort = [cluster_idx_sort{:}];
%// Reshape back to original image dimensions
cluster_label = reshape(cluster_idx_sort,nrows,ncols);
This should hopefully give you some speedup in your code.
To double check, I tried this on the cameraman.tif image, that's part of the image processing toolbox. Running the code gives me these cluster centres:
>> mu
mu =
153.3484
23.7291
Once I sort the clusters in ascending order, this is what I get for the ordering and for the centroids:
>> mu_sort
mu_sort =
23.7291
153.3484
>> id_sort
id_sort =
2
1
So that works as we expected... now if we display the original cluster label map before sorting on the centroids with:
cluster_label = reshape(cluster_idx, nrows, ncols);
imshow(cluster_label,[]);
... we get this image:
Now, if we run through the sorting logic and display the centroids:
imshow(cluster_label, []);
... we get this image:
This works as I expected. Because the centroids flipped, so should the colouring.

Normalization of an image

I applied some operations on a grayscale image and now I am getting new values but the problem is the intensity values are now less than 0, between 0 and 255 and greater than 255. For values between [0-255] there is no problem but for intensity values < 0 and intensity values > 255 there is a problem as these values cannot occur in a grayscale image.
Therefore, I need to normalize the values so that all the values whether they are negative or greater than 255 or whatever other values are, comes in the range 0 to 255 so that the image can be displayed.
For that I know two methods:
Method #1
newImg = ((255-0)/(max(img(:))-min(img(:))))*(img-min(img(:)))
where min(img(:)) and max(img(:)) are the minimum and maximum values obtained after doing some operations on the input image img. The min can be less than 0 and the max can be greater than 255.
Method #2
I just make all the values less than 0 as 0 and all the values greater than 255 as 255, so:
img(img < 0) = 0;
img(img > 255) = 255;
I tried to use both the methods but I am getting good results using second method but not with the first one. Can anyone of you please tell me what the problem is?
That totally depends on the image content itself. Both of those methods are valid to ensure that the range of values is between [0,255]. However, before you decide on what method you're using, you need to ask yourself the following questions:
Question #1 - What is my image?
The first question you need to ask is what does your image represent? If this is the output of an edge detector for example, the method you choose will depend on the dynamic range of the values seen in the result (more below in Question #2). For example, it's preferable that you use the second method if there is a good distribution of pixels and a low variance. However, if the dynamic range is a bit smaller, then you'll want to use the first method to push up the contrast of your result.
If the output is an image subtraction, then it's preferable to use the first method because you want to visualize the exact differences between pixels. Truncating the result will not give you a good visualization of the differences.
Question #2 - What's the dynamic range of the values?
Another thing you need to take note of is how wide the dynamic range of the minimum and maximum values are. For example, if the minimum and maximum are not that far off from the limits of [0,255], then you can use the first or second method and you won't notice much of a difference. However, if your values are within a small range that is within [0,255], then doing the first method will increase contrast whereas the second method won't do anything. If it is your goal to also increase the contrast of your image and if the intensities are within the valid [0,255] range, then you should do the first method.
However, if you have minimum and maximum values that are quite far away from the [0,255] range, like min=-50 and max=350, then doing the first method won't bode very well - especially if the grayscale intensities have huge variance. What I mean by huge variance is that you would have values that are in the high range, values in the low range and nothing else. If you rescaled using the first method, this would mean that the minimum gets pushed to 0, the maximum gets shrunk to 255 and the rest of the intensities get scaled in between so for those values that are lower, they get scaled so that they're visualized as gray.
Question #3 - Do I have a clean or noisy image?
This is something that not many people think about. Is your image very clean, or are there a couple of spurious noisy spots? The first method is very bad when it comes to noisy pixels. If you only had a couple of pixel values that have a very large value but the other pixels are within the range of [0,255], this would make all of the other pixels get rescaled accordingly and would thus decrease the contrast of your image. You probably want to ignore the contribution made by these pixels and so the second method is preferable.
Conclusion
Therefore, there is nothing wrong with either of those methods that you have talked about. You need to be cognizant of what the image is, the dynamic range of values that you see once you examine the output and whether or not this is a clear or noisy image. You simply have to make a smart choice keeping those two factors in mind. So in your case, the first output probably didn't work because you have very large negative values and large positive values and perhaps very few of those values too. Doing a truncation is probably better for your application.

Color generation based on random number

I would like to create a color generator based on random numbers, which might differ just slightly, but I need colors to be easily recognizable from each other. I was thinking about generation then in a rgb format which would be probably easiest. I'm afraid simply multiplying given arguments wouldn't do very well. What algorithm do you suggest using? Also, second generated color should not be the same as previous one, but I don't want to store them - nor multiplying with (micro)time would do well since the scripts' parts are usually faster.
If you wanted truly random colors, then generating the same color 10 times in a row would be acceptable. To get values that are perceived as random, you have to strip out true randomness.
The easiest way to do this is probably with a cycling index into a list of colors. Say you pick web colors, a list of 216 colors. Each time you want a new color, add a random number to the index, wrapping as needed. To prevent getting the same color, limit random numbers to less than the number of colors.
colorIndex = ( colorIndex + ( random() % 100 ) + 1 ) % 216;
If you do not want a lookup table, then generate HSB colors but limit the hue to part of the circle that does not include the previous color. If the previous hue was 60 degrees, then pick the next hue above 90 or below 30 degrees, for example. You probably want to limit the saturation and brightness to be above 50% or so.
There are 255*255*255 possible combination of colors that you can do if you generate a random number for each value of RBG.
I wouldn't be afraid of color collision, but if you want to make sure that there will be no collisions whatsoever you will need to record the previous color.
This simple pseudo code illustrates how to avoid some necessary comparisons
if red is not equals previous_red then
if blue is not equals previous_blue then
if green is not equals previous_green then
use this color
else
generate again
Not an answer, but just to share a nice picture of xkcd:
It's not easy to model what constitutes "easily recognizable colors". The euclidean distance of the R,G,B components of a color is a rough measure, but the human eye is not an RGB color receptor! E.g. if a pair of colors has some euclidean distance between them, and another pair of colors have the exact distance between them, you don't really know whether each pair color is equally distinguishable, unless you see them!
For a true random number generator, have a look here. I'm sure you can bound it within a range of numbers too.
Let me sugest this:
Create a pseudo aleatory number algorythm (Type Google to find thowsands) and create an array with the colors.
You didn't specified the language, byt anyway you can have something like:
colors = [0xFF0000, 0x00FF00, 0x0000FF]
Red, Green and Blue
And you can have something like:
position = fn_random();
draw(colors[position]);
Hope it's what you are looking for...
Let me know!!

Mapping a list of numeric values to colors

I have a list of numeric values. I may normalize the values if needed.
I need to transform this list to a list of colors (in HSL, RGB or any other color model — I can always do conversion later).
For any given value the color must be the same every time.
The more different two given numeric values are, the more contrast corresponding values should be.
All used colors must be as contrast to each other as possible (this is a soft limitation, rough solution would do).
Note that list is rather large (thousands of numbers), so simply squeezing all numbers into a single color channel would produce too dense results.
You could consider using a 3D space-filling curve through your chosen colour space. I'll second Mark's CIELAB suggestion, wish I'd known about that last time I had to solve a similar problem.
Whatever algorithm you finally settle on, you might try the CIELAB color space. It normalizes the differences in human color perception, so that equal numeric spacing gives equal perceptual differences.
See: How to automatically generate N "distinct" colors?
It would be best to normalize your values, and run them through the code I suggested (where hue == your value), building a map/hash. (You can use a hash-style function instead, which is probably more efficient.)
You can "randomize" lightness (or brightness, depending on your model) and saturation using some predetermined bits of your number, for example.
Why not use shades of gray? Just calculate the min/max values and use that to translate each number into a different shade from white to black.
I know it's not colors, but in my opinion it'll be easier to interpret the results. I can tell what it means when something is darker vs. lighter, but who is to say that, for example, green is a higher value than orange?

Resources