I applied some operations on a grayscale image and now I am getting new values but the problem is the intensity values are now less than 0, between 0 and 255 and greater than 255. For values between [0-255] there is no problem but for intensity values < 0 and intensity values > 255 there is a problem as these values cannot occur in a grayscale image.
Therefore, I need to normalize the values so that all the values whether they are negative or greater than 255 or whatever other values are, comes in the range 0 to 255 so that the image can be displayed.
For that I know two methods:
Method #1
newImg = ((255-0)/(max(img(:))-min(img(:))))*(img-min(img(:)))
where min(img(:)) and max(img(:)) are the minimum and maximum values obtained after doing some operations on the input image img. The min can be less than 0 and the max can be greater than 255.
Method #2
I just make all the values less than 0 as 0 and all the values greater than 255 as 255, so:
img(img < 0) = 0;
img(img > 255) = 255;
I tried to use both the methods but I am getting good results using second method but not with the first one. Can anyone of you please tell me what the problem is?
That totally depends on the image content itself. Both of those methods are valid to ensure that the range of values is between [0,255]. However, before you decide on what method you're using, you need to ask yourself the following questions:
Question #1 - What is my image?
The first question you need to ask is what does your image represent? If this is the output of an edge detector for example, the method you choose will depend on the dynamic range of the values seen in the result (more below in Question #2). For example, it's preferable that you use the second method if there is a good distribution of pixels and a low variance. However, if the dynamic range is a bit smaller, then you'll want to use the first method to push up the contrast of your result.
If the output is an image subtraction, then it's preferable to use the first method because you want to visualize the exact differences between pixels. Truncating the result will not give you a good visualization of the differences.
Question #2 - What's the dynamic range of the values?
Another thing you need to take note of is how wide the dynamic range of the minimum and maximum values are. For example, if the minimum and maximum are not that far off from the limits of [0,255], then you can use the first or second method and you won't notice much of a difference. However, if your values are within a small range that is within [0,255], then doing the first method will increase contrast whereas the second method won't do anything. If it is your goal to also increase the contrast of your image and if the intensities are within the valid [0,255] range, then you should do the first method.
However, if you have minimum and maximum values that are quite far away from the [0,255] range, like min=-50 and max=350, then doing the first method won't bode very well - especially if the grayscale intensities have huge variance. What I mean by huge variance is that you would have values that are in the high range, values in the low range and nothing else. If you rescaled using the first method, this would mean that the minimum gets pushed to 0, the maximum gets shrunk to 255 and the rest of the intensities get scaled in between so for those values that are lower, they get scaled so that they're visualized as gray.
Question #3 - Do I have a clean or noisy image?
This is something that not many people think about. Is your image very clean, or are there a couple of spurious noisy spots? The first method is very bad when it comes to noisy pixels. If you only had a couple of pixel values that have a very large value but the other pixels are within the range of [0,255], this would make all of the other pixels get rescaled accordingly and would thus decrease the contrast of your image. You probably want to ignore the contribution made by these pixels and so the second method is preferable.
Conclusion
Therefore, there is nothing wrong with either of those methods that you have talked about. You need to be cognizant of what the image is, the dynamic range of values that you see once you examine the output and whether or not this is a clear or noisy image. You simply have to make a smart choice keeping those two factors in mind. So in your case, the first output probably didn't work because you have very large negative values and large positive values and perhaps very few of those values too. Doing a truncation is probably better for your application.
Related
I searched a lot for finding the threshold values for the below mention methods.
methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', cv2.TM_SQDIFF_NORMED']
I also tried to figure them out by myself but I could only find thresholds for 3 methods which have max value of 1.0. The other methods values were in range of 10^5. I would like to know the bounds of these methods.
Can somebody point me in the right direction. My agenda is to loop through all the methods for template matching and get the best outcome.I went through the documentation and source code, but no luck.
These are the values I got , I could understand that *NORMED methods have values 0-1.
cv2.TM_CCOEFF -- 25349100.0
cv2.TM_CCOEFF_NORMED -- 0.31208357214927673
cv2.TM_CCORR -- 616707328.0
cv2.TM_CCORR_NORMED -- 0.9031367897987366
cv2.TM_SQDIFF -- 405656000.0
cv2.TM_SQDIFF_NORMED -- 0.737377941608429
As described in opencv documentation matchTemplate result is a sum of differences (varies with method) for each pixel, so for not normalized methods - thresholds would vary with size of template.
You can see formulas for each method and calculate thresholds for your template type considering that max difference between pixels is 255 for CV_8UC1 image.
So lets say you have 2 grayscale images and smallest one is 10x10.
In that case for TM_SQDIFF minimum distance would be 10x10x0^2=0 (images are identical) and maximum would be 10x10x255^2=6502500 (one image is completely black and other is white), which results in [0, 6502500] boundaries.
Of course it is possible to calculate that for the undefined sizes [A, B].
For TM_CCORR it would be AxBxmax(T(x',y')I(x+x',y+y')) = 65025AB
You can go on and calculate that for remaining methods, remember that if you have different from CV_8UC image types (like 32FC or 32SC) - you would need to replace 255 with corresponding values (max(float) max(int32))
I modified some Labview code I found online to use in my program. It works, I understand nearly all of it, but there's one section that confuses me. This is the program:
This program takes 2 images, subtracts them, and returns the picture plus a percentage difference. What I understand is it takes the pictures, subtracts them, converts the subtracted image into an array of colored pixels, then math happens, and the pixels are compared to the threshold. It adds a 1 for every pixel greater than the threshold, divides it by the image size, and out comes a percentage. The part I don't understand is the math part, the whole quotient and remainder section with a "random" 256. Because I don't understand how to get these numbers, I have a percentage, but I don't understand what they mean. Here's a picture of the front panel with 2 different tests.
In the top one, I have a percentage of 15, and the bottom a percentage of 96. This tells me that the bottom one is "96 percent different". But is there anyway to make sure this is accurate?
The other question I have is threshold, as I don't know exactly what that does either. Like if I change the threshold on the bottom image to 30, my percentage is 8%, with the same picture.
I'm sure once I understand the quotient/remainder part, it'll all make sense, but I can't seem to get it. Thank you for your help.
My best guess is that someone tried to characterize difference between 2 images with a single number. The remainder-quotient part is a "poor man" approach to split each 2D array element of the difference into 2 lower bytes (2 remainders) and the upper 2 byte word. Then lower 2 bytes of the difference are summed and the result is added to the upper 2 bytes (as a word). Maybe 3 different bytes each represented different channel of the camera (e.g. RGB color)?
Then, the value is compared against the threshold, and number of pixels above the threshold are calculated. This number is divided by the total number of pixels to calculate the %% difference. So result is a %% of pixels, which differ from the master image by the threshold.
E.g. if certain pixel of your image was 0x00112233 and corresponding master image pixel had a value of 0x00011122, then the number compared to the threshold is (0x11 - 0x01) + (0x22 - 0x11) + (0x33 - 0x22) = 0x10 + 0x11 + 0x11 = 0x32 = 50 decimal.
Whether this is the best possible comparison/difference criteria is the question well outside of this topic.
I have 16-bit raw image (12 effective bits). I convert it to rgb and now I want to change the dynamic range. I created 2 map functions. You can see them visualized below. As you can see the first function maps values 0-500 to 0-100 and the second one maps the rest values to 101-255.
Now I want to apply the map-functions on the rgb image. What I'm doing is iterating through each pixel, find appropriate function for each channel and apply it on the channel. For example, the pixel is RGB=[100 2000 4000]. On R channel I'll apply the first function since 100 is in 0-500 range. But, on G and B channels I'll apply the second function since their values are in 501-4095.
But, in doing this way I'm actually changing the actual color of the pixel since I apply different functions on the channels of the pixel.
Can you suggest how to do it or at least give me a direction or show some articles?
What you're doing is a very straightforward imaging operation, frequently applied in image and video processing. Sometimes it's (imprecisely) called a lookup table (LUT), even though it's not always implemented via an actual lookup table. Examples of this are gamma adjustment or log encoding.
For instance, an example of this kind of encoding is sRGB, which is a gamma encoding from linear light. You can read about it here: http://en.wikipedia.org/wiki/SRGB. You'll see that it has a nonlinear adjustment.
The name LUT implies a good way of doing it. If you can make your image a uint8 or uint16 valued set, you can create a vector of desired output values for any input value. The lookup table has the same number of elements as the possible range of the variable type. If you were using a uint8, you'd have a lookup table of 256 values. Then the lookup is easy, you just use the image value as an index into your LUT to get the resulting value. That computational efficiency is why LUTS are so widely used.
In your case, since you're working in RGB space, it is acceptable to apply the curves in exactly the same way to each of the three color channels. RGB space is nice for that reason. However, for various reasons, sometimes different LUTs are implemented per-channel.
So if you had an image (we'll use one included in MATLAB and pretend it's 12 bit by scaling it):
someimage = uint16(imread('autumn.tif')).*16;
image(someimage.*16); % Need to multiply again to display 16 bit data scaled properly
For your LUT, you would implement this as:
lut = uint8([(0:500).*(1/5), (501:4095).*((255-101)/(4095-501)) + 79.5326]);
plot(lut); %Take a look at the lut
This makes the piecewise calculation you described in your question.
You could make a new image this way:
convertedimage = lut(double(someimage)+1);
image(convertedimage);
Note that because MATLAB indexes with doubles--one based--you need to cast properly and add one. This doesn't slow things down as much as you may think; MATLAB is made to do this. I've been using MATLAB for decades and this still looks odd to me.
This method lets you get fancy with the LUT creation (logs, exp, whatever) and it still runs very fast.
In your case, your LUT only needs 4096 elements since your input data is only 12 bits. You may want to be careful with the bounds, since it's possible a uint16 could have higher values. One clean way to bound this is to use the min and end functions:
convertedimage = lut(min(double(someimage)+1, end));
Now, this has implemented your function, but perhaps you want a slightly different function. For instance, a common function of this type is a simple gamma adjustment. A gamma of 2.2 means that the incoming image values are scaled by taking them to the 1/2.2 power (if scaled between 0 and 1). We can create such a LUT as follows:
lutgamma = uint8(256.*(((0:4095)./4095).^(1/2.2)));
plot(lutgamma);
Again, we apply the LUT with a simple indexing:
convertedimage = lutgamma(min(double(someimage)+1, end));
And we get the following image:
Using a smooth LUT will usually improve overall image quality. A piecewise linear LUT will tend to cause the resulting image to have odd discontinuities in the shaded regions.
These are so common in many imaging systems that LUTs have file formats. To see what I mean, look at this LUT generator from a major camera company. LUTs are a big deal, and it looks like you're on the right track.
I think you are referring to something that Photoshop calls "Enhance Monochromatic Contrast", which is described here - look at "Step 3: Try Out The Different Algorithms".
Basically, I think you find a single min from all the channels and a single max from across all 3 channels and apply the same scaling to all the channels, rather than doing each channel individually with its own min and max.
Alternatively, you can convert to Lab (Lightness plus a and b) mode and apply your function to the Lightness channel (without affecting the a and b channels which hold the colour information) then transform back to RGB, your colour unaffected.
UPDATE
Here is my code that is meant to add up the two matrices and using element by element addition and then divide by two.
function [ finish ] = stackAndMeanImage (initFrame, finalFrame)
cd 'C:\Users\Disc-1119\Desktop\Internships\Tracking\Octave\highway\highway (6-13-2014 11-13-41 AM)';
pkg load image;
i = initFrame;
f = finalFrame;
astr = num2str(i);
tmp = imread(astr, 'jpg');
d = f - i
for a = 1:d
a
astr = num2str(i + 1);
read_tmp = imread(astr, 'jpg');
read_tmp = rgb2gray(read_tmp);
tmp = tmp :+ read_tmp;
tmp = tmp / 2;
end
imwrite(tmp, 'meanimage.JPG');
finish = 'done';
end
Here are two example input images
http://imgur.com/5DR1ccS,AWBEI0d#1
And here is one output image
http://imgur.com/aX6b0kj
I am really confused as to what is happening. I have not implemented what the other answers have said yet though.
OLD
I am working on an image processing project where I am now manually choosing images that are 'empty' or only have the background, so that my algorithm can compute the differences and then do some more analysis, I have a simple piece of code that computes the mean of the two images, which I have converted to grayscale matrices, but this only works for two images, because when I find the mean of two, then take this mean and find the mean of this versus the next image, and do this repeatedly, I end up with a washed out white image that is absolutely useless. You can't even see anything.
I found that there is a function in Matlab called imFuse that is able to average images. I was wondering if anyone knew the process that imFuse uses to combine images, I am happy to implement this into Octave, or if anyone knew of or has already written a piece of code that achieves something similiar to this. Again, I am not asking for anyone to write code for me, just wondering what the process for this is and if there are already pre-existing functions out there, which I have not found after my research.
Thanks,
AeroVTP
You should not end up with a washed-out image. Instead, you should end up with an image, which is technically speaking temporally low-pass filtered. What this means is that half of the information content is form the last image, one quarter from the second last image, one eight from the third last image, etc.
Actually, the effect in a moving image is similar to a display with slow response time.
If you are ending up with a white image, you are doing something wrong. nkjt's guess of type challenges is a good one. Another possibility is that you have forgotten to divide by two after summing the two images.
One more thing... If you are doing linear operations (such as averaging) on images, your image intensity scale should be linear. If you just use the RGB values or some grayscale values simply calculated from them, you may get bitten by the nonlinearity of the image. This property is called the gamma correction. (Admittedly, most image processing programs just ignore the problem, as it is not always a big challenge.)
As your project calculates differences of images, you should take this into account. I suggest using linearised floating point values. Unfortunately, the linearisation depends on the source of your image data.
On the other hand, averaging often the most efficient way of reducing noise. So, there you are in the right track assuming the images are similar enough.
However, after having a look at your images, it seems that you may actually want to do something else than to average the image. If I understand your intention correctly, you would like to get rid of the cars in your road cam to give you just the carless background which you could then subtract from the image to get the cars.
If that is what you want to do, you should consider using a median filter instead of averaging. What this means is that you take for example 11 consecutive frames. Then for each pixel you have 11 different values. Now you order (sort) these values and take the middle (6th) one as the background pixel value.
If your road is empty most of the time (at least 6 frames of 11), then the 6th sample will represent the road regardless of the colour of the cars passing your camera.
If you have an empty road, the result from the median filtering is close to averaging. (Averaging is better with Gaussian white noise, but the difference is not very big.) But your averaging will be affected by white or black cars, whereas median filtering is not.
The problem with median filtering is that it is computationally intensive. I am very sorry I speak very broken and ancient Octave, so I cannot give you any useful code. In MatLab or PyLab you would stack, say, 11 images to a M x N x 11 array, and then use a single median command along the depth axis. (When I say intensive, I do not mean it couldn't be done in real time with your data. It can, but it is much more complicated than averaging.)
If you have really a lot of traffic, the road is visible behind the cars less than half of the time. Then the median trick will fail. You will need to take more samples and then find the most typical value, because it is likely to be the road (unless all cars have similar colours). There it will help a lot to use the colour image, as cars look more different from each other in RGB or HSV than in grayscale.
Unfortunately, if you need to resort to this type of processing, the path is slightly slippery and rocky. Average is very easy and fast, median is easy (but not that fast), but then things tend to get rather complicated.
Another BTW came into my mind. If you want to have a rolling average, there is a very simple and effective way to calculate it with an arbitrary length (arbitrary number of frames to average):
# N is the number of images to average
# P[i] are the input frames
# S is a sum accumulator (sum of N frames)
# calculate the sum of the first N frames
S <- 0
I <- 0
while I < N
S <- S + P[I]
I <- I + 1
# save_img() saves an averaged image
while there are images to process
save_img(S / N)
S <- -P[I-N] + S + P[I]
I <- I + 1
Of course, you'll probably want to use for-loops, and += and -= operators, but still the idea is there. For each frame you only need one subtraction, one addition, and one division by a constant (which can be modified into a multiplication or even a bitwise shift in some cases if you are in a hurry).
I may have misunderstood your problem but I think what you're trying to do is the following. Basically, read all images into a matrix and then use mean(). This is providing that you are able to put them all in memory.
function [finish] = stackAndMeanImage (ini_frame, final_frame)
pkg load image;
dir_path = 'C:\Users\Disc-1119\Desktop\Internships\Tracking\Octave\highway\highway (6-13-2014 11-13-41 AM)';
imgs = cell (1, 1, d);
## read all images into a cell array
current_frame = ini_frame;
for n = 1:(final_frame - ini_frame)
fname = fullfile (dir_path, sprintf ("%i", current_frame++));
imgs{n} = rgb2gray (imread (fname, "jpg"));
endfor
## create 3D matrix out of all frames and calculate mean across 3rd dimension
imgs = cell2mat (imgs);
avg = mean (imgs, 3);
## mean returns double precision so we cast it back to uint8 after
## rescaling it to range [0 1]. This assumes that images were all
## originally uint8, but since they are jpgs, that's a safe assumption
avg = im2uint8 (avg ./255);
imwrite (avg, fullfile (dir_path, "meanimage.jpg"));
finish = "done";
endfunction
For a thumbnail-engine I would like to develop an algorithm that takes x random thumbnails (crop, no resize) from an image, analyzes them for contrast and chooses the one with the highest contrast. I'm working with PHP and Imagick but I would be glad for some general tips about how to compute contrast of imagery.
It seems that many things are easier than computing contrast, for example counting colors, computing luminosity,etc.
What are your experiences with the analysis of picture material?
I'd do it that way (pseudocode):
L[256] = {0,0,0...}
loop over each pixel:
luminance = avg(R,G,B)
increment L[luminance] by 1
for i = 0 to 255:
if L[i] < C: L[i] = 0 // C = threshold of your chose
find index of first and last non-zero value of L[]
contrast = last - first
In looking for the image "with the highest contrast," you will need to be very careful in how you define contrast for the image. In the simplest way, contrast is the difference between the lowest intensity and the highest intensity in the image. That is not going to be very useful in your case.
I suggest you use a histogram approach to describe the contrast of a given image and then compare the properties of the histograms to determine the image with the highest contrast as you define it. You could use a variety of well known containers to represent the histogram in code, or construct a class to meet your specific needs. (I am not implying that you need to create a histogram in the form of a chart – just a statistical representation of the intensity values.) You could use the variance of each histogram directly as a measure of contrast, or use the standard deviation if that is easier to work with.
The key really lies in how you define the contrast of the image. In general, I would define a high contrast image as one with values present for all, or nearly all, the possible values. And I would further add that in this definition of a high contrast image, the intensity values of the image will tend to be distributed across the range of possible values in a uniform way.
Using this approach, a low contrast image would tend to have relatively few discrete intensity values and they would tend to be closely grouped together rather than uniformly distributed. (As a general rule, they will also tend to be grouped toward the center of the range.)