I am working on an image processing project and I have to use entopyfilt (from matlab).
I researched and found some information to do it but not enough. I can calculate the entropy value of an image, but I don't know how to write an entropy filter. There is a similar question in the site, but I also didn't understand it.
Can anybody help me to understand entropy filter?
From the MATLAB documentation:
J = entropyfilt(I) returns the array J, where each output pixel contains the entropy value of the 9-by-9 neighborhood around the corresponding pixel in the input image I. I can have any dimension. If I has more than two dimensions, entropyfilt treats it as a multidimensional grayscale image and not as a truecolor (RGB) image. The output image J is the same size as the input image I.
For each pixel, you look at the 9 by 9 area around the pixel and calculate the entropy. Since the entropy calculation is a nonlinear calculation, it is not something you can do with a simple kernel filter. You have to loop over each pixel and do the calculation on a per-pixel basis.
Related
I want to compute depth entropy of a depth image in Matlab (same as this work ). Unfortunately, the authors don't reply my emails. There is a function,"entropyfilt", that compute the local entropy of grayscale image. I've used this function with a depth input image that captured by Kinect but it hasn't worked probably. Here is my input depth image:
Here is the code used for entropy computing:
J = entropyfilt(Depth);
imshow(mat2gray(J))
Sorry, My "reputation view" isn't enough, so I can't upload my result image.
How can I compute entropy image of a depth image? I want to acquire an image same as figure 4 in above paper.
Thanks in advance.
It is written in the paper, for each pixel you first extract two patches from the image, then you calculate the entropy of each patch. The formula for which is also in the paper and well-known in statistics.
If you want to use the function entropyfilt, you need to provide as a second argument a mask image that describes the patch (all pixels within the patch need to be 1 in the mask, others need to be 0). This is detailed in the documentation of said function.
The authors generate one color image from two entropy images. How they do so they seemingly forgot to mention.
I think the paper is low quality.
image1= imread('where is located ')
entropy(image1)
imshow(image1)
I am using Laplacian of Gaussian for edge detection using a combination of what is described in http://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm and http://wwwmath.tau.ac.il/~turkel/notes/Maini.pdf
Simply put, I'm using this equation :
for(int i = -(kernelSize/2); i<=(kernelSize/2); i++)
{
for(int j = -(kernelSize/2); j<=(kernelSize/2); j++)
{
double L_xy = -1/(Math.PI * Math.pow(sigma,4))*(1 - ((Math.pow(i,2) + Math.pow(j,2))/(2*Math.pow(sigma,2))))*Math.exp(-((Math.pow(i,2) + Math.pow(j,2))/(2*Math.pow(sigma,2))));
L_xy*=426.3;
}
}
and using up the L_xy variable to build the LoG kernel.
The problem is, when the image size is larger, application of the same kernel is making the filter more sensitive to noise. The edge sharpness is also not the same.
Let me put an example here...
Suppose we've got this image:
Using a value of sigma = 0.9 and a kernel size of 5 x 5 matrix on a 480 × 264 pixel version of this image, we get the following output:
However, if we use the same values on a 1920 × 1080 pixels version of this image (same sigma value and kernel size), we get something like this:
[Both the images are scaled down version of an even larger image. The scaling down was done using a photo editor, which means the data contained in the images are not exactly similar. But, at least, they should be very near.]
Given that the larger image is roughly 4 times the smaller one... I also tried scaling the sigma by factor of 4 (sigma*=4) and the output was... you guessed it right, a black canvas.
Could you please help me realize how to implement a LoG edge detector that finds the same features from an input signal, even if the incoming signal is scaled up or down (scaling factor will be given).
Looking at your images, I suppose you are working in 24-bit RGB. When you increase your sigma, the response of your filter weakens accordingly, thus what you get in the larger image with a larger kernel are values close to zero, which are either truncated or so close to zero that your display cannot distinguish.
To make differentials across different scales comparable, you should use the scale-space differential operator (Lindeberg et al.):
Essentially, differential operators are applied to the Gaussian kernel function (G_{\sigma}) and the result (or alternatively the convolution kernel; it is just a scalar multiplier anyways) is scaled by \sigma^{\gamma}. Here L is the input image and LoG is Laplacian of Gaussian -image.
When the order of differential is 2, \gammais typically set to 2.
Then you should get quite similar magnitude in both images.
Sources:
[1] Lindeberg: "Scale-space theory in computer vision" 1993
[2] Frangi et al. "Multiscale vessel enhancement filtering" 1998
I want to write my own focus stacking software but haven't been able to find a suitable explanation of any algorithm for extracting the in-focus portions of each image in the stack.
For those who are not familiar with focus stacking, this Wikipedia article does a nice job of explaining the idea.
Can anyone point me in the right direction for finding an algorithm? Even some key words to search would be helpful.
I realise this is over a year old but for anyone who is interested...
I have had a fair bit of experience in machine vision and this is how I would do it:
Load every image in memory
Perform a Gaussian blur on each image on one of the channels (maybe Green):
The simplest Gaussian kernel is:
1 2 1
2 4 2
1 2 1
The idea is to loop through every pixel and look at the pixels immediately adjacent. The pixel that you are looping through is multiplied by 4, and the neighboring pixels are multiplied by whatever value corresponds to the kernel above.
You can make a larger Gaussian kernel by using the equation:
exp(-(((x*x)/2/c/c+(y*y)/2/c/c)))
where c is the strength of the blur
Perform a Laplacian Edge Detection kernel on each Gaussian Blurred image but do not apply a threshold
The simplest Laplacian operator is:
-1 -1 -1
-1 8 -1
-1 -1 -1
same deal as the Gaussian, slide the kernel over the entire image and generate a result.
An equation to work out larger kernels is here:
(-1/pi/c/c/c/c)*(1-(x*x+y*y)/2/c/c)*exp(-(x*x+y*y)/2/c/c)
Take the absolute value of the Laplacian of Gaussian result. this will quantify the strength of edges with respect to the size and strength of your kernel.
Now create a blank image, loop through each pixel and find the strongest edge in the LoG (i.e. the highest value in the image stack) and take the RGB value for that pixel from the corresponding image.
Here is an example in MATLAB that I have created:
http://www.magiclantern.fm/forum/index.php?topic=11886.0
You are free to use it for whatever you like. It will create a file called Outsharp.bmp which is what you are after.
To better your output image you could:
- Compensate for differences in lightness levels between images (i.e. histogram matching or simple level adjustment)
- Create a custom algorithm to reject image noise
- Manually adjust the stack after you have generated it
- Apply a Gaussian blur (be sure to divide the result by 16) on the focus map so that the individual images are better merged
Good luck!
Folks,
I have read a number of articles on Discrete Wavelet Transform (DWT) and looked at some sample code as well. However, I am not clear on what exactly does DWT achieve.
Here is what I understand. For a two dimensional image in YUV format, I can pass in the Y plane (brightness) to DWT function as a parameter. The function returns me a matrix of the original width and height containing coefficient values.
What are these coefficient values telling me? Is it how fast or slow the brightness of a pixel has changed compared to its neighbors?
Further, the returned matrix is rearranged in four quarters. As the coefficients have been rearranged, I no longer know which coefficient belongs to which pixel. This is confusing. If I cannot associate the coefficient to its corresponding pixel location, how can I really use the coefficients?
A little bit of background. I am looking at hiding some information in an image as an invisible watermark. From what I understand, DWT can help me identify the best region to hide the information. However, I have not been able to put the whole picture together.
Ok. I figured out how DWT works. I was under the assumption that the coefficients generated have a relationship with the original image. However, the transform converts the input luma into a completely different set. It is possible to run the reverse transform on the new values to once again obtain the original values.
Regards,
Peter
I’m trying to calculate an average value of one color over the whole image in order to determine how color, saturation or intencity or eny other value describing this changes between frmaes of the video.
However i would like to get just one value that will describe whole frame (and sigle, chosen color in it). Calculating simple average value of color in frame gives me very small differences between video frames, just 2-3 on a 0..255 space.
Is there any other method to determine color of the image other than histogram which as i understand will give me more than one value describing single frame.
Which library are you using for image processing? If it's OpenCV (or Matlab) then the steps here will be quite easy. Otherwise you'd need to look around and experiment a bit.
Use a Mean Shift filter on RGB (or gray, whichever) to cluster the colors in the image - nearly similar colors are clustered together. This lessens the number of colors to deal with.
Change to gray-level and compute a frequency histogram with bins [0...255] of pixel values that are present in the image
The highest frequency - the median - will correspond to the bin (color) that is present the most. The frequency of each bin will give you the no. of pixels of the color that is present in the frame.
Take the median value as the single color to describe your frame - the color present in the largest amount in the frame.
The key point here is if the above steps are fast enough for realtime video. You'd have to try to find out I guess.
Worst case scenario, you could loop over all the pixels in the image and do a count. Not sure what you are using programming wise but I use Python with Numpy something similar to this. Where pb is a gtk pixbuf with my image in it.
def pull_color_out(self, pb, *args):
counter = 0
dat = pb.get_pixels_array().copy()
for y in range(0,pb.get_width()):
for x in range(0,pb.get_height()):
p = dat[x][y]
#counts pure red pixels
if p[1] = 255 and p[2] = 0 and p[3] = 0:
counter += 1
return counter
Other than that, I would normally use a histogram and get the data I need from that. Mainly, this will not be your fastest option, especially for a video, but if you have time or just a few frames then hack away :P