I am new in image processing and I don't know the use of basic terms, I know the basic definition of sparsity, but can anyone please elaborate the definition in term of image processing?
Well Sajid, I actually was doing image processing a few months ago, and I had found a website that gave me what I thought was the best definition of sparsity.
Sparsity and density are terms used to describe the percentage of
cells in a database table that are not populated and populated,
respectively. The sum of the sparsity and density should equal 100%.
A table that is 10% dense has 10% of its cells populated with non-zero
values. It is therefore 90% sparse – meaning that 90% of its cells are
either not filled with data or are zeros.
I took this in the context of on/off for black and white image processing. If many pixels were off, then the pixels were sparse.
As The Obscure Question said, sparsity is when a vector or matrix is mostly zeros. To see a real world example of this, just look at the wavelet transform, which is known to be sparse for any real-world image.
(all the black values are 0)
Sparsity has powerful impacts. It can transform matrix multiplication of two NxN matrices, normally a O(N^3) operation, into an O(k) operation (with k non-zero elements). Why? Because it's a well-known fact that for all x, x * 0 = 0.
What does sparsity mean? In the problems I've been exposed to, it means similarity in some domain. For example, natural images are largely the same color in areas (the sky is blue, the grass is green, etc). If you take the wavelet transform of that natural image, the output is sparse through the recursive nature of the wavelet (well, at least recursive in the Haar wavelet).
Related
Why are the basis blocks corresponding to reflected waves in the quantisation matrix given seemingly random priorities in the standard JPEG quantisation matrices. Also, why aren't the priorities monotonic with respect to frequency?
I haven't been able to find any explanation and all I can come up with is possible tiling patterns occurring with symmetric quantisation matrices or an adaptation to the arrangement of photoreceptors in the eye.
The quantization tables are a set of fudge factors that attempt to model human perception.
The specific quantization table values are more art than science, because human perception is quirky and complex, and ideal coefficients depend on specific viewing conditions that can only be roughly guessed in advance.
Tables are not always monotonic with respect to frequency, because blocks of certain frequencies form patterns that are more useful than others, e.g. for straight horizontal and vertical lines.
I need a (preferably simple and fast) image hashing algorithm. The hash value is used in a lookup table, not for cryptography.
Some of the images are "computer graphic" - i.e. solid-color filled rects, rasterized texts and etc., whereas there are also "photographic" images - containing rich color spectrum, mostly smooth, with reasonable noise amplitude.
I'd also like the hashing algorithm to be able to be applied to specific image parts. I mean, the image can be divided into a grid cells, and the hash function of each cell should depend only on the contents of this cell. So that one may spot quickly if two images have common areas (in case they're aligned appropriately).
Note: I only need to know if two images (or their parts) are identical. That is, I don't need to match similar images, there's no need in feature recognition, correlation, and other DSP techniques.
I wonder what is the preferred hashing algorithm.
For "photographic" images just XOR-ing all the pixels within a grid cell is ok more-or-less. The probability of the same hash value for different images is pretty low, especially because the presence of the (nearly white) noise breaks all the potential symmetries. Plus the spectrum of such a hash function looks good (any value is possible with nearly the same probability).
But such a naive algorithm may not be used with "artificial" graphics. Identical pixels, repeating patterns, geometrical offset invariance are very common for such images. XOR-ing all the pixels will give 0 for any image with even number of identical pixels.
Using something like CRT-32 looks somewhat promising, but I'd like to figure-out something faster. I thought about iterative formula, each new pixel mutates the current hash value, like this:
hashValue = (hashValue * /*something*/ | newPixelValue) % /* huge prime */
Doing modulo prime number should probably give a good dispersion, so that I'm leaning toward this option. But I'd like to know if there are better varians.
Thanks in advance.
Have a look at this tutorial on the phash algorithm http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html which is used to find closely matching images.
If you want to make it very fast, you should consider taking a random subset of the pixels to avoid reading the entire image. Next, compute a hash function on the sequence of values at those pixels. The random subset should be selected by a deterministic pseudo-random number generator with fixed seed so that identical images produce identical subsets and consequently identical hash values.
This should work reasonably well even for artificial images. However, if you have images which differ from each other by a small number of pixels, this is going to give hash collisions. More iterations give better reliability. If that is the case, for instance, if your images set is likely to have pairs with one different pixel, you must read every pixel to compute the hash value. Taking a simple linear combination with pseudo-random coefficients would be good enough even for artificial images.
pseudo-code of a simple algorithm
Random generator = new generator(2847) // Initialized with fixed seed
int num_iterations = 100
int hash(Image image) {
generator.reset() //To ensure consistency on each evaluation
int value = 0
for num_iteration steps {
int nextValue = image.getPixel(generator.nextInt()%image.getSize()).getValue()
value = value + nextValue*generator.nextInt()
}
return value
}
I need to compare 3 216x216 matrix (data correlation matrix, events etc) . can someone suggest a way to plot these in matlab or someother plotting tools that can easily visualise and compare them ... does a 3d mesh plot be useful ? I thought mesh would be good .. but I need others opinion too.
Thanks in advance,
Sparse matrices
You can use the spy() method to visualize a "sparsity pattern", as Matlab calls it. It plots a dot (or any other marker) where the matrix element is non-zero.
spy() can also be used to visualize non-sparse matrices where a lot of entries are close to zero - just threshold the matrix first:
a=eye(50)+0.01*randn(50);
spy(a) % Not very useful
b=a; b(b<0.02)=0;
figure, spy(b) % Much more useful
More generally, you can apply upper and lower thresholds to visualize the location of matrix entries whose value is within a specific range.
Corellation
It may be useful to just display the matrix using imagesc(). This may give you an idea of the degree of corellation in your data - i.e. an uncorellated signal will have a corellation matrix with dominant diagonal elements, which will be clearly visible. I find Matlab's default color map distracting, so I usually do something like
colormap(gray);imagesc(a);
Miscellaneous
Of course, there's a whole host of non-visual comparisons you can make - various norm()'s, std(), spectral analysis using eig() for square matrices, or svd() more generally. You can compare eigenvalue magnitudes, or compare the eigenvectors. This may be very useful or complete garbage, depending on what your data is.
Thus, to conclude (for now), depending on what specifically your matrices contain, you may get more useful suggestions.
I have implemented a nice algorithm ("Non Local Means") for reducing noise in image.
It is based on it's Matlab implementation.
The problem with NLMeans is that the original algorithm is slow even on compiled languages like c/c++ and i am trying to run it using scripting language.
One of best solutions is to use improved Blockwise NLMeans algorithm which is ~60-80 times faster. The problem is that the paper which describes it is written in a complex mathematical language and it's really hard for me to understand an idea and program it
(yes, i didn't learn math at college).
That is why i am desperately looking for a pseudo code of this algorithm.
(modification of original Matlab implementation would be just perfect)
I admit, I was intrigued until I saw the result – 260+ seconds on a dual core, and that doesn't assume the overhead of a scripting language, and that's for the Optimized Block Non Local Means filter.
Let me break down the math for you – My idea of pseudo-code is writing in Ruby.
Non Local Means Filtering
Assume an image that's 200 x 100 pixels (20000 pixels total), which is a pretty tiny image. We're going to have to go through 20,000 pixels and evaluate each one on the weighted average of the other 19,999 pixels: [Sorry about the spacing, but the equation is interpreted as a link without it]
NL [v] (i) = ∑ w(i,j)v(j) [j ∈ I]
where 0 ≤ w(i,j) ≤ 1 and ∑j w(i,j) = 1
Understandably, this last part can be a little confusing, but this is really nothing more than a convolution filter the size of the whole image being applied to each pixel.
Blockwise Non Local Means Filtering
The blockwise implementation takes overlapping sets of voxels (volumetric pixels - the implementation you pointed us to is for 3D space). Presumably, taking a similar approach, you could apply this to 2D space, taking sets of overlapping pixels. Let's see if we can describe this...
NL [v] (ijk) = 1/|Ai|∑ w(ijk, i)v(i)
Where A is a vector of the pixels to be estimated, and similar circumstances as above are applied.
[NB: I may be slightly off; It's been a few years since I did heavy image processing]
Algorithm
In all likelihood, we're talking about reducing complexity of the algorithm at a minimal cost to reduction quality. The larger the sample vector, the higher the quality as well as the higher the complexity. By overlapping then averaging the sample vectors from the image then applying that weighted average to each pixel we're looping through the image far fewer times.
Loop through the image to collect the sample vectors and store their weighted average to an array.
Apply each weighted average (a number between 0 and 1) to each pixel times the pixels value.
Pretty simple, but the processing time is going to be horrid with larger images.
Final Thoughts
You're going to have to make some tough decisions. If you're going to use a scripting language, you're already dealing with significant interpretive overhead. It's far from optimal to use a scripting language for heavy duty image processing. If you're not processing medical images, in all likelihood, there are far better algorithms to use with lesser O's.
Hope this is helpful... I'm not terribly good at making a point clearly and concisely, so if I can clarify anything, let me know.
I'm cross-posting this from math.stackexchange.com because I'm not getting any feedback and it's a time-sensitive question for me.
My question pertains to linear separability with hyperplanes in a support vector machine.
According to Wikipedia:
...formally, a support vector machine
constructs a hyperplane or set of
hyperplanes in a high or infinite
dimensional space, which can be used
for classification, regression or
other tasks. Intuitively, a good
separation is achieved by the
hyperplane that has the largest
distance to the nearest training data
points of any class (so-called
functional margin), since in general
the larger the margin the lower the
generalization error of the
classifier.classifier.
The linear separation of classes by hyperplanes intuitively makes sense to me. And I think I understand linear separability for two-dimensional geometry. However, I'm implementing an SVM using a popular SVM library (libSVM) and when messing around with the numbers, I fail to understand how an SVM can create a curve between classes, or enclose central points in category 1 within a circular curve when surrounded by points in category 2 if a hyperplane in an n-dimensional space V is a "flat" subset of dimension n − 1, or for two-dimensional space - a 1D line.
Here is what I mean:
That's not a hyperplane. That's circular. How does this work? Or are there more dimensions inside the SVM than the two-dimensional 2D input features?
This example application can be downloaded here.
Edit:
Thanks for your comprehensive answers. So the SVM can separate weird data well by using a kernel function. Would it help to linearize the data before sending it to the SVM? For example, one of my input features (a numeric value) has a turning point (eg. 0) where it neatly fits into category 1, but above and below zero it fits into category 2. Now, because I know this, would it help classification to send the absolute value of this feature for the SVM?
As mokus explained, support vector machines use a kernel function to implicitly map data into a feature space where they are linearly separable:
Different kernel functions are used for various kinds of data. Note that an extra dimension (feature) is added by the transformation in the picture, although this feature is never materialized in memory.
(Illustration from Chris Thornton, U. Sussex.)
Check out this YouTube video that illustrates an example of linearly inseparable points that become separable by a plane when mapped to a higher dimension.
I am not intimately familiar with SVMs, but from what I recall from my studies they are often used with a "kernel function" - essentially, a replacement for the standard inner product that effectively non-linearizes the space. It's loosely equivalent to applying a nonlinear transformation from your space into some "working space" where the linear classifier is applied, and then pulling the results back into your original space, where the linear subspaces the classifier works with are no longer linear.
The wikipedia article does mention this in the subsection "Non-linear classification", with a link to http://en.wikipedia.org/wiki/Kernel_trick which explains the technique more generally.
This is done by applying what is know as a [Kernel Trick] (http://en.wikipedia.org/wiki/Kernel_trick)
What basically is done is that if something is not linearly separable in the existing input space ( 2-D in your case), it is projected to a higher dimension where this would be separable. A kernel function ( can be non-linear) is applied to modify your feature space. All computations are then performed in this feature space (which can be possibly of infinite dimensions too).
Each point in your input is transformed using this kernel function, and all further computations are performed as if this was your original input space. Thus, your points may be separable in a higher dimension (possibly infinite) and thus the linear hyperplane in higher dimensions might not be linear in the original dimensions.
For a simple example, consider the example of XOR. If you plot Input1 on X-Axis, and Input2 on Y-Axis, then the output classes will be:
Class 0: (0,0), (1,1)
Class 1: (0,1), (1,0)
As you can observe, its not linearly seperable in 2-D. But if I take these ordered pairs in 3-D, (by just moving 1 point in 3-D) say:
Class 0: (0,0,1), (1,1,0)
Class 1: (0,1,0), (1,0,0)
Now you can easily observe that there is a plane in 3-D to separate these two classes linearly.
Thus if you project your inputs to a sufficiently large dimension (possibly infinite), then you'll be able to separate your classes linearly in that dimension.
One important point to notice here (and maybe I'll answer your other question too) is that you don't have to make a kernel function yourself (like I made one above). The good thing is that the kernel function automatically takes care of your input and figures out how to "linearize" it.
For the SVM example in the question given in 2-D space let x1, x2 be the two axes. You can have a transformation function F = x1^2 + x2^2 and transform this problem into a 1-D space problem. If you notice carefully you could see that in the transformed space, you can easily linearly separate the points(thresholds on F axis). Here the transformed space was [ F ] ( 1 dimensional ) . In most cases , you would be increasing the dimensionality to get linearly separable hyperplanes.
SVM clustering
HTH
My answer to a previous question might shed some light on what is happening in this case. The example I give is very contrived and not really what happens in an SVM, but it should give you come intuition.