Is it possible to calculate the mathematical function of a 2D image? - image

The question basically says it all. I would like to add that lets suppose I have an image, a photograph and I wish to calculate its mathematical function, so that when I input x and y pixel value, it returns a vector consisting of R,G,B values at that x,y point. Therefore I can use a for loop to construct the whole image by just that function. I am not asking about the whole solution or algorithm here, but just that if this thing is possible, which direction should I take to go about doing this. Reference to relevant papers would be really nice.
Thanks
Azmuh

Yes, it is absolutely always possible. Basically, if you choose some points, there is always (an infinity of) smooth explicit functions (that is nice functions) which value on the points is exactly the one you choose.
For example, you can have a look at http://en.wikipedia.org/wiki/Lagrange_polynomial or http://en.wikipedia.org/wiki/Trigonometric_interpolation. They are two different methods to compute an explicit function which pass exactly by the data points you have. So you can apply those methods for your image, seen as a set of data points, and separately for R, G, and B.
At the end, you get one simple function explicitly (a polynomial or a trigonometric series, depending on what you chose), and you can compute its values where you want.
However, note that I would definitely not recommend to use those methods to effectively retrieve the data. Indeed, the functions you get are absolutely not optimized (that is with a veeeery high degree (for a n×m image, each color will have a degree nm-1), very high coefficients) and furthermore will have extremely large values between your original points (look for Runge's phenomenon).

This is not possible in general... Imagine an image that has been generated by random values for each pixel. You can't find a mathematical expression that will give you the value of a pixel given its 2d coordinates.
Now it may be possible for some images that have been generated using a function. In that case, it's not a problem specific to image processing, it's get back the function from some points of the function (in your case, you have all the points). It's exactly the same thing as extrapolating a curve from a set of points when you trace a graph in excel. The more points you have, the more precise the function you wind will be.
Look for information about Regression analysis. I can't help you much but there are some algorithms that exist.

Related

Obtaining the functional form of a curve

The following is the plot of a curve f(r), where r is the radial coordinate, and plotted for different values of a parameter as shown:
However, I don't know the functional form of the curve and I am interested to find the same. Are there any numerical methods which can be used to find the functional form of f(r) in terms of the radial coordinate and the parameter?
I had found a solution of the problem based on the suggestion by ja72 to use the Eureqa software which churns through the data to create accurate predictive models using evolutionary search algorithm.
In the question, the different curves corresponds to different values of . So, initially I obtained the best fit equation for different values of and found that the following model equation is suitable for my purpose:
Then, I repeated the process for a large number of values of and calculated the values of the four functions for different values of and then individually fitted these four functions. The following are the results that I obtained:
N.B.: Eureqa gave several other better fitting formulas than those mentioned in the answer. But the formulas that I mentioned are sufficiently accurate for my purpose and have minimum complexity.
A blind curve fit without an underlying model is a dangerous thing.
You need to have an understanding of the physical model behind the data to create a successful fit. The reason is that if r is distance and the best fit curve uses r^0.4072 for example, that dimension raised to a decimal power bears no meaning and it hides any underlying assumptions.Like some other dimension l not included in the model, whereas only the dimensionless quantity (r/l) would make sense to raise to the decimal power.
From a function analysis standpoint
These curves are not the result of any standard math function. Well I am not that familiar with bessel functions, gamma functions and legendre polynomials. But none of the standard functions you find in a scientific calculator jumps out here.
If r is assumed to be dimensionless, then you try to match the asymptotic behavior when r -> 0 and when r -> ∞. The would be the baseline curve. To me it does not look hyperbolic, but rather close to 1/LN(1+r).
So change the variables make g=1/LN(1+r) and plot f(r) against g(r) and see what that looks like. Then try another round of curve fitting in the new curves ... and so on.
Nobody can answer this question
Nobody else could effectively answer this question but you, because a) you have the data, and b) you need to make assumptions about what region is important or not, and what is acceptable deviation.

Finding optimal solution to multivariable function with non-negligible solution time?

So I have this issue where I have to find the best distribution that, when passed through a function, matches a known surface. I have written a script that creates the distribution given some parameters and spits out a metric that compares the given surface to the known, but this script takes a non-negligible time, so I can't just run through a very large set of parameters to find the optimal set of parameters. I looked into the simplex method, and it seems to be the right path, but its not quite what I need, because I dont exactly have a set of linear equations, and dont know the constraints for the parameters, but rather one method that gives a single output (an thats all). Can anyone point me in the right direction to how to solve this problem? Thanks!
To quickly go over my process / problem again, I have a set of parameters (at this point 2 but will be expanded to more later) that defines a distribution. This distribution is used to create a surface, which is compared to a known surface, and an error metric is produced. I want to find the optimal set of parameters, but cannot run through an arbitrarily large number of parameters due to the time constraint.
One situation consistent with what you have asked is a model in which you have a reasonably tractable probability distribution which generates an unknown value. This unknown value goes through a complex and not mathematically nice process and generates an observation. Your surface corresponds to the observed probability distribution on the observations. You would be happy finding the parameters that give a good least squares fit between the theoretical and real life surface distribution.
One approximation for the fitting process is that you compute a grid of values in the space output by the probability distribution. Each set of parameters gives you a probability for each point on this grid. The not nice process maps each grid point here to a nearest grid point in the space of the surface. The least squares fit is a quadratic in the probabilities calculated for the first grid, because the probabilities calculated for a grid point in the surface are the sums of the probabilities calculated for values in the first grid that map to something nearer to that point in the surface than any other point in the surface. This means that it has first (and even second) derivatives that you can calculate. If your probability distribution is nice enough you can use the chain rule to calculate derivatives for the least squares fit in the initial parameters. This means that you can use optimization methods to calculate the best fit parameters which require not just a means to calculate the function to be optimized but also its derivatives, and these are generally more efficient than optimization methods which require only function values, such as Nelder-Mead or Torczon Simplex. See e.g. http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/optim/package-summary.html.
Another possible approach is via something called the EM Algorithm. Here EM stands for Expectation-Maximization. It can be used for finding maximum likelihood fits in cases where the problem would be easy if you could see some hidden state that you cannot actually see. In this case the output produced by the initial distribution might be such a hidden state. One starting point is http://www-prima.imag.fr/jlc/Courses/2002/ENSI2.RNRF/EM-tutorial.pdf.

Rotational invariant hash function for binary matrix

I am looking for a hash function that will assign a scalar value for a small binary matrix(7x7). I want it to give different values for 2 different matrices unless one matrix is a 90 degree,180 degree or 270 degree rotation of the other one.
Do you have any suggestions on how I could do this? I was expecting to find a method in image processing as this would be equivalent to a 7x7 binary image but I could not find anything.
Converting my comment to an answer:
If you're trying to find a way to test if two objects are equivalent after doing some sort of transformation, it often helps to pick a single "canonical form" for the object that can easily be computed. In your case, it would probably help a lot to pick a single rotation of the matrix as the "canonical" rotation and compare things that way. One simple option would be to pick the lexicographically first matrix out of all the rotations possible, then use that.

Theory on how to find the equation of a curve given a variable number of data points

I have recently started working on a project. One of the problems I ran into was converting changing accelerations into velocity. Accelerations at different points in time are provided through sensors. If you get the equation of these data points, the derivative of a certain time (x) on that equation will be the velocity.
I know how to do this on the computer, but how would I get the equation to start with? I have searched around but I have not found any existing programs that can form an equation given a set of points. In the past, I have created a neural net algorithm to form an equation, but it takes an incredibly long time to run.
If someone can link me a program or explain the process of doing this, that would be fantastic.
Sorry if this is in the wrong forum. I would post into math, but a programming background will be needed to know the realm of possibility of what a computer can do quickly.
This started out as a comment but ended up being too big.
Just to make sure you're familiar with the terminology...
Differentiation takes a function f(t) and spits out a new function f'(t) that tells you how f(t) changes with time (i.e. f'(t) gives the slope of f(t) at time t). This takes you from displacement to velocity or from velocity to acceleration.
Integreation takes a function f(t) and spits out a new function F(t) which measures the area under the function f(t) from the beginning of time up until a given point t. What's not obvious at first is that integration is actually the reverse of differentiation, a fact called the The Fundamental Theorem of Calculus. So integration takes you from acceleration to velocity or velocity to displacement.
You don't need to understand the rules of calculus to do numerical integration. The simplest (and most naive) method for integrating a function numerically is just by approximating the area by dividing it up into small slices between time points and summing the area of rectangles. This approximating sum is called a Reimann sum.
As you can see, this tends to really overshoot and undershoot certain parts of the function. A more accurate but still very simple method is the trapezoid rule, which also approximates the function with a series of slices, except the tops of the slices are straight lines between the function values rather than constant values.
Still more complicated, but yet a better approximation, is Simpson's rules, which approximates the function with parabolas between time points.
(source: tutorvista.com)
You can think of each of these methods as getting a better approximation of the integral because they each use more information about the function. The first method uses just one data point per area (a constant flat line), the second method uses two data points per area (a straight line), and the third method uses three data points per area (a parabola).
You could read up on the math behind these methods here or in the first page of this pdf.
I agree with the comments that numerical integration is probably what you want. In case you still want a function going through your data, let me further argue against doing that.
It's usually a bad idea to find a curve that goes exactly through some given points. In almost any applied math context you have to accept that there is a little noise in the inputs, and a curve going exactly through the points may be very sensitive to noise. This can produce garbage outputs. Finding a curve going exactly through a set of points is asking for overfitting to get a function that memorizes rather than understands the data, and does not generalize.
For example, take the points (0,0), (1,1), (2,4), (3,9), (4,16), (5,25), (6,36). These are seven points on y=x^2, which is fine. The value of x^2 at x=-1 is 1. Now what happens if you replace (3,9) with (2.9,9.1)? There is a sixth order polynomial passing through all 7 points,
4.66329x - 8.87063x^2 + 7.2281x^3 - 2.35108x^4 + 0.349747x^5 - 0.0194304x^6.
The value of this at x=-1 is -23.4823, very far from 1. While the curve looks ok between 0 and 2, in other examples you can see large oscillations between the data points.
Once you accept that you want an approximation, not a curve going exactly through the points, you have what is known as a regression problem. There are many types of regression. Typically, you choose a set of functions and a way to measure how well a function approximates the data. If you use a simple set of functions like lines (linear regression), you just find the best fit. If you use a more complicated family of functions, you should use regularization to penalize overly complicated functions such as high degree polynomials with large coefficients that memorize the data. If you either use a simple family or regularization, the function tends not to change much when you add or withhold a few data points, which indicates that it is a meaningful trend in the data.
Unfortunately, integrating accelerometer data to get velocity is a numerically unstable problem. For most applications, your error will diverge far too soon to get results of any practical value.
Recall that:
So:
However well you fit a function to your accelerometer data, you will still essentially be doing a piecewise interpolation of the underlying acceleration function:
Where the error terms from each integration will add!
Typically you will see wildly inaccurate results after just a few seconds.

Plotting Issue - Comparing 3 matrices where one is a sparse matrix

I need to compare 3 216x216 matrix (data correlation matrix, events etc) . can someone suggest a way to plot these in matlab or someother plotting tools that can easily visualise and compare them ... does a 3d mesh plot be useful ? I thought mesh would be good .. but I need others opinion too.
Thanks in advance,
Sparse matrices
You can use the spy() method to visualize a "sparsity pattern", as Matlab calls it. It plots a dot (or any other marker) where the matrix element is non-zero.
spy() can also be used to visualize non-sparse matrices where a lot of entries are close to zero - just threshold the matrix first:
a=eye(50)+0.01*randn(50);
spy(a) % Not very useful
b=a; b(b<0.02)=0;
figure, spy(b) % Much more useful
More generally, you can apply upper and lower thresholds to visualize the location of matrix entries whose value is within a specific range.
Corellation
It may be useful to just display the matrix using imagesc(). This may give you an idea of the degree of corellation in your data - i.e. an uncorellated signal will have a corellation matrix with dominant diagonal elements, which will be clearly visible. I find Matlab's default color map distracting, so I usually do something like
colormap(gray);imagesc(a);
Miscellaneous
Of course, there's a whole host of non-visual comparisons you can make - various norm()'s, std(), spectral analysis using eig() for square matrices, or svd() more generally. You can compare eigenvalue magnitudes, or compare the eigenvectors. This may be very useful or complete garbage, depending on what your data is.
Thus, to conclude (for now), depending on what specifically your matrices contain, you may get more useful suggestions.

Resources