Probability normal distribution with an equation P(|x-3| > 5) for X~N(2,6) - probability

I'm confused about how to go about solving this problem. I don't quite understand what |x-3| represents in this case, and how it impacts the outcome when the variable is normally distributed. What would be the steps required to solve this?

It is absolute value, so P(|X-3|>5) means out of whole [-infinity...+infinity] range subrange around point x=3 with width of 5 is excluded.
So you have X in ranges [-infinity...-2] and [8...+infinity]
Given N(x;2,6) distribution, probability would be sum of integrals
P(|X-3|>5) = S[-infinity...-2] N(x;2,6) dx + S[8...+infinity] N(x;2,6) dx
where S denotes integration, or, equivalent
P(|X-3|>5) = 1 - S[-2...8] N(x;2,6) dx

Related

How to find least square fit for two combined functions

I have a curvefit problem
I have two functions
y = ax+b
y = ax^2+bx-2.3
I have one set of data each for the above functions
I need to find a and b using least square method combining both the functions
I was using fminsearch function to minimize the sum of squares of errors of these two functions.
I am unable to use this method in lsqcurvefit
Kindly help me
Regards
Ram
I think you'll need to worry less about which library routine to use and more about the math. Assuming you mean vertical offset least squares, then you'll want
D = sum_{i=1..m}(y_Li - a x_Li + b)^2 + sum_{i=j..n}(y_Pj - a x_Pj^2 - b x_Pj + 2.3)^2
where there are m points (x_Li, y_Li) on the line and n points (x_Pj, y_Pj) on the parabola. Now find partial derivatives of D with respect to a and b. Setting them to zero provides two linear equations in 2 unknowns, a and b. Solve this linear system.
y = ax+b
y = ax^2+bx-2.3
In order to not confuse y of the first equation with y of the second equation we use distinct notations :
u = ax+b
v = ax^2+bx+c
The method of linear regression combined for the two functions is shown on the joint page :
HINT : If you want to find by yourself the matrixial equation appearing above, follow the Gene's answer.

how to define the probability distribution

I have small question and I will be very happy if you can give me a solution or any idea for solution of probability distribution of the following idea:
I have a random variable x which follows exponntial distribution with parameter lambda1,I have one more variable y which follows exponential distribution with parameter lambda2. z is a discrete value, how can I define the probability distribution of k in the following formula ?
k=z-x-y
Thank you so much
Ok, lets start with rewriting formula a bit:
k = z-x-y = -(x-y) + z = - (x + y + -z)
That parts in the parentheses looks manageable. Let's start with x+y. For random variable x and y if one wants to find out their sum, answer is PDFs convolution.
q = x+y
PDF(q) = S PDFx(q-t) PDFy(t) dt
where S denotes integration. For x and y being exponential, the convolution integral is known and equal to expression here when lambdas are different, or to Gamma(2,lambda) when lambdas are equal, Gamma being Gamma distribution.
If z is some constant discrete value, then we could express it as continuous RV with PDF
PDF(t) = 𝛿(t+z)
where 𝛿 is Delta function, and we take into account that peak would be at -z as expected. It is normalized, so integral over t is eqaul to 1. It could be easily extended to discrete RV, as sum of 𝛿-functions at those values, multiplied by probabilities such that sum of them is equal to 1.
Again, we have sum of two RV, with known PDFs, and solution is convolution, which is easy to compute due to property of 𝛿-function. So final PDF of x + y + -z would be
PDF(q+z) dq
where PDF is taken from sum expression from Exponential distribution wiki, of Gamma distribution from Gamma wiki.
You just have to negate, and that's it

Implementing the square root method through successive approximation

Determining the square root through successive approximation is implemented using the following algorithm:
Begin by guessing that the square root is x / 2. Call that guess g.
The actual square root must lie between g and x/g. At each step in the successive approximation, generate a new guess by averaging g and x/g.
Repeat step 2 until the values of g and x/g are as close together as the precision of the hardware allows. In Java, the best way to check for this condition is to test whether the average is equal to either of the values used to generate it.
What really confuses me is the last statement of step 3. I interpreted it as follows:
private double sqrt(double x) {
double g = x / 2;
while(true) {
double average = (g + x/g) / 2;
if(average == g || average == x/g) break;
g = average;
}
return g;
}
This seems to just cause an infinite loop. I am following the algorithm exactly, if the average equals either g or x/g (the two values used to generate it) then we have our answer ?
Why would anyone ever use that approach, when they could simply use the formulas for (2n^2) = 4n^2 and (n + 1)^2 = n^2 + 2n + 1, to populate each bit in the mantissa, and divide the exponent by two, multiplying the mantissa by two iff the the mod of the exponent with two equals 1?
To check if g and x/g are as close as the HW allow, look at the relative difference and compare
it with the epsilon for your floating point format. If it is within a small integer multiple of epsilon, you are OK.
Relative difference of x and y, see https://en.wikipedia.org/wiki/Relative_change_and_difference
The epsilon for 32-bit IEEE floats is about 1.0e-7, as in one of the other answers here, but that answer used the absolute rather than the relative difference.
In practice, that means something like:
Math.abs(g-x/g)/Math.max(Math.abs(g),Math.abs(x/g)) < 3.0e-7
Never compare floating point values for equality. The result is not reliable.
Use a epsilon like so:
if(Math.abs(average-g) < 1e-7 || Math.abs(average-x/g) < 1e-7)
You can change the epsilon value to be whatever you need. Probably best is something related to the original x.

least square approximation: how this matrix calculation equation is deducted?

I am reading a book "kernel methods for pattern analysis". For the least square approximation, it is to minimise the sum of the square of the discrepancies:
e=y-Xw
Therefore it is to minimize
L(w,S)=(y-Xw)'(y-Xw)
Leading to
$$ w=(X'X)^-1 X'y $$
I understand until now.
But how does it leads to this? What is a exactly? Is it constant?
The same way you would solve for the minima (or maxima) of a quadratic function in only one variable: By solving for the zero in the first derivative:
diff((y-Xw)' (y-Xw), w) = 0
(only that this "0" is a row vector with as many elements as w.)
after performing the differentiation we get the following. (note that ' is the transpose, not a differentiation operator.)
-2y'X + 2w'X'X = 0
we transpose the whole expression (so 0 is a column vector) and divide by two:
-X'y + X'Xw = 0
and finally solve for w:
w = (X'X)^-1 X'y
Regarding your second question: The alpha is simply the whole expression X(X'X)^-2X'y. The point is that w can be written as the dot product of X' and some vector, which means that w is a linear combination of the columns of X' (rows of X).

matlab: optimum amount of points for linear fit

I want to make a linear fit to few data points, as shown on the image. Since I know the intercept (in this case say 0.05), I want to fit only points which are in the linear region with this particular intercept. In this case it will be lets say points 5:22 (but not 22:30).
I'm looking for the simple algorithm to determine this optimal amount of points, based on... hmm, that's the question... R^2? Any Ideas how to do it?
I was thinking about probing R^2 for fits using points 1 to 2:30, 2 to 3:30, and so on, but I don't really know how to enclose it into clear and simple function. For fits with fixed intercept I'm using polyfit0 (http://www.mathworks.com/matlabcentral/fileexchange/272-polyfit0-m) . Thanks for any suggestions!
EDIT:
sample data:
intercept = 0.043;
x = 0.01:0.01:0.3;
y = [0.0530642513911393,0.0600786706929529,0.0673485248329648,0.0794662409166333,0.0895915873196170,0.103837395346484,0.107224784565365,0.120300492775786,0.126318699218730,0.141508831492330,0.147135757370947,0.161734674733680,0.170982455701681,0.191799936622712,0.192312642057298,0.204771365716483,0.222689541632988,0.242582251060963,0.252582727297656,0.267390860166283,0.282890010610515,0.292381165948577,0.307990544720676,0.314264952297699,0.332344368808024,0.355781519885611,0.373277721489254,0.387722683944356,0.413648156978284,0.446500064130389;];
What you have here is a rather difficult problem to find a general solution of.
One approach would be to compute all the slopes/intersects between all consecutive pairs of points, and then do cluster analysis on the intersepts:
slopes = diff(y)./diff(x);
intersepts = y(1:end-1) - slopes.*x(1:end-1);
idx = kmeans(intersepts, 3);
x([idx; 3] == 2) % the points with the intersepts closest to the linear one.
This requires the statistics toolbox (for kmeans). This is the best of all methods I tried, although the range of points found this way might have a few small holes in it; e.g., when the slopes of two points in the start and end range lie close to the slope of the line, these points will be detected as belonging to the line. This (and other factors) will require a bit more post-processing of the solution found this way.
Another approach (which I failed to construct successfully) is to do a linear fit in a loop, each time increasing the range of points from some point in the middle towards both of the endpoints, and see if the sum of the squared error remains small. This I gave up very quickly, because defining what "small" is is very subjective and must be done in some heuristic way.
I tried a more systematic and robust approach of the above:
function test
%% example data
slope = 2;
intercept = 1.5;
x = linspace(0.1, 5, 100).';
y = slope*x + intercept;
y(1:12) = log(x(1:12)) + y(12)-log(x(12));
y(74:100) = y(74:100) + (x(74:100)-x(74)).^8;
y = y + 0.2*randn(size(y));
%% simple algorithm
[X,fn] = fminsearch(#(ii)P(ii, x,y,intercept), [0.5 0.5])
[~,inds] = P(X, y,x,intercept)
end
function [C, inds] = P(ii, x,y,intercept)
% ii represents fraction of range from center to end,
% So ii lies between 0 and 1.
N = numel(x);
n = round(N/2);
ii = round(ii*n);
inds = min(max(1, n+(-ii(1):ii(2))), N);
% Solve linear system with fixed intercept
A = x(inds);
b = y(inds) - intercept;
% and return the sum of squared errors, divided by
% the number of points included in the set. This
% last step is required to prevent fminsearch from
% reducing the set to 1 point (= minimum possible
% squared error).
C = sum(((A\b)*A - b).^2)/numel(inds);
end
which only finds a rough approximation to the desired indices (12 and 74 in this example).
When fminsearch is run a few dozen times with random starting values (really just rand(1,2)), it gets more reliable, but I still wouln't bet my life on it.
If you have the statistics toolbox, use the kmeans option.
Depending on the number of data values, I would split the data into a relative small number of overlapping segments, and for each segment calculate the linear fit, or rather the 1-st order coefficient, (remember you know the intercept, which will be same for all segments).
Then, for each coefficient calculate the MSE between this hypothetical line and entire dataset, choosing the coefficient which yields the smallest MSE.

Resources