EstimatingDistribution for StableDistribution - wolfram-mathematica

I have 2 sets of data:
d1= {0.119894,0.430666,0.0831885,0.0319174,0.120422,0.113005,0.396407,0.286316,0.0846212,0.0380193,0.047136,0.0362305,0.0445161,0.142403,0.0540607,0.133119,0.10831,0.173586,0.162465,0.0704632,0.0856676,0.086322,0.31334,0.210488,0.165907,0.119317,0.0995894,0.103821,0.135736,0.245069,0.0814167,0.142331,0.321499,0.0576824,0.0535766,0.0546975,0.121395,0.0608112,0.0606295,0.133289,0.0468469,0.0501325,0.0641351,0.0846396,0.317252,0.0779754,0.105217,0.0749865,0.302625,0.301864,0.0929992,0.12178,0.279253,0.245539,0.198353,0.107202,0.17784,0.145572,0.055006,0.0770127,0.0861758,0.189966,0.21403,0.0834313,0.206845,0.2087,0.263422,0.0767717,0.162445,0.0542824,0.0553086,0.141381,0.052898,0.0945407,0.0776741,0.0367623,0.0565677,0.166219,0.035447,0.120121,0.0418321,0.11264,0.0540176,0.120358,0.074417,0.242225,0.398622,0.308373,0.15192,0.278717};
d2={0.170719,0.099203,0.0539713,0.15749,0.150455,0.142714,0.0705496,0.0690684,0.0630756,0.0372223,0.0885515,0.0305229,0.0869673,0.0426363,0.0504665,0.0371966,0.0766164,0.0402321,0.0334813,0.0489499,0.0753463,0.0942363,0.0786223,0.335095,0.0706324,0.0764047,0.0682716,0.0699429,0.0355438,0.0755698,0.10206,0.199187,0.0560379,0.0342713,0.0500202,0.0558365,0.0624332,0.0418887,0.0531662,0.0499419,0.0273659,0.0228881,0.0893776,0.0643183,0.0171277,0.0373337,0.0457631,0.0764322,0.0963383,0.0633643,0.107952,0.0570244,0.19336,0.0428824,0.0629954,0.120787,0.0924894,0.0562895,0.125588,0.116919,0.196895,0.264337,0.0787541,0.318374,0.193144,0.147134,0.0456675,0.0419496,0.057378,0.0577714,0.0706519,0.0410366,0.0716635,0.0547774,0.0157382,0.030444,0.0769898,0.0121786,0.0586156,0.0314843,0.0942514,0.1627,0.0781299,0.148406,0.423559,0.276206,0.0708934,0.0812794,0.159947};
Now I want to find an Estimated distribution using StableDistribution[]
For the first data set I do the following:
dist1 = EstimatedDistribution[d1, StableDistribution[alpha, beta, mu, sigma]]
I get a message and output
FindMaximum::sdprec: Line search unable to find a sufficient increase in the function value with MachinePrecision digit precision. >>
StableDistribution[1,0.863446,1.,-0.0781627,0.0345779]
The output looks ok (not a great fit for the data, but not too bad) but what does the message imply for the output?
For the second data set, d2
dist2 = EstimatedDistribution[d2, StableDistribution[alpha, beta, mu, sigma]]
I get a different message.
Optimization`ModifiedCholeskyDecomposition::herm: The matrix {{2.76856*10^157,-1.75574*10^159,-1.84519*10^157,-2.26892*10^157},{7.88598*10^159,0.,6.41507*10^159,7.88598*10^159},{1.82386*10^157,6.41507*10^159,1.13495*10^157,1.82386*10^157},{-2.26892*10^157,-1.75574*10^159,-1.84519*10^157,1.68961*10^157}} is not Hermitian or real and symmetric.
and output:
StableDistribution[1,0.834688,1.,-0.0101189,0.0181306]
So, I've got a couple of questions. Can anyone explain these messages and their relevance? It looks to me that Mathematica tries a number of different ways to estimate the distribution and some just don't work very well.
Thx.
J.

In order to make parameter estimation for stable distribution efficient, a multivariate interpolation of the pdf(alpha, beta, x) is constructed, and the resulting interpolation is used for estimation. Polynomial interpolation exhibits small scale oscillations, which can throw off the maximization routines. Thus, in working with stable estimation, it is better to use PrecisionGoal->3, AccuracyGoal->3.
Doing this does not get rid of your messages, though, but will speed-up estimation, which matters for larger size problems.
Since you data-size is small, statistical uncertainties of the estimators are large anyway.
The first message is benign, but the second is probably a bug, since the log-likelihood
of the estimated distribution on data is too low.
As an aside, it seems that StableDistribution is not a very good fit for your data:
In[44]:= LogLikelihood[
EstimatedDistribution[d1, StableDistribution[a, b, c, d]],
d1] // Quiet
Out[44]= 101.926
In[45]:= LogLikelihood[
EstimatedDistribution[d1, HyperbolicDistribution[a, b, c, d]],
d1] // Quiet
Out[45]= 111.847
In[46]:= LogLikelihood[
EstimatedDistribution[d2, StableDistribution[a, b, c, d]],
d2] // Quiet
Out[46]= -10.2194
In[47]:= LogLikelihood[
EstimatedDistribution[d2, HyperbolicDistribution[a, b, c, d]],
d2] // Quiet
Out[47]= 143.04

A general comment about numerical optimizer warnings -- I had a similar issue with using FindMaximum and getting "sufficient decrease" warnings, even though output seemed fine. It had to do with the fact that default AccuracyGoal of 6 could not be guaranteed, but smaller goal could be met without warnings.
You can globally turn the warning off with Off[FindMaximum::sdprec] or suppress it on per-command basis with
Quiet[EstimatedDistribution[d1,StableDistribution[alpha, beta, mu, sigma]], FindMaximum::sdprec]

Related

MSE giving negative results in High-Level Synthesis

I am trying to calculate the Mean Squared Error in Vitis HLS. I am using hls::pow(...,2) and divide by n, but all I receive is a negative value for example -0.004. This does not make sense to me. Could anyone point the problem out or have a proper explanation for this??
Besides calculating the mean squared error using hls::pow does not give the same results as (a - b) * (a - b) and for information I am using ap_fixed<> types and not normal float or double precision
Thanks in advance!
It sounds like an overflow and/or underflow issue, meaning that the values reach the sign bit and are interpreted as negative while just be very large.
Have you tried tuning the representation precision or the different saturation/rounding options for the fixed point class? This tuning will depend on the data you're processing.
For example, if you handle data that you know will range between -128.5 and 1023.4, you might need very few fractional bits, say 3 or 4, leaving the rest for the integer part (which might roughly be log2((1023+128)^2)).
Alternatively, if n is very large, you can try a moving average and calculate the mean in small "chunks" of length m < n.
p.s. Getting the absolute value of a - b and store it into an ap_ufixed before the multiplication can already give you one extra bit, but adds an instruction/operation/logic to the algorithm (which might not be a problem if the design is pipelined, but require space if the size of ap_ufixed is very large).

How to set a convergence tolerance to an specific variable using Dymola?

So, I have a model of a tube with pressure loss, where the unknown is the mass flow rate. Normally, and on most models of this problem, the conservation equations are used to calculate the mass flow rate, but such models have lots of convergence issues (because of the blocked flow at the end of the tube which results in an infinite pressure derivative at the end). See figure below for a representation of the problem on the left and the right a graph showing the infinite pressure derivative.
Because of that I'm using a model which is more robust, though it outputs not the mass flow rate but the tube length, which is known. Therefore an iterative loop is needed to determine the mass flow rate. Ok then, I coded a function length that given the tube geometry, mass flow rate and boundary conditions it outputs the calculated tube length and made the equations like so:
parameter Real L;
Real m_flow;
...
equation
L = length(geometry, boundary, m_flow)
It simulates fine, but it takes ages... And it shouldn't because the mass flow rate is rather insensitive to the tube length, e.g. if L=3 I could say that m_flow has converged if the output of length is within L ± 0.1. On the other hand the default convergence tolerance of DASSL in Dymola is 0.0001, which is fine for all other variables, but a major setback to my model here...
That being said, I'd like to know if there's a (hacky) way of setting a specific tolerance L (from annotations or something). I was unable to find any solution online or in Dymola's user manual... So far I managed a workaround by making a second function which uses a Newton-Raphson method to determine the mass flow rate, something like:
function massflowrate
input geometry, boundary, m_flow_start, tolerance;
output m_flow;
protected
Real error, L, dL, dLdm_flow, Delta_m_flow;
algorithm
error = geometry.L;
m_flow = m_flow_start;
while error>tolerance loop
L = length(geometry, boundary, m_flow);
error = abs(boundary.L - L);
dL = length(geometry, boundary, m_flow*1.001);
dLdm_flow = dL/(0.001*m_flow);
Delta_m_flow = (geometry.L - L)/dLdm_flow;
m_flow = m_flow + Delta_m_flow;
end while;
end massflowrate;
And then I use it in the equations section:
parameter Real L;
Real m_flow;
...
equation
m_flow = massflowrate(geometry, boundary, delay(m_flow,10), tolerance)
Nevertheless, this solutions is not without it's problems, the real equations are very non-linear and depending on the boundary conditions the solver reaches a never-ending loop... =/
PS: I'm sorry for the long post and the lack of a MWE, the real equations are very long and with loads of thermodynamics which I believe not to be of any help, be that as it may, if necessary, I'm able to provide the real model.
Is the length-function smooth? To me that it being non-smooth seems like a likely cause for problems, and the suggestions by #Phil might also be good ideas.
However, it should also be possible to do what you want as follows:
Real m_flow(nominal=1e9);
Explanation: The equations are normally solved to a certain tolerance in unknowns - in this case m_flow.
The tolerance for each variable is a relative/absolute tolerance taking into the nominal value, and Dymola does not allow you to set different tolerances for different variables.
Thus the simple way to compute m_flow less accurately is by setting a high nominal value for it, since the error tolerance will be tol*(abs(m_flow)+abs(nominal(m_flow))) or something like that.
The downside is that it may be too inaccurate, e.g. causing additional events, or that the error is so random that the solver is still slowed down.

Wolfram alpha is able to integrate an indefinite integral but not a definite integral of the same function?

My question is regarding integration. I have a complex function that needs to be integrated and its a definite integral. The thing is when I use Wolfram Alpha to integrate this function it gives me nothing i.e its unable to compute it. However if I remove the boundaries of integration i.e I make my integral an indefinite integral, Wolfram Alpha is able to compute. Now my question is
Can I take the result I obtained for the indefinite integral and just evaluate for the boundary limits to evaluate my definite integral ?
If my analysis is correct, then why wouldn't Wolfram alpha give the result anyways?
using Wolfram Alpha, if I try
integrate(exp(-v)/(1+sv^-1))
then I get the following result
-e^(-v)-e^s s Ei(-s-v)
While if I try
integrate(exp(-v)/(1+sv^-1),{v,1,+infinity})
I get nothing!
since you tagged this Mathematica:
by specifying an appropriate assumption on s we get the expected result:
Integrate[Exp[-v]/(1 + s/v) , {v, 1, Infinity}, Assumptions -> {s > -1}]
--> 1/E + E^s s ExpIntegralEi[-1 - s]
I don't know if alpha has some similar syntax to add assumptions..
additionally if we try a finite integral:
Integrate[Exp[-v]/(1 + s/v) , {v, 1, 2} ]
mathematica returns a conditional expression that tells us the result is valid for s>-1 or s<-2. For some reason it doesn't give such result for the infinite case however.
Yes, you can take the result obtained for the indefinite integral and use to calculate the definite integral. When I try to run your request at Wolfram Alpha, here's what I get:
As you can see in the highlighted portion at the bottom left of the above picture, Wolfram Alpha didn't complete your request because it exceeded the standard computation time. This is because they need to offer some extra features for Wolfram Alpha Pro users to pay for the service. One of this features is extended computation time.
Wolfram Alpha is a business, and this is one of the ways it makes money. See for yourself, it'll offer you the pro service if you click the "Try again with additional computational time" on the bottom right.
If you just break down the definite integration between first the indefinite integral (which it can handle) and then calculate the boundary values and take the difference, it seems to work fine:
This is mathematically correct because that is how definite integrals are calculated.
However your input has an sv in the dividend. Wolfram Alpha is taking it to mean s*v, which might not be what you meant—if sv is a variable on it's own, I suggest you rename it to s or something else. The point is that if s is indeed a variable, if you take a look at the plot in the answer, there seems to be a ridge due to the -∞ term, so for some values of s that ridge might be within your integration curve, and then the integral can't be calculated, as Bill pointed out in his comment to your question.

Are Haskell List Comprehensions Inefficient?

I started doing Project Euler and got to problem number 9. Since I was using Project Euler to learn Haskell, I decided to use list comprehensions (as shown in Learn You A Haskell). I do that and GHCI takes awhile to figure out the triplet, which I figured is normal because of the calculations involved. Now, at work yesterday (I don't work as a programmer professionally, yet) I was talking to a friend who knows VBA and he wanted to try to find the answers in VBA. I thought it would be a fun challenge as well, and I churn out some basic for loops and if statements, but what got me was that it was much faster than Haskell was.
My question is: are Haskell's list comprehension incredibly inefficient? At first I thought it was just because I was in GHC's interactive mode, but then I realized VBA is interpreted too.
Please note, I didn't post my code because of it being an answer to project euler. If it will answer my question (as in I'm doing something wrong) then I will gladly post the code.
[edit]
Here is my Haskell list comprehension:
[(a,b,c) | c <- [1..1000], b <- [1..c], a <- [1..b], a+b+c=1000, a^2+b^2=c^2]
I guess I could've lowered the range on c but is that what is really slowing it down?
There are two things you could be doing with this problem that could make your code slow. One is how you are trying values for a, b and c. If you loop through all possible values for a, b, c from 1 to 1000, you'll be spending a long time. To give a hint, you can make use of a+b+c=1000 if you rearrange it for c. The other is that if you only use a list comprehension, it will process every possible value for a, b and c. The problem tells you that there is only one unique set of numbers that satisfies the problem, so if you change your answer from this:
[ a * b * c | .... ]
to:
head [ a * b * c | ... ]
then Haskell's lazy evaluation means that it will stop after finding the first answer. This is the Haskell equivalent of breaking out of your VBA loop when you find the first answer. When I used both these tips, I had an answer that completed very quickly (under a second) in ghci.
Addendum: I missed at first the condition a < b < c. You can also make use of this in your list comprehensions; it is valid to say things along the lines of:
[(a, b) | b <- [1..100], a <- [1..b-1]]
Consider this simplified version of your list comprehension:
[(a,b,c) | a <- [1..1000], b <- [1..1000], c <- [1..1000]]
This will give all possible combinations of a, b, and c. It's kind of like saying, "how many ways can three one-thousand-sided dice land?" The answer is 1000*1000*1000 = 1,000,000,000 different combinations. If it took 0.001 seconds to generate each combination, it would therefore take 1,000,000 seconds (~11.5 days) to finish all combinations. (OK, 0.001 seconds is actually pretty slow for a computer, but you get the idea)
When you add predicates to your list comprehension, it still takes the same amount of time to compute; in fact, it takes longer since it needs to check the predicate for each of the 1 billion combinations it computes.
Now consider your comprehension. It looks like it should be much faster, right?
[(a,b,c) | c <- [1..1000], b <- [1..c], a <- [1..b], a+b+c=1000, a^2+b^2=c^2]
There are 1000 choices for c. How many are there for b and a? Well, the average choice for c is 500. For all choices of c, then, there are an average of 500 choices for b (since b can range from 1 to c). Likewise, for all choices of c and b, there are an average of 250 choices for a. That's very hand-wavy, but I'm fairly sure it's accurate. So 1000 choices for c * 1000/2 choices for b * 1000/4 choices for a = 1 billion / 8 ~= 100 million. It's 8x faster, but if you paid attention, you'll notice it's actually the same big-Oh complexity as the simplified version above. If we compared "simplified" vs "improved" versions of the same problem, but from [1..100000] instead of [1..1000], the "improved" would still only be 8x faster than the "simplified".
Don't get me wrong, 8x is a wonderful constant-factor speedup. But unless you want to wait a couple hours to get the solution, you'll need to get a better big-Oh.
As Neil noted, the way to reduce the complexity of this problem is, for a given b and c, choose the a that satisfies a+b+c=1000. That way, you're not trying a bunch of as that will fail. This will drop the big-Oh complexity; you'll only be considering approximately 1000 * 500 * 1 = 500,000 combinations, instead of ~100,000,000.
Once you get the solution to the problem you can check out other peoples versions of Haskell solutions on the Project Euler site to get an idea of how other people have solved the problem. Incidentally, here is a link to the referenced problem: http://projecteuler.net/index.php?section=problems&id=9
In addition to what everyone else has said about generating fewer elements in the generators, you can also switch to using Int instead of Integer as the type of the numbers. The default is Integer, but your numbers are small enough to fit in an Int.
(Also, to nitpick, Haskell list comprehensions have no speed. Haskell is a language definition with very little operational semantics. A particular Haskell implementation might have slow list comprehensions, though.)

What is a "good" R value when comparing 2 signals using cross correlation?

I apologize for being a bit verbose in advance: if you want to skip all the background mumbo jumbo you can see my question down below.
This is pretty much a follow up to a question I previously posted on how to compare two 1D (time dependent) signals. One of the answers I got was to use the cross-correlation function (xcorr in MATLAB), which I did.
Background information
Perhaps a little background information will be useful: I'm trying to implement an Independent Component Analysis algorithm. One of my informal tests is to (1) create the test case by (a) generate 2 random vectors (1x1000), (b) combine the vectors into a 2x1000 matrix (called "S"), and multiply this by a 2x2 mixing matrix (called "A"), to give me a new matrix (let's call it "T").
In summary: T = A * S
(2) I then run the ICA algorithm to generate the inverse of the mixing matrix (called "W"), (3) multiply "T" by "W" to (hopefully) give me a reconstruction of the original signal matrix (called "X")
In summary: X = W * T
(4) I now want to compare "S" and "X". Although "S" and "X" are 2x1000, I simply compare S(1,:) to X(1,:) and S(2,:) to X(2,:), each which is 1x1000, making them 1D signals. (I have another step which makes sure that these vectors are the proper vectors to compare to each other and I also normalize the signals).
So my current quandary is how to 'grade' how close S(1,:) matches to X(1,:), and likewise with S(2,:) to X(2,:).
So far I have used something like: r1 = max(abs(xcorr(S(1,:), X(1,:)))
My question
Assuming that using the cross correlation function is a valid way to go about comparing the similarity of two signals, what would be considered a good R value to grade the similarity of the signals? Wikipedia states that this is a very subjective area, and so I defer to the better judgment of those who might have experience in this field.
As you might realize, I'm not coming from a EE/DSP/statistical background at all (I'm a medical student) so I'm going through a sort of "baptism through fire" right now, and I appreciate all the help I can get. Thanks!
(edit: as far as directly answering your question about R values, see below)
One way to approach this would be to use cross-correlation. Bear in mind that you have to normalize amplitudes and correct for delays: if you have signal S1, and signal S2 is identical in shape, but half the amplitude and delayed by 3 samples, they're still perfectly correlated.
For example:
>> t = 0:0.001:1;
>> y = #(t) sin(10*t).*exp(-10*t).*(t > 0);
>> S1 = y(t);
>> S2 = 0.4*y(t-0.1);
>> plot(t,S1,t,S2);
These should have a perfect correlation coefficient. A way to compute this is to use maximum cross-correlation:
>> f = #(S1,S2) max(xcorr(S1,S2));
f =
#(S1,S2) max(xcorr(S1,S2))
>> disp(f(S1,S1)); disp(f(S2,S2)); disp(f(S1,S2));
12.5000
2.0000
5.0000
The maximum value of xcorr() takes care of the time-delay between signals. As far as correcting for amplitude goes, you can normalize the signals so that their self-cross-correlation is 1.0, or you can fold that equivalent step into the following:
ρ2 = f(S1,S2)2 / (f(S1,S1)*f(S2,S2);
In this case ρ2 = 5 * 5 / (12.5 * 2) = 1.0
You can solve for ρ itself, i.e. ρ = f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2)), just bear in mind that both 1.0 and -1.0 are perfectly correlated (-1.0 has opposite sign)
Try it on your signals!
with respect to what threshold to use for acceptance/rejection, that really depends on what kind of signals you have. 0.9 and above is fairly good but can be misleading. I would consider looking at the residual signal you get after you subtract out the correlated version. You could do this by looking at the time index of the maximum value of xcorr():
>> t = 0:0.001:1;
>> y = #(a,t) sin(a*t).*exp(-a*t).*(t > 0);
>> S1=y(10,t);
>> S2=0.4*y(9,t-0.1);
>> f(S1,S2)/sqrt(f(S1,S1)*f(S2,S2))
ans =
0.9959
This looks pretty darn good for a correlation. But let's try fitting S2 with a scaled/shifted multiple of S1:
>> [A,i]=max(xcorr(S1,S2)); tshift = i-length(S1);
>> S2fit = zeros(size(S2)); S2fit(1-tshift:end) = A/f(S1,S1)*S1(1:end+tshift);
>> plot(t,[S2; S2fit]); % fit S2 using S1 as a basis
>> plot(t,[S2-S2fit]); % residual
Residual has some energy in it; to get a feel for how much, you can use this:
>> S2res=S2-S2fit;
>> dot(S2res,S2res)/dot(S2,S2)
ans =
0.0081
>> sqrt(dot(S2res,S2res)/dot(S2,S2))
ans =
0.0900
This says that the residual has about 0.81% of the energy (9% of the root-mean-square amplitude) of the original signal S2. (the dot product of a 1D signal with itself will always be equal to the maximum value of cross-correlation of that signal with itself.)
I don't think there's a silver bullet for answering how similar two signals are with each other, but hopefully I've given you some ideas that might be applicable to your circumstances.
A good starting point is to get a sense of what a perfect match will look like by calculating the auto-correlations for each signal (i.e. do the "cross-correlation" of each signal with itself).
THIS IS A COMPLETE GUESS - but I'm guessing max(abs(xcorr(S(1,:),X(1,:)))) > 0.8 implies success. Just out of curiosity, what kind of values do you get for max(abs(xcorr(S(1,:),X(2,:))))?
Another approach to validate your algorithm might be to compare A and W. If W is calculated correctly, it should be A^-1, so can you calculate a measure like |A*W - I|? Maybe you have to normalize by the trace of A*W.
Getting back to your original question, I come from a DSP background, so I get to deal with fairly noise-free signals. I understand that's not a luxury you get in biology :) so my 0.8 guess might be very optimistic. Perhaps looking at some literature in your field, even if they aren't using cross-correlation exactly, might be useful.
Usually in such cases people talk about "false acceptance rate" and "false rejection rate".
The first one describes how many times algorithm says "similar" for non-similar signals, the second one is the opposite.
Selecting a threshold thus becomes a trade-off between these criteria. To make FAR=0, threshold should be 1, to make FRR=0 threshold should be -1.
So probably, you will need to decide which trade-off between FAR and FRR is acceptable in your situation and this will give the right value for threshold.
Mathematically this can be expressed in different ways. Just a couple of examples:
1. fix some of rates at acceptable value and minimize other one
2. minimize max(FRR,FAR)
3. minimize aFRR+bFAR
Since they should be equal, the correlation coefficient should be high, between .99 and 1. I would take the max and abs functions out of your calculation, too.
EDIT:
I spoke too soon. I confused cross-correlation with correlation coefficient, which is completely different. My answer might not be worth much.
I would agree that the result would be subjective. Something that would involve the sum of the squares of the differences, element by element, would have some value. Two identical arrays would give a value of 0 in that form. You would have to decide what value then becomes "bad". Make up 2 different vectors that "aren't too bad" and find their cross-correlation coefficient to be used as a guide.
(parenthetically: if you were doing a correlation coefficient where 1 or -1 would be great and 0 would be awful, I've been told by bio-statisticians that a real-life value of 0.7 is extremely good. I understand that this is not exactly what you are doing but the comment on correlation coefficient came up earlier.)

Resources