I am trying to calculate the logarithm of a modified Bessel function of second type in MATLAB, i.e. something like that:
log(besselk(nu, Z))
where e.g.
nu = 750;
Z = 1;
I have a problem because the value of log(besselk(nu, Z)) goes to infinity, because besselk(nu, Z) is infinity. However, log(besselk(nu, Z)) should be small indeed.
I am trying to write something like
f = double(sym('ln(besselk(double(nu), double(Z)))'));
However, I get the following error:
Error using mupadmex Error in MuPAD command: DOUBLE cannot convert the input expression into a double array. If the input expression contains a symbolic variable, use the VPA function instead.
Error in sym/double (line 514) Xstr = mupadmex('symobj::double', S.s, 0)`;
How can I avoid this error?
You're doing a few things incorrectly. It makes no sense to use double for your two arguments to to besselk and the convert the output to symbolic. You should also avoid the old string based input to sym. Instead, you want to evaluate besselk symbolically (which will return about 1.02×102055, much greater than realmax), take the log of the result symbolically, and then convert back to double precision.
The following is sufficient – when one or more of the input arguments is symbolic, the symbolic version of besselk will be used:
f = double(log(besselk(sym(750), sym(1))))
or in the old string form:
f = double(sym('log(besselk(750, 1))'))
If you want to keep your parameters symbolic and evaluate at a later time:
syms nu Z;
f = log(besselk(nu, Z))
double(subs(f, {nu, Z}, {750, 1}))
Make sure that you haven't flipped the nu and Z values in your math as large orders (nu) aren't very common.
As njuffa pointed out, DLMF gives asymptotic expansions of K_nu(z) for large nu. From 10.41.2 we find for real positive arguments z:
besselk(nu,z) ~ sqrt(pi/(2nu)) (e z/(2nu))^-nu
which gives after some simplification
log( besselk(nu,z) ) ~ 1/2*log(pi) + (nu-1/2)*log(2nu) - nu(1 + log(z))
So it is O(nu log(nu)). No surprise the direct calculation fails for nu > 750.
I dont know how accurate this approximation is. Perhaps you can compare it for the values where besselk is smaller than the numerical infinity, to see if it fits your purpose?
EDIT: I just tried for nu=750 and z=1: The above approximation gives 4.7318e+03, while with the result of horchler we get log(1.02*10^2055) = 2055*log(10) + log(1.02) = 4.7318e+03. So it is correct to at least 5 significant digits, for nu >= 750 and z=1! If this is good enough for you this will be much faster than symbolic math.
Have you tried the integral representation?
Log[Integrate[Cosh[Nu t]/E^(Z Cosh[t]), {t, 0, Infinity}]]
Related
Having issue with the num2str function in Octave.
string = num2str(8.395,3);
String returns "8.39"
whereas,
string = num2str(1.395,3);
returns "1.4".
How can it be possible?
How can I get a consistent number of decimals?
IEEE floating point rounds to the nearest even number when exactly half-way. So the first case it rounds down towards 8, and in the second one up towards 2.
I think this will always show 3 digits (ref):
num2str(x,'%4.2f')
The question seems to be "why is 8.395 rounded "down", whereas 1.395 is rounded "up". Is that a bug?
Yes and no. This is an inherent limitation of floating-point arithmetic. Effectively, neither of those two numbers can be expressed exactly in floating-point format, therefore their nearest approximation is used. It just so happens that the nearest approximation for 8.395 is just "under" that value, whereas for 1.395 is just "above" it. Hence octave rounds the first "downwards", and the latter "upwards". You can confirm this if you print more significant digits for each:
sprintf("%.20f", 8.395) % ans = 8.39499999999999957367
sprintf("%.20f", 1.395) % ans = 1.39500000000000001776
So, as far as the "actual numbers in memory" are concerned, octave is doing the right thing.
If you do not care about 'actual precision' and you just want to see 'visually-desired precision', then you can create a simple function that uses "round" (i.e. to the nearest integer) under the hood. E.g.
ndigits = #(x) 10 ^ floor( log10( x ) );
sigdigits = #(x, s) round( x / ndigits(x) * 10^s ) / 10^s * ndigits(x);
sigdigits(8.395, 2) % ans = 8.4000
sigdigits(1.395, 2) % ans = 1.4000
Consider (a-b)/(c-d) operation, where a,b,c and d are floating-point numbers (namely, double type in C++). Both (a-b) and (c-d) are (sum-correction) pairs, as in Kahan summation algorithm. Briefly, the specific of these (sum-correction) pairs is that sum contains a large value relatively to what's in correction. More precisely, correction contains what didn't fit in sum during summation due to numerical limitations (53 bits of mantissa in double type).
What is the numerically most precise way to calculate (a-b)/(c-d) given the above speciality of the numbers?
Bonus question: it would be better to get the result also as (sum-correction), as in Kahan summation algorithm. So to find (e-f)=(a-b)/(c-d), rather than just e=(a-b)/(c-d) .
The div2 algorithm of Dekker (1971) is a good approach.
It requires a mul12(p,q) algorithm which can exactly computes a pair u+v = p*q. Dekker uses a method known as Veltkamp splitting, but if you have access to an fma function, then a much simpler method is
u = p*q
v = fma(p,q,-u)
the actual division then looks like (I've had to change some of the signs since Dekker uses additive pairs instead of subtractive):
r = a/c
u,v = mul12(r,c)
s = (a - u - v - b + r*d)/c
The the sum r+s is an accurate approximation to (a-b)/(c-d).
UPDATE: The subtraction and addition are assumed to be left-associative, i.e.
s = ((((a-u)-v)-b)+r*d)/c
This works because if we let rr be the error in the computation of r (i.e. r + rr = a/c exactly), then since u+v = r*c exactly, we have that rr*c = a-u-v exactly, so therefore (a-u-v-b)/c gives a fairly good approximation to the correction term of (a-b)/c.
The final r*d arises due to the following:
(a-b)/(c-d) = (a-b)/c * c/(c-d) = (a-b)/c *(1 + d/(c-d))
= [a-b + (a-b)/(c-d) * d]/c
Now r is also a fairly good initial approximation to (a-b)/(c-d) so we substitute that inside the [...], so we find that (a-u-v-b+r*d)/c is a good approximation to the correction term of (a-b)/(c-d)
For tiny corrections, maybe think of
(a - b) / (c - d) = a/b (1 - b/a) / (1 - c/d) ~ a/b (1 - b/a + c/d)
I'm writing a matlab code which uses digits of an irrational number. I tried finding it using a taylor expansion of $\sqrt(1+x)$. Since division to large numbers could be a bad idea for Matlab, this method seems to me not a good one.
I wonder if there is any simpler and efficient method to do this?
If you have the Symbolic Toolbox, vpa does that. You can specify the number of significant digits you want:
x = '2'; %// define x as a *string*. This avoids loss of precision
n = 100; %// desired number of *significant* digits
result = vpa(['sqrt(' x ')'], n);
The result is a symbolic variable. If needed, convert to a string:
result = char(result);
In the example above,
result =
1.414213562373095048801688724209698078569671875376948073176679737990732478462107038850387534327641573
Note that this is subject to rounding. For example, the result with n = 7 is 1.414214 instead of 1.414213.
In newer Matlab versions (tested on R2017b), using a char input with vpa is discouraged, and support for this may be removed in the future. The recommended approach is to first define the variable as symbolic, and then apply the required operations to it:
x = sym(2);
n = 100;
result = vpa(sqrt(x), n);
It seems you need a method of digit-by-digit root calculation that was discovered long before computer era.
Determining the square root through successive approximation is implemented using the following algorithm:
Begin by guessing that the square root is x / 2. Call that guess g.
The actual square root must lie between g and x/g. At each step in the successive approximation, generate a new guess by averaging g and x/g.
Repeat step 2 until the values of g and x/g are as close together as the precision of the hardware allows. In Java, the best way to check for this condition is to test whether the average is equal to either of the values used to generate it.
What really confuses me is the last statement of step 3. I interpreted it as follows:
private double sqrt(double x) {
double g = x / 2;
while(true) {
double average = (g + x/g) / 2;
if(average == g || average == x/g) break;
g = average;
}
return g;
}
This seems to just cause an infinite loop. I am following the algorithm exactly, if the average equals either g or x/g (the two values used to generate it) then we have our answer ?
Why would anyone ever use that approach, when they could simply use the formulas for (2n^2) = 4n^2 and (n + 1)^2 = n^2 + 2n + 1, to populate each bit in the mantissa, and divide the exponent by two, multiplying the mantissa by two iff the the mod of the exponent with two equals 1?
To check if g and x/g are as close as the HW allow, look at the relative difference and compare
it with the epsilon for your floating point format. If it is within a small integer multiple of epsilon, you are OK.
Relative difference of x and y, see https://en.wikipedia.org/wiki/Relative_change_and_difference
The epsilon for 32-bit IEEE floats is about 1.0e-7, as in one of the other answers here, but that answer used the absolute rather than the relative difference.
In practice, that means something like:
Math.abs(g-x/g)/Math.max(Math.abs(g),Math.abs(x/g)) < 3.0e-7
Never compare floating point values for equality. The result is not reliable.
Use a epsilon like so:
if(Math.abs(average-g) < 1e-7 || Math.abs(average-x/g) < 1e-7)
You can change the epsilon value to be whatever you need. Probably best is something related to the original x.
So this is weird. I'm in Ruby 1.9.3, and float addition is not working as I expect it would.
0.3 + 0.6 + 0.1 = 0.9999999999999999
0.6 + 0.1 + 0.3 = 1
I've tried this on another machine and get the same result. Any idea why this might happen?
Floating point operations are inexact: they round the result to nearest representable float value.
That means that each float operation is:
float(a op b) = mathematical(a op b) + rounding-error( a op b )
As suggested by above equation, the rounding error depends on operands a & b.
Thus, if you perform operations in different order,
float(float( a op b) op c) != float(a op (b op c))
In other words, floating point operations are not associative.
They are commutative though...
As other said, transforming a decimal representation 0.1 (that is 1/10) into a base 2 representation (that is 1/16 + 1/64 + ... ) would lead to an infinite serie of digits. So float(0.1) is not equal to 1/10 exactly, it also has a rounding-error and it leads to a long serie of binary digits, which explains that following operations have a non null rounding-error (mathematical result is not representable in floating point)
It has been said many times before but it bears repeating: Floating point numbers are by their very nature approximations of decimal numbers. There are some decimal numbers that cannot be represented precisely due to the way the floating point numbers are stored in binary. Small but perceptible rounding errors will occur.
To avoid this kind of mess, you should always format your numbers to an appropriate number of places for presentation:
'%.3f' % (0.3 + 0.6 + 0.1)
# => "1.000"
'%.3f' % (0.6 + 0.1 + 0.3)
# => "1.000"
This is why using floating point numbers for currency values is risky and you're generally encouraged to use fixed point numbers or regular integers for these things.
First, the numerals “0.3”, “.6”, and “.1” in the source text are converted to floating-point numbers, which I will call a, b, and c. These values are near .3, .6, and .1 but not equal to them, but that is not directly the reason you see different results.
In each floating-point arithmetic operation, there may be a little rounding error, some small number ei. So the exact mathematical results your two expressions calculate is:
(a + b + e0) + c + e1 and
(b + c + e2) + a + e3.
That is, in the first expression, a is added to b, and there is a slight rounding error e0. Then c is added, and there is a slight rounding error e1. In the second expression, b is added to c, and there is a slight rounding error e2. Finally, a is added, and there is a slight rounding error e3.
The reason your results differ is that e0 + e1 ≠ e2 + e3. That is, the rounding that was necessary when a and b were added was different from the rounding that was necessary when b and c were added and/or the roundings that were necessary in the second additions of the two cases were different.
There are rules that govern these errors. If you know the rules, you can make deductions about them that bound the size of the errors in final results.
This is a common limitation of floating point numbers, due to their being encoded in base 2 instead of base 10. It can be difficult to understand, but once you do, you can easily avoid problems like this. I recommend this guide, which goes in depth to explain it.
For this problem specifically, you might try rounding your result to the nearest millionths place:
result = (0.3+0.6+0.1)
=> 0.9999999999999999
(result*1000000.0).round/1000000.0
=> 1.0
As for why the order matters, it has to do with rounding. When those numbers are turned into floats, they are converted to binary, and all of them become repeating fractions, like ⅓ is in decimal. Since the result gets rounded during each addition, the final answer depends on the order of the additions. It appears that in one of those, you get a round-up, where in the other, you get a round-down. This explains the discrepancy.
It is worth noting what the actual difference is between those two answers: approximately 0.0000000000000001.
In view you can use also the number_with_precision helper:
result = 0.3 + 0.6 + 0.1
result = number_with_precision result, :precision => 3