What does 1. mean in a mathematica solution (of a sum) - wolfram-mathematica

I'm trying to evaluate a difficult sum: mathematica seems to evaluate it, giving the message "Solve was unable to solve the system with inexact coefficients. The answer was obtained by solving a corresponding exact system and numericizing the result"
The solution contains expressions "1." such as (0.5 + 1.i).
What does the 1. mean?

You can look at a similar question here. Mathematica interprets the input 0.5 (or any input containing 0.5), for example, as "numerical," and so its attempts to solve it will be numerical in nature, assuming that 0.5 is some real number that is within whatever relevant level of precision that it looks like it's equal to 0.5. Even though 0.5==1/2 will return True, Mathematica still treats those two expressions very differently.
If you input some commands using "numerical" (ie. decimal) numbers, Mathematica falls to numerical methods (like NIntegrate, NSolve, NDSolve, numerical versions of arithmetic operations, etc.) rather than those that apply to integers, rationals, etc.
The error that occurs is due to how NSolve (or another such algorithm) works. But it then takes the step of making the equations exact (it does know, after all, that 0.5=1/2) and then gets an exact solution, but then it "numericizes" the result (hits it with an N command) to give you the numerical equivalent.
Type in N[1/2+I] and see what you get. Should be 0.5+1.i. All this means is that you have a quantity that is roughly 1.0000000000000000 in the imaginary direction and 0.50000000000000 in the real direction.
To see the difference explicitly, try:
Head[1]
Head[1.]
The decimal point indicates to Mathematica that the second of the two is a "real" number, i.e. for floating point arithmetic of some sort. The first one is an integer, for which Mathematica sometimes uses different sorts of algorithms.

The "1." is there to guarantee that subsequent use of that expression doesn't lose that the expression was obtained numerically, and is therefore subject to numerical precision. For example,
In[121]:= Pi/3.14`2 * x
Out[121]= 1.0 x
Even though you might think that 1.0*x == x, it's certainly not true that Pi==3.14; rather, Pi is only 3.14 to the given precision of 2. By including the 1.0 in the answer (which InputForm shows is actually internally 1.00050721451904243263141509021640261145`2) the next evaluation,
In[122]:= % /. x -> 3
Out[122]= 3.0
comes out correct instead of incorrectly giving an exact 3.

Related

Numerical accuracy of an integer solution to an exponential equation

I have an algorithm that relies on integer inputs x, y and s.
For input checking and raising an exception for invalid arguments I have to make sure that there is a natural number n so that x*s^n=y
Or in words: How often do I have to chain-multiply x with s until I arrive at y.
And more importantly: Do I arrive at y exactly?
This problem can be further abstracted by dividing by x:
x*s^n=y => s^n=y/x => s^n=z
With z=y/x. z is not an integer in general, but one can only arrive at y using integer multiplication if y is divisible by x. So this property can be easily tested first and after that it is guaranteed that z is also an integer and now it is down to solving s^n=z.
There is already a question related to that.
There are lots of solutions. Some iterative and some solve the equation using a logarithm and either truncate, round or compare with an epsilon. I am particularly interested in the solutions with logarithms. The general idea is:
def check(z,s):
n = log(z)/log(s)
return n == int(n)
Equality comparing floating point numbers does seem pretty sketchy though. Under normal circumstances I would not count that as a general and exact solution to the problem. Answers that suggest this approach don't mention the precision issue and answers that use an epsilon for comparing just take a randomly small number.
I wonder how robust this method (with straight equality) really is, because it seems to work pretty well and I couldn't break it with trial and error. And if it breaks down at some point, how small or large the epsilon has to be.
So basically my question is:
Can the logarithm approach be guaranteed to be exact under specific circumstances? E.g. limited integer input range.
I thought about this for a long time now and I think, that it is possible that this solution is exact and robust at least under some circumstances. But I don't have a proof for that.
My line of thinking was:
Can I find a combination of x,y,s so that the chain-multiply just barely misses y, which means that n will be very close to an integer but not quite?
The answer is no. Because x, y and s are integers, the multiplication will also be an integer. So if the result just barely misses y, it has to miss by at least 1.
Well, that is how far I've gotten. My theory is, that choosing only integers makes the calculation very precise. I would consider it a method with good numerical stability. And also with a very specific behaviour regarding stability. So I believe it is possible, that this calculation is precise enough to truncate all decimals. It would be insane if someone could prove or disprove that.
If a guarantee for correctness can be given for a specific value range, I am interested in the general approach, but a fairly applicable range of values would be the positive part of int32 for the integers and double floating point precision.
Testing with an epsilon is also an option, but then the question is how small that epsilon has to be. This is probably related to the "miss by at least 1" logic.
You’re right to be skeptical of floating point. Math libraries typically
don’t provide correctly rounded transcendental functions (The Table
Maker’s Dilemma), so the exact test is suspect. Indeed, it’s not
difficult to find counterexamples (see the Python below).
Since the input z is an integer, however, we can do an error analysis to
determine an appropriate epsilon. Using calculus, one can prove a bound
log(z+1) − log(z) = log(1 + 1/z) ≥ 1/z − 1/(2z2).
If log(z)/log(s) is not an integer, then z must be at least one away
from a power of s, putting this bound in play. If 2 ≥ z, s <
231 (have representations as signed 32-bit integers), then
log(z)/log(s) is at least (1/231 −
1/263)/log(231) away from integer. An epsilon of
1.0e-12 is comfortably less than this, yet large enough that if we lose
a couple of ulps (1 ulp is on the order of 3.6e-15 in the worst case
here) to rounding, we don’t get a false negative, even with a rather
poor quality implementation of log.
import math
import random
while True:
x = random.randrange(2, 2**15)
if math.log(x**2) / math.log(x) != 2:
print("#", x)
break
# 19143

How to find accuracy of matrix multiplication with floating-point numbers?

I am trying to analyze how floating-point computation becomes more inaccurate when the data size decreases. In order to do that, I wanted to perform simple matrix operations on different variations of floating point representation, such as float64, float32, and float16. Since float64 computation will give the most precise and accurate result out of the three, I assume all float64 computation to give the expected result (i.e., error = 0).
The issue is that when I compare the calculated result with the expected result, I don't have an exact idea of how to quantify all the individual errors that I get into a single metric. I know about certain ways to go about it, such as finding the error mean, or the sum of square of errors (SSE), but I just wanted to know if there was a standard way of calculating the overall error of a given matrix computation.
Perhaps a variant of the condition number can be helpful? See here: https://en.wikipedia.org/wiki/Condition_number#Matrices
if there was a standard way of calculating the overall error of a given matrix computation.
Consider the case when a matrix is size 1. Then we are in a familiar 1 dimension domain.
How to compare y_computed_as_float vs y_expected? Even in this case, there is not a standard of how these should compare as floating point numbers. Subtract? Divide? It is often context sensitive. So "no" to OP's question.
Yet there are common practices. So a potential "yes" to OP question for select cases.
Floating point computations are often assessed by the difference between computed and math expected values scaled by the Unit in the last place*.
error = (y_computed_as_float - y_expected)/ulpf((float) y_expected);
For an N dimension matrix, the matrix error could use a root mean square of the N2 element errors.
* Scaling by ULP has some issues near each power of 2 and more near 0.0. There are ways to mitigate that, but we a getting into the weeds.

A particular paragraph in Deep Learning - Bengio

This question concerns with the chapter on RNNs in the
Deep learning look by Prof Bengio.
In section 10.2.2 on page 336 in the last paragraph, the book talks about
"...because the outputs are the result of a softmax, it must be that the input sequence is a sequence of symbols...".
This seems to suggest that the output is treated as a probability distribution over the possible 'bits' and the next input x(t+1) is sampled using this joint probability distribution over the output bits. Is this interpretation correct?
No, the interpretation is not correct (unless my interpretation of your interpretation is incorrect). x is an input, and it is fixed in advance, so x(t+1) does not depend on the predicted value for timestep t.
In that paragraph he discusses a particular case of an RNN, where y(t) is a prediction of x(t + 1), in other words, the network is trying to predict the next symbol given all the previous symbols.
My understanding of the sentence you are referring to is that since y is a result of a softmax, y has a limited range of values it can assume, and therefore x on itself has to be limited to the same range of values, hence x has to be a "symbol or bounded integer". Otherwise, if x, for instance, is a double, y cannot predict it, since the output of a softmax is a discrete value.
UPDATE: as a matter of fact, Bengio has a great paper:
http://arxiv.org/abs/1506.03099
in which he actually suggests that on some iterations we use y(t) instead of x(t+1) as input when predict y(t+1) during training (which is along the lines of your understanding in your question).

Mathematica is able to compute an indefinite integral but not the corresponding definite one

I'm trying to compute the following definite integral in Mathematica:
Integrate[Sqrt[3]/Sqrt[1 + Sqrt[1 + 12*u^2 - 24*\[Mu]]],
{u, -Sqrt[1 + 8*\[Mu]]/2, Sqrt[1 + 8*\[Mu]]/2}]
Only for some specific case of Mu Mathematica seems to be able to compute it. The "funny" things are that:
if I keep the integral indefinite, it returns me a solution (quite ugly, but at least...)
if I give precise values for the boundary (e.g, u=+-1/2), after a long time it just returns the definite integral without any result
if I additionally specify a precise value of Mu (so it knows Mu + boundary of integration), in one lucky case it is able to directly give me the result for the definite integral; this does not match with the value I would obtain by using the fundamental theorem of calculus (i.e. substituting the values of Mu and u in the ugly formula and taking the difference).
Has any of you an idea about what the problem could be? I would like to also point out that the square roots are always well definite for the values I consider.
Thank you.

Mathematica - Solving for the input of a taylor series such that coefficients are minimized

I need to find the value of a variable s such that the taylor expansion of an expression involving s:
Has a minimum (preferably zero, but due to binary minimum is sufficient) in as many coefficients other than 0th order as possible (preferably more than that one minimum coefficient, but 2nd and 3rd have priority).
reports the best n values of s that fulfill the condition within the region (ie show me the 3 best values of s and what the coefficients look like for each).
I have no idea how to even get the output of a Series[] command into any other mathematica command without receiving an error, much less how to actually solve the problem. The equation I am working with is too complex to post here (multi-regional but continuous polynomial expression that can be expanded). Does anyone know what commands to use for this?
The first thing you should realize is that the output of Series is not a sum but a a SeriesData object. To convert it into a sum you have to wrap it in Normal[Series[...]]. Since the question doesn't provide details, I can't say more.

Resources