Display the exact form of quadratic equations' roots - algorithm

I notice that almost all of new calculators are able to display the roots of quadratic equations in exact form. For example:
x^2-16x+14=0
x1=8+5sqrt2
x2=8-5sqrt2
What algorithm could I use to achieve that? I've been searching around but I found no results related to this problem

Assuming your quadratic equation is in the form
y = ax^2+bx+c
you get the two roots by
x_1,x_2 = ( -b +- sqrt(b^2-4ac)) / 2a
when for one you use the + between b and the square root, and for the other the -.
If you want to take something out of the square root, just compute the factors of the argument and take out the ones with exponent greater than 2.
By the way, the two root you posted are wrong.

The “algorithm” is exactly the same as on paper. Depending on the programming language, it may start with int delta = b*b - 4*a*c;.
You may want to define a datatype of terms and simplifications on them, though, in case the coefficients of the equation are not simply integer but themselves solutions of previous equations. If this is the sort of thing you are after, look up “symbolic computation”. Some languages are better for this purpose than others. I expect that elementary versions of what you are asking is actually used as an example in some ML tutorials, for instance (see chapter 9).

Related

Finding more than one root with root finding algorithms

What is the best way to find the roots of an equation with more than one root. I understand that no one method which can solve every equation, and that you have to use more than one, but I can't find a root finding algorithm that can solve for more than one root in even the simplest instance.
For example: y = x^2
Although a root solving algorithm to solve a basic equation like this is helpful, it would need to be something I could adapt to solve an equation with more than two roots.
One more thing to note is that the equations wouldn't be your typical polynomials, but could be something such as ln(x^2) + x - 15 = 0
What is a root finding algorithm that could solve for this, or how could you edit a root finding algorithm such as the Bisection/Newton/Brent method to solve this problem, (Assuming I'm correct in that Newton and Brent's method can only solve for one root).
I'd say that there's no general method to find all the roots of a general equation. However, one can try and devise methodologies once sufficient conditions have been specified. Even simple quadratic equations ax2 + bx + c = 0 aren't completely trivial, because the existence of real roots depends on the sign of b2-4ac, which isn't immediately obvious. So there are lots of techniques to apply, e.g Newton-Raphson, but no general method for the general case, especially for equations like ln(x2)+x-15 = 0.
Bottom line: You need to isolate the roots yourself.
Details depend on the algorithm:
If you're using bisection or Brent's method, you need to come up with a set of intervals each containing a unique root.
If using the Newton's method, you need to come up with a set of starting estimates (since it converges to a root given a starting point, and with different starting points it may or may not converge to different roots).
As everyone has said, it is impossible to provide a general algorithm to find all, or some of the roots (where some is greater than one.) And how many roots is some? You cannot find all of the roots in general, since many functions will have infinitely many roots.
Even methods like Newton do not always converge to a solution. I tend to like a good, fairly stable method that will converge to a solution under reasonable circumstances, such as a bracket where the function is known to change sign. You can find such a code that has good convergence behavior on single roots, yet still is protected to behave basically like a bisection scheme when the function is less well behaved.
So, given a decent root finding scheme, you can try simple things like deflation. Thus, consider a simple function, like a first kind Bessel function. I'll do all my examples using MATLAB, but any tool that has a stable well written rootfinder like fzero in MATLAB will suffice.
ezplot(#(x) besselj(0,x),[0,10])
grid on
f0 = #(x) besselj(0,x);
xroots(1) = fzero(f0,1)
xroots =
2.4048
From the plot, we can see there is a second root around 5 or 6.
Now, deflate f0 for that root, creating a new function based on f0, but one that lacks a root at xroots(1).
f1 = #(x) f0(x)./(x-xroots(1));
ezplot(f1,[0,10])
grid on
Note that in this curve, the root of f0 at xroots(1) has now been zapped away, as if it did not exist. Can we find a second root?
xroots(2) = fzero(f1,2)
xroots =
2.4048 5.5201
We can go on of course, but at some point this scheme will fail due to numerical issues. And that failure won't take too terribly long either.
A better scheme might be to (for 1-d problems) use a bracketing scheme, conjoined with a sampling methodology. I say better because it does not require modifying the initial function to deflate the roots. (For 2-d or higher, things get far more hairy of course.)
xlist = (0:1:10)';
flist = f0(xlist);
[xlist,flist]
ans =
0 1.0000
1.0000 0.7652
2.0000 0.2239
3.0000 -0.2601
4.0000 -0.3971
5.0000 -0.1776
6.0000 0.1506
7.0000 0.3001
8.0000 0.1717
9.0000 -0.0903
10.0000 -0.2459
As you can see, the function has sign changes in the intervals [2,3], [5,6], and [8,9]. A rootfinder that can search in a bracket will do here.
fzero(f0,[2,3])
ans =
2.4048
fzero(f0,[5,6])
ans =
5.5201
fzero(f0,[8,9])
ans =
8.6537
Just look for sign changes, then throw the known bracket into a root finder. This will give as many solutions as you can find brackets.
Be advised, there are serious problems with the above scheme. It will completely fail to find the double root of a simple function like f(x)=x^2, because no sign change exists. And if you choose too coarse of a sampling, then you may have an interval with TWO roots in it, but you won't see a sign change at the endpoints.
For example, consider the function f(x) = x^2-x, which has single roots at 0 and at 1. But if you sample that function at -1 and 2, you will find that it is positive at both points. There is no sign change, but there are two roots.
Again, NO method can be made perfect. You can ALWAYS devise a function that will cause any such numerical method to fail.

How do I determine whether my calculation of pi is accurate?

I was trying various methods to implement a program that gives the digits of pi sequentially. I tried the Taylor series method, but it proved to converge extremely slowly (when I compared my result with the online values after some time). Anyway, I am trying better algorithms.
So, while writing the program I got stuck on a problem, as with all algorithms: How do I know that the n digits that I've calculated are accurate?
Since I'm the current world record holder for the most digits of pi, I'll add my two cents:
Unless you're actually setting a new world record, the common practice is just to verify the computed digits against the known values. So that's simple enough.
In fact, I have a webpage that lists snippets of digits for the purpose of verifying computations against them: http://www.numberworld.org/digits/Pi/
But when you get into world-record territory, there's nothing to compare against.
Historically, the standard approach for verifying that computed digits are correct is to recompute the digits using a second algorithm. So if either computation goes bad, the digits at the end won't match.
This does typically more than double the amount of time needed (since the second algorithm is usually slower). But it's the only way to verify the computed digits once you've wandered into the uncharted territory of never-before-computed digits and a new world record.
Back in the days where supercomputers were setting the records, two different AGM algorithms were commonly used:
Gauss–Legendre algorithm
Borwein's algorithm
These are both O(N log(N)^2) algorithms that were fairly easy to implement.
However, nowadays, things are a bit different. In the last three world records, instead of performing two computations, we performed only one computation using the fastest known formula (Chudnovsky Formula):
This algorithm is much harder to implement, but it is a lot faster than the AGM algorithms.
Then we verify the binary digits using the BBP formulas for digit extraction.
This formula allows you to compute arbitrary binary digits without computing all the digits before it. So it is used to verify the last few computed binary digits. Therefore it is much faster than a full computation.
The advantage of this is:
Only one expensive computation is needed.
The disadvantage is:
An implementation of the Bailey–Borwein–Plouffe (BBP) formula is needed.
An additional step is needed to verify the radix conversion from binary to decimal.
I've glossed over some details of why verifying the last few digits implies that all the digits are correct. But it is easy to see this since any computation error will propagate to the last digits.
Now this last step (verifying the conversion) is actually fairly important. One of the previous world record holders actually called us out on this because, initially, I didn't give a sufficient description of how it worked.
So I've pulled this snippet from my blog:
N = # of decimal digits desired
p = 64-bit prime number
Compute A using base 10 arithmetic and B using binary arithmetic.
If A = B, then with "extremely high probability", the conversion is correct.
For further reading, see my blog post Pi - 5 Trillion Digits.
Undoubtedly, for your purposes (which I assume is just a programming exercise), the best thing is to check your results against any of the listings of the digits of pi on the web.
And how do we know that those values are correct? Well, I could say that there are computer-science-y ways to prove that an implementation of an algorithm is correct.
More pragmatically, if different people use different algorithms, and they all agree to (pick a number) a thousand (million, whatever) decimal places, that should give you a warm fuzzy feeling that they got it right.
Historically, William Shanks published pi to 707 decimal places in 1873. Poor guy, he made a mistake starting at the 528th decimal place.
Very interestingly, in 1995 an algorithm was published that had the property that would directly calculate the nth digit (base 16) of pi without having to calculate all the previous digits!
Finally, I hope your initial algorithm wasn't pi/4 = 1 - 1/3 + 1/5 - 1/7 + ... That may be the simplest to program, but it's also one of the slowest ways to do so. Check out the pi article on Wikipedia for faster approaches.
You could use multiple approaches and see if they converge to the same answer. Or grab some from the 'net. The Chudnovsky algorithm is usually used as a very fast method of calculating pi. http://www.craig-wood.com/nick/articles/pi-chudnovsky/
The Taylor series is one way to approximate pi. As noted it converges slowly.
The partial sums of the Taylor series can be shown to be within some multiplier of the next term away from the true value of pi.
Other means of approximating pi have similar ways to calculate the max error.
We know this because we can prove it mathematically.
You could try computing sin(pi/2) (or cos(pi/2) for that matter) using the (fairly) quickly converging power series for sin and cos. (Even better: use various doubling formulas to compute nearer x=0 for faster convergence.)
BTW, better than using series for tan(x) is, with computing say cos(x) as a black box (e.g. you could use taylor series as above) is to do root finding via Newton. There certainly are better algorithms out there, but if you don't want to verify tons of digits this should suffice (and it's not that tricky to implement, and you only need a bit of calculus to understand why it works.)
There is an algorithm for digit-wise evaluation of arctan, just to answer the question, pi = 4 arctan 1 :)

Locally weighted logistic regression

I have been trying to implement a locally-weighted logistic regression algorithm in Ruby. As far as I know, no library currently exists for this algorithm, and there is very little information available, so it's been difficult.
My main resource has been the dissertation of Dr. Kan Deng, in which he described the algorithm in what I feel is pretty light detail. My work so far on the library is here.
I've run into trouble when trying to calculate B (beta). From what I understand, B is a (1+d x 1) vector that represents the local weighting for a particular point. After that, pi (the probability of a positive output) for that point is the sigmoid function based on the B for that point. To get B, use the Newton-Raphson algorithm recursively a certain number of times, probably no more than ten.
Equation 4-4 on page 66, the Newton-Raphson algorithm itself, doesn't make sense to me. Based on my understanding of what X and W are, (x.transpose * w * x).inverse * x.transpose * w should be a (1+d x N) matrix, which doesn't match up with B, which is (1+d x 1). The only way that would work, then, is if e were a (N x 1) vector.
At the top of page 67, under the picture, though, Dr. Deng just says that e is a ratio, which doesn't make sense to me. Is e Euler's Constant, and it just so happens that that ratio is always 2.718:1, or is it something else? Either way, the explanation doesn't seem to suggest, to me, that it's a vector, which leaves me confused.
The use of pi' is also confusing to me. Equation 4-5, the derivative of the sigmoid function w.r.t. B, gives a constant multiplied by a vector, or a vector. From my understanding, though, pi' is just supposed to be a number, to be multiplied by w and form the diagonal of the weight algorithm W.
So, my two main questions here are, what is e on page 67 and is that the 1xN matrix I need, and how does pi' in equation 4-5 end up a number?
I realize that this is a difficult question to answer, so if there is a good answer then I will come back in a few days and give it a fifty point bounty. I would send an e-mail to Dr. Deng, but I haven't been able to find out what happened to him after 1997.
If anyone has any experience with this algorithm or knows of any other resources, any help would be much appreciated!
As far as I can see, this is just a version of Logistic regression in which the terms in the log-likelihood function have a multiplicative weight depending on their distance from the point you are trying to classify. I would start by getting familiar with an explanation of logistic regression, such as http://czep.net/stat/mlelr.pdf. The "e" you mention seems to be totally unconnected with Euler's constant - I think he is using e for error.
If you can call Java from Ruby, you may be able to make use of the logistic classifier in Weka described at http://weka.sourceforge.net/doc.stable/weka/classifiers/functions/Logistic.html - this says "Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights." If nothing else, you could download it and look at its source code. If you do this, note that it is a fairly sophisticated approach - for instance, they check beforehand to see if all the points actually lie pretty much in some subspace of the input space, and project down a few dimensions if they do.

Create a function for given input and ouput

Imagine, there are two same-sized sets of numbers.
Is it possible, and how, to create a function an algorithm or a subroutine which exactly maps input items to output items? Like:
Input = 1, 2, 3, 4
Output = 2, 3, 4, 5
and the function would be:
f(x): return x + 1
And by "function" I mean something slightly more comlex than [1]:
f(x):
if x == 1: return 2
if x == 2: return 3
if x == 3: return 4
if x == 4: return 5
This would be be useful for creating special hash functions or function approximations.
Update:
What I try to ask is to find out is whether there is a way to compress that trivial mapping example from above [1].
Finding the shortest program that outputs some string (sequence, function etc.) is equivalent to finding its Kolmogorov complexity, which is undecidable.
If "impossible" is not a satisfying answer, you have to restrict your problem. In all appropriately restricted cases (polynomials, rational functions, linear recurrences) finding an optimal algorithm will be easy as long as you understand what you're doing. Examples:
polynomial - Lagrange interpolation
rational function - Pade approximation
boolean formula - Karnaugh map
approximate solution - regression, linear case: linear regression
general packing of data - data compression; some techniques, like run-length encoding, are lossless, some not.
In case of polynomial sequences, it often helps to consider the sequence bn=an+1-an; this reduces quadratic relation to linear one, and a linear one to a constant sequence etc. But there's no silver bullet. You might build some heuristics (e.g. Mathematica has FindSequenceFunction - check that page to get an impression of how complex this can get) using genetic algorithms, random guesses, checking many built-in sequences and their compositions and so on. No matter what, any such program - in theory - is infinitely distant from perfection due to undecidability of Kolmogorov complexity. In practice, you might get satisfactory results, but this requires a lot of man-years.
See also another SO question. You might also implement some wrapper to OEIS in your application.
Fields:
Mostly, the limits of what can be done are described in
complexity theory - describing what problems can be solved "fast", like finding shortest path in graph, and what cannot, like playing generalized version of checkers (they're EXPTIME-complete).
information theory - describing how much "information" is carried by a random variable. For example, take coin tossing. Normally, it takes 1 bit to encode the result, and n bits to encode n results (using a long 0-1 sequence). Suppose now that you have a biased coin that gives tails 90% of time. Then, it is possible to find another way of describing n results that on average gives much shorter sequence. The number of bits per tossing needed for optimal coding (less than 1 in that case!) is called entropy; the plot in that article shows how much information is carried (1 bit for 1/2-1/2, less than 1 for biased coin, 0 bits if the coin lands always on the same side).
algorithmic information theory - that attempts to join complexity theory and information theory. Kolmogorov complexity belongs here. You may consider a string "random" if it has large Kolmogorov complexity: aaaaaaaaaaaa is not a random string, f8a34olx probably is. So, a random string is incompressible (Volchan's What is a random sequence is a very readable introduction.). Chaitin's algorithmic information theory book is available for download. Quote: "[...] we construct an equation involving only whole numbers and addition, multiplication and exponentiation, with the property that if one varies a parameter and asks whether the number of solutions is finite or infinite, the answer to this question is indistinguishable from the result of independent tosses of a fair coin." (in other words no algorithm can guess that result with probability > 1/2). I haven't read that book however, so can't rate it.
Strongly related to information theory is coding theory, that describes error-correcting codes. Example result: it is possible to encode 4 bits to 7 bits such that it will be possible to detect and correct any single error, or detect two errors (Hamming(7,4)).
The "positive" side are:
symbolic algorithms for Lagrange interpolation and Pade approximation are a part of computer algebra/symbolic computation; von zur Gathen, Gerhard "Modern Computer Algebra" is a good reference.
data compresssion - here you'd better ask someone else for references :)
Ok, I don't understand your question, but I'm going to give it a shot.
If you only have 2 sets of numbers and you want to find f where y = f(x), then you can try curve-fitting to give you an approximate "map".
In this case, it's linear so curve-fitting would work. You could try different models to see which works best and choose based on minimizing an error metric.
Is this what you had in mind?
Here's another link to curve-fitting and an image from that article:
It seems to me that you want a hashtable. These are based in hash functions and there are known hash functions that work better than others depending on the expected input and desired output.
If what you want a algorithmic way of mapping arbitrary input to arbitrary output, this is not feasible in the general case, as it totally depends on the input and output set.
For example, in the trivial sample you have there, the function is immediately obvious, f(x): x+1. In others it may be very hard or even impossible to generate an exact function describing the mapping, you would have to approximate or just use directly a map.
In some cases (such as your example), linear regression or similar statistical models could find the relation between your input and output sets.
Doing this in the general case is arbitrarially difficult. For example, consider a block cipher used in ECB mode: It maps an input integer to an output integer, but - by design - deriving any general mapping from specific examples is infeasible. In fact, for a good cipher, even with the complete set of mappings between input and output blocks, you still couldn't determine how to calculate that mapping on a general basis.
Obviously, a cipher is an extreme example, but it serves to illustrate that there's no (known) general procedure for doing what you ask.
Discerning an underlying map from input and output data is exactly what Neural Nets are about! You have unknowingly stumbled across a great branch of research in computer science.

Efficient evaluation of hypergeometric functions

Does anyone have experience with algorithms for evaluating hypergeometric functions? I would be interested in general references, but I'll describe my particular problem in case someone has dealt with it.
My specific problem is evaluating a function of the form 3F2(a, b, 1; c, d; 1) where a, b, c, and d are all positive reals and c+d > a+b+1. There are many special cases that have a closed-form formula, but as far as I know there are no such formulas in general. The power series centered at zero converges at 1, but very slowly; the ratio of consecutive coefficients goes to 1 in the limit. Maybe something like Aitken acceleration would help?
I tested Aitken acceleration and it does not seem to help for this problem (nor does Richardson extrapolation). This probably means Pade approximation doesn't work either. I might have done something wrong though, so by all means try it for yourself.
I can think of two approaches.
One is to evaluate the series at some point such as z = 0.5 where convergence is rapid to get an initial value and then step forward to z = 1 by plugging the hypergeometric differential equation into an ODE solver. I don't know how well this works in practice; it might not, due to z = 1 being a singularity (if I recall correctly).
The second is to use the definition of 3F2 in terms of the Meijer G-function. The contour integral defining the Meijer G-function can be evaluated numerically by applying Gaussian or doubly-exponential quadrature to segments of the contour. This is not terribly efficient, but it should work, and it should scale to relatively high precision.
Is it correct that you want to sum a series where you know the ratio of successive terms and it is a rational function?
I think Gosper's algorithm and the rest of the tools for proving hypergeometric identities (and finding them) do exactly this, right? (See Wilf and Zielberger's A=B book online.)

Resources