Upgrading a binary search algorithm to something more sophisticated - algorithm

I solved an analytically unsolvable problem with numerical methods. I am searching for X, based on a desired Y value. f(x)=y is possible, x=f^-1(y) is not.
Currently the algorithm does a binary search. It starts at X=50%, calculates Y, returns Y_err=Y-Y_demand. It keeps stepping by intervals of 5% in the direction of shrinking Y_err, until Y_err changes sign, then it reduces the step, and steps in the opposite direction. This works, but it's embarassingly slow & inefficient.
Below, an example chart of x=f^-1(y). I chose one with high coefficients for the nonlinear part.
Example chart of x=f^-1(y)
It varies depending on coefficients, but always has this pseudoparabolic shape. It's of course nonlinear and even 9th order polynomial approximations don't offer satisfactory precision.
For simplicity's sake let's say the inflecton point is at X=50%, and am looking only for solutions where X>50%.
How should I proceed? I'm looking to optimise as much as possible. What are some good algorithms? Thanks.
EDIT: Thank you for pointing out that this is not in fact a binary search. I've updated the code and now have much better results by comparison.
I'm not sure if Newton's method applies here, or at least I don't know how to apply it. One-way trial and error is all I can do. When I have some more time I will try to learn and implement regula falsi.

Related

What string distance algorithm is best for measuring typing accuracy?

I'm trying to write a function that detects how accurate the user typed a particular phrase/sentence/word/words. My objective is to build an app to train the user's typing accuracy of certain phrases.
My initial instinct is to use the basic levenshtein distance algorithm (mostly because that's the only algo I knew off the top of my head).
But after a bit more research, I saw that Jaro-Winkler is a slightly more interesting algorithm because of its consideration for transpositions.
I even found a link that talks about the differences between these algorithms:
Difference between Jaro-Winkler and Levenshtein distance?
Having read all that, in addition to the respective Wikipedia posts, I am still a little clueless as to which algorithm fits my objective the best.
Since you are grading the quality of typing, and you want to train the student to make zero mistakes, you should use Levenshtein distance, because it is less forgiving.
Additionally, Levenshtein score is more intuitive to understand, and easier to represent graphically, than the Jaro-Winkler results. You can modify Levenshtein algorithm to report insertions, deletions, and mistypes separately, and show end-users a list of corrections. Jaro-Winkler, on the other hand, gives you a score that is hard to show to end-user, because penalties for misspelling in the middle are lower than penalties at the end.
Slightly tongue-in-cheek, but only slightly: build a generative model for typing that gives high (prior) probability to hitting the right letter, and apportion out some probabilities for hitting two neighboring keys at once, two keys from different hands in the wrong order, two keys from the same hand in the wrong order, a key near the correct one, a key far from the correct one, etc. Or perhaps less ad-hoc: give your model a probability for a given sequence of keypresses given the current pair of keys needed to continue the passage. You could do a lot of things with such a model; for example, you could get a "distance"-like metric by giving a likelihood score for the learner's actual performance. But even better would be to give them a report summarizing which kinds of errors they make the most -- after all, why boil their performance down to a single number when many numbers would do? Bonus points if you learn the probabilities for the different kinds of errors from a large corpus of real typists' work.
I mostly agree with the answer given by dasblinkenlight, however, would suggest to use the Damerau-Levenshtein distance instead of only Levenshtein, that is, including transpositions. Transpositions are fairly frequent and easy to make while typing, and there is no good reason why they should incur a double distance penalty with respect to the other possible errors (insertions, deletions, and substitutions).

How to simplify a spline?

I have an interesting algorithmic challenge in a project I am working on. I have a sorted list of coordinate points pointing at buildings on either side of a street that, sufficiently zoomed in, looks like this:
I would like to take this zigzag and smooth it out to linearize the underlying street.
I can think of a couple of solutions:
Calculate centroids using rolling averages of six or so points, and use those.
Spline regression.
Is there a better or best way to approach this problem? (I am using Python 3.5)
Based on your description and your comments, you are looking for a line simplification algorithms.
Ramer-Doublas algorithm (suggested in the comment) is most probably the most well-known algorithm in this family, but there are many more.
For example Visvalingam’s algorithm works by removing the point with the smallest change, which is calculated by the smallest square of the triangle. This makes it super easy to code and intuitively understandable. If it is hard to read research paper, you can read this easy article.
Other algorithms in this family are:
Opheim
Lang
Zhao
Read about them, understand what are they trying to minify and select the most suitable for you.
Dali's post correctly surmises that a line simplification algorithm is useful for this task. Before posting this question I actually examined a few such algorithms but wasn't quite comfortable with them because even though they resulted in the simplified geometry that I liked, they didn't directly address the issue I had of points being on either side of the feature and never in the middle.
Thus I used a two-step process:
I computed the centroids of the polyline by using a rolling average of the coordinates of the five surrounding points. This didn't help much with smoothing the function but it did mostly succeed in remapping them to the middle of the street.
I applied Visvalingam’s algorithm to the new polyline, with n=20 points specified (using this wonderful implementation).
The result wasn't quite perfect but it was good enough:
Thanks for the help everyone!

Algorithm for highest value inside budget

I wasn't entirely sure the best way to ask this question (or do the research to see if it has been previously answered).
Given a data set where each entry has a Point value and a Dollar value, I'm looking to generate a list of length N entries that yields the highest aggregate Point value whilst staying within budget B.
Example data set:
Item Points Dollars
Apple 3.0 $1.00
Pear 2.5 $0.75
Peach 2.8 $0.88
And with this (small) data set, say my budget (B) is $2.25, and list length (N) must be 2. You MUST use the fixed list length, but are not required to use ALL of the budget.
Obviously the example provided is easy to do in one's head, but given a much larger data set, and both higher N and B values, I'm looking for an algorithm that can generate the list. Having a hard time wrapping my head around this one.
Just looking for a pseudo-algorithm, but if you prefer any given language feel free to respond with that!
I am quite positive that this can be reduced to an NP-complete problem and hence it's not really worth trying to develop a process that will always give you the 'correct' answer as many people have tried and failed to do this efficiently over a large data set. However, you can use a much more efficient approximation technique that whilst it will not guarantee to give you the correct answer, many popular approximation algorithms are capable of achieving a high degree of accuracy.
Hope this helps you out :)
This problem is NP-Complete (NP and NP-Hard), meaning, that until now there is no algorithm found, that solves this problem in a polynomial amount time (polynomial to the input size) and if you find an algorithm that does, you would have solved one of the greatest problems in computer science (P=NP), which would you at least bring a million dollar reward.
If you are satisfied with an approximation, I would recommend the Greedy-Algorithm:
https://en.wikipedia.org/wiki/Greedy_algorithm

Gradient descent implementation

I've implemented both the batch and stochastic gradient descent. I'm experiencing some issues though. This is the stochastic rule:
1 to m {
theta(j):=theta(j)-step*derivative (for all j)
}
The issue I have is that, even though the cost function is becoming smaller and smaller the testing says it's not good. If I change the step a bit and change the number of iterations, the cost function is a bit bigger in value but the results are ok. Is this an overfitting "symptom"? How do I know which one is the right one? :)
As I said, even though the cost function is more minimized the testing says it's not good.
Gradient descent is a local search method for minimizing a function. When it reaches a local minimum in the parameter space, it won't be able to go any further. This makes gradient descent (and other local methods) prone to getting stuck in local minima, rather than reaching the global minimum. The local minima may or may not be good solutions for what you're trying to achieve. What to expect will depend on the function that you're trying to minimize.
In particular, high-dimensional NP-complete problems can be tricky. They often have exponentially many local optima, with many of them nearly as good as the global optimum in terms of cost, but with parameter values orthogonal to those for the global optimum. These are hard problems: you don't generally expect to be able to find the global optimum, instead just looking for a local minimum that is good enough. These are also relevant problems: many interesting problems have just these properties.
I'd suggest first testing your gradient descent implementation with an easy problem. You might try finding the minimum in a polynomial. Since it's a one-parameter problem, you can plot the progress of the parameter values along the curve of the polynomial. You should be able to see if something is drastically wrong, and can also observe how the search gets stuck in local minima. You should also be able to see that the initial parameter choice can matter quite a lot.
For dealing with harder problems, you might modify your algorithm to help it escape the local minima. A few common approaches:
Add noise. This reduces the precision of the parameters you've found, which can "blur" out local minima. The search can then jump out of local minima that are small compared to the noise, while still being trapped in deeper minima. A well-known approach for adding noise is simulated annealing.
Add momentum. Along with using the current gradient to define the step, also continue in the same direction as the previous step. If you take a fraction of the previous step as the momentum term, there is a tendency to keep going, which can take the search past the local minimum. By using a fraction, the steps decay exponentially, so poor steps aren't a big problem. This was always a popular modification to gradient descent when used to train neural networks, where gradient descent is known as backpropagation.
Use a hybrid search. First use a global search (e.g., genetic algorithms, various Monte Carlo methods) to find some good starting points, then apply gradient descent to take advantage of the gradient information in the function.
I won't make a recommendation on which to use. Instead, I'll suggest doing a little research to see what others have done with problems related to what you're working on. If it's purely a learning experience, momentum is probably the easiest to get working.
There are lots of things that could be going on:
your step could be a bad choice
your derivative might be off
your "expected value" might be mistaken
your gradient descent could simply be slow to converge
I would try increasing the run length, and plot runs with a variety of step values. A smaller step will have a better chance of avoiding the problems of, er, steps that are too big.

Optimization of multivariate function with a initial solution close to the optimum

I was wondering if anyone knows which kind of algorithm could be use in my case. I already have run the optimizer on my multivariate function and found a solution to my problem, assuming that my function is regular enough. I slightly perturbate the problem and would like to find the optimum solution which is close to my last solution. Is there any very fast algorithm in this case or should I just fallback to a regular one.
We probably need a bit more information about your problem; but since you know you're near the right solution, and if derivatives are easy to calculate, Newton-Raphson is a sensible choice, and if not, Conjugate-Gradient may make sense.
If you already have an iterative optimizer (for example, based on Powell's direction set method, or CG), why don't you use your initial solution as a starting point for the next run of your optimizer?
EDIT: due to your comment: if calculating the Jacobian or the Hessian matrix gives you performance problems, try BFGS (http://en.wikipedia.org/wiki/BFGS_method), it avoids calculation of the Hessian completely; here
http://www.alglib.net/optimization/lbfgs.php you find a (free-for-non-commercial) implementation of BFGS. A good description of the details you will here.
And don't expect to get anything from finding your initial solution with a less sophisticated algorithm.
So this is all about unconstrained optimization. If you need information about constrained optimization, I suggest you google for "SQP".
there are a bunch of algorithms for finding the roots of equations. If you know approximately where the root is, there are algorithms that will get you arbitrarily close very quickly, in ln n time or better.
One is Newton's method
another is the Bisection Method
Note that these algorithms are for single variable functions, but can be expanded to multivariate functions.
Every minimization algorithm performs better (read: perform at all) if you have a good initial guess. The initial guess for the perturbed problem will be in your case the minimum point of the non perturbed problem.
Then, you have to specify your requirements: you want speed. What accuracy do you want ? Does space efficiency matters ? Most importantly: what information do you have: only the value of the function, or do you also have the derivatives (possibly second derivatives) ?
Some background on the problem would help too. Looking for a smooth function which has been discretized will be very different than looking for hundreds of unrelated parameters.
Global information (ie. is the function convex, is there a guaranteed global minimum or many local ones, etc) can be left aside for now. If you have trouble finding the minimum point of the perturbed problem, this is something you will have to investigate though.
Answering these questions will allow us to select a particular algorithm. There are many choices (and trade-offs) for multivariate optimization.
Also, which is quicker will very much depend on the problem (rather than on the algorithm), and should be determined by experimentation.
Thought I don't know much about using computers in this capacity, I remember an article that used neuroevolutionary techniques to find "best-fit" equations relatively efficiently, given a known function complexity (linear, Nth-polynomial, exponential, logarithmic, etc) and a set of point plots. As I recall it was one of the earliest uses of what we now know as computational neuroevolution; because the functional complexity (and thus the number of terms) of the equation is known and fixed, a static neural net can be used and seeded with your closest values, then "mutated" and tested for fitness, with heuristics to make new nets closer to existing nets with high fitness. Using multithreading, many nets can be created, tested and evaluated in parallel.

Resources