Is there some way to bruteforce a mathematical function? [closed] - algorithm

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I'm not sure if this is a stupid question but I couldn't really find anything on Google. Given a few data points for a function f(x) would it be possible to bruteforce what the function f(x) itself might be?

This will rely on some prior knowledge of f(x).
If you know that the function is constant, one point is enough; a line, then two points, etc. for polynomial functions.
But if you have no restrictions, this isn't possible. Assuming function here means something like a real-valued function on the real numbers, there are (uncountably) infinitely many functions which will take the specified values on any finite set of data points.

This is mostly math question. It depends on number of data points that are available. You are basically fitting data to a function. You need two data points for straight line, etc. The commercial solution is TableCurve 2D, http://en.wikipedia.org/wiki/TableCurve_2D. I would search for nonlinear fit on Google.
Fitting algorithms are also described in Numerical Recipes (http://en.wikipedia.org/wiki/Numerical_Recipes). The simplest algorithm would look for deviations between assumed function and data points. If you assume certain error on your data points, you can calculate chi-square and goodness of your fit.

Related

What is the best algorithm for optimizing a combination of variables? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 20 hours ago.
Improve this question
I want to program a software that calculates the best combination of materials to use base on parameters such as its tensile strength, elastic modulus, stiffness, and results from doing certain tests from those materials. Those each factor are going to be weighted differently in a WDM. Is there an algorithm that would allow me to find the best combination without actually going through all the combinations and doing each individual calculations? I will be working with a lot of data, so efficiency is important
I tried researching algorithms like kruskal's and other things, but I'm not very fammiliar with them
First step is to write down an equation to calculate a number that you want to optimize.
If you can do that and the equation has no squares or other exponential terms then this is the classical linear programming problem https://en.wikipedia.org/wiki/Linear_programming
Your equation needs to look something like this:
max O = n1 * p1 + n2 * p2 - n3 * p3 ...
If so, then your best bet is to choose a linear programming package ( ask google ) with a good introductory tutorial and plug your problem into that. After a day or so on a steep learning curve, your problem will become almost trivial.
If you cannot do that, then you will need to use some sort of hill climbing algorithm - probably best to hire an expert to help with that.

Polynomial Regression - results accuracy between two algorithms [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I know that I can find a polynomial regression's coefficients doing (X'X)^-1 * X'y (where X' is the transpose, see Wikipedia for details).
This is a way of finding the coefficients; now, there is (as far as I know) at least one other way, which is by minimizing a cost function using gradient descent. The former method seems to be the easiest to implement ( I did it in C++, I have the latter in Matlab ).
What I wanted to know is the advantage of one of these methods over the other.
Upon a particular dataset, with very few points, I found that I couldn't find a satisfactory solution using (X'X)^-1 * X'y, but gradient descent worked fine and I could get an estimation function that made sense.
So what's wrong with the matrix resolution over gradient descent ? And how would one test a regression results, having all the details hidden from the user ?
Both methods are equivalent. Iterative method is much more computationally efficient thanks to lower storage and the avoidance of matrix inverse calculation. The method outweighs the closed form (matrix equation) methods especially when X is huge and sparse.
Make sure the row number of X is larger than the column number of X to avoid the underdetermined problem. Also check out the condition number of X'X to see if the problem is ill-posedness. If that is the case, you may add a small regularization factor in the closed form ((X'X + lambda * I)^(-1) * X'y) where lambda is a small value and I is the identity matrix.

generating random variable having an exponential density function [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I would like to generate a random variable having an exponential density function:
f(x) = e^x / (e - 1), 0 <= x <= 1
I know I can use a uniform random number generator with using the inversion method for a simple function like (e^-x). But, I am not sure how to use them on the function given above.
Any suggestions?
Per Wolfram Alpha, the integral of that density function from 0 to a is (e^a-1)/(e-1), which inverts to y=log((e-1)*x+1). So the inverse transform method should work fine.
In the more general case where the integral doesn't pan out or the inversion doesn't pan out, stochastic sampling methods are the most widely applicable methods for sampling a random variable given its probability density. The easiest to understand and implement is Rejection Sampling. After that, you're looking at Metropolis-Hastings, which is immensely powerful but not necessarily the simplest to get your head around.
The first step is to integrate f(x) from 0 to x to determine the cumulative distribution function, call this function U. When you (pseudo-)randomly pick a number, put it into this function U and find x that satisfies this.
Your function appears to be simple enough that direct inversion will work. If you have a more complicated function, you would have to use a Newton-Raphson method to solve x for the given U.

two whole texts similarity using levenshtein distance [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have two text files which I'd like to compare. What I did is:
I've split both of them into sentences.
I've measured levenshtein distance between each of the sentences from one file with each of the sentences from second file.
I'd like to calculate average similarity between those two text files, however I have trouble to deliver any meaningful value - obviously arithmetic mean (sum of all the distances [normalized] divided by number of comparisions) is a bad idea.
How to interpret such results?
edit:
Distance values are normalized.
The levenshtein distances has a maximum value, i.e. the max. length of both input strings. It cannot get worse than that. So a normalized similarity index (0=bad, 1=match) for two strings a and b can be calculated as 1- distance(a,b)/max(a.length, b.length).
Take one sentence from File A. You said you'd compare this to each sentence of File B. I guess you are looking for a sentence out of B which has the smallest distance (i.e. the highest similarity index).
Simply calculate the average of all those 'minimum similarity indexes'. This should give you a rough estimation of the similarity of two texts.
But what makes you think that two texts which are similar might have their sentences shuffled? My personal opinion is that you should also introduce stop word lists, synonyms and all that.
Nevertheless: Please also check trigram matching which might be another good approach to what you are looking for.

Optimization similar to Knapsack [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am trying to find a way to solve an Optimization problem as follows:
I have 22 different objects that can be selected more than once. I have a evaluation function f that takes the multiplicities and calculates the total value.
f is a product over fractions of linear (affine) terms and as such, differentiable and even smooth in the allowed region.
I want to optimize f with respect to the 22 variables, with the additional conditions that certain sums may not exceed certain values (for example, if a,...,v are my variables, a + e + i + m + q + s <= 9). By this, all of the variables are bounded.
If f were strictly monotonuous, this could be solved optimally by a (minimalistically modified) knapsack solution. However, the function isnt convex. That means it is even impossible to assume that if taking an object A is better than B on an empty knapsack, that this choice holds even when adding a third object C (as C could modify B's benefit to be better than A). This means that a greedy algorithm cannot be used;
Are there similar algorithms that solve such a problem in a optimal (or at least, nearly optimal) way?
EDIT: As requested, an example of what the problem is (I chose 5 variables a,b,c,d,e for simplicity)
for example,
f(a,b,c,d,e) = e*(a*0.45+b*1.2-1)/(c+d)
(Every variable only appears once, if this helps at all)
Also, for example, a+b+c=4, d+e=3
The problem is to optimize that with respect to a,b,c,d,e as integers. There is a bunch of optimization algorithms that hold for convex functions, but very few for non-convex...

Resources