Drawing a polynomial - algorithm

I have a polynomial form:
w(x) = a +b(x-x1) + c(x-x1)^2*(x-x2) + d(x-x1)^2*(x-x2)^2+....
Does anybody know a fast algorithm to counting this polynomial?
I want to draw this polynomial, but first i have to count the value, but I can't find any fast and interesting method to that.

I don't think there is faster algorithm than iterating over the whole polynom. As it is not obvious from your description what exactly is the rule by which you form the terms I can not offer a solution but it is even better if you come up with it yourself.
From what I see the x-dependent part for each consecutive term is formed by multiplying what you have so far by another monomial. If that is so, keep the value between iterations of the cycle.

Write a parser to get the individual elements of the expression such as a, b, x, x1, etc and 2, *, +, ^, (, ), etc.
Then use the Shunting-yard algorithm to transform the expression into Reverse Polish Notation.
Then evaluate it using a stack or a tree.
If you intend to evaluate the same expression many times, you may want to eliminate common subexpressions (e.g. x - x1 repeats multiple times and you may calculate it just once). There are ways to do that as well. But before you go there, first see if what you get without such optimizations is really not sufficient.

Related

Ordering equations in the face of sometimes-unordered terms

Superposition calculus is used for reasoning with equations; it reduces the size of the search space by applying an order to equations, based on an ordering of terms.
A suitable ordering of terms, such as Knuth-Bendix, must sometimes answer 'unordered'. For example, f(x) vs f(y) where x and y are variables; a suitable order must be stable under substitution of terms for variables, so no matter which answer you might give for f(x) vs f(y) (less, same, greater), some substitution of terms for the two variables, would turn out to be inconsistent with the initial answer. In this domain, comparison needs a fourth possible answer, 'unordered'.
Superposition calculus orders equations relative to each other, based on the constituent terms and the polarity. There are ways of constructing this based on the multiset extension of term order, but perhaps the simplest correct algorithm is:
Compare the larger terms; if they are unequal, that's the answer.
Compare the polarities; if they are unequal, negative is greater than positive.
Compare the smaller terms.
It is tempting to implement this by first sorting each equation, larger term first, then implementing the above algorithm directly. The problem is that each equation may not have a larger term; it is quite possible that the component terms of one or both equations are unordered relative to each other, so a correct algorithm for comparing equations, must take that into account.
This could be derived from first principles by going through all the possibilities, but it also looks like there would be many opportunities to make a subtle error that would take a while to track down.
Is there a known/canonical algorithm already worked out, for comparing equations in this context?

How to implement a superoptimizer

[Related to https://codegolf.stackexchange.com/questions/12664/implement-superoptimizer-for-addition from Sep 27, 2013]
I am interested in how to write superoptimizers. In particular to find small logical formulae for sums of bits. This was previously set this as a challenge on codegolf but it seems a lot harder than one might imagine.
I would like to write code that finds the smallest possible propositional logical formula to check if the sum of y binary 0/1 variables equals some value x. Let us call the variables x1, x2, x3, x4 etc. In the simplest approach the logical formula should be equivalent to the sum. That is, the logical formula should be true if and only if the sum equals x.
Here is a naive way to do that. Say y=15 and x = 5. Pick all 3003 different ways of choosing 5 variables and for each make a new clause with the AND of those variables AND the AND of the negation of the remaining variables. You end up with 3003 clauses each of length exactly 15 for a total cost of 45054.
However, if you are allowed to introduce new variables into your solution then you can potentially reduce this a lot by eliminating common subformulae. So in this case your logical formula consists of the y binary variables, x and some new variables. The whole formula would be satisfiable if and only if the sum of the y variables equals x. The only allowed operators are and, or and not.
It turns out there is a clever method for solving this problem when x =1, at least in theory . However, I am looking for a computational intensive method to search for small solutions.
How can you make a superoptimizer for this problem?
Examples. Take as an example two variables where we want a logical formula that is True exactly when they sum to 1. One possible answer is:
(((not y0) and (y1)) or ((y0) and (not y1)))
To introduce a new variable into a formula such as z0 to represent y0 and not y1 then we can introduce a new clause (y0 and not y1) or not z0 and replace y0 and not y1 by z0 throughout the rest of the formula . Of course this is pointless in this example as it makes the expression longer.
Write your desired sum in binary. First look at the least important bit, y0 . Clearly,
x1 xor x2 xor ... xor xn = y0 - that's your first formula. The final formula will be a conjunction of formulae for each bit of the desired sum.
Now, do you know how an adder is implemented? http://en.wikipedia.org/wiki/Adder_(electronics) . Take inspiration from it, group your input into pairs/triples of bits, calculate the carry bits, and use them to make formulae for y1...yk . If you need further hints, let me know.
If I understand what you're asking, you'll want to look into the general topics of logic minimization and/or Boolean function simplification. The references are mostly about general methods for eliminating redundancy in Boolean formulas that are disjunctions ("or"s) of terms that are conjunctions ("and"s).
By hand, the standard method is called a Karnaugh map. The equivalent algorithm expressed in a way that's more amenable to computer implementation is Quine-McKlosky (also called the method of prime implicants). The minimization problem is NP-hard, and QM solves it exactly.
Therefore I think QM is what you want for the "super-optimizer" you're trying to build.
But the combination of NP-hard and exact solution means that QM is impractical for large problems, at least non-trivial ones.
The QM Algorithm lays out the conjunctive terms (called minterms in this context) in a table and conducts searches for 1-bit differences between pairs of terms. These terms can be combined and the factor for the differing bit labeled "don't care" in further combinations. This is repeated with 2-bit, 4-bit, etc. subsets of bits. The exponential behavior results because choices are involved for the combinations of larger bit sets: choosing one rules out another. Therefore it is essentially a search problem.
There is an enormous literature on heuristics to trim the search space, yet find "good" solutions that aren't necessarily optimal. A famous one is Espresso. However, since algorithm improvements translate directly to dollars in semiconductor manufacture, it's entirely possible that the best are proprietary and closely held.

Levenstein Transpose Distance

How can I implement the transpose/swap/twiddle/exchange distance alone using dynamic programming. I must stress that I do not want to check for the other operations (ie copy, delete, insert, kill etc) just transpose/swap.
I wish to apply Levenstein's algorithm just for swap distance. How would the code look like?
I'm not sure that Levenstein's algorithm can be used in this case. Without insert or delete operation, distance is good defined only between strings with same length and same characters. Examples of strings that isn't possible to transform to same string with only transpositions:
AB, ABC
AAB, ABB
With that, algorithm can be to find all possible permutations of positions of characters not on same places in both strings and look for one that can be represent with minimum number of transpositions or swaps.
An efficient application of dynamic programming usually requires that the task decompose into several instances of the same task for a shorter input. In case of the Levenstein distance, this boils down to prefixes of the two strings and the number of edits required to get from one to the other. I don't see how such a decomposition can be achieved in your case. At least I don't see one that would result in a polynomial time algorithm.
Also, it is not quite clear what operations you are talking about. Depending on the context, a swap or exchange can mean either the same thing as transposition or a replacement of a letter with an arbitrary other letter, e.g. test->text. If by "transpose/swap/twiddle/exchange" you try to say just "transpose", than you should have a look at Counting the adjacent swaps required to convert one permutation into another. If not, please clarify the question.

String analysis

Given a sequence of operations:
a*b*a*b*a*a*b*a*b
is there a way to get the optimal subdivision to enable reusage of substring.
making
a*b*a*b*a*a*b*a*b => c*a*c, where c = a*b*a*b
and then seeing that
a*b*a*b => d*d, where d = a*b
all in all reducing the 8 initial operations into the 4 described here?
(c = (d = a*b)*d)*a*c
The goal of course is to minimize the number of operations
I'm considering a suffixtree of sorts.
I'm especially interested in linear time heuristics or solutions.
The '*' operations are actually matrix multiplications.
This whole problem is known as "Common Subexpression Elimination" or CSE. It is a slightly smaller version of the problem called "Graph Reduction" faced by the implementer of compilers for functional programming languages. Googling "Common Subexpression elimination algorithm" gives lots of solutions, though none that I can see especially for the constraints given by matrix multiplication.
The pages linked to give a lot of references.
My old answer is below. However, having researched a bit more, the solution is simply building a suffix tree. This can be done in O(N) time (lots of references on the wikipedia page). Having done this, the sub-expressions (c, d etc. in your question) are just nodes in the suffix tree - just pull them out.
However, I think MarcoS is on to something with the suggestion of Longest repeating Substring, as graph reduction precedence might not allow optimisations that can be allowed here.
sketch of algorithm:
optimise(s):
let sub = longestRepeatingSubstring(s).
optimisedSub = optimise(sub)
return s with sub replaced by optimisedSub
Each run of longest repeating substring takes time N. You can probably re-use the suffix tree you build to solve the whole thing in time N.
edit: The orders-of-growth in this answer are needed in addition to the accepted answer in order to run CSE or matrix-chain multiplication
Interestingly, a compression algorithm may be what you want: a compression algorithm seeks to reduce the size of what it's compressing, and if the only way it can do that is substitution, you can trace it and obtain the necessary subcomponents for your algorithm. This may not give nice results though for small inputs.
What subsets of your operations are commutative will be an important consideration in choosing such an algorithm. [edit: OP says no operations are commutative in his/her situation]
We can also define an optimal solution, if we ignore effects such as caching:
input: [some product of matrices to compute]
given that multiplying two NxN matrices is O(N^2.376)
given we can visualize the product as follows:
[[AxB][BxC][CxD][DxE]...]
we must for example perform O(max(A,B,C)^2.376) or so operations in order to combine
[AxB][BxC] -> [AxC]
The max(...) is an estimate based on how fast it is to multiply two square matrices;
a better estimate of cost(A,B,C) for multiplying an AxB * BxC matrix can be gotten
from actually looking at the algorithm, or running benchmarks if you don't know the
algorithm used.
However note that multiplying the same matrix with itself, i.e. calculating
a power, can be much more efficient, and we also need to take that into account.
At worst, it takes log_2(power) multiplies each of O(N^2.376), but this could be
made more efficient by diagonalizing the matrix first.
There is the question about whether a greedy approach is feasible for not: whether one SHOULD compress repeating substrings at each step. This may not be the case, e.g.
aaaaabaab
compressing 'aa' results in ccabcb and compressing 'aab' is now impossible
However I have a hunch that, if we try all orders of compressing substrings, we will probably not run into this issue too often.
Thus having written down what we want (the costs) and considered possibly issues, we already have a brute-force algorithm which can do this, and it will run for very small numbers of matrices:
# pseudocode
def compress(problem, substring)
x = new Problem(problem)
x.string.replaceall(substring, newsymbol)
x.subcomputations += Subcomputation(newsymbol=substring)
def bestCompression(problem)
candidateCompressions = [compress(problem,substring) for each substring in problem.string]
# etc., recursively return problem with minimum cost
# dynamic programming may help make this more efficient, but one must watch
# out for the note above, how it may be hard to be greedy
Note: according to another answer by Asgeir, this is known as the Matrix Chain Multiplication optimization problem. Nick Fortescue notes this is also known more generally as http://en.wikipedia.org/wiki/Common_subexpression_elimination -- thus one could find any generic CSE or Matrix-Chain-Multiplication algorithm/library from the literature, and plug in the cost orders-of-magnitude I mentioned earlier (you will need those nomatter which solution you use). Note that the cost of the above calculations (multiplication, exponentiation, etc.) assume that they are being done efficiently with state-of-the-art algorithms; if this is not the case, replace the exponents with appropriate values which correspond to the way the operations will be carried out.
If you want to use the fewest arithmetic operations then you should have a look at matrix chain multiplication which can be reduced to O(n log n)
From the top of the head the problem seems in NP for me. Depending on the substitutions you are doing other substitions will be possible or impossible for example for the string
d*e*a*b*c*d*e*a*b*c*d*e*a there are several possibilities.
If you take the longest common string it will be:
f = d*e*a*b*c and you could substitute f*f*e*a leaving you with three multiplications in the end and four intermediate ones (total seven).
If you instead substitute the following way:
f = d*e*a you get f*b*c*f*b*c*f which you can further substitute using g = f*b*c to
g*g*f for a total of six multiplication.
There are other possible substitutions in this problem, but I do not have the time to count them all right now.
I am guessing for a complete minimal substitution it is not only necessary to figure out the longest common substring but also the number of times each substring repeats, which probably means you have to track all substitutions so far and do backtracking. Still it might be faster than the actual multiplications.
Isn't this the Longest repeated substring problem?

Given a integer number, find the smallest function that given it

I have a very large positive integer number (million digits). I need represent it with the smallest possible function, this number is variable, it means, I need an algorithm that generates the smallest possible function to get the given number.
Example: For the number 29512665430652752148753480226197736314359272517043832886063884637676943433478020332709411004889 the algorithm must return "9^99". It must be able to analyze numbers and always return a math function that represent the number. Example the number 21847450052839212624230656502990235142567050104912751880812823948662932355202 must return "9^5^16+1".
Heard of Kolmogorov complexity?
To answer your question: unless you restrict yourself to some specific set of functions, it's impossible.
EDIT: Even in your example, how do you know that the shortest representation of 21​847​450​052​839​212​624​230​656​502​990​235​142​567​050​104​912​751​880​812​823​948​662​932​355​202 is actually 9^5^16+1? Isn't it a quite hard to prove even in this specific case?
If you restrict yourself to some set of functions then you can use the following algorithm:
For i = 1 to n
enumerate all strings s of length i
if s represents a valid expression according to rules chosen a priori,
and evaluates to the number in the input,
return s
It is guaranteed to halt because on the last iteration of the outer loop (i = n) you will get eventually to a string contains the input verbatim.
Of course, this is not very efficient. Specifically O(bn) where n is the length of the input and b is the size of the alphabet.
Expanding on #ybungalobill's terse answer, your function is equivalent to a function that computes the Kolmogorov complexity of an arbitrary string. (The equivalence is obvious if you treat each digit of your very large numbers as characters, and the numbers as sequences of characters.)
According to the Wikipedia page on Kolmogorov complexity, the K(s) function that gives the complexity of a string s is not a computable function. (The page includes a proof.)
In other words, the algorithm you want simply does not exist.
#BlueRaja - Danny Pflughoeft: yes, it is. I'm trying to create some compression that uses this algorithm, but by the way this is impossible.
That's because it's technically impossible to compress arbitrary data, for the same reason, but that doesn't stop us from doing it :)
There are much better ways of compressing data, however. Take a look at, for instance, LZ. It is so ubiquitous that you can almost certainly find a library to do the compression for you, regardless of what language you're writing in. DEFLATE is another popular one.
Hope that helps!
If you're not looking for optimality, just a reasonably good job, then there are a bunch of heuristics you can use. For example, try to decompose n using all of the following
n = a^k + b
for k = 2, 3, ..., log n, and pick the one with the smallest a + b, say. You can compute a and b using a = floor(n^(1/k)) and b = n-a^k. Then recurse on a and b.
Of course, this uses only exponentiation and addition to find a good compression. If you allow subtraction as well, use a=round(n^(1/k)) instead and let b be negative.
Allowing multiplication as well makes it quite a bit harder because you would probably need to factor n.

Resources