Two strange efficiency problems in Mathematica - performance

FIRST PROBLEM
I have timed how long it takes to compute the following statements (where V[x] is a time-intensive function call):
Alice = Table[V[i],{i,1,300},{1000}];
Bob = Table[Table[V[i],{i,1,300}],{1000}]^tr;
Chris_pre = Table[V[i],{i,1,300}];
Chris = Table[Chris_pre,{1000}]^tr;
Alice, Bob, and Chris are identical matricies computed 3 slightly different ways. I find that Chris is computed 1000 times faster than Alice and Bob.
It is not surprising that Alice is computed 1000 times slower because, naively, the function V must be called 1000 more times than when Chris is computed. But it is very surprising that Bob is so slow, since he is computed identically to Chris except that Chris stores the intermediate step Chris_pre.
Why does Bob evaluate so slowly?
SECOND PROBLEM
Suppose I want to compile a function in Mathematica of the form
f(x)=x+y
where "y" is a constant fixed at compile time (but which I prefer not to directly replace in the code with its numerical because I want to be able to easily change it). If y's actual value is y=7.3, and I define
f1=Compile[{x},x+y]
f2=Compile[{x},x+7.3]
then f1 runs 50% slower than f2. How do I make Mathematica replace "y" with "7.3" when f1 is compiled, so that f1 runs as fast as f2?
EDIT:
I found an ugly workaround for the second problem:
f1=ReleaseHold[Hold[Compile[{x},x+z]]/.{z->y}]
There must be a better way...

You probably should've posted these as separate questions, but no worries!
Problem one
The problem with Alice is of course what you expect. The problem with Bob is that the inner Table is evaluated once per iteration of the outer Table. This is clearly visible with Trace:
Trace[Table[Table[i, {i, 1, 3}], {3}]]
{
Table[Table[i,{i,1,2}],{2}],
{Table[i,{i,1,2}],{i,1},{i,2},{1,2}},{Table[i,{i,1,2}],{i,1},{i,2},{1,2}},
{{1,2},{1,2}}
}
Line breaks added for emphasis, and yeah, the output of Trace on Table is a little weird, but you can see it. Clearly Mathematica could optimize this better, knowing that the outside table has no iterator, but for whatever reason, it doesn't take that into account. Only Chris does what you want, though you could modify Bob:
Transpose[Table[Evaluate[Table[V[i],{i,1,300}]],{1000}]]
This looks like it actually outperforms Chris by a factor of two or so, because it doesn't have to store the intermediate result.
Problem two
There's a simpler solution with Evaluate, though I expect it won't work with all possible functions to be compiled (i.e. ones that really should be Held):
f1 = Compile[{x}, Evaluate[x + y]];
You could also use a With:
With[{y=7.3},
f1 = Compile[{x}, x + y];
]
Or if y is defined elsewhere, use a temporary:
y = 7.3;
With[{z = y},
f1 = Compile[{x}, x + z];
]
I'm not an expert on Mathematica's scoping and evaluation mechanisms, so there could easily be a much better way, but hopefully one of those does it for you!

Your first problem has already been explained, but I want to point out that ConstantArray was introduced in Mathematica 6 to address this issue. Prior to that time Table[expr, {50}] was used for both fixed and changing expressions.
Since the introduction of ConstantArray there is clear separation between iteration with reevaluation, and simple duplication of an expression. You can see the behavior using this:
ConstantArray[Table[Pause[1]; i, {i, 5}], {50}] ~Monitor~ i
It takes five seconds to loop through Table because of Pause[1], but after that loop is complete it is not reevaluated and the 50 copies are immediately printed.

First problem
Have you checked the output of the Chris_pre computation? You will find that it is not a large matrix at all, since you're trying to store an intermediate result in a pattern, rather than a Variable. Try ChrisPre, instead. Then all the timings are comparable.
Second problem
Compile has a number of tricky restrictions on it's use. One issue is that you cannot refer to global variables. The With construct that was already suggested is the suggested way around this. If you want to learn more about Compile, check out Ted Ersek's tricks:
http://www.verbeia.com/mathematica/tips/Tricks.html

Related

Sparse matrix to speed up octave

I have a loop where "i" depends on "i-1" value, so I cannot vectorize it.
I've read that I can use a sparse matrix in order to vectorize it and so to speed up my code, but I don't understand how this work.
Any help?
Thanks
You are referring to this technique, as referenced from this (rather old) how to speed up octave article.
I'll rephrase the gist here in case the link dies in the future.
Suppose you have the following loop:
p1(1) = 0;
for i = 2 : N
t = t + dt;
p1(i) = p1(i - 1) + dt * 2 * t;
endfor
You note here that, purely from a mathematical point of view, the last step in the loop could be rephrased as:
-1 * p1(i - 1) + 1 * p1(i) = dt * 2 * t
This makes it possible to recast the problem as a sparse matrix solve, by thinking of p1 as the vector of unknowns, and each iteration of the loop as a row in a (sparse) system of equations. E.g.:
Given that t is a known vector, this makes the above a straightforward problem that can be solved via a simple matrix division operation, which is guaranteed to be fast.
Having said that, presumably this 'trick' is only useful if you are able to recast the problem in this manner in the first place. Presumably this will only be the case for linear problems of your unknown. I don't think this can necessarily be used for more complicated loops.
Also, as Cris has mentioned in the comments, if this method does not work for you, there's a chance you can optimize your loop in other ways (or even that the loop solution may not necessarily be slow in the first place).
By the way, in theory, Octave provides jit-speedup like matlab does, though unlike matlab you need to enable it explicitly (in the sense that you need to compile your octave with jit options, which tends not to be the default), and my personal experience is that this is mostly experimental and may not do much except in the simplest of loops (see this post).

How I predict how some formula will behave with integers?

I am making some software that need to work with integers.
Also I need to apply some formula to those integers, repeatedly over time (example, do x/=z several times in a row for a indefinite amount).
All tools, algorithms and formulas I could think or find, or don't work with integers at all, or work as approximations at best.
For example the x/=z several times in a row for example, you can theoretically calculate what x will be in the 10th time by doing x = x/(z^10), but that will be wrong if the result is fractional, you can use floor(x/(z^10)), but the result will STILL be wrong.
Plotting software that I found also don't have integers at all, or has "floor()/ceil()" functions support, at best, and still the result would fall in the problem of the previous paragraph.
So how I do it?
Here's something to get you going for the iteration of x/=z:
(that should have ended in "all three terms are 0 with regard to integer division")
Now if x or z are negative, you can try and see whether this still holds; I did not invest the time to make the necessary case distinctions, but they should be fairly analogous.
As Karoly Horvath mentions in a comment, without a clear specification of the kinds of functions for which you would like to find a shortcut to replace iterative evaluation, helping you out won't be possible since there are uncountably many functions over the integers, and the same approach won't work for all of them.

Are Haskell List Comprehensions Inefficient?

I started doing Project Euler and got to problem number 9. Since I was using Project Euler to learn Haskell, I decided to use list comprehensions (as shown in Learn You A Haskell). I do that and GHCI takes awhile to figure out the triplet, which I figured is normal because of the calculations involved. Now, at work yesterday (I don't work as a programmer professionally, yet) I was talking to a friend who knows VBA and he wanted to try to find the answers in VBA. I thought it would be a fun challenge as well, and I churn out some basic for loops and if statements, but what got me was that it was much faster than Haskell was.
My question is: are Haskell's list comprehension incredibly inefficient? At first I thought it was just because I was in GHC's interactive mode, but then I realized VBA is interpreted too.
Please note, I didn't post my code because of it being an answer to project euler. If it will answer my question (as in I'm doing something wrong) then I will gladly post the code.
[edit]
Here is my Haskell list comprehension:
[(a,b,c) | c <- [1..1000], b <- [1..c], a <- [1..b], a+b+c=1000, a^2+b^2=c^2]
I guess I could've lowered the range on c but is that what is really slowing it down?
There are two things you could be doing with this problem that could make your code slow. One is how you are trying values for a, b and c. If you loop through all possible values for a, b, c from 1 to 1000, you'll be spending a long time. To give a hint, you can make use of a+b+c=1000 if you rearrange it for c. The other is that if you only use a list comprehension, it will process every possible value for a, b and c. The problem tells you that there is only one unique set of numbers that satisfies the problem, so if you change your answer from this:
[ a * b * c | .... ]
to:
head [ a * b * c | ... ]
then Haskell's lazy evaluation means that it will stop after finding the first answer. This is the Haskell equivalent of breaking out of your VBA loop when you find the first answer. When I used both these tips, I had an answer that completed very quickly (under a second) in ghci.
Addendum: I missed at first the condition a < b < c. You can also make use of this in your list comprehensions; it is valid to say things along the lines of:
[(a, b) | b <- [1..100], a <- [1..b-1]]
Consider this simplified version of your list comprehension:
[(a,b,c) | a <- [1..1000], b <- [1..1000], c <- [1..1000]]
This will give all possible combinations of a, b, and c. It's kind of like saying, "how many ways can three one-thousand-sided dice land?" The answer is 1000*1000*1000 = 1,000,000,000 different combinations. If it took 0.001 seconds to generate each combination, it would therefore take 1,000,000 seconds (~11.5 days) to finish all combinations. (OK, 0.001 seconds is actually pretty slow for a computer, but you get the idea)
When you add predicates to your list comprehension, it still takes the same amount of time to compute; in fact, it takes longer since it needs to check the predicate for each of the 1 billion combinations it computes.
Now consider your comprehension. It looks like it should be much faster, right?
[(a,b,c) | c <- [1..1000], b <- [1..c], a <- [1..b], a+b+c=1000, a^2+b^2=c^2]
There are 1000 choices for c. How many are there for b and a? Well, the average choice for c is 500. For all choices of c, then, there are an average of 500 choices for b (since b can range from 1 to c). Likewise, for all choices of c and b, there are an average of 250 choices for a. That's very hand-wavy, but I'm fairly sure it's accurate. So 1000 choices for c * 1000/2 choices for b * 1000/4 choices for a = 1 billion / 8 ~= 100 million. It's 8x faster, but if you paid attention, you'll notice it's actually the same big-Oh complexity as the simplified version above. If we compared "simplified" vs "improved" versions of the same problem, but from [1..100000] instead of [1..1000], the "improved" would still only be 8x faster than the "simplified".
Don't get me wrong, 8x is a wonderful constant-factor speedup. But unless you want to wait a couple hours to get the solution, you'll need to get a better big-Oh.
As Neil noted, the way to reduce the complexity of this problem is, for a given b and c, choose the a that satisfies a+b+c=1000. That way, you're not trying a bunch of as that will fail. This will drop the big-Oh complexity; you'll only be considering approximately 1000 * 500 * 1 = 500,000 combinations, instead of ~100,000,000.
Once you get the solution to the problem you can check out other peoples versions of Haskell solutions on the Project Euler site to get an idea of how other people have solved the problem. Incidentally, here is a link to the referenced problem: http://projecteuler.net/index.php?section=problems&id=9
In addition to what everyone else has said about generating fewer elements in the generators, you can also switch to using Int instead of Integer as the type of the numbers. The default is Integer, but your numbers are small enough to fit in an Int.
(Also, to nitpick, Haskell list comprehensions have no speed. Haskell is a language definition with very little operational semantics. A particular Haskell implementation might have slow list comprehensions, though.)

Iterative solving for unknowns in a fluids problem

I am a Mechanical engineer with a computer scientist question. This is an example of what the equations I'm working with are like:
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
The situation is this:
I need r to find x, but I need x to find z. I also need x to find f which is a part of finding z. So I guess a value for x, and then I use that value to find r and f. Then I go back and use the value I found for r and f to find x. I keep doing this until the guess and the calculated are the same.
My question is:
How do I get the computer to do this? I've been using mathcad, but an example in another language like C++ is fine.
The very first thing you should do faced with iterative algorithms is write down on paper the sequence that will result from your idea:
Eg.:
x_0 = ..., f_0 = ..., r_0 = ...
x_1 = ..., f_1 = ..., r_1 = ...
...
x_n = ..., f_n = ..., r_n = ...
Now, you have an idea of what you should implement (even if you don't know how). If you don't manage to find a closed form expression for one of the x_i, r_i or whatever_i, you will need to solve one dimensional equations numerically. This will imply more work.
Now, for the implementation part, if you never wrote a program, you should seriously ask someone live who can help you (or hire an intern and have him write the code). We cannot help you beginning from scratch with, eg. C programming, but we are willing to help you with specific problems which should arise when you write the program.
Please note that your algorithm is not guaranteed to converge, even if you strongly think there is a unique solution. Solving non linear equations is a difficult subject.
It appears that mathcad has many abstractions for iterative algorithms without the need to actually implement them directly using a "lower level" language. Perhaps this question is better suited for the mathcad forums at:
http://communities.ptc.com/index.jspa
If you are using Mathcad, it has the functionality built in. It is called solve block.
Start with the keyword "given"
Given
define the guess values for all unknowns
x:=2
f:=3
r:=2
...
define your constraints
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
calculate the solution
find(x, y, z, r, ...)=
Check Mathcad help or Quicksheets for examples of the exact syntax.
The simple answer to your question is this pseudo-code:
X = startingX;
lastF = Infinity;
F = 0;
tolerance = 1e-10;
while ((lastF - F)^2 > tolerance)
{
lastF = F;
X = ?;
R = ?;
F = FunctionOf(X,R);
}
This may not do what you expect at all. It may give a valid but nonsense answer or it may loop endlessly between alternate wrong answers.
This is standard substitution to convergence. There are more advanced techniques like DIIS but I'm not sure you want to go there. I found this article while figuring out if I want to go there.
In general, it really pays to think about how you can transform your problem into an easier problem.
In my experience it is better to pose your problem as a univariate bounded root-finding problem and use Brent's Method if you can
Next worst option is multivariate minimization with something like BFGS.
Iterative solutions are horrible, but are more easily solved once you think of them as X2 = f(X1) where X is the input vector and you're trying to reduce the difference between X1 and X2.
As the commenters have noted, the mathematical aspects of your question are beyond the scope of the help you can expect here, and are even beyond the help you could be offered based on the detail you posted.
However, I think that even if you understood the mathematics thoroughly there are computer science aspects to your question that should be addressed.
When you write your code, try to make organize it into functions that depend only upon the parameters you are passing in to a subroutine. So write a subroutine that takes in values for y, z, and r and returns you x. Make another that takes in f,L,D,G and returns z. Now you have testable routines that you can check to make sure they are computing correctly. Check the input values to your routines in the routines - for instance in computing x you will get a divide by 0 error if you pass in a 0 for r. Think about how you want to handle this.
If you are going to solve this problem interatively you will need a method that will decide, based on the results of one iteration, what the values for the next iteration will be. This also should be encapsulated within a subroutine. Now if you are using a language that allows only one value to be returned from a subroutine (which is most common computation languages C, C++, Java, C#) you need to package up all your variables into some kind of data structure to return them. You could use an array of reals or doubles, but it would be nicer to choose to make an object and then you can reference the variables by their name and not their position (less chance of error).
Another aspect of iteration is knowing when to stop. Certainly you'll do so when you get a solution that converges. Make this decision into another subroutine. Now when you need to change the convergence criteria there is only one place in the code to go to. But you need to consider other reasons for stopping - what do you do if your solution starts diverging instead of converging? How many iterations will you allow the run to go before giving up?
Another aspect of iteration of a computer is round-off error. Mathematically 10^40/10^38 is 100. Mathematically 10^20 + 1 > 10^20. These statements are not true in most computations. Your calculations may need to take this into account or you will end up with numbers that are garbage. This is an example of a cross-cutting concern that does not lend itself to encapsulation in a subroutine.
I would suggest that you go look at the Python language, and the pythonxy.com extensions. There are people in the associated forums that would be a good resource for helping you learn how to do iterative solving of a system of equations.

Optimizing the computation of a recursive sequence

What is the fastest way in R to compute a recursive sequence defined as
x[1] <- x1
x[n] <- f(x[n-1])
I am assuming that the vector x of proper length is preallocated. Is there a smarter way than just looping?
Variant: extend this to vectors:
x[,1] <- x1
x[,n] <- f(x[,n-1])
Solve the recurrence relationship ;)
In terms of the question of whether this can be fully "vectorized" in any way, I think the answer is probably "no". The fundamental idea behind array programming is that operations apply to an entire set of values at the same time. Similarly for questions of "embarassingly parallel" computation. In this case, since your recursive algorithm depends on each prior state, there would be no way to gain speed from parallel processing: it must be run serially.
That being said, the usual advice for speeding up your program applies. For instance, do as much of the calculation outside of your recursive function as possible. Sort everything. Predefine your array lengths so they don't have to grow during the looping. Etc. See this question for a similar discussion. There is also a pseudocode example in Tim Hesterberg's article on efficient S-Plus Programming.
You could consider writing it in C / C++ / Fortran and use the handy inline package to deal with the compiling, linking and loading for you.
Of course, your function f() may be a real constraint if that one needs to remain an R function. There is a callback-from-C++-to-R example in Rcpp but this requires a bit more work than just using inline.
Well if you need the entire sequence how fast it can be? assuming that the function is O(1), you cannot do better than O(n), and looping through will give you just that.
In general, the syntax x$y <- f(z) will have to reallocate x every time, which would be very slow if x is a large object. But, it turns out that R has some tricks so that the list replacement function [[<- doesn't reallocate the whole list every time. So I think you can reasonably efficiently do:
x[[1]] <- x1
for (m in seq(2, n))
x[[m]] <- f(x[[m-1]])
The only wasteful aspect here is that you have to generate an array of length n-1 for the for loop, which isn't ideal, but it's probably not a giant issue. You could replace it by a while loop if you preferred. The usual vectorization tricks (lapply, etc.) won't work here...
(The double brackets give you a list element, which is what you probably want, rather than a singleton list.)
For more details, see Chambers (2008). Software for Data Analysis. p. 473-474.

Resources