Do all programming languages store the output of function calls? - performance

This is a general question about if programming languages remember/store the output of function calls.
Suppose I need to calculate a quantity X which depends on some number of simpler calculations. Let's say
X=sin(t)+cos(t)+(cos(t)-sin(t))^2.
Naively, I could compute X as above, calling sin(t) twice, and cos(t) twice.
Or I could call sin(t) and cos(t) once:
a=sin(t)
b=cos(t)
and do
X=a+b+(b-a)^2
Intuitively, the second method should be twice as fast right? Is this the case in all programming languages?
I ask because, doing such a calculation in Julia, I noticed that computing the simpler quantities once vs calling them at each point they appear in the expression for X does not change the runtime.

It depends on how clever your compiler is, and on properties of the function.
First your compiler would need to figure out for example that you are calling sin(t) twice. That's not too difficult.
Second it needs to convince itself that t has the same value for each call. For example, if t was a static variable, and you didn't call sin(t) but some other function, that function call could modify t, so the second call sin(t) would have a different argument and sin(t) would have to be called twice.
Third it needs to convince itself that it doesn't matter whether sin(t) is called once or twice. (Such a function is called idempotent). For example, if you called a function that writes a message to a logfile, then the compiler would have to call it twice, or only one message is written to the logfile instead of two.

Related

In Julia set random seed and not generating same values if calling a function from different parts of program

I have a Julia program which runs some montecarlo simulations. To do this, I set a seed before calling the function that generates my random values. This function can be called from different parts of my program, and the problem is that I am not generating same random values if I call the function from different parts of my program.
My question is: In order to obtain the same reproducible results, do I have to call this function always from the same part of the program? Maybe this questions is more computer science related than language programming itself.
I thought of seeding inside the function with an increasing index but I still do not get same results.
it is easier to have randomness according to which are your needs if you pass to your function a RNG (random number generator) object as parameter.
For example:
using Random
function foo(a=3;rng=MersenneTwister(123))
return rand(rng,a)
end
Now, you can call the function getting the same results at each function call:
foo() # the same result on any call
However, this is rarely what you want. More often you want to guarantee the replicability of your overall experiment.
In this case, set an RNG at the beginning of your script and call the function with it. This will cause the output of the function to be different at each call, but the same between one run of the whole script and another run.
myGlobalProgramRNG = MersenneTwister() # at the beginning of your script...
foo(rng=myGlobalProgramRNG)
Two further notes: attention to multi-thread. If you want to guarantee replicability in case of multithreading, and in particular independently from the fact of your script running with 1, 2, 4,... threads, look for example how I dealt it in aML library using a generateParallelRngs() function.
The second notice is that even if you create your own RNG at the beginning of your script, or seed the default random seed, different Julia versions may (and indeed they do) change the RNG stream, so the replicability is guaranteed only within the same Julia version.
To overcome this problem you can use a package like StableRNG.jl that guarantee the stability of the RNG stream (at the cost of losing a bit of performance)...

Is having only one argument functions efficient? Haskell

I have started learning Haskell and I have read that every function in haskell takes only one argument and I can't understand what magic happens under the hood of Haskell that makes it possible and I am wondering if it is efficient.
Example
>:t (+)
(+) :: Num a => a -> a -> a
Signature above means that (+) function takes one Num then returns another function which takes one Num and returns a Num
Example 1 is relatively easy but I have started wondering what happens when functions are a little more complex.
My Questions
For sake of the example I have written a zipWith function and executed it in two ways, once passing one argument at the time and once passing all arguments.
zipwithCustom f (x:xs) (y:ys) = f x y : zipwithCustom f xs ys
zipwithCustom _ _ _ = []
zipWithAdd = zipwithCustom (+)
zipWithAddTo123 = zipWithAdd [1,2,3]
test1 = zipWithAddTo123 [1,1,1]
test2 = zipwithCustom (+) [1,2,3] [1,1,1]
>test1
[2,3,4]
>test2
[2,3,4]
Is passing one argument at the time (scenario_1) as efficient as passing all arguments at once (scenario_2)?
Are those scenarios any different in terms of what Haskell is actually doing to compute test1 and test2 (except the fact that scenario_1 probably takes more memory as it needs to save zipWithAdd and zipWithAdd123)
Is this correct and why? In scenario_1 I iterate over [1,2,3] and then over [1,1,1]
Is this correct and why? In scenario_1 and scenario_2 I iterate over both lists at the same time
I realise that I have asked a lot of questions in one post but I believe those are connected and will help me (and other people who are new to Haskell) to better understand what actually is happening in Haskell that makes both scenarios possible.
You ask about "Haskell", but Haskell the language specification doesn't care about these details. It is up to implementations to choose how evaluation happens -- the only thing the spec says is what the result of the evaluation should be, and carefully avoids giving an algorithm that must be used for computing that result. So in this answer I will talk about GHC, which, practically speaking, is the only extant implementation.
For (3) and (4) the answer is simple: the iteration pattern is exactly the same whether you apply zipWithCustom to arguments one at a time or all at once. (And that iteration pattern is to iterate over both lists at once.)
Unfortunately, the answer for (1) and (2) is complicated.
The starting point is the following simple algorithm:
When you apply a function to an argument, a closure is created (allocated and initialized). A closure is a data structure in memory, containing a pointer to the function and a pointer to the argument. When the function body is executed, any time its argument is mentioned, the value of that argument is looked up in the closure.
That's it.
However, this algorithm kind of sucks. It means that if you have a 7-argument function, you allocate 7 data structures, and when you use an argument, you may have to follow a 7-long chain of pointers to find it. Gross. So GHC does something slightly smarter. It uses the syntax of your program in a special way: if you apply a function to multiple arguments, it generates just one closure for that application, with as many fields as there are arguments.
(Well... that might be not quite true. Actually, it tracks the arity of every function -- defined again in a syntactic way as the number of arguments used to the left of the = sign when that function was defined. If you apply a function to more arguments than its arity, you might get multiple closures or something, I'm not sure.)
So that's pretty nice, and from that you might think that your test1 would then allocate one extra closure compared to test2. And you'd be right... when the optimizer isn't on.
But GHC also does lots of optimization stuff, and one of those is to notice "small" definitions and inline them. Almost certainly with optimizations turned on, your zipWithAdd and zipWithAddTo123 would both be inlined anywhere they were used, and we'd be back to the situation where just one closure gets allocated.
Hopefully this explanation gets you to where you can answer questions (1) and (2) yourself, but just in case it doesn't, here's explicit answers to those:
Is passing one argument at the time as efficient as passing all arguments at once?
Maybe. It's possible that passing arguments one at a time will be converted via inlining to passing all arguments at once, and then of course they will be identical. In the absence of this optimization, passing one argument at a time has a (very slight) performance penalty compared to passing all arguments at once.
Are those scenarios any different in terms of what Haskell is actually doing to compute test1 and test2?
test1 and test2 will almost certainly be compiled to the same code -- possibly even to the point that only one of them is compiled and the other is an alias for it.
If you want to read more about the ideas in the implementation, the Spineless Tagless G-machine paper is much more approachable than its title suggests, and only a little bit out of date.

how to call iter_cb every iteration rather than every function evaluation?

The minimize function in lmfit allows the specification of a function that is called every iteration with the keyword iter_cb. This function is called every function evaluation (so not every iteration in the least squares process). What I want to do is to call a function every iteration of the least squares process, after the parameters have been updated. For example, with 3 parameters, the residuals are evaluated 4 times to get 3 derivatives (4 function evaluations). After the parameters are updated I need to call a function before the next iteration (again 4 function evaluations). Is that possible?
Well the answer is sort of "no" in that the underlying compiled code (scipy.optimize.leastsq uses MINPACK lmdif) makes no distinctions between calling the objective function to estimate the finite-difference jacobian. In fact, it doesn't really count "iterations" of the estimated solution, but it does count number of function evaluations.
Also, it's not true that it will always do Nvars+1 function evaluations between updates of the values. That is, there will often be a Nvars+1 evaluations, but it seems like it is sometimes Nvars+2 evaluations (or, rather 1 evaluation with rather unique values).
You could check "iter" (the count of the current function evaluation) that is passed into the iteration callback, and return quickly if the value isn't what you think it should be. Or, you could check that the values have changed significantly since the last time the iteration-callback actually ran completely.

Is this a correct way to think about recursivity in programming? (example)

I've been trying to learn what recursion in programming is, and I need someone to confirm whether I have thruly understood what it is.
The way I'm trying to think about it is through collision-detection between objects.
Let's say we have a function. The function is called when it's certain that a collision has occured, and it's called with a list of objects to determine which object collided, and with what object it collided with. It does this by first confirming whether the first object in the list collided with any of the other objects. If true, the function returns the objects in the list that collided. If false, the function calls itself with a shortened list that excludes the first object, and then repeats the proccess to determine whether it was the next object in the list that collided.
This is a finite recursive function because if the desired conditions aren't met, it calls itself with a shorter and shorter list to until it deductively meets the desired conditions. This is in contrast to a potentially infinite recursive function, where, for example, the list it calls itself with is not shortened, but the order of the list is randomized.
So... is this correct? Or is this just another example of iteration?
Thanks!
Edit: I was fortunate enough to get three VERY good answers by #rici, #Evan and #Jack. They all gave me valuable insight on this, in both technical and practical terms from different perspectives. Thank you!
Any iteration can be expressed recursively. (And, with auxiliary data structures, vice versa, but not so easily.)
I would say that you are thinking iteratively. That's not a bad thing; I don't say it to criticise. Simply, your explanation is of the form "Do this and then do that and continue until you reach the end".
Recursion is a slightly different way of thinking. I have some problem, and it's not obvious how to solve it. But I observe that if I knew the answer to a simpler problem, I could easily solve the problem at hand. And, moreover, there are some very simple problems which I can solve directly.
The recursive solution is based on using a simpler (smaller, fewer, whatever) problem to solve the problem at hand. How do I find out which pairs of objects in a set of objects collide?
If the set has fewer than 2 elements, there are no pairs. That's the simplest problem, and it has an obvious solution: the empty set.
Otherwise, I select some object. All colliding pairs either include this object, or they don't. So that gives me two subproblems.
The set of collisions which don't involve the selected object is obviously the same problem which I started with, but with a smaller set. So I've replaced this part of the problem with a smaller problem. That's one recursion.
But I also need the set of objects which the selected object collides with (which might be an empty set). That's a simpler problem, because now one element of each pair is known. I can solve that problem recursively as well:
I need the set of pairs which include the object X and a set S of objects.
If the set is empty, there are no pairs. Simple.
Otherwise, I choose some element from the set. Then I find all the collisions between X and the rest of the set (a simpler but otherwise identical problem).
If there is a collision between X and the selected element, I add that to the set I just found.
Then I return the set.
Technically speaking, you have the right mindset of how recursion works.
Practically speaking, you would not want to use recursion for an instance such as the one you described above. Reasons being is that every recursive call adds to the stack (which is finite in size), and recursive calls are expensive on the processor, with enough objects you are going to run into some serious bottle-necking on a large application. With enough recursive calls, you would result with a stack overflow, which is exactly what you would get in "infinite recursion". You never want something to infinitely recurse; it goes against the fundamental principal of recursion.
Recursion works on two defining characteristics:
A base case can be defined: It is possible to eventually reach 0 or 1 depending on your necessity
A general case can be defined: The general case is continually called, reducing the problem set until your base case is reached.
Once you have defined both cases, you can define a recursive solution.
The point of recursion is to take a very large and difficult-to-solve problem and continually break it down until it's easy to work with.
Once our base case is reached, the methods "recurse-out". This means they bounce backwards, back into the function that called it, bringing all the data from the functions below it!
It is at this point that our operations actually occur.
Once the original function is reached, we have our final result.
For example, let's say you want the summation of the first 3 integers. The first recursive call is passed the number 3.
public factorial(num) {
//Base case
if (num == 1) {
return 1;
}
//General case
return num + factorial(num-1);
}
Walking through the function calls:
factorial(3); //Initial function call
//Becomes..
factorial(1) + factorial(2) + factorial(3) = returned value
This gives us a result of 6!
Your scenario seems to me like iterative programming, but your function is simply calling itself as a way of continuing its comparisons. That is simply re-tasking your function to be able to call itself with a smaller list.
In my experience, a recursive function has more potential to branch out into multiple 'threads' (so to speak), and is used to process information the same way the hierarchy in a company works for delegation; The boss hands a contract down to the managers, who divide up the work and hand it to their respective staff, the staff get it done, and had it back to the managers, who report back to the boss.
The best example of a recursive function is one that iterates through all files on a file system. ( I will do this in pseudo code because it works in all languages).
function find_all_files (directory_name)
{
- Check the given directory name for sub-directories within it
- for each sub-directory
find_all_files(directory_name + subdirectory_name)
- Check the given directory for files
- Do your processing of the filename; it is located at directory_name + filename
}
You use the function by calling it with a directory path as the parameter. The first thing it does is, for each subdirectory, it generates a value of the actual path to the subdirectory and uses it as a value to call find_all_files() with. As long as there are sub-directories in the given directory, it will keep calling itself.
Now, when the function reaches a directory that contains only files, it is allowed to proceed to the part where it process the files. Once done that, it exits, and returns to the previous instance of itself that is iterating through directories.
It continues to process directories and files until it has completed all iterations and returns to the main program flow where you called the original instance of find_all_files in the first place.
One additional note: Sometimes global variables can be handy with recursive functions. If your function is merely searching for the first occurrence of something, you can set an "exit" variable as a flag to "stop what you are doing now!". You simply add checks for the flag's status during any iterations you have going on inside the function (as in the example, the iteration through all the sub-directories). Then, when the flag is set, your function just exits. Since the flag is global, all generations of the function will exit and return to the main flow.

Derivative of a program

Let us assume you can represent a program as mathematical function, that's possible. How does the program representation of the first derivative of that function look like? Is there a way to transform a program to its "derivative" form, and does this make sense at all?
Yes it does make sense, it's known as Automatic Differentiation. There are one or two experimental compilers which can do this, for example NAGware's Differentiation Enabled Fortran Compiler Technology. And there are a lot of research papers on the topic. I suggest you get Googling.
First, it only makes sense to try to get the derivative of a pure function (one that does not affect external state and returns the exact same output for every input). Second, the type system of many programming languages involves a lot of step functions (e.g. integers), meaning you'd have to get your program to work in terms of continuous functions in order to get a valid first derivative. Third, getting the derivative of any function involves breaking it down and manipulating it symbolically. Thus, you can't get the derivative of a function without knowing how what operations it is made of. This could be achieved with reflection.
You could create a derivative approximation function if your programming language supports closures (that is, nested functions and the ability to put functions into variables and return them). Here is a JavaScript example taken from http://en.wikipedia.org/wiki/Closure_%28computer_science%29 :
function derivative(f, dx) {
return function(x) {
return (f(x + dx) - f(x)) / dx;
};
}
Thus, you could say:
function f(x) { return x*x; }
f_prime = derivative(f, 0.0001);
Here, f_prime will approximate function(x) {return 2*x;}
If a programming language implemented higher-order functions and enough algebra, one could implement a real derivative function in it. That would be really cool.
See Lambda the Ultimate discussions on Derivatives and dissections of data types and Derivatives of Regular Expressions
How do you define the mathematical function of a program?
A derivative represent the rate of change of a function. If your function isn't continuous its derivative will be undefined over most of the domain.
I'm just gonna say that this doesn't make a lot of sense, as a program is much more abstract and "ruleless" than a mathematical function. As a derivative is a measure of the change in output as the input changes, there are certainly some programs where this could apply. However, you'd need to be able to quantify your input/output both in numerical terms.
Since input/output would both numerical, it's reasonable to assume that your program represents or operates similarly to a mathematical function, or series of functions. Hence, you can easily represent a derivative, but it would be no different than converting the mathematical derivative of a function to a computer program.
If the program is denoted as a distribution (Schwartz) then you have some notion of derivative assuming that tests functions models your postcondition (you can still take the limit to get a characteristic function). For instance, the assignment x:=x+1 is associated to the Dirac distribution \delta_{x_0+1} where x_0 is the initial value of the variable x. However, I have no idea what is the computational meaning of \delta_{x_0+1}'.
I am wondering, what if the program your're trying to "derive" uses some form of heursitics ? How can it be derived then ?
Half-jokingly, we all know that all real programs use at least a rand().

Resources