What does "target" do? - wake

In Wake code, I know that def indicates something is a function. However, it looks like there is another keyword, target that also defines functions. For example (taken from here):
global target makeBitstream plan =
...
global def makeMCS plan =
...
Both of these are callable from the command-line. What is the difference between def and target?

I've just written an article about this very topic:
While the wake language is mostly functional, it is not completely pure. Wake has side-effects like running jobs and printing. It also has memoization, a contained means of storing computation described below.
Consider the Fibonacci function:
def fib n = if n < 2 then 1 else fib (n-1) + fib (n-2)
Evaluating this function on 20 completes relatively quickly, but running it on 40 is another story entirely. Wake appears to run forever (really, just a very very long time).
The problem is that each invocation of fib causes two more invocations of fib. This results in a chain reaction where fib is called an exponentially increasing number of times depending on its input.
Let’s try the special target keyword:
target fib n = if n < 2 then 1 else fib (n-1) + fib (n-2)
Now fib 40 completes quickly! In fact, fib 40000 also returns a result.
What has happened here is that target fib now remembers and re-uses the results of previous invocations. fib 4 will call fib 3 and fib 2. fib 3 will call fib 2 and fib 1. However, the common invocation of fib 2 now happens only once. The target remembers the result; this is called memoization.
While target is useful for speeding up toy functions, its real use is in saving work in a build system. A wake build system typically includes build rules which invoke further rules upon which they depend. Imagine job C depends on jobs B and A, but job B also depends on job A. We don’t want A to be executed twice!
Fortunately, wake’s Plan API includes by default the option to run jobs once. Internally, this API uses a target to prevent re-execution of the job. However, this use of target will not suffice in a large build.
In a large build, top-level functions which produce a Path for other functions should generally be defined using target. That way, even if the function is invoked twice by dependencies, it will only need to be evaluated once. In a build involving many targets which depend on many targets, the result can be an exponential speed-up, like we saw in the fib example.
A target can also be defined inside a function. These targets only retain their saved values while the enclosing function can access them.
def wrappedFib n =
target fib n = if n < 2 then 1 else fib (n-1) + fib (n-2)
fib n
In this example, wrappedFib uses an internal target fib to compute the Fibonacci result. However, between invocations of wrappedFib the partial results are not retained. Nested targets can be useful because they don’t consume memory for the entire execution of wake. For example, a function might need to compute a large number of uninteresting intermediate values in order to compute the value of interest (which might be saved).
One way to think about a target is that it defines a table, like in a database or a key-value map. E.g., target foo x y = z defined a table with the key Pair x y and the value z. From that point-of-view, it is perhaps unsurprising that it is sometimes useful to compute z with some inputs which are not part of the key.
target myWrite filename \ contents = write filename contents
In the above example, myWrite "file" "content" will create a file called file and fill it with the string content, returning a Path for the created file. If someone tries to write the same file with the same contents again, then the same Path will be returned.
However, what if someone tried to write the same file, but with different contents? If we allowed that, there would be a race condition in the build system! Let’s see what happens:
$ wake -x '("bar", "bar", Nil) | map (myWrite "foo")'
Path "foo", Path "foo", Nil
$ wake -x '("bar", "baz", Nil) | map (myWrite "foo")'
ERROR: Target subkey mismatch for 'myWrite filename \ contents' (demo.wake:1:[8-34])
In the first invocation, both calls succeed, and foo was only created once. In the second invocation, one of the calls fails with a Target subkey mismatch. These failures are fatal in wake, because it is never clear which invocation failed, due to the out-of-order parallel evaluation strategy used by wake. Nevertheless, it is probably better if a buggy build fails spectacularly, than succeeds only sporadically.
Finally, be warned of this common target gotcha:
target foo x = match _
None = None
Some y = x + y
Normally in wake, for defines (def), the above function is the same as this:
target foo x y = match y
None = None
Some y = x + y
However, in the target situation, these are quite different. The first target foo is memoizing a function result, while the second target foo is memoizing an Integer result (probably what was intended).

Related

What does C(C) do in BASIC?

I'm currently trying to understand this BASIC program. I especially have issues with this part:
DIM C(52),D(52)
FOR D=D TO 1 STEP -1
C=C-1
C(C)=D(D)
NEXT D
I guess that it is a for-loop which starts at D where the last executed iteration is D=1 (hence inclusive?)
What does C(C) do? C is an array with 52 elements and I assumed C(X) is an access to the X-th element of the array C. But what does it do when the parameter is C itself?
In the original BASIC program, there is a GOTO 1500 on line 90, which comes before lines 16-19, that you’ve reproduced here. Line 1500 is the start of the program’s main loop. This particular programmer uses the (not uncommon) pattern of placing subroutines at the beginning of their BASIC program, using a GOTO to jump to the main code.
The code you’ve reproduced from the Creative Computing program you’ve linked is a subroutine to “get a card”, as indicated by the comment above that section of code:
100 REM--SUBROUTINE TO GET A CARD. RESULT IS PUT IN X.
REM is a BASIC statement; it stands for “remark”. In modern parlance, it’s a comment.
In BASIC, arrays, strings, and numbers are in separate namespaces. This means that you can (and commonly do) have the same variable name for arrays as for the integer that you use to access the array. The following variables would all be separate in BASIC and not overwrite each other:
C = 12
C(5) = 33
C$ = "Jack of Spades"
C$(5) = "Five of Hearts"
Line 1 is a numeric variable called C.
Line 2 is a numeric array called C.
Line 3 is a string called C.
Line 4 is a string array called C.
A single program could contain all four of those variables without conflict. This is not unknown in modern programming languages; Perl, for example, has very similar behavior. A Perl script can have a number, a string, an array, and a hash all with the same name without conflicting.
If you look at line 1500 of the program you linked and follow through, you’ll see that the variable C is initialized to 53. This means that the first time this subroutine is called, C starts at 53, and gets immediately decremented to 52, which is the number of cards. After the program has run a bit, the value of C will vary.
Basically, this bit of code copies to the array C some values in the array D. It chooses which values of D() to copy to C() using the (most likely integer) numeric variables C and D. As the code steps through D from D’s initial value down to 1, C is also decremented by 1.
If D begins with value 3, and C begins with value 10, this happens:
C(9) = D(3)
C(8) = D(2)
C(7) = D(1)
Note that this example is purely hypothetical; I have not examined the code closely enough to verify that this combination of values is one that can occur in a program run.
A couple of caveats. There are many variations of BASIC, and few absolutes among them. For example, some BASIC dialects will use what looks like a string array as a means of accessing substrings and sometimes even modifying substrings within a string. In these dialects, C$(2) will be the second (or third, if zero-based) character in the string C$. The BASIC program you’ve linked does not appear to be one of those variants, since it uses LEFT$ and MID$ to access substrings.
Second, many BASIC dialects include a DEFSTR command, which defines a variable as a string variable without having to use the “$” marker. If a variable were defined in this manner as a string, it is no longer available as a number. This will often be true of both the scalar and the array forms. For example, consider this transcript using TRS-80 Model III BASIC:
READY
>10 DEFSTR C
>20 C = "HELLO, WORLD"
>30 PRINT C
>40 C(3) = 5
>RUN
HELLO, WORLD
?TM Error IN 40
READY
>
The program successfully accepts a string into the variable C, and prints it; it displays a “Type Mismatch Error” on attempting to assign a number to element 3 of the array C. That’s because DEFSTR C defines both C and C() as strings, and it becomes an error to attempt to assign a number to either of them.
The program you’ve linked likely (but not definitely) runs on a BASIC that supports DEFSTR. However, the program does not make use of it.
Finally, many variants will have a third type of variable for integers, which will not conflict with the others; often, this variable is identified by a “%” in the same way that a string is identified by a “$”:
C = 3.5
C% = 4
C$ = "FOUR"
In such variants, all three of these are separate variables and do not conflict with each other. You’ll often see a DEFINT C at the top of code that uses integers, to define that variable (and the array with the same name) as an integer, to save memory and to make the program run more quickly. BASICs of the era often performed integer calculations significantly faster than floating point/real calculations.

Is having only one argument functions efficient? Haskell

I have started learning Haskell and I have read that every function in haskell takes only one argument and I can't understand what magic happens under the hood of Haskell that makes it possible and I am wondering if it is efficient.
Example
>:t (+)
(+) :: Num a => a -> a -> a
Signature above means that (+) function takes one Num then returns another function which takes one Num and returns a Num
Example 1 is relatively easy but I have started wondering what happens when functions are a little more complex.
My Questions
For sake of the example I have written a zipWith function and executed it in two ways, once passing one argument at the time and once passing all arguments.
zipwithCustom f (x:xs) (y:ys) = f x y : zipwithCustom f xs ys
zipwithCustom _ _ _ = []
zipWithAdd = zipwithCustom (+)
zipWithAddTo123 = zipWithAdd [1,2,3]
test1 = zipWithAddTo123 [1,1,1]
test2 = zipwithCustom (+) [1,2,3] [1,1,1]
>test1
[2,3,4]
>test2
[2,3,4]
Is passing one argument at the time (scenario_1) as efficient as passing all arguments at once (scenario_2)?
Are those scenarios any different in terms of what Haskell is actually doing to compute test1 and test2 (except the fact that scenario_1 probably takes more memory as it needs to save zipWithAdd and zipWithAdd123)
Is this correct and why? In scenario_1 I iterate over [1,2,3] and then over [1,1,1]
Is this correct and why? In scenario_1 and scenario_2 I iterate over both lists at the same time
I realise that I have asked a lot of questions in one post but I believe those are connected and will help me (and other people who are new to Haskell) to better understand what actually is happening in Haskell that makes both scenarios possible.
You ask about "Haskell", but Haskell the language specification doesn't care about these details. It is up to implementations to choose how evaluation happens -- the only thing the spec says is what the result of the evaluation should be, and carefully avoids giving an algorithm that must be used for computing that result. So in this answer I will talk about GHC, which, practically speaking, is the only extant implementation.
For (3) and (4) the answer is simple: the iteration pattern is exactly the same whether you apply zipWithCustom to arguments one at a time or all at once. (And that iteration pattern is to iterate over both lists at once.)
Unfortunately, the answer for (1) and (2) is complicated.
The starting point is the following simple algorithm:
When you apply a function to an argument, a closure is created (allocated and initialized). A closure is a data structure in memory, containing a pointer to the function and a pointer to the argument. When the function body is executed, any time its argument is mentioned, the value of that argument is looked up in the closure.
That's it.
However, this algorithm kind of sucks. It means that if you have a 7-argument function, you allocate 7 data structures, and when you use an argument, you may have to follow a 7-long chain of pointers to find it. Gross. So GHC does something slightly smarter. It uses the syntax of your program in a special way: if you apply a function to multiple arguments, it generates just one closure for that application, with as many fields as there are arguments.
(Well... that might be not quite true. Actually, it tracks the arity of every function -- defined again in a syntactic way as the number of arguments used to the left of the = sign when that function was defined. If you apply a function to more arguments than its arity, you might get multiple closures or something, I'm not sure.)
So that's pretty nice, and from that you might think that your test1 would then allocate one extra closure compared to test2. And you'd be right... when the optimizer isn't on.
But GHC also does lots of optimization stuff, and one of those is to notice "small" definitions and inline them. Almost certainly with optimizations turned on, your zipWithAdd and zipWithAddTo123 would both be inlined anywhere they were used, and we'd be back to the situation where just one closure gets allocated.
Hopefully this explanation gets you to where you can answer questions (1) and (2) yourself, but just in case it doesn't, here's explicit answers to those:
Is passing one argument at the time as efficient as passing all arguments at once?
Maybe. It's possible that passing arguments one at a time will be converted via inlining to passing all arguments at once, and then of course they will be identical. In the absence of this optimization, passing one argument at a time has a (very slight) performance penalty compared to passing all arguments at once.
Are those scenarios any different in terms of what Haskell is actually doing to compute test1 and test2?
test1 and test2 will almost certainly be compiled to the same code -- possibly even to the point that only one of them is compiled and the other is an alias for it.
If you want to read more about the ideas in the implementation, the Spineless Tagless G-machine paper is much more approachable than its title suggests, and only a little bit out of date.

In Tensorflow, what is the difference between Session.partial_run and Session.run?

I always thought that Session.run required all placeholders in the graph to be fed, while Session.partial_run only the ones specified through Session.partial_run_setup, but looking further that is not the case.
So how exactly do the two methods differentiate? What are the advantages/disadvantages of using one over the other?
With tf.Session.run, you usually give some inputs and expected outputs, and TensorFlow runs the operations in the graph to compute and return those outputs. If you later want to get some other output, even if it is with the same input, you have to run again all the necessary operations in the graph, even if some intermediate results will be the same as in the previous call. For example, consider something like this:
import tensorflow as tf
input_ = tf.placeholder(tf.float32)
result1 = some_expensive_operation(input_)
result2 = another_expensive_operation(result1)
with tf.Session() as sess:
x = ...
sess.run(result1, feed_dict={input_: x})
sess.run(result2, feed_dict={input_: x})
Computing result2 will require to run both the operations from some_expensive_operation and another_expensive_operation, but actually most of the computation is repeated from when result1 was calculated. tf.Session.partial_run allows you to evaluate part of a graph, leave that evaluation "on hold" and complete it later. For example:
import tensorflow as tf
input_ = tf.placeholder(tf.float32)
result1 = some_expensive_operation(input_)
result2 = another_expensive_operation(result1)
with tf.Session() as sess:
x = ...
h = sess.partial_run_setup([result1, result2], [input_ ])
sess.partial_run(h, result1, feed_dict={input_: x})
sess.partial_run(h, result2)
Unlike before, here the operations from some_expensive_operation will only we run once in total, because the computation of result2 is just a continuation from the computation of result1.
This can be useful in several contexts, for example if you want to split the computational cost of a run into several steps, but also if you need to do some mid-evaluation checks out of TensorFlow, such as computing an input to the second half of the graph that depends on an output of the first half, or deciding whether or not to complete an evaluation depending on an intermediate result (these may also be implemented within TensorFlow, but there may be cases where you do not want that).
Note too that it is not only a matter of avoiding repeating computation. Many operations have a state that changes on each evaluation, so the result of two separate evaluations and one evaluation divided into two partial ones may actually be different. This is the case with random operations, where you get a new different value per run, and other stateful object like iterators. Variables are also obviously stateful, so operations that change variables (like tf.Session.assign or optimizers) will not produce the same results when they are run once and when they are run twice.
In any case, note that, as of v1.12.0, partial_run is still an experimental feature and is subject to change.

OCaml: Stack_overflow exception in pervasives.ml

I got a Stack_overflow error in my OCaml program lately. If I turn on backtracing, I see the exception is raised by a "primitive operation" "pervasives.ml", line 270. I went into the OCaml source code and saw that line 270 defines the function # (i.e. list append). I don't get any other information from the backtrace, not even where the exception gets thrown in my program. I switched to bytecode and tried ocamldebug, and it doesn't help (no backtrace generated).
I thought this is an extremely weird situation. The only places in my program where I used a list is (a) building a list containing integers 1 to 1000000, (b) in-order traversing a RBT and putting the result into a list, and (c) printing a list of integers containing ostensibly 1000000 numbers. I've tested all functions and none of them contain could an infinite loop, and I thought 1000000 isn't even a huge number. Moreover, I've tried the equivalent of my program in Haskell (GHC), Scala and SML (MLton), and all of those versions worked perfectly and in a reasonably short amount of time. So, the question is, what could be going on? Can I debug it?
The # operator is not tail-recursive in the OCaml standard library,
let rec ( # ) l1 l2 =
match l1 with
[] -> l2
| hd :: tl -> hd :: (tl # l2)
Thus calling it with large lists (as the left argument) will overflow your stack.
It could be possible, that you're building your list by appending a new element to the end of the already generated list, e.g.,
let rec init n x = if n > 0 then init (n-1) x # [x] else []
This has time complexity n^2 and will consume n slots in the stack space.
Concerning the general question - how to debug such stack overflows, my usual recipe is to reduce the stack size, so that the problem is triggered as soon as possible before the trace is bloated, e.g.,
OCAMLRUNPARAM=b,l=1024 ocaml ./test.ml
If you're compiling your OCaml code to the native code, then you need to pass the -g option to the compiler, so that it can produce backtraces. Also, in the native execution, the size of the stack is controlled by the operating system and should be set using the corresponding mechanism of your OS, for example with ulimit in GNU/Linux, e.g., ulimit -s 1024.
As a bonus track, the following init function is tail recursive and will have O(N) time complexity and will take O(1) stack space:
let init n x =
let rec loop n xs =
if n = 0 then xs else loop (n-1) (x :: xs) in
loop n []
The idea is to use an accumulator list and build the list in the heap space.
If you don't like thinking about tail-recursiveness then you can use Janestreet Base library (or Core), or Batteries library. They both provide tail-recursive versions of the init function, as well as guarantees that all other functions are tail-recursive.
List functions in the standard library are optimised for small lists and are not necessarily tail-recursive; with the partial justification that lists are not an efficient data structure for storing large amount of data (note that Haskell lists are lazy and thus are quite different than OCaml eager lists).
In particular, if you get a stackoverflow error using #, you are quite probably implementing an algorithm with a quadratic time-complexity due to the fact that #'s complexity is linear in the size of its left argument.
They are probably far better data structure than list for your problem, if you want iteration the sequence library or any other forms of iterator would be far more efficient for instance.
With all the caveat stated before, it is relatively straightforward to redefine tail-recursive but inefficient version of the standard library function, e.g. :
let (#!) x y = List.rev_append (List.rev x) y
Another option is to use the containers library or any of the extended standard libraries (batteries or base essentially): all of those libraries reimplement tail-recursive version of list functions.

Haskell: Caches, memoization, and referential transparency [duplicate]

I can't figure out why m1 is apparently memoized while m2 is not in the following:
m1 = ((filter odd [1..]) !!)
m2 n = ((filter odd [1..]) !! n)
m1 10000000 takes about 1.5 seconds on the first call, and a fraction of that on subsequent calls (presumably it caches the list), whereas m2 10000000 always takes the same amount of time (rebuilding the list with each call). Any idea what's going on? Are there any rules of thumb as to if and when GHC will memoize a function? Thanks.
GHC does not memoize functions.
It does, however, compute any given expression in the code at most once per time that its surrounding lambda-expression is entered, or at most once ever if it is at top level. Determining where the lambda-expressions are can be a little tricky when you use syntactic sugar like in your example, so let's convert these to equivalent desugared syntax:
m1' = (!!) (filter odd [1..]) -- NB: See below!
m2' = \n -> (!!) (filter odd [1..]) n
(Note: The Haskell 98 report actually describes a left operator section like (a %) as equivalent to \b -> (%) a b, but GHC desugars it to (%) a. These are technically different because they can be distinguished by seq. I think I might have submitted a GHC Trac ticket about this.)
Given this, you can see that in m1', the expression filter odd [1..] is not contained in any lambda-expression, so it will only be computed once per run of your program, while in m2', filter odd [1..] will be computed each time the lambda-expression is entered, i.e., on each call of m2'. That explains the difference in timing you are seeing.
Actually, some versions of GHC, with certain optimization options, will share more values than the above description indicates. This can be problematic in some situations. For example, consider the function
f = \x -> let y = [1..30000000] in foldl' (+) 0 (y ++ [x])
GHC might notice that y does not depend on x and rewrite the function to
f = let y = [1..30000000] in \x -> foldl' (+) 0 (y ++ [x])
In this case, the new version is much less efficient because it will have to read about 1 GB from memory where y is stored, while the original version would run in constant space and fit in the processor's cache. In fact, under GHC 6.12.1, the function f is almost twice as fast when compiled without optimizations than it is compiled with -O2.
m1 is computed only once because it is a Constant Applicative Form, while m2 is not a CAF, and so is computed for each evaluation.
See the GHC wiki on CAFs: http://www.haskell.org/haskellwiki/Constant_applicative_form
There is a crucial difference between the two forms: the monomorphism restriction applies to m1 but not m2, because m2 has explicitly given arguments. So m2's type is general but m1's is specific. The types they are assigned are:
m1 :: Int -> Integer
m2 :: (Integral a) => Int -> a
Most Haskell compilers and interpreters (all of them that I know of actually) do not memoize polymorphic structures, so m2's internal list is recreated every time it's called, where m1's is not.
I'm not sure, because I'm quite new to Haskell myself, but it appears that it's beacuse the second function is parametrized and the first one is not. The nature of the function is that, it's result depends on input value and in functional paradigm especailly it depends ONLY on the input. Obvious implication is that a function with no parameters returns always the same value over and over, no matter what.
Aparently there's an optimizing mechanizm in GHC compiler that exploits this fact to compute the value of such a function only once for whole program runtime. It does it lazily, to be sure, but does it nonetheless. I noticed it myself, when I wrote the following function:
primes = filter isPrime [2..]
where isPrime n = null [factor | factor <- [2..n-1], factor `divides` n]
where f `divides` n = (n `mod` f) == 0
Then to test it, I entered GHCI and wrote: primes !! 1000. It took a few seconds, but finally I got the answer: 7927. Then I called primes !! 1001 and got the answer instantly. Similarly in an instant I got the result for take 1000 primes, because Haskell had to compute the whole thousand-element list to return 1001st element before.
Thus if you can write your function such that it takes no parameters, you probably want it. ;)

Resources