Anonymous Scala function syntax - syntax

I'm learning more about Scala, and I'm having a little trouble understanding the example of anonymous functions in http://www.scala-lang.org/node/135. I've copied the entire code block below:
object CurryTest extends Application {
def filter(xs: List[Int], p: Int => Boolean): List[Int] =
if (xs.isEmpty) xs
else if (p(xs.head)) xs.head :: filter(xs.tail, p)
else filter(xs.tail, p)
def modN(n: Int)(x: Int) = ((x % n) == 0)
val nums = List(1, 2, 3, 4, 5, 6, 7, 8)
println(filter(nums, modN(2)))
println(filter(nums, modN(3)))
}
I'm confused with the application of the modN function
def modN(n: Int)(x: Int) = ((x % n) == 0)
In the example, it's called with one argument
modN(2) and modN(3)
What does the syntax of modN(n: Int)(x: Int) mean?
Since it's called with one argument, I'm assuming they're not both arguments, but I can't really figure out how the values from nums get used by the mod function.

This is a fun thing in functional programming called currying. Basically Moses Schönfinkel and latter Haskell Curry (Schonfinkeling would sound weird though...) came up with the idea that calling a function of multiple arguments, say f(x,y) is the same as the chain of calls {g(x)}(y) or g(x)(y) where g is a function that produces another function as its output.
As an example, take the function f(x: Int, y: Int) = x + y. A call to f(2,3) would produce 5, as expected. But what happens when we curry this function - redefine it as f(x:Int)(y: Int)and call it as f(2)(3). The first call, f(2) produces a function taking an integer y and adding 2 to it -> therefore f(2) has type Int => Int and is equivalent to the function g(y) = 2 + y. The second call f(2)(3) calls the newly produced function g with the argument 3, therefore evaluating to 5, as expected.
Another way to view it is by stepping through the reduction (functional programmers call this beta-reduction - it's like the functional way of stepping line by line) of the f(2)(3) call (note, the following is not really valid Scala syntax).
f(2)(3) // Same as x => {y => x + y}
|
{y => 2 + y}(3) // The x in f gets replaced by 2
|
2 + 3 // The y gets replaced by 3
|
5
So, after all this talk, f(x)(y) can be viewed as just the following lambda expression (x: Int) => {(y: Int) => x + y} - which is valid Scala.
I hope this all makes sense - I tried to give a bit of a background of why the modN(3) call makes sense :)

You are partially applying the ModN function. Partial function application is one of the main features of functional languages. For more information check out these articles on Currying and Pointfree style.

In that example, modN returns a function that mods by the particular N. It saves you from having to do this:
def mod2(x:Int): Boolean = (x%2) == 0
def mod3(x:Int): Boolean = (x%3) == 0
The two pairs of parens delimit where you can stop passing arguments to the method. Of course, you can also just use a placeholder to achieve the same thing even when the method only has a single argument list.
def modN(n: Int, x: Int): Boolean = (x % n) == 0
val nums = List(1, 2, 3, 4, 5)
println(nums.filter(modN(2, _)))
println(nums.filter(modN(3, _)))

Related

PROLOG Asignation problems

I have to solve a problem involving the WWESupercard game. The idea is to to take 2 cards and make them fight, comparing each stat alone and every combination, making 10 comparations in total. A fighter wins if it gets more than 50% of the fights (6/10).
The problem I have is that the "is" statement is not working for me.
This are some of the "solutions" I've tried so far, using only the base stats and no combinations for faster testing, none of them are working as I intend:
leGana(X,Y) :- T is 0,
((A is 0, (pwr(X,Y) -> A1 is A+1));
(B is 0, (tgh(X,Y) -> B1 is B+1));
(C is 0, (spd(X,Y) -> C1 is C+1));
(D is 0, (cha(X,Y) -> D1 is D+1))),
T1 is T+A1+B1+C1+D1, T1 > 2.
leGana(X,Y) :- T is 0,
((pwr(X,Y) -> T is T+1);
(tgh(X,Y) -> T is T+1);
(spd(X,Y) -> T is T+1);
(cha(X,Y) -> T is T+1)),
T1 > 2.
I've tried other ways, but I keep getting the same 2 error, most of the times:
It fails on "T is T+1" saying "0 is 0+1" and failing
On example 1, it does the "A1 is A+1", getting "1 is 0+1", but then it jumps directly to the last line of the code, getting "T1 is 0+1+_1543+_1546 ..."
Also, sorry if the way I wrote the code is not correct, it's my first time asking for help here, so any advice is welcome.
Prolog is a declarative language, which means that a variable can only have a single value.
If you do this:
X is 1, X is 2.
it will fail because you're trying to say that X has both the value 1 and the value 2, which is impossible ... that is is is not assignment but a kind of equality that evaluates the right-hand side.
Similarly
X = 1, X = 2.
will fail.
If you want to do the equivalent of Python's
def f(x):
result = 1
if x == 0:
result += 1
return result
then you need to do something like this:
f(X, Result) :-
Result1 = 1,
( X = 0
-> Result is Result1 + 1
; Result = Result1
).

Haskell State monad vs state as parameter performance test

I start to learn a State Monad and one idea bother me. Instead of passing accumulator as parameter, we can wrap everything to the state monad.
So I wanted to compare performance between using State monad vs passing it as parameter.
So I created two functions:
sum1 :: Int -> [Int] -> Int
sum1 x [] = x
sum1 x (y:xs) = sum1 (x + y) xs
and
sumState:: [Int] -> Int
sumState xs = execState (traverse f xs) 0
where f n = modify (n+)
I compared them on the input array [1..1000000000].
sumState running time was around 15s
sum1 around 5s
We can see clear winner, but the I realised that sumState can be optimised as:
We can use strict version of modify
We do not need necessary the map list output, so we can use traverse_ instead
So the new optimised state function is:
sumState:: [Int] -> Int
sumState xs = execState (traverse_ f xs) 0
where f n = modify' (n+)
which has running time around 350ms. This is a huge improvement. It was shocking.
Why the modified sumState has better performance then sum1? Can sum1 be optimised to match or even be better then sumState?
I also tried other different implementation of sum as
using built in sum function, which gives me around 240ms ((sum [1..x] ::Int))
using strict foldl', which gives me the same result around 240ms (with implicit [Int] -> Int)
Does it actually mean that it is better to use foldl function or State monad to pass accumulator instead of passing it as argument to the function?
Thank you for help.
EDIT:
Each function was in separate file with own main function and compiled with "-O2" flag.
main = do
x <- (read . head ) <$> getArgs
print $ <particular sum function> [1..x]
Runtime was measured via time command on linux.
To give a bit more explanation as to why traverse is slower: traverse f xs has has type State [()] and that [()] (list of unit tuples) is built up during the summation. This prevents further optimizations and would cause a memory leak if you were not using lazy state.
Update: I think GHC should have been able to notice that that list of unit tuples is never used, so I opened a GHC issue.
In both cases, To get the best performance we want to combine (or fuse) the summation with the enumeration [1..x] into a tight recursive loop which simply increments and adds until it reaches x. The resulting code would look something like this:
sumFromTo :: Int -> Int -> Int -> Int
sumFromTo s x y
| x == y = s + x
| otherwise = sumFromTo (s + x) (x + 1) y
This avoids allocations for the list [1..x].
The base library achieves this optimization using foldr/build fusion, also known as short cut fusion. The sum, foldl' and traverse (for lists) functions are implemented using the foldr function and [1..x] is implemented using the build function. The foldr and build function have special optimization rules so that they can be fused. Your custom sum1 function doesn't use foldr and so it can never be fused with [1..x] in this way.
Ironically, the same problem that plagued your implementation of sumState is also the problem with sum1. You don't have strict accumulation, so you build up thunks like so:
sum 0 [1, 2, 3]
sum (0 + 1) [2, 3]
sum ((0 + 1) + 2) [3]
sum (((0 + 1) + 2) + 3) []
(((0 + 1) + 2) + 3)
((1 + 2) + 3)
(3 + 3)
6
If you add strictness to sum1, you should see a dramatic improvement in efficiency because you eliminate the non-tail-recursive evaluation of the thunk (((0 + 1) + 2) + 3), which is the costly part of sum1. Using strict accumulation makes this much more efficient:
sum1 x [] = []
sum1 x (y : xs) = x `seq` sum1 (x + y) xs
should give you comparable performance to sum (although as noted in another answer, GHC may not be able to use fusion properly to give you the truly magical performance of sum on the list [1..x]).

What does it mean to "close over" something?

I'm trying to understand closures, but literally every definition of a closure that I can find uses the same cryptic and vague phrase: "closes over".
What's a closure? "Oh, it's a function that closes over another function."
But nowhere can I find a definition of what "closes over" means. Can someone explain what it means for Thing A to "close over" Thing B?
A closure is a pair consisting of a code pointer and an environment pointer. The environment pointer contains all of the free variables of a given function. For example:
fun f(a, b) =
let fun g(c, d) = a + b + c + d
in g end
val g = f(1, 2)
val result = g(3, 4) (*should be 10*)
The function g contains two free variables: a and b. If you are not familiar with the term free variable, it is a variable that is not defined within the scope of a function. In this context, to close over something, means to remove any occurrences of a free variable from a function. The above example provides good motivation for closures. When the function f returns, we need to be able to remember what the values of a and b are for later. The way this is compiled, is to treat function g as a code pointer and a record containing all the free variables, such as:
fun g(c, d, env) = env.a + env.b + c + d
fun f(a, b, env) = (g, {a = a, b = b})
val (g, gEnv) = f(1, 2)
val result = g(3, 4, gEnv)
When we apply the function g, we supply the environment that was returned when calling function f. Note that now function g no longer has any occurrences of a variable that is not defined in its scope. We typically call a term that doesn't have any free variables as closed. If you are still unclear, Matt Might has an excellent in depth explanation of closure conversion at http://matt.might.net/articles/closure-conversion/
Same example in Javascript
Before closure conversion
function f(a, b){
function g(c, d) {
return a + b + c + d
}
return g
}
var g = f(1, 2)
var result = g(3, 4)
After closure conversion:
function g(c, d, env) {
return env.a + env.b + c + d
}
function f(a, b, env) {
return [g, {"a": a, "b": b}]
}
var [g, gEnv] = f(1, 2)
var result = g(3, 4, gEnv)
From apple documentation
Closures are self-contained blocks of functionality that can be passed
around and used in your code. Closures in Swift are similar to blocks
in C and Objective-C and to lambdas in other programming languages.
But what that means?
It means that a closure captures the variables and constants of the context in which it is defined, referred to as closing over those variables and constants.
I hope that helps!

What's the formal term for a function that can be written in terms of `fold`?

I use the LINQ Aggregate operator quite often. Essentially, it lets you "accumulate" a function over a sequence by repeatedly applying the function on the last computed value of the function and the next element of the sequence.
For example:
int[] numbers = ...
int result = numbers.Aggregate(0, (result, next) => result + next * next);
will compute the sum of the squares of the elements of an array.
After some googling, I discovered that the general term for this in functional programming is "fold". This got me curious about functions that could be written as folds. In other words, the f in f = fold op.
I think that a function that can be computed with this operator only needs to satisfy (please correct me if I am wrong):
f(x1, x2, ..., xn) = f(f(x1, x2, ..., xn-1), xn)
This property seems common enough to deserve a special name. Is there one?
An Iterated binary operation may be what you are looking for.
You would also need to add some stopping conditions like
f(x) = something
f(x1,x2) = something2
They define a binary operation f and another function F in the link I provided to handle what happens when you get down to f(x1,x2).
To clarify the question: 'sum of squares' is a special function because it has the property that it can be expressed in terms of the fold functional plus a lambda, ie
sumSq = fold ((result, next) => result + next * next) 0
Which functions f have this property, where dom f = { A tuples }, ran f :: B?
Clearly, due to the mechanics of fold, the statement that f is foldable is the assertion that there exists an h :: A * B -> B such that for any n > 0, x1, ..., xn in A, f ((x1,...xn)) = h (xn, f ((x1,...,xn-1))).
The assertion that the h exists says almost the same thing as your condition that
f((x1, x2, ..., xn)) = f((f((x1, x2, ..., xn-1)), xn)) (*)
so you were very nearly correct; the difference is that you are requiring A=B which is a bit more restrictive than being a general fold-expressible function. More problematically though, fold in general also takes a starting value a, which is set to a = f nil. The main reason your formulation (*) is wrong is that it assumes that h is whatever f does on pair lists, but that is only true when h(x, a) = a. That is, in your example of sum of squares, the starting value you gave to Accumulate was 0, which is a does-nothing when you add it, but there are fold-expressible functions where the starting value does something, in which case we have a fold-expressible function which does not satisfy (*).
For example, take this fold-expressible function lengthPlusOne:
lengthPlusOne = fold ((result, next) => result + 1) 1
f (1) = 2, but f(f(), 1) = f(1, 1) = 3.
Finally, let's give an example of a functions on lists not expressible in terms of fold. Suppose we had a black box function and tested it on these inputs:
f (1) = 1
f (1, 1) = 1 (1)
f (2, 1) = 1
f (1, 2, 1) = 2 (2)
Such a function on tuples (=finite lists) obviously exists (we can just define it to have those outputs above and be zero on any other lists). Yet, it is not foldable because (1) implies h(1,1)=1, while (2) implies h(1,1)=2.
I don't know if there is other terminology than just saying 'a function expressible as a fold'. Perhaps a (left/right) context-free list function would be a good way of describing it?
In functional programming, fold is used to aggregate results on collections like list, array, sequence... Your formulation of fold is incorrect, which leads to confusion. A correct formulation could be:
fold f e [x1, x2, x3,..., xn] = f((...f(f(f(e, x1),x2),x3)...), xn)
The requirement for f is actually very loose. Lets say the type of elements is T and type of e is U. So function f indeed takes two arguments, the first one of type U and the second one of type T, and returns a value of type U (because this value will be supplied as the first argument of function f again). In short, we have an "accumulate" function with a signature f: U * T -> U. Due to this reason, I don't think there is a formal term for these kinds of function.
In your example, e = 0, T = int, U = int and your lambda function (result, next) => result + next * next has a signaturef: int * int -> int, which satisfies the condition of "foldable" functions.
In case you want to know, another variant of fold is foldBack, which accumulates results with the reverse order from xn to x1:
foldBack f [x1, x2,..., xn] e = f(x1,f(x2,...,f(n,e)...))
There are interesting cases with commutative functions, which satisfy f(x, y) = f(x, y), when fold and foldBack return the same result. About fold itself, it is a specific instance of catamorphism in category theory. You can read more about catamorphism here.

ML Expression, help line by line

val y=2;
fun f(x) = x*y;
fun g(h) = let val y=5 in 3+h(y) end;
let val y=3 in g(f) end;
I'm looking for a line by line explanation. I'm new to ML and trying to decipher some online code. Also, a description of the "let/in" commands would be very helpful.
I'm more familiar with ocaml but it all looks the same to me.
val y=2;
fun f(x) = x*y;
The first two lines bind variables y and f. y to an integer 2 and f to a function which takes an integer x and multiplies it by what's bound to y, 2. So you can think of the function f takes some integer and multiplies it by 2. (f(x) = x*2)
fun g(h) = let val y=5
in
3+h(y)
end;
The next line defines a function g which takes some h (which turns out to be a function which takes an integer and returns an integer) and does the following:
Binds the integer 5 to a temporary variable y.
You can think of the let/in/end syntax as a way to declare a temporary variable which could be used in the expression following in. end just ends the expression. (this is in contrast to ocaml where end is omitted)
Returns the sum of 3 plus the function h applying the argument y, or 5.
At a high level, the function g takes some function, applies 5 to that function and adds 3 to the result. (g(h) = 3+h(5))
At this point, three variables are bound in the environment: y = 2, f = function and g = function.
let val y=3
in
g(f)
end;
Now 3 is bound to a temporary variable y and calls function g with the function f as the argument. You need to remember that when a function is defined, it keeps it's environment along with it so the temporary binding of y here has no affect on the functions g and f. Their behavior does not change.
g (g(h) = 3+h(5)), is called with argument f (f(x) = x*2). Performing the substitutions for parameter h, g becomes 3+((5)*2) which evaluates to 13.
I hope this is clear to you.

Resources