Let and construct versus let in sequence - coding-style

Consider this OCaml code:
let coupe_inter i j cases =
let lcases = Array.length cases in
let low,_,_ = cases.(i)
and _,high,_ = cases.(j) in
low,high,
Array.sub cases i (j-i+1),
case_append (Array.sub cases 0 i) (Array.sub cases (j+1) (lcases-(j+1)))
Why the expression let ... and ... in is used in place of a let ... in let ... in sequence (like F# force you to do)? This construct seems quite frequent in OCaml code.
Thanks!

let x = a and y = b in c has the effect of defining x and y "simultaneously". This means that the order of evaluation (a after or before b) is unspecified (you must not assume that a will be evaluated before), and that x is not bound in b and y not bound in a, they are only available in c.
I rarely use this construction, because I have been bitten in the past by the evaluation order thing. I often use the recursive variant of it, let rec ... and ... in ... (where all variable bound are available everywhere), however, to define mutually recursive functions.
let rec even n = (n = 0) || odd (n - 1)
and odd n = (n <> 0) && even (n - 1)

In F# let ... and ... is prohibited, but you still can write:
let rec low,_,_ = cases.[i]
and _,high,_ = cases.[j]
As #gasche said, let rec ... and ... is mainly used for defining mutually recursive functions/types. I think using a sequence of let is more intuitive and less error-prone hence should be preferred.

Related

How can I emulate the results of this if then then statement while using correct syntax?

Working on an exercise for university class and cant seem to represent what I am trying to do with correct syntax in ocaml. I want the function sum_positive to sum all the positive integers in the list into a single int value and return that value.
let int x = 0 in
let rec sum_positive (ls: int list) = function
|h::[] -> x (*sum of positive ints in list*)
|[] -> 0
|h::t -> if (h >= 0) then x + h then sum_positive t else sum_positive t (*trying to ensure that sum_positive t will still run after the addition of x + h*)
On compiling I am met with this error,
File "functions.ml", line 26, characters 34-38:
Error: Syntax error
This points to the then then statement I have in there, I know it cannot work but I cant think of any other representations that would.
You have if ... then ... then which is not syntactically valid.
It seems what you're asking is how to write what you have in mind in a way that is syntactically valid. But it's not clear what you have in mind.
You can evaluate two expressions in OCaml sequentially (one after the other) by separating them with ;. Possibly that is what you have in mind.
However it seems to me your code has bigger problems than just syntax. It appears you're trying to use x as an accumulated sum for the calculation. You should be aware that OCaml variables like x are immutable. Once you say let x = 0, the value can't be changed later. x will always be 0. The expression x + h doesn't change the value of x. It just evaluates to a new value.
The usual way to make this work is to pass x as a function parameter.
I was getting an issue that had involved the parameter of , I believe it was because I was trying to add an int value to function of type int list. This is what I ended up with.
let rec sum_positive = function
|[] -> 0
|h::t -> if h > 0 then h + (sum_positive t) else sum_positive t
a lot simpler than I thought it out to be.

Reordering match clauses in a recursive function

I have some Ocaml courses at school, and for an exercise we must write the function length.
My teacher showed us how Xavier Leroy wrote his function :
let rec length_aux len = function
[] -> len
| a::l -> length_aux (len + 1) l
let length l = length_aux 0 l
When my teacher explained us why he do the length function like that, he said he didn't know why Xavier Leroy didn't write:
let rec length_aux len = function
a::l -> length_aux (len + 1) l
| [] -> len
let length l = length_aux 0 l
... in order to make it faster (since most of the cases the list in nonempty).
So if someone knows why the second one is no better than the first one, could you answer me please ?
Thank you.
For OCaml, this is the same function. The pattern matching will be compiled to a test on whether the list is empty or not, and jump to a side or the other.
Similar code in C would be reordering cases in a switch statement.

Efficient summation in OCaml

Please note I am almost a complete newbie in OCaml. In order to learn a bit, and test its performance, I tried to implement a module that approximates Pi using the Leibniz series.
My first attempt led to a stack overflow (the actual error, not this site). Knowing from Haskell that this may come from too many "thunks", or promises to compute something, while recursing over the addends, I looked for some way of keeping just the last result while summing with the next. I found the following tail-recursive implementations of sum and map in the notes of an OCaml course, here and here, and expected the compiler to produce an efficient result.
However, the resulting executable, compiled with ocamlopt, is much slower than a C++ version compiled with clang++. Is this code as efficient as possible? Is there some optimization flag I am missing?
My complete code is:
let (--) i j =
let rec aux n acc =
if n < i then acc else aux (n-1) (n :: acc)
in aux j [];;
let sum_list_tr l =
let rec helper a l = match l with
| [] -> a
| h :: t -> helper (a +. h) t
in helper 0. l
let rec tailmap f l a = match l with
| [] -> a
| h :: t -> tailmap f t (f h :: a);;
let rev l =
let rec helper l a = match l with
| [] -> a
| h :: t -> helper t (h :: a)
in helper l [];;
let efficient_map f l = rev (tailmap f l []);;
let summand n =
let m = float_of_int n
in (-1.) ** m /. (2. *. m +. 1.);;
let pi_approx n =
4. *. sum_list_tr (efficient_map summand (0 -- n));;
let n = int_of_string Sys.argv.(1);;
Printf.printf "%F\n" (pi_approx n);;
Just for reference, here are the measured times on my machine:
❯❯❯ time ocaml/main 10000000
3.14159275359
ocaml/main 10000000 3,33s user 0,30s system 99% cpu 3,625 total
❯❯❯ time cpp/main 10000000
3.14159
cpp/main 10000000 0,17s user 0,00s system 99% cpu 0,174 total
For completeness, let me state that the first helper function, an equivalent to Python's range, comes from this SO thread, and that this is run using OCaml version 4.01.0, installed via MacPorts on a Darwin 13.1.0.
As I noted in a comment, OCaml's float are boxed, which puts OCaml to a disadvantage compared to Clang.
However, I may be noticing another typical rough edge trying OCaml after Haskell:
if I see what your program is doing, you are creating a list of stuff, to then map a function on that list and finally fold it into a result.
In Haskell, you could more or less expect such a program to be automatically “deforested” at compile-time, so that the resulting generated code was an efficient implementation of the task at hand.
In OCaml, the fact that functions can have side-effects, and in particular functions passed to high-order functions such as map and fold, means that it would be much harder for the compiler to deforest automatically. The programmer has to do it by hand.
In other words: stop building huge short-lived data structures such as 0 -- n and (efficient_map summand (0 -- n)). When your program decides to tackle a new summand, make it do all it wants to do with that summand in a single pass. You can see this as an exercise in applying the principles in Wadler's article (again, by hand, because for various reasons the compiler will not do it for you despite your program being pure).
Here are some results:
$ ocamlopt v2.ml
$ time ./a.out 1000000
3.14159165359
real 0m0.020s
user 0m0.013s
sys 0m0.003s
$ ocamlopt v1.ml
$ time ./a.out 1000000
3.14159365359
real 0m0.238s
user 0m0.204s
sys 0m0.029s
v1.ml is your version. v2.ml is what you might consider an idiomatic OCaml version:
let rec q_pi_approx p n acc =
if n = p
then acc
else q_pi_approx (succ p) n (acc +. (summand p))
let n = int_of_string Sys.argv.(1);;
Printf.printf "%F\n" (4. *. (q_pi_approx 0 n 0.));;
(reusing summand from your code)
It might be more accurate to sum from the last terms to the first, instead of from the first to the last. This is orthogonal to your question, but you may consider it as an exercise in modifying a function that has been forcefully made tail-recursive. Besides, the (-1.) ** m expression in summand is mapped by the compiler to a call to the pow() function on the host, and that's a bag of hurt you may want to avoid.
I've also tried several variants, here are my conclusions:
Using arrays
Using recursion
Using imperative loop
Recursive function is about 30% more effective than array implementation. Imperative loop is approximately as much effective as a recursion (maybe even little slower).
Here're my implementations:
Array:
open Core.Std
let pi_approx n =
let f m = (-1.) ** m /. (2. *. m +. 1.) in
let qpi = Array.init n ~f:Float.of_int |>
Array.map ~f |>
Array.reduce_exn ~f:(+.) in
qpi *. 4.0
Recursion:
let pi_approx n =
let rec loop n acc m =
if m = n
then acc *. 4.0
else
let acc = acc +. (-1.) ** m /. (2. *. m +. 1.) in
loop n acc (m +. 1.0) in
let n = float_of_int n in
loop n 0.0 0.0
This can be further optimized, by moving local function loop outside, so that compiler can inline it.
Imperative loop:
let pi_approx n =
let sum = ref 0. in
for m = 0 to n -1 do
let m = float_of_int m in
sum := !sum +. (-1.) ** m /. (2. *. m +. 1.)
done;
4.0 *. !sum
But, in the code above creating a ref to the sum will incur boxing/unboxing on each step, that we can further optimize this code by using float_ref trick:
type float_ref = { mutable value : float}
let pi_approx n =
let sum = {value = 0.} in
for m = 0 to n - 1 do
let m = float_of_int m in
sum.value <- sum.value +. (-1.) ** m /. (2. *. m +. 1.)
done;
4.0 *. sum.value
Scoreboard
for-loop (with float_ref) : 1.0
non-local recursion : 0.89
local recursion : 0.86
Pascal's version : 0.77
for-loop (with float ref) : 0.62
array : 0.47
original : 0.08
Update
I've updated the answer, as I've found a way to give 40% speedup (or 33% in comparison with #Pascal's answer.
I would like to add that although floats are boxed in OCaml, float arrays are unboxed. Here is a program that builds a float array corresponding to the Leibnitz sequence and uses it to approximate π:
open Array
let q_pi_approx n =
let summand n =
let m = float_of_int n
in (-1.) ** m /. (2. *. m +. 1.) in
let a = Array.init n summand in
Array.fold_left (+.) 0. a
let n = int_of_string Sys.argv.(1);;
Printf.printf "%F\n" (4. *. (q_pi_approx n));;
Obviously, it is still slower than a code that doesn't build any data structure at all. Execution times (the version with array is the last one):
time ./v1 10000000
3.14159275359
real 0m2.479s
user 0m2.380s
sys 0m0.104s
time ./v2 10000000
3.14159255359
real 0m0.402s
user 0m0.400s
sys 0m0.000s
time ./a 10000000
3.14159255359
real 0m0.453s
user 0m0.432s
sys 0m0.020s

Evaluate all possible interpretations in OCaml

I need to evaluate whether two formulas are equivalent or not. Here, I use a simple definition of formula, which is a prefix formula.
For example, And(Atom("b"), True) means b and true, while And(Atom("b"), Or(Atom("c"), Not(Atom("c")))) means (b and (c or not c))
My idea is simple, get all atoms, apply every combination (for my cases, I will have 4 combination, which are true-true, true-false, false-true, and false-false). The thing is, I don't know how to create these combinations.
For now, I have known how to get all involving atoms, so in case of there are 5 atoms, I should create 32 combinations. How to do it in OCaml?
Ok, so what you need is a function combinations n that will produce all the booleans combinations of length n; let's represent them as lists of lists of booleans (i.e. a single assignment of variables will be a list of booleans). Then this function would do the job:
let rec combinations = function
| 0 -> [[]]
| n ->
let rest = combinations (n - 1) in
let comb_f = List.map (fun l -> false::l) rest in
let comb_t = List.map (fun l -> true::l) rest in
comb_t # comb_f
There is only one empty combination of length 0 and for n > 0 we produce combinations of n-1 and prefix them with false and with true to produce all possible combinations of length n.
You could write a function to print such combinations, let's say:
let rec combinations_to_string = function
| [] -> ""
| x::xs ->
let rec bools_to_str = function
| [] -> ""
| b::bs -> Printf.sprintf "%s%s" (if b then "T" else "F") (bools_to_str bs)
in
Printf.sprintf "[%s]%s" (bools_to_str x) (combinations_to_string xs)
and then test it all with:
let _ =
let n = int_of_string Sys.argv.(1) in
let combs = combinations n in
Printf.eprintf "combinations(%d) = %s\n" n (combinations_to_string combs)
to get:
> ./combinations 3
combinations(3) = [TTT][TTF][TFT][TFF][FTT][FTF][FFT][FFF]
If you think of a list of booleans as a list of bits of fixed length, there is a very simple solution: Count!
If you want to have all combinations of 4 booleans, count from 0 to 15 (2^4 - 1) -- then interpret each bit as one of the booleans. For simplicity I'll use a for-loop, but you can also do it with a recursion:
let size = 4 in
(* '1 lsl size' computes 2^size *)
for i = 0 to (1 lsl size) - 1 do
(* from: is the least significant bit '1'? *)
let b0 = 1 = ((i / 1) mod 2) in
let b1 = 1 = ((i / 2) mod 2) in
let b2 = 1 = ((i / 4) mod 2) in
(* to: is the most significant bit '1'? *)
let b3 = 1 = ((i / 8) mod 2) in
(* do your thing *)
compute b0 b1 b2 b3
done
Of course you can make the body of the loop more general so that it e.g. creates a list/array of booleans depending on the size given above etc.;
The point is that you can solve this problem by enumerating all values you are searching for. If this is the case, compute all integers up to your problem size. Write a function that generates a value of your original problem from an integer. Put it all together.
This method has the advantage that you do not need to first create all combinations, before starting your computation. For large problems this might well save you. For rather small size=16 you will already need 65535 * sizeof(type) memory -- and this is growing exponentially with the size! The above solution will require only a constant amount of memory of sizeof(type).
And for science's sake: Your problem is NP-complete, so if you want the exact solution, it will take exponential time.

What's the way to determine if an Int is a perfect square in Haskell?

I need a simple function
is_square :: Int -> Bool
which determines if an Int N a perfect square (is there an integer x such that x*x = N).
Of course I can just write something like
is_square n = sq * sq == n
where sq = floor $ sqrt $ (fromIntegral n::Double)
but it looks terrible! Maybe there is a common simple way to implement such a predicate?
Think of it this way, if you have a positive int n, then you're basically doing a binary search on the range of numbers from 1 .. n to find the first number n' where n' * n' = n.
I don't know Haskell, but this F# should be easy to convert:
let is_perfect_square n =
let rec binary_search low high =
let mid = (high + low) / 2
let midSquare = mid * mid
if low > high then false
elif n = midSquare then true
else if n < midSquare then binary_search low (mid - 1)
else binary_search (mid + 1) high
binary_search 1 n
Guaranteed to be O(log n). Easy to modify perfect cubes and higher powers.
There is a wonderful library for most number theory related problems in Haskell included in the arithmoi package.
Use the Math.NumberTheory.Powers.Squares library.
Specifically the isSquare' function.
is_square :: Int -> Bool
is_square = isSquare' . fromIntegral
The library is optimized and well vetted by people much more dedicated to efficiency then you or I. While it currently doesn't have this kind of shenanigans going on under the hood, it could in the future as the library evolves and gets more optimized. View the source code to understand how it works!
Don't reinvent the wheel, always use a library when available.
I think the code you provided is the fastest that you are going to get:
is_square n = sq * sq == n
where sq = floor $ sqrt $ (fromIntegral n::Double)
The complexity of this code is: one sqrt, one double multiplication, one cast (dbl->int), and one comparison. You could try to use other computation methods to replace the sqrt and the multiplication with just integer arithmetic and shifts, but chances are it is not going to be faster than one sqrt and one multiplication.
The only place where it might be worth using another method is if the CPU on which you are running does not support floating point arithmetic. In this case the compiler will probably have to generate sqrt and double multiplication in software, and you could get advantage in optimizing for your specific application.
As pointed out by other answer, there is still a limitation of big integers, but unless you are going to run into those numbers, it is probably better to take advantage of the floating point hardware support than writing your own algorithm.
In a comment on another answer to this question, you discussed memoization. Keep in mind that this technique helps when your probe patterns exhibit good density. In this case, that would mean testing the same integers over and over. How likely is your code to repeat the same work and thus benefit from caching answers?
You didn't give us an idea of the distribution of your inputs, so consider a quick benchmark that uses the excellent criterion package:
module Main
where
import Criterion.Main
import Random
is_square n = sq * sq == n
where sq = floor $ sqrt $ (fromIntegral n::Double)
is_square_mem =
let check n = sq * sq == n
where sq = floor $ sqrt $ (fromIntegral n :: Double)
in (map check [0..] !!)
main = do
g <- newStdGen
let rs = take 10000 $ randomRs (0,1000::Int) g
direct = map is_square
memo = map is_square_mem
defaultMain [ bench "direct" $ whnf direct rs
, bench "memo" $ whnf memo rs
]
This workload may or may not be a fair representative of what you're doing, but as written, the cache miss rate appears too high:
Wikipedia's article on Integer Square Roots has algorithms can be adapted to suit your needs. Newton's method is nice because it converges quadratically, i.e., you get twice as many correct digits each step.
I would advise you to stay away from Double if the input might be bigger than 2^53, after which not all integers can be exactly represented as Double.
Oh, today I needed to determine if a number is perfect cube, and similar solution was VERY slow.
So, I came up with a pretty clever alternative
cubes = map (\x -> x*x*x) [1..]
is_cube n = n == (head $ dropWhile (<n) cubes)
Very simple. I think, I need to use a tree for faster lookups, but now I'll try this solution, maybe it will be fast enough for my task. If not, I'll edit the answer with proper datastructure
Sometimes you shouldn't divide problems into too small parts (like checks is_square):
intersectSorted [] _ = []
intersectSorted _ [] = []
intersectSorted xs (y:ys) | head xs > y = intersectSorted xs ys
intersectSorted (x:xs) ys | head ys > x = intersectSorted xs ys
intersectSorted (x:xs) (y:ys) | x == y = x : intersectSorted xs ys
squares = [x*x | x <- [ 1..]]
weird = [2*x+1 | x <- [ 1..]]
perfectSquareWeird = intersectSorted squares weird
There's a very simple way to test for a perfect square - quite literally, you check if the square root of the number has anything other than zero in the fractional part of it.
I'm assuming a square root function that returns a floating point, in which case you can do (Psuedocode):
func IsSquare(N)
sq = sqrt(N)
return (sq modulus 1.0) equals 0.0
It's not particularly pretty or fast, but here's a cast-free, FPA-free version based on Newton's method that works (slowly) for arbitrarily large integers:
import Control.Applicative ((<*>))
import Control.Monad (join)
import Data.Ratio ((%))
isSquare = (==) =<< (^2) . floor . (join g <*> join f) . (%1)
where
f n x = (x + n / x) / 2
g n x y | abs (x - y) > 1 = g n y $ f n y
| otherwise = y
It could probably be sped up with some additional number theory trickery.

Resources