Learning F# - printing prime numbers - algorithm

Yesterday I started looking at F# during some spare time. I thought I would start with the standard problem of printing out all the prime numbers up to 100. Heres what I came up with...
#light
open System
let mutable divisable = false
let mutable j = 2
for i = 2 to 100 do
j <- 2
while j < i do
if i % j = 0 then divisable <- true
j <- j + 1
if divisable = false then Console.WriteLine(i)
divisable <- false
The thing is I feel like I have approached this from a C/C# perspective and not embraced the true functional language aspect.
I was wondering what other people could come up with - and whether anyone has any tips/pointers/suggestions. I feel good F# content is hard to come by on the web at the moment, and the last functional language I touched was HOPE about 5 years ago in university.

Here is a simple implementation of the Sieve of Eratosthenes in F#:
let rec sieve = function
| (p::xs) -> p :: sieve [ for x in xs do if x % p > 0 then yield x ]
| [] -> []
let primes = sieve [2..50]
printfn "%A" primes // [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47]
This implementation won't work for very large lists but it illustrates the elegance of a functional solution.

Using a Sieve function like Eratosthenes is a good way to go. Functional languages work really well with lists, so I would start with that in mind for struture.
On another note, functional languages work well constructed out of functions (heh). For a functional language "feel" I would build a Sieve function and then call it to print out the primes. You could even split it up--one function builds the list and does all the work and one goes through and does all the printing, neatly separating functionality.
There's a couple of interesting versions here.
And there are well known implementations in other similar languages. Here's one in OCAML that beats one in C.

Here are my two cents:
let rec primes =
seq {
yield 2
yield! (Seq.unfold (fun i -> Some(i, i + 2)) 3)
|> Seq.filter (fun p ->
primes
|> Seq.takeWhile (fun i -> i * i <= p)
|> Seq.forall (fun i -> p % i <> 0))
}
for i in primes do
printf "%d " i
Or maybe this clearer version of the same thing as isprime is defined as a separate function:
let rec isprime x =
primes
|> Seq.takeWhile (fun i -> i*i <= x)
|> Seq.forall (fun i -> x%i <> 0)
and primes =
seq {
yield 2
yield! (Seq.unfold (fun i -> Some(i,i+2)) 3)
|> Seq.filter isprime
}

You definitely do not want to learn from this example, but I wrote an F# implementation of a NewSqueak sieve based on message passing:
type 'a seqMsg =
| Die
| Next of AsyncReplyChannel<'a>
type primes() =
let counter(init) =
MailboxProcessor.Start(fun inbox ->
let rec loop n =
async { let! msg = inbox.Receive()
match msg with
| Die -> return ()
| Next(reply) ->
reply.Reply(n)
return! loop(n + 1) }
loop init)
let filter(c : MailboxProcessor<'a seqMsg>, pred) =
MailboxProcessor.Start(fun inbox ->
let rec loop() =
async {
let! msg = inbox.Receive()
match msg with
| Die ->
c.Post(Die)
return()
| Next(reply) ->
let rec filter' n =
if pred n then async { return n }
else
async {let! m = c.AsyncPostAndReply(Next)
return! filter' m }
let! testItem = c.AsyncPostAndReply(Next)
let! filteredItem = filter' testItem
reply.Reply(filteredItem)
return! loop()
}
loop()
)
let processor = MailboxProcessor.Start(fun inbox ->
let rec loop (oldFilter : MailboxProcessor<int seqMsg>) prime =
async {
let! msg = inbox.Receive()
match msg with
| Die ->
oldFilter.Post(Die)
return()
| Next(reply) ->
reply.Reply(prime)
let newFilter = filter(oldFilter, (fun x -> x % prime <> 0))
let! newPrime = oldFilter.AsyncPostAndReply(Next)
return! loop newFilter newPrime
}
loop (counter(3)) 2)
member this.Next() = processor.PostAndReply( (fun reply -> Next(reply)), timeout = 2000)
interface System.IDisposable with
member this.Dispose() = processor.Post(Die)
static member upto max =
[ use p = new primes()
let lastPrime = ref (p.Next())
while !lastPrime <= max do
yield !lastPrime
lastPrime := p.Next() ]
Does it work?
> let p = new primes();;
val p : primes
> p.Next();;
val it : int = 2
> p.Next();;
val it : int = 3
> p.Next();;
val it : int = 5
> p.Next();;
val it : int = 7
> p.Next();;
val it : int = 11
> p.Next();;
val it : int = 13
> p.Next();;
val it : int = 17
> primes.upto 100;;
val it : int list
= [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71;
73; 79; 83; 89; 97]
Sweet! :)

Simple but inefficient suggestion:
Create a function to test whether a single number is prime
Create a list for numbers from 2 to 100
Filter the list by the function
Compose the result with another function to print out the results
To make this efficient you really want to test for a number being prime by checking whether or not it's divisible by any lower primes, which will require memoisation. Probably best to wait until you've got the simple version working first :)
Let me know if that's not enough of a hint and I'll come up with a full example - thought it may not be until tonight...

Here is my old post at HubFS about using recursive seq's to implement prime number generator.
For case you want fast implementation, there is nice OCaml code by Markus Mottl
P.S. if you want to iterate prime number up to 10^20 you really want to port primegen by D. J. Bernstein to F#/OCaml :)

While solving the same problem, I have implemented Sieve of Atkins in F#. It is one of the most efficient modern algorithms.
// Create sieve
let initSieve topCandidate =
let result = Array.zeroCreate<bool> (topCandidate + 1)
Array.set result 2 true
Array.set result 3 true
Array.set result 5 true
result
// Remove squares of primes
let removeSquares sieve topCandidate =
let squares =
seq { 7 .. topCandidate}
|> Seq.filter (fun n -> Array.get sieve n)
|> Seq.map (fun n -> n * n)
|> Seq.takeWhile (fun n -> n <= topCandidate)
for n2 in squares do
n2
|> Seq.unfold (fun state -> Some(state, state + n2))
|> Seq.takeWhile (fun x -> x <= topCandidate)
|> Seq.iter (fun x -> Array.set sieve x false)
sieve
// Pick the primes and return as an Array
let pickPrimes sieve =
sieve
|> Array.mapi (fun i t -> if t then Some i else None)
|> Array.choose (fun t -> t)
// Flip solutions of the first equation
let doFirst sieve topCandidate =
let set1 = Set.ofList [1; 13; 17; 29; 37; 41; 49; 53]
let mutable x = 1
let mutable y = 1
let mutable go = true
let mutable x2 = 4 * x * x
while go do
let n = x2 + y*y
if n <= topCandidate then
if Set.contains (n % 60) set1 then
Array.get sieve n |> not |> Array.set sieve n
y <- y + 2
else
y <- 1
x <- x + 1
x2 <- 4 * x * x
if topCandidate < x2 + 1 then
go <- false
// Flip solutions of the second equation
let doSecond sieve topCandidate =
let set2 = Set.ofList [7; 19; 31; 43]
let mutable x = 1
let mutable y = 2
let mutable go = true
let mutable x2 = 3 * x * x
while go do
let n = x2 + y*y
if n <= topCandidate then
if Set.contains (n % 60) set2 then
Array.get sieve n |> not |> Array.set sieve n
y <- y + 2
else
y <- 2
x <- x + 2
x2 <- 3 * x * x
if topCandidate < x2 + 4 then
go <- false
// Flip solutions of the third equation
let doThird sieve topCandidate =
let set3 = Set.ofList [11; 23; 47; 59]
let mutable x = 2
let mutable y = x - 1
let mutable go = true
let mutable x2 = 3 * x * x
while go do
let n = x2 - y*y
if n <= topCandidate && 0 < y then
if Set.contains (n % 60) set3 then
Array.get sieve n |> not |> Array.set sieve n
y <- y - 2
else
x <- x + 1
y <- x - 1
x2 <- 3 * x * x
if topCandidate < x2 - y*y then
go <- false
// Sieve of Atkin
let ListAtkin (topCandidate : int) =
let sieve = initSieve topCandidate
[async { doFirst sieve topCandidate }
async { doSecond sieve topCandidate }
async { doThird sieve topCandidate }]
|> Async.Parallel
|> Async.RunSynchronously
|> ignore
removeSquares sieve topCandidate |> pickPrimes
I know some don't recommend to use Parallel Async, but it did increase the speed ~20% on my 2 core (4 with hyperthreading) i5. Which is about the same increase I got using TPL.
I have tried rewriting it in functional way, getting read of loops and mutable variables, but performance degraded 3-4 times, so decided to keep this version.

Related

Boggle - count all possible paths on a N*N grid. Performance

While reading up this question, I wondered why no one would "simply" iterate all the possible paths on the boggle grid and have the word-tries follow and then cancel the path if there is no match in the word-trie. Cannot be that many paths on a tiny 4 by 4 grid, right? How many paths are there? So I set out to code a path-counter function in F#. The results yield what no one stated on that other page: Way more paths on the grid than I would have guessed (more paths than words in the word-set, actually).
While all that is pretty much the back story to my question, the code I ended up with was running rather slow and I found that I could not give good answers to a few aspects of the code. So here, the code first, then below it, you will find points which I think deserve explanations...
let moves n state square =
let allSquares = [0..n*n-1] |> Set.ofList
let right = Set.difference allSquares (Set.ofList [n-1..n..n*n])
let left = Set.difference allSquares (Set.ofList [0..n..n*n-1])
let up = Set.difference allSquares (Set.ofList [0..n-1])
let down = Set.difference allSquares (Set.ofList [n*n-n..n*n-1])
let downRight = Set.intersect right down
let downLeft = Set.intersect left down
let upRight = Set.intersect right up
let upLeft = Set.intersect left up
let appendIfInSet se v res =
if Set.contains square se then res # v else res
[]
|> appendIfInSet right [square + 1]
|> appendIfInSet left [square - 1]
|> appendIfInSet up [square - n]
|> appendIfInSet down [square + n]
|> appendIfInSet downRight [square + n + 1]
|> appendIfInSet downLeft [square + n - 1]
|> appendIfInSet upRight [square - n + 1]
|> appendIfInSet upLeft [square - n - 1]
|> List.choose (fun s -> if ((uint64 1 <<< s) &&& state) = 0UL then Some s else None )
let block state square =
state ||| (uint64 1 <<< square)
let countAllPaths n lmin lmax =
let mov = moves n // line 30
let rec count l state sq c =
let state' = block state sq
let m = mov state' sq
match l with
| x when x <= lmax && x >= lmin ->
List.fold (fun acc s -> count (l+1) state' s acc) (c+1) m
| x when x < lmin ->
List.fold (fun acc s -> count (l+1) state' s acc) (c) m
| _ ->
c
List.fold (fun acc s -> count 0 (block 0UL s) s acc) 0 [0..n*n-1]
[<EntryPoint>]
let main args =
printfn "%d: %A" (Array.length args) args
if 3 = Array.length args then
let n = int args.[0]
let lmin = int args.[1]
let lmax = int args.[2]
printfn "%d %d %d -> %d" n lmin lmax (countAllPaths n lmin lmax)
else
printfn "usage: wordgames.exe n lmin lmax"
0
In line 30, I curried the moves function with the first argument, hoping that maybe code optimization would benefit from it. Maybe optimizing the 9 sets I create in move which are only a function of n. After all, they need not be generated over and over again, right? On the other hand, I would not really bet on it actually happening.
So, question #1 is: How could I enforce this optimization in an as little code bloating way as possible? (I could of course create a type with 9 members and then an array of that type for each possible n and then do a look up table like usage of the pre-computed sets but that would be code bloat in my opinion).
Many sources hint that parallel folds are considered critical. How could I create a parallel version of the counting function (which runs on multiple cores)?
Does anyone have clever ideas how to speed this up? Maybe some pruning or memoization etc?
At first, when I ran the function for n=4 lmin=3 lmax=8 I thought it takes so long because I ran it in fsi. But then I compiled the code with -O and it still took about the same time...
UPDATE
While waiting for input from you guys, I did the code bloated manual optimization version (runs much faster) and then found a way to make it run on multiple cores.
All in all those 2 changes yielded about a speed up by a factor of 30. Here the (bloated) version I came up with (still looking for a way to avoid the bloat):
let squareSet n =
let allSquares = [0..n*n-1] |> Set.ofList
let right = Set.difference allSquares (Set.ofList [n-1..n..n*n])
let left = Set.difference allSquares (Set.ofList [0..n..n*n-1])
let up = Set.difference allSquares (Set.ofList [0..n-1])
let down = Set.difference allSquares (Set.ofList [n*n-n..n*n-1])
let downRight = Set.intersect right down
let downLeft = Set.intersect left down
let upRight = Set.intersect right up
let upLeft = Set.intersect left up
[|right;left;up;down;upRight;upLeft;downRight;downLeft|]
let RIGHT,LEFT,UP,DOWN = 0,1,2,3
let UPRIGHT,UPLEFT,DOWNRIGHT,DOWNLEFT = 4,5,6,7
let squareSets =
[|Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;|]
::
[ for i in 1..8 do
yield squareSet i
]
|> Array.ofList
let moves n state square =
let appendIfInSet se v res =
if Set.contains square se then res # v else res
[]
|> appendIfInSet squareSets.[n].[RIGHT] [square + 1]
|> appendIfInSet squareSets.[n].[LEFT] [square - 1]
|> appendIfInSet squareSets.[n].[UP] [square - n]
|> appendIfInSet squareSets.[n].[DOWN] [square + n]
|> appendIfInSet squareSets.[n].[DOWNRIGHT] [square + n + 1]
|> appendIfInSet squareSets.[n].[DOWNLEFT] [square + n - 1]
|> appendIfInSet squareSets.[n].[UPRIGHT] [square - n + 1]
|> appendIfInSet squareSets.[n].[UPLEFT] [square - n - 1]
|> List.choose (fun s -> if ((uint64 1 <<< s) &&& state) = 0UL then Some s else None )
let block state square =
state ||| (uint64 1 <<< square)
let countAllPaths n lmin lmax =
let mov = moves n
let rec count l state sq c =
let state' = block state sq
let m = mov state' sq
match l with
| x when x <= lmax && x >= lmin ->
List.fold (fun acc s -> count (l+1) state' s acc) (c+1) m
| x when x < lmin ->
List.fold (fun acc s -> count (l+1) state' s acc) (c) m
| _ ->
c
//List.fold (fun acc s -> count 0 (block 0UL s) s acc) 0 [0..n*n-1]
[0..n*n-1]
|> Array.ofList
|> Array.Parallel.map (fun start -> count 0 (block 0UL start) start 0)
|> Array.sum
[<EntryPoint>]
let main args =
printfn "%d: %A" (Array.length args) args
if 3 = Array.length args then
let n = int args.[0]
let lmin = int args.[1]
let lmax = int args.[2]
printfn "%d %d %d -> %d" n lmin lmax (countAllPaths n lmin lmax)
else
printfn "usage: wordgames.exe n lmin lmax"
0
As for the non-optimization of the generation of sets.
The second version posted in the update to the question showed, that this is actually the case (not optimized by compiler) and it yielded a significant speed up. The final version (posted below in this answer) carries that approach even further and speeds up the path counting (and the solving of a boggle puzzle) even further.
Combined with parallel execution on multiple cores, the initially really slow (maybe 30s) version could be sped up to only around 100ms for the n=4 lmin=3 lmax=8 case.
For n=6 class of problems, the parallel and hand tuned implementation solves a puzzle in around 60ms on my machine. It makes sense, that this is faster than the path counting, as the word list probing (used a dictionary with around 80000 words) along with the dynamic programming approach pointed out by #GuyCoder renders the solution of the puzzle a less complex problem than the (brute force) path counting.
Lesson learned
The f# compiler does not seem to be all too mystical and magical if it comes to code optimizations. Hand tuning is worth the effort if performance is really required.
Turning a single threaded recursive search function into a parallel (concurrent) function was not really hard in this case.
The final version of the code
Compiled with:
fsc --optimize+ --tailcalls+ wordgames.fs
(Microsoft (R) F# Compiler version 14.0.23413.0)
let wordListPath = #"E:\temp\12dicts-6.0.2\International\3of6all.txt"
let acceptableWord (s : string) : bool =
let s' = s.Trim()
if s'.Length > 2
then
if System.Char.IsLower(s'.[0]) && System.Char.IsLetter(s'.[0]) then true
else false
else
false
let words =
System.IO.File.ReadAllLines(wordListPath)
|> Array.filter acceptableWord
let squareSet n =
let allSquares = [0..n*n-1] |> Set.ofList
let right = Set.difference allSquares (Set.ofList [n-1..n..n*n])
let left = Set.difference allSquares (Set.ofList [0..n..n*n-1])
let up = Set.difference allSquares (Set.ofList [0..n-1])
let down = Set.difference allSquares (Set.ofList [n*n-n..n*n-1])
let downRight = Set.intersect right down
let downLeft = Set.intersect left down
let upRight = Set.intersect right up
let upLeft = Set.intersect left up
[|right;left;up;down;upRight;upLeft;downRight;downLeft|]
let RIGHT,LEFT,UP,DOWN = 0,1,2,3
let UPRIGHT,UPLEFT,DOWNRIGHT,DOWNLEFT = 4,5,6,7
let squareSets =
[|Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;Set.empty;|]
::
[ for i in 1..8 do
yield squareSet i
]
|> Array.ofList
let movesFromSquare n square =
let appendIfInSet se v res =
if Set.contains square se then v :: res else res
[]
|> appendIfInSet squareSets.[n].[RIGHT] (square + 1)
|> appendIfInSet squareSets.[n].[LEFT] (square - 1)
|> appendIfInSet squareSets.[n].[UP] (square - n)
|> appendIfInSet squareSets.[n].[DOWN] (square + n)
|> appendIfInSet squareSets.[n].[DOWNRIGHT] (square + n + 1)
|> appendIfInSet squareSets.[n].[DOWNLEFT] (square + n - 1)
|> appendIfInSet squareSets.[n].[UPRIGHT] (square - n + 1)
|> appendIfInSet squareSets.[n].[UPLEFT] (square - n - 1)
let lutMovesN n =
Array.init n (fun i -> if i > 0 then Array.init (n*n-1) (fun j -> movesFromSquare i j) else Array.empty)
let lutMoves =
lutMovesN 8
let moves n state square =
let appendIfInSet se v res =
if Set.contains square se then v :: res else res
lutMoves.[n].[square]
|> List.filter (fun s -> ((uint64 1 <<< s) &&& state) = 0UL)
let block state square =
state ||| (uint64 1 <<< square)
let countAllPaths n lmin lmax =
let mov = moves n
let rec count l state sq c =
let state' = block state sq
let m = mov state' sq
match l with
| x when x <= lmax && x >= lmin ->
List.fold (fun acc s -> count (l+1) state' s acc) (c+1) m
| x when x < lmin ->
List.fold (fun acc s -> count (l+1) state' s acc) (c) m
| _ ->
c
//List.fold (fun acc s -> count 0 (block 0UL s) s acc) 0 [0..n*n-1]
[|0..n*n-1|]
|> Array.Parallel.map (fun start -> count 0 (block 0UL start) start 0)
|> Array.sum
//printfn "%d " (words |> Array.distinct |> Array.length)
let usage() =
printfn "usage: wordgames.exe [--gen n count problemPath | --count n lmin lmax | --solve problemPath ]"
let rng = System.Random()
let genProblem n (sb : System.Text.StringBuilder) =
let a = Array.init (n*n) (fun _ -> char (rng.Next(26) + int 'a'))
sb.Append(a) |> ignore
sb.AppendLine()
let genProblems nproblems n (sb : System.Text.StringBuilder) : System.Text.StringBuilder =
for i in 1..nproblems do
genProblem n sb |> ignore
sb
let solve n (board : System.String) =
let ba = board.ToCharArray()
let testWord (w : string) : bool =
let testChar k sq = (ba.[sq] = w.[k])
let rec testSquare state k sq =
match k with
| 0 -> testChar 0 sq
| x ->
if testChar x sq
then
let state' = block state x
moves n state' x
|> List.exists (testSquare state' (x-1))
else
false
[0..n*n-1]
|> List.exists (testSquare 0UL (String.length w - 1))
words
|> Array.splitInto 32
|> Array.Parallel.map (Array.filter testWord)
|> Array.concat
[<EntryPoint>]
let main args =
printfn "%d: %A" (Array.length args) args
let nargs = Array.length args
let sw = System.Diagnostics.Stopwatch()
match nargs with
| x when x >= 2 ->
match args.[0] with
| "--gen" ->
if nargs = 4
then
let n = int args.[1]
let nproblems = int args.[2]
let outpath = args.[3]
let problems = genProblems nproblems n (System.Text.StringBuilder())
System.IO.File.WriteAllText (outpath,problems.ToString())
0
else
usage()
0
| "--count" ->
if nargs = 4
then
let n = int args.[1]
let lmin = int args.[2]
let lmax = int args.[3]
sw.Start()
let count = countAllPaths n lmin lmax
sw.Stop()
printfn "%d %d %d -> %d (took: %d)" n lmin lmax count (sw.ElapsedMilliseconds)
0
else
usage ()
0
| "--solve" ->
if nargs = 2
then
let problems = System.IO.File.ReadAllLines(args.[1])
problems
|> Array.iter
(fun (p : string) ->
let n = int (sqrt (float (String.length p)))
sw.Reset()
sw.Start()
let found = solve n p
sw.Stop()
printfn "%s\n%A\n%dms" p found (sw.ElapsedMilliseconds)
)
0
else
usage ()
0
| _ ->
usage ()
0
| _ ->
usage ()
0

How to improve performance with F# idioms

I'm using this course on Machine-Learning to learn F# at the same time. I've done the following homework exercise which is the first exercise of the second week:
Run a computer simulation for flipping 1,000 virtual fair coins. Flip
each coin independently 10 times. Focus on 3 coins as follows: c1
is the first coin flipped, crand is a coin chosen randomly from
the 1,000, and cmin is the coin which had the minimum frequency of
heads (pick the earlier one in case of a tie).
Let ν1 , νrand
, and νmin be the fraction of heads obtained for the 3 respective
coins out of the 10 tosses. Run the experiment 100,000 times in order
to get a full distribution of ν1 , νrand, and νmin (note that c rand
and c min will change from run to run).
What is the average value of νmin?
I have produced the following code, which works fine and gives the correct answer:
let private rnd = System.Random()
let FlipCoin() = rnd.NextDouble() > 0.5
let FlipCoinNTimes N = List.init N (fun _ -> FlipCoin())
let FlipMCoinsNTimes M N = List.init M (fun _ -> FlipCoinNTimes N)
let ObtainFrequencyOfHeads tosses =
let heads = tosses |> List.filter (fun toss -> toss = true)
float (List.length (heads)) / float (List.length (tosses))
let GetFirstRandMinHeadsFraction allCoinsLaunchs =
let first = ObtainFrequencyOfHeads(List.head (allCoinsLaunchs))
let randomCoin = List.item (rnd.Next(List.length (allCoinsLaunchs))) allCoinsLaunchs
let random = ObtainFrequencyOfHeads(randomCoin)
let min =
allCoinsLaunchs
|> List.map (fun coin -> ObtainFrequencyOfHeads coin)
|> List.min
(first, random, min)
module Exercice1 =
let GetResult() =
Seq.init 100000 (fun _ -> FlipMCoinsNTimes 1000 10)
|> Seq.map (fun oneExperiment -> GetFirstRandMinHeadsFraction oneExperiment)
|> Seq.map (fun (first, random, min) -> min)
|> Seq.average
However, it takes roughly 4 minutes to run in my machine. I know that it is doing a lot of work, but I'm wondering if there are some modifications that could be made to optimize it.
As I'm trying lo learn F#, I'm asking for optimizations that use F# idioms, not to change the code to a C-style.
Feel free to suggest any kind of improvement, in style, good practices, etc.
[UPDATE]
I have written some code to compare the proposed solutions, it is accesible here.
These are the results:
Base - result: 0.037510, time elapsed: 00:00:55.1274883, improvement:
0.99 x
Matthew Mcveigh - result: 0.037497, time elapsed: 00:00:15.1682052, improvement: 3.61 x
Fyodor Soikin - result:0.037524, time elapsed: 00:01:29.7168787, improvement: 0.61 x
GuyCoder - result: 0.037645, time elapsed: 00:00:02.0883482, improvement: 26.25 x
GuyCoder MathNet- result: 0.037666, time elapsed:
00:00:24.7596117, improvement: 2.21 x
TheQuickBrownFox - result:
0.037494, time elapsed: 00:00:34.2831239, improvement: 1.60 x
The winner concerning the improvement in time is the GuyCoder, so I will accept his answer. However, I find that his code is more difficult to understand.
Allocating a large amount of lists up front is heavy work, the algorithm can be processed online e.g. via sequences or recursion. I transformed all the work into tail recursive functions for some raw speed (will be transformed into loops by the compiler)
not guaranteed to be 100% correct, but hopefully gives you a gist of where I was going with it:
let private rnd = System.Random()
let flipCoin () = rnd.NextDouble() > 0.5
let frequencyOfHeads flipsPerCoin =
let rec countHeads numHeads i =
if i < flipsPerCoin then
let isHead = flipCoin ()
countHeads (if isHead then numHeads + 1 else numHeads) (i + 1)
else
float numHeads
countHeads 0 0 / float flipsPerCoin
let getFirstRandMinHeadsFraction numCoins flipsPerCoin =
let randomCoinI = rnd.Next numCoins
let rec run first random min i =
if i < numCoins then
let frequency = frequencyOfHeads flipsPerCoin
let first = if i = 0 then frequency else first
let random = if i = randomCoinI then frequency else random
let min = if min > frequency then frequency else min
run first random min (i + 1)
else
(first, random, min)
run 0.0 0.0 System.Double.MaxValue 0
module Exercice1 =
let getResult () =
let iterations, numCoins, numFlips = 100000, 1000, 10
let getMinFromExperiment () =
let (_, _, min) = getFirstRandMinHeadsFraction numCoins numFlips
min
let rec sumMinFromExperiments i sumOfMin =
if i < iterations then
sumMinFromExperiments (i + 1) (sumOfMin + getMinFromExperiment ())
else
sumOfMin
let sum = sumMinFromExperiments 0 0.0
sum / float iterations
Running your code on my computer and timing I get:
seconds: 68.481918
result: 0.47570994
Running my code on my computer and timing I get:
seconds: 14.003861
vOne: 0.498963
vRnd: 0.499793
vMin: 0.037675
with vMin being closest to the correct answer of b being 0.01
That is almost 5x faster.
I did not tinker with each method and data structure to figure out why and what worked, I just used many decades of experience to guide me. Clearly not storing the intermediate values but just the results is a big improvement. Specifically coinTest just returns the number of heads which is an int and not a list of the results. Also instead of getting a random number for each coin flip but getting a random number for each coin and then using each part of that random number as a coin flip is advantageous. That saves number of flips - 1 calls to a function. Also I avoided using float values until the very end; I don't consider that saving time on the CPU, but it did simplify the thought process of thinking only in int which allowed me to concentrate on other efficiencies. I know that may sound weird but the less I have to think about the better the answers I get. I also only ran coinTest when it was necessary, e.g. only the first coin, only the random coin, and looked for all tails as an exit condition.
namespace Workspace
module main =
[<EntryPoint>]
let main argv =
let rnd = System.Random()
let randomPick (limit : int) : int = rnd.Next(limit) // [0 .. limit) it's a Python habit
let numberOfCoins = 1000
let numberOfFlips = 10
let numberOfExperiements = 100000
let coinTest (numberOfFlips : int) : int =
let rec countHeads (flips : int) bitIndex (headCount : int) : int =
if bitIndex < 0 then headCount
else countHeads (flips >>> 1) (bitIndex-1) (headCount + (flips &&& 0x01))
countHeads (randomPick ((pown 2 numberOfFlips) - 1)) numberOfFlips 0
let runExperiement (numberOfCoins : int) (numberOfFlips : int) : (int * int * int) =
let (randomCoin : int) = randomPick numberOfCoins
let rec testCoin coinIndex (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone) : (int * int * int) =
if (coinIndex < numberOfCoins) then
if (not cFirstDone || not cRanDone || not cMinDone) then
if (cFirstDone && cMinDone && (coinIndex <> randomCoin)) then
testCoin (coinIndex+1) (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone)
else
let headsTotal = coinTest numberOfFlips
let (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone) =
let cFirst = if coinIndex = 0 then headsTotal else cFirst
let cRnd = if coinIndex = randomCoin then headsTotal else cRnd
let cMin = if headsTotal < cMin then headsTotal else cMin
let cRanDone = if (coinIndex >= randomCoin) then true else cRanDone
let cMinDone = if (headsTotal = 0) then true else cMinDone
(cFirst, cRnd, cMin, true, cRanDone, cMinDone)
testCoin (coinIndex+1) (cFirst, cRnd, cMin, cFirstDone, cRanDone, cMinDone)
else
(cFirst, cRnd, cMin)
else
(cFirst, cRnd, cMin)
testCoin 0 (-1,-1,10, false, false, false)
let runExperiements (numberOfExperiements : int) (numberOfCoins : int) ( numberOfFlips : int) =
let rec accumateExperiements index aOne aRnd aMin : (int * int * int) =
let (cOne,cRnd,cMin) = runExperiement numberOfCoins numberOfFlips
if index > numberOfExperiements then (aOne, aRnd, aMin)
else accumateExperiements (index + 1) (aOne + cOne) (aRnd + cRnd) (aMin + cMin)
let (aOne, aRnd, aMin) = accumateExperiements 0 0 0 0
let (vOne : double) = (double)(aOne) / (double)numberOfExperiements / (double)numberOfFlips
let (vRnd : double) = (double)(aRnd) / (double)numberOfExperiements / (double)numberOfFlips
let (vMin : double) = (double)(aMin) / (double)numberOfExperiements / (double)numberOfFlips
(vOne, vRnd, vMin)
let timeIt () =
let stopWatch = System.Diagnostics.Stopwatch.StartNew()
let (vOne, vRnd, vMin) = runExperiements numberOfExperiements numberOfCoins numberOfFlips
stopWatch.Stop()
printfn "seconds: %f" (stopWatch.Elapsed.TotalMilliseconds / 1000.0)
printfn "vOne: %A" vOne
printfn "vRnd: %A" vRnd
printfn "vMin: %A" vMin
timeIt ()
printf "Press any key to exit: "
System.Console.ReadKey() |> ignore
printfn ""
0 // return an integer exit code
========================================================================
This is just an intermediate answer because I inquired if the OP considered using MathNet Numerics idiomatic F# and the OP wanted to see what that looked like. After running his version and this first cut version on my machine the OP version is faster. OP: 75 secs, mine: 84 secs
namespace Workspace
open MathNet.Numerics.LinearAlgebra
module main =
[<EntryPoint>]
let main argv =
let rnd = System.Random()
let flipCoin() =
let head = rnd.NextDouble() > 0.5
if head then 1.0 else 0.0
let numberOfCoins = 1000
let numberOfFlips = 10
let numberOfExperiements = 100000
let numberOfValues = 3
let randomPick (limit : int) : int = rnd.Next(limit) // [0 .. limit) it's a Python habit
let headCount (m : Matrix<float>) (coinIndex : int) : int =
System.Convert.ToInt32((m.Row coinIndex).Sum())
let minHeads (m : Matrix<float>) (numberOfCoins : int) (numberOfFlips : int) : int =
let rec findMinHeads currentCoinIndex minHeadsCount minHeadsIndex =
match currentCoinIndex,minHeadsCount with
| -1,_ -> minHeadsCount
| _,0 -> minHeadsCount // Can't get less than zero so stop searching.
| _ ->
let currentMinHeadCount = (headCount m currentCoinIndex)
let nextIndex = currentCoinIndex - 1
if currentMinHeadCount < minHeadsCount
then findMinHeads nextIndex currentMinHeadCount currentCoinIndex
else findMinHeads nextIndex minHeadsCount minHeadsIndex
findMinHeads (numberOfCoins - 1) numberOfFlips -1
// Return the values for cOne, cRnd, and cMin as int values.
// Will do division on final sum of experiments instead of after each experiment.
let runExperiement (numberOfCoins : int) (numberOfFlips : int) : (int * int * int) =
let (flips : Matrix<float>) = DenseMatrix.init numberOfCoins numberOfFlips (fun i j -> flipCoin())
let cOne = headCount flips 0
let cRnd = headCount flips (randomPick numberOfCoins)
let cMin = minHeads flips numberOfCoins numberOfFlips
(cOne,cRnd,cMin)
let runExperiements (numberOfExperiements : int) (numberOfCoins : int) (numberOfFlips : int) : (int [] * int [] * int []) =
let (cOneArray : int[]) = Array.create numberOfExperiements 0
let (cRndArray : int[]) = Array.create numberOfExperiements 0
let (cMinArray : int[]) = Array.create numberOfExperiements 0
for i = 0 to (numberOfExperiements - 1) do
let (cOne,cRnd,cMin) = runExperiement numberOfCoins numberOfFlips
cOneArray.[i] <- cOne
cRndArray.[i] <- cRnd
cMinArray.[i] <- cMin
(cOneArray, cRndArray, cMinArray)
let (cOneArray, cRndArray, cMinArray) = runExperiements numberOfExperiements numberOfCoins numberOfFlips
let (vOne : double) = (double)(Array.sum cOneArray) / (double)numberOfExperiements / (double)numberOfFlips
let (vRnd : double) = (double)(Array.sum cRndArray) / (double)numberOfExperiements / (double)numberOfFlips
let (vMin : double) = (double)(Array.sum cMinArray) / (double)numberOfExperiements / (double)numberOfFlips
printfn "vOne: %A" vOne
printfn "vRnd: %A" vRnd
printfn "vMin: %A" vMin
Halfway through the coding I realized I could do all of the calculations using just int, it was only the last calculations that generated the percentages that needed to be a float or double and even then that is only because the list of answers is a percentage; in theory the numbers can be compared as int to get the same understanding. If I use only int then I would have to create an int Matrix type and that is more work than I want to do. When I get time I will switch the MathNet Matrix to an F# Array2D or something similar and check that. Note if you tag this with MathNet then the maintainer of MathNet might answer (Christoph Rüegg)
I made an change to this method and it is faster by 5 seconds.
// faster
let minHeads (m : Matrix<float>) (numberOfCoins : int) (numberOfFlips : int) : int =
let (mins : float[]) = m.FoldByRow((fun (x : float) y -> x + y), 0.0)
let (minHead : float) = Array.min mins
System.Convert.ToInt32(minHead)
I tried to find the smallest possible changes to your code to make it faster.
The biggest performance improvement I found was by changing the ObtainFrequencyOfHeads function so that it counts true values in the collection instead of creating an intermediate filtered collection and then counting that. I did this by using fold:
let ObtainFrequencyOfHeads tosses =
let heads = tosses |> List.fold (fun state t -> if t then state + 1 else state) 0
float heads / float (List.length (tosses))
Another improvement came from changing all of the lists into arrays. This was as simple as replacing every instance of List. with Array. (including the new function above).
Some might say this is less functional, because it's using a mutable collection instead of an immutable one. However, we're not mutating any arrays, just using the fact that they are cheap to create, check the length of, and look up by index. We have removed a restriction on mutation but we are still not using mutation. It is certainly idiomatic F# to use arrays for performance if required.
With both of these changes I got almost a 2x performance improvement in FSI.

Generate two different randoms in F#

I have a F# list and I'm taking two elements of that list.
If the list has 10 elements in it :
let rnd = new Random()
let elem1 = list.Item(rnd.Next(0,9))
let elem2 = list.Item(rnd.Next(0,9))
There is a chance elem1 and elem2 are equal.
I have checked some workarounds and most of them work using a do while, but I don't want to implement a function that may never end in F#.
Is there a way to create a restriction in the random function?
First random : 0 <= x <= 9
Second random : 0 <= y <= 9 <> x
A simple solution:
let rnd = new Random()
let ndx1 = rnd.Next(9)
let ndx2 =
let x = rnd.Next(8)
if x < ndx1 then x else x + 1
let elem1, elem2 = list.[ndx1], list.[ndx2]
Another way, using maths and calling the random function once:
let r = Random().Next(9 * 8)
let x = 1 + r + r / 9
let elem1, elem2 = list.[x / 9], list.[x % 9]
which may be generalised to:
let getTwoElements lst =
let c = List.length lst
let x, y = Math.DivRem(Random().Next(c * (c-1)) * (c+1) / c + 1, c)
lst.[x], lst.[y]
A more declarative approach, taking into account your comment about points in the image:
let rnd = System.Random()
/// this will build you a list of 10 pairs of indices where a <> b.
let indices =
Seq.initInfinite (fun _ -> rnd.Next(0,10), rnd.Next(0,10))
|> Seq.filter (fun (a,b) -> a <> b)
|> Seq.take 10
|> List.ofSeq
/// map indices into actual points.
let elems =
let points = list |> Array.ofList
List.map (fun (a, b) -> points.[a], points.[b]) indices
As a side note, do not use random access on lists. They're not made for that and performance of that is poor. Convert them to an array first.
There are lots of way to achieve this. A simple one would be something like this:
open System
open System.Linq
let rnd = new Random()
let elem1 = list.Item(rnd.Next(0,9))
let elem2 = list.Where(fun x->x <> elem1).ElementAt(rnd.Next(0,8))

Optimising F# answer for Euler #4

I have recently begun learning F#. Hoping to use it to perform any mathematically heavy algorithms in C# applications and to broaden my knowledge
I have so far avoided StackOverflow as I didn't want to see the answer to this until I came to one myself.
I want to be able to write very efficient F# code, focused on performance and then maybe in other ways, such as writing in F# concisely (number of lines etc.).
Project Euler Question 4:
A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.
Find the largest palindrome made from the product of two 3-digit numbers.
My Answer:
let IsPalindrome (x:int) = if x.ToString().ToCharArray() = Array.rev(x.ToString().ToCharArray()) then x else 0
let euler4 = [for i in [100..999] do
for j in [i..999] do yield i*j]
|> Seq.filter(fun x -> x = IsPalindrome(x)) |> Seq.max |> printf "Largest product of two 3-digit numbers is %d"
I tried using option and returning Some(x) and None in IsPalindrome but kept getting compiling errors as I was passing in an int and returning int option. I got a NullRefenceException trying to return None.Value.
Instead I return 0 if the number isn't a palindrome, these 0's go into the Sequence, unfortunately.
Maybe I could order the sequence and then get the top value? instead of using Seq.Max? Or filter out results > 1?
Would this be better? Any advice would be much appreciated, even if it's general F# advice.
Efficiency being a primary concern, using string allocation/manipulation to find a numeric palindrome seems misguided – here's my approach:
module NumericLiteralG =
let inline FromZero () = LanguagePrimitives.GenericZero
let inline FromOne () = LanguagePrimitives.GenericOne
module Euler =
let inline isNumPalindrome number =
let ten = 1G + 1G + 1G + 1G + 1G + 1G + 1G + 1G + 1G + 1G
let hundred = ten * ten
let rec findHighDiv div =
let div' = div * ten
if number / div' = 0G then div else findHighDiv div'
let rec impl n div =
div = 0G || n / div = n % ten && impl (n % div / ten) (div / hundred)
findHighDiv 1G |> impl number
let problem004 () =
{ 100 .. 999 }
|> Seq.collect (fun n -> Seq.init (1000 - n) ((+) n >> (*) n))
|> Seq.filter isNumPalindrome
|> Seq.max
Here's one way to do it:
/// handy extension for reversing a string
type System.String with
member s.Reverse() = String(Array.rev (s.ToCharArray()))
let isPalindrome x = let s = string x in s = s.Reverse()
seq {
for i in 100..999 do
for j in i..999 -> i * j
}
|> Seq.filter isPalindrome
|> Seq.max
|> printfn "The answer is: %d"
let IsPalindrom (str:string)=
let rec fn(a,b)=a>b||str.[a]=str.[b]&&fn(a+1,b-1)
fn(0,str.Length-1)
let IsIntPalindrome = (string>>IsPalindrom)
let sq={100..999}
sq|>Seq.map (fun x->sq|>Seq.map (fun y->(x,y),x*y))
|>Seq.concat|>Seq.filter (snd>>IsIntPalindrome)|>Seq.maxBy (snd)
just my solution:
let isPalin x =
x.ToString() = new string(Array.rev (x.ToString().ToCharArray()))
let isGood num seq1 = Seq.exists (fun elem -> (num % elem = 0 && (num / elem) < 999)) seq1
{998001 .. -1 .. 10000} |> Seq.filter(fun x -> isPalin x) |> Seq.filter(fun x -> isGood x {999 .. -1 .. 100}) |> Seq.nth 0
simplest way is to go from 999 to 100, because is much likley to be product of two large numbers.
j can then start from i because other way around was already tested
other optimisations would go in directions where multiplactions would go descending order, but that makes everything little more difficult. In general it is expressed as list mergeing.
Haskell (my best try in functional programming)
merge f x [] = x
merge f [] y = y
merge f (x:xs) (y:ys)
| f x y = x : merge f xs (y:ys)
| otherwise = y : merge f (x:xs) ys
compare_tuples (a,b) (c,d) = a*b >= c*d
gen_mul n = (n,n) : merge compare_tuples
( gen_mul (n-1) )
( map (\x -> (n,x)) [n-1,n-2 .. 1] )
is_product_palindrome (a,b) = x == reverse x where x = show (a*b)
main = print $ take 10 $ map ( \(a,b)->(a,b,a*b) )
$ filter is_product_palindrome $ gen_mul 9999
output (less than 1s)- first 10 palindromes =>
[(9999,9901,99000099),
(9967,9867,98344389),
(9999,9811,98100189),
(9999,9721,97200279),
(9999,9631,96300369),
(9999,9541,95400459),
(9999,9451,94500549),
(9767,9647,94222249),
(9867,9547,94200249),
(9999,9361,93600639)]
One can see that this sequence is lazy generated from large to small
Optimized version:
let Euler dgt=
let [mine;maxe]=[dgt-1;dgt]|>List.map (fun x->String.replicate x "9"|>int)
let IsPalindrom (str:string)=
let rec fn(a,b)=a>b||str.[a]=str.[b]&&fn(a+1,b-1)
fn(0,str.Length-1)
let IsIntPalindrome = (string>>IsPalindrom)
let rec fn=function
|x,y,max,a,_ when a=mine->x,y,max
|x,y,max,a,b when b=mine->fn(x,y,max,a-1,maxe)
|x,y,max,a,b->a*b|>function
|m when b=maxe&&m<max->x,y,max
|m when m>max&&IsIntPalindrome(m)->fn(a,b,m,a-1,maxe)
|m when m>max->fn(x,y,max,a,b-1)
|_->fn(x,y,max,a-1,maxe)
fn(0,0,0,maxe,maxe)
Log (switch #time on):
> Euler 2;;
Real: 00:00:00.004, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
val it : int * int * int = (99, 91, 9009)
> Euler 3;;
Real: 00:00:00.004, CPU: 00:00:00.015, GC gen0: 0, gen1: 0, gen2: 0
val it : int * int * int = (993, 913, 906609)
> Euler 4;;
Real: 00:00:00.002, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
val it : int * int * int = (9999, 9901, 99000099)
> Euler 5;;
Real: 00:00:00.702, CPU: 00:00:00.686, GC gen0: 108, gen1: 1, gen2: 0
val it : int * int * int = (99793, 99041, 1293663921) //int32 overflow
Extern to BigInteger:
let Euler dgt=
let [mine;maxe]=[dgt-1;dgt]|>List.map (fun x->new System.Numerics.BigInteger(String.replicate x "9"|>int))
let IsPalindrom (str:string)=
let rec fn(a,b)=a>b||str.[a]=str.[b]&&fn(a+1,b-1)
fn(0,str.Length-1)
let IsIntPalindrome = (string>>IsPalindrom)
let rec fn=function
|x,y,max,a,_ when a=mine->x,y,max
|x,y,max,a,b when b=mine->fn(x,y,max,a-1I,maxe)
|x,y,max,a,b->a*b|>function
|m when b=maxe&&m<max->x,y,max
|m when m>max&&IsIntPalindrome(m)->fn(a,b,m,a-1I,maxe)
|m when m>max->fn(x,y,max,a,b-1I)
|_->fn(x,y,max,a-1I,maxe)
fn(0I,0I,0I,maxe,maxe)
Check:
Euler 5;;
Real: 00:00:02.658, CPU: 00:00:02.605, GC gen0: 592, gen1: 1, gen2: 0
val it :
System.Numerics.BigInteger * System.Numerics.BigInteger *
System.Numerics.BigInteger =
(99979 {...}, 99681 {...}, 9966006699 {...})

Most idiomatic way to write batchesOf size seq in F#

I'm trying to learn F# by rewriting some C# algorithms I have into idiomatic F#.
One of the first functions I'm trying to rewrite is a batchesOf where:
[1..17] |> batchesOf 5
Which would split the sequence into batches with a max of five in each, i.e:
[[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]
My first attempt at doing this is kind of ugly where I've resorted to using a mutable ref object after running into errors trying to use mutable type inside the closure. Using ref is particularly unpleasant since to dereference it you have to use the ! operator which when inside a condition expression can be counter intuitive to some devs who will read it as logical not. Another problem I ran into is where Seq.skip and Seq.take are not like their Linq aliases in that they will throw an error if size exceeds the size of the sequence.
let batchesOf size (sequence: _ seq) : _ list seq =
seq {
let s = ref sequence
while not (!s |> Seq.isEmpty) do
yield !s |> Seq.truncate size |> List.ofSeq
s := System.Linq.Enumerable.Skip(!s, size)
}
Anyway what would be the most elegant/idiomatic way to rewrite this in F#? Keeping the original behaviour but preferably without the ref mutable variable.
Implementing this function using the seq<_> type idiomatically is difficult - the type is inherently mutable, so there is no simple nice functional way. Your version is quite inefficient, because it uses Skip repeatedly on the sequence. A better imperative option would be to use GetEnumerator and just iterate over elements using IEnumerator. You can find various imperative options in this snippet: http://fssnip.net/1o
If you're learning F#, then it is better to try writing the function using F# list type. This way, you can use idiomatic functional style. Then you can write batchesOf using pattern matching with recursion and accumulator argument like this:
let batchesOf size input =
// Inner function that does the actual work.
// 'input' is the remaining part of the list, 'num' is the number of elements
// in a current batch, which is stored in 'batch'. Finally, 'acc' is a list of
// batches (in a reverse order)
let rec loop input num batch acc =
match input with
| [] ->
// We've reached the end - add current batch to the list of all
// batches if it is not empty and return batch (in the right order)
if batch <> [] then (List.rev batch)::acc else acc
|> List.rev
| x::xs when num = size - 1 ->
// We've reached the end of the batch - add the last element
// and add batch to the list of batches.
loop xs 0 [] ((List.rev (x::batch))::acc)
| x::xs ->
// Take one element from the input and add it to the current batch
loop xs (num + 1) (x::batch) acc
loop input 0 [] []
As a footnote, the imperative version can be made a bit nicer using computation expression for working with IEnumerator, but that's not standard and it is quite advanced trick (for example, see http://fssnip.net/37).
A friend asked me this a while back. Here's a recycled answer. This works and is pure:
let batchesOf n =
Seq.mapi (fun i v -> i / n, v) >>
Seq.groupBy fst >>
Seq.map snd >>
Seq.map (Seq.map snd)
Or an impure version:
let batchesOf n =
let i = ref -1
Seq.groupBy (fun _ -> i := !i + 1; !i / n) >> Seq.map snd
These produce a seq<seq<'a>>. If you really must have an 'a list list as in your sample then just add ... |> Seq.map (List.ofSeq) |> List.ofSeq as in:
> [1..17] |> batchesOf 5 |> Seq.map (List.ofSeq) |> List.ofSeq;;
val it : int list list = [[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]
Hope that helps!
This can be done without recursion if you want
[0..20]
|> Seq.mapi (fun i elem -> (i/size),elem)
|> Seq.groupBy (fun (a,_) -> a)
|> Seq.map (fun (_,se) -> se |> Seq.map (snd));;
val it : seq<seq<int>> =
seq
[seq [0; 1; 2; 3; ...]; seq [5; 6; 7; 8; ...]; seq [10; 11; 12; 13; ...];
seq [15; 16; 17; 18; ...]; ...]
Depending on how you think this may be easier to understand. Tomas' solution is probably more idiomatic F# though
Hurray, we can use List.chunkBySize, Seq.chunkBySize and Array.chunkBySize in F# 4, as mentioned by Brad Collins and Scott Wlaschin.
This isn't perhaps idiomatic but it works:
let batchesOf n l =
let _, _, temp', res' = List.fold (fun (i, n, temp, res) hd ->
if i < n then
(i + 1, n, hd :: temp, res)
else
(1, i, [hd], (List.rev temp) :: res))
(0, n, [], []) l
(List.rev temp') :: res' |> List.rev
Here's a simple implementation for sequences:
let chunks size (items:seq<_>) =
use e = items.GetEnumerator()
let rec loop i acc =
seq {
if i = size then
yield (List.rev acc)
yield! loop 0 []
elif e.MoveNext() then
yield! loop (i+1) (e.Current::acc)
else
yield (List.rev acc)
}
if size = 0 then invalidArg "size" "must be greater than zero"
if Seq.isEmpty items then Seq.empty else loop 0 []
let s = Seq.init 10 id
chunks 3 s
//output: seq [[0; 1; 2]; [3; 4; 5]; [6; 7; 8]; [9]]
My method involves converting the list to an array and recursively chunking the array:
let batchesOf (sz:int) lt =
let arr = List.toArray lt
let rec bite curr =
if (curr + sz - 1 ) >= arr.Length then
[Array.toList arr.[ curr .. (arr.Length - 1)]]
else
let curr1 = curr + sz
(Array.toList (arr.[curr .. (curr + sz - 1)])) :: (bite curr1)
bite 0
batchesOf 5 [1 .. 17]
[[1; 2; 3; 4; 5]; [6; 7; 8; 9; 10]; [11; 12; 13; 14; 15]; [16; 17]]
I found this to be a quite terse solution:
let partition n (stream:seq<_>) = seq {
let enum = stream.GetEnumerator()
let rec collect n partition =
if n = 1 || not (enum.MoveNext()) then
partition
else
collect (n-1) (partition # [enum.Current])
while enum.MoveNext() do
yield collect n [enum.Current]
}
It works on a sequence and produces a sequence. The output sequence consists of lists of n elements from the input sequence.
You can solve your task with analog of Clojure partition library function below:
let partition n step coll =
let rec split ss =
seq {
yield(ss |> Seq.truncate n)
if Seq.length(ss |> Seq.truncate (step+1)) > step then
yield! split <| (ss |> Seq.skip step)
}
split coll
Being used as partition 5 5 it will provide you with sought batchesOf 5 functionality:
[1..17] |> partition 5 5;;
val it : seq<seq<int>> =
seq
[seq [1; 2; 3; 4; ...]; seq [6; 7; 8; 9; ...]; seq [11; 12; 13; 14; ...];
seq [16; 17]]
As a premium by playing with n and step you can use it for slicing overlapping batches aka sliding windows, and even apply to infinite sequences, like below:
Seq.initInfinite(fun x -> x) |> partition 4 1;;
val it : seq<seq<int>> =
seq
[seq [0; 1; 2; 3]; seq [1; 2; 3; 4]; seq [2; 3; 4; 5]; seq [3; 4; 5; 6];
...]
Consider it as a prototype only as it does many redundant evaluations on the source sequence and not likely fit for production purposes.
This version passes all my tests I could think of including ones for lazy evaluation and single sequence evaluation:
let batchIn batchLength sequence =
let padding = seq { for i in 1 .. batchLength -> None }
let wrapped = sequence |> Seq.map Some
Seq.concat [wrapped; padding]
|> Seq.windowed batchLength
|> Seq.mapi (fun i el -> (i, el))
|> Seq.filter (fun t -> fst t % batchLength = 0)
|> Seq.map snd
|> Seq.map (Seq.choose id)
|> Seq.filter (fun el -> not (Seq.isEmpty el))
I am still quite new to F# so if I'm missing anything - please do correct me, it will be greatly appreciated.

Resources