For a better understanding, I try to rewrite this code without "... with" but I struggle:
let rec blast list =
list with
| x :: y :: [] -> x
| hd :: tl -> blast tl
| _ -> fail "not enough";;
Any ideas? Thanks!
Sure we could "manually" try to match each pattern.
The first applies when there is exactly 2 elements, the second when there is more than 1 (but not 2) and the third in all other cases (0 elements).
The second case can be folded into the last case (As when there is 1 element, the recursive call just fails).
So now we have 3 cases: exactly 2, more than 2 and less than 2.
Perfect for List.compare_length_with: 'a list -> int -> int:
let rec beforelast list =
let cmp = List.compare_length_with list 2 in
if cmp = 0 then (* Exactly 2 elements *)
List.hd list
else if cmp > 0 then (* More than 2 elements *)
beforelast (List.tl list)
else (* 1 or 0 elements *)
failwith "not enough"
Though note that you are still pattern matching under the hood, because that's what OCaml data types are made for. For example, List.hd might be implemented like:
let hd = function
| head :: _ -> head
| [] -> raise (Failure "hd")
So the match ... with way should be the way that leads to a better understanding.
I stumbled upon a strange time complexity behaviour when using Seq.unfold. Here's the minimal case I could come up with to reproduce this.
let idUnfolder sequence =
sequence
|> Seq.tryHead
|> Option.map (fun head -> (head, Seq.tail sequence))
let seqIdWithUnfold sequence =
Seq.unfold idUnfolder sequence
The function seqIdWithUnfold returns the given sequence itself. I would expect the resulting sequence to be iterated in linear time as Seq.unfold is O(n), Seq.tryHead and Seq.tail are O(1) (correct me if I'm wrong). However, against all my knowledge, it has a cubic complexity.
I tested the execution time with the following function with a set of n values.
let test n =
let start = System.DateTime.Now
Seq.init n id
|> seqIdWithUnfold
|> Seq.iter ignore
let duration = System.DateTime.Now - start
printfn "%f" duration.TotalSeconds
What makes this operation cubic in complexity?
seq is almost always O(n).
A seq aka IEnumerable<T> is essentially:
type Enumerator<'a> = {
getNext : unit -> 'a option
}
type Seq<'a> = {
getEnumerator: unit -> Enumerator<'a>
}
Every time you evaluate a sequence, a new Enumerator is created which captures the state of enumeration. getNext is then repeatedly called till the sequence terminates.
You can see this for yourself, if you replace a seq at any point with with
source |> Seq.map(fun x -> printfn "Eval %A" x; x)
Let's show the calls to getEnumerator as well:
let sq =
seq {
let mutable ctr = 0
do printfn "Init _"
while true do
ctr <- ctr + 1
printfn "Yield %d" ctr
yield ctr
}
seqIdWithUnfold (sq |> Seq.take 3) |> Seq.toList |> printfn "%A"
And there's the output:
Init _
Yield 1
Init _
Yield 1
Yield 2
Init _
Yield 1
Yield 2
Yield 3
Init _
Yield 1
Yield 2
Yield 3
[1; 2; 3]
This shows you the classic n(n+1)/2 pattern.
You can see that complexity will have n + n2 terms in it.
If can use list, you'll get O(n) instead of O(n^2).
If you really want O(1), use Arrays.
As Philip Carter mentioned in the comments Seq.tail is O(n).
I'm not sure why an F# List would be unacceptable, F# Lists are singly linked lists so you can get the constant time tail. If you need to accept any sequence as your signature just convert to a List before unfolding, it still makes it O(n), This example you presented can't be lazy anyway unless you really didn't need the tail.
let seqIdWithUnfold sequence =
sequence
|> Seq.toList
|> List.unfold idUnfolder
|> List.toSeq
Your processing example still works with the List module
let idUnfolder listSeq =
listSeq
|> List.tryHead
|> Option.map (fun head -> (head, List.tail listSeq))
But I think it would look a bit cleaner as
let idUnfolder =
function | [] -> None
| h::t -> Some(h, t);
Benchamarks
| Method | Mean | Error | StdDev |
|---------------- |------------:|----------:|----------:|
| Original | 4,683.88 us | 36.462 us | 34.106 us |
| ListConversion | 15.63 us | 0.202 us | 0.179 us |
// * Hints *
Outliers
Benchmarkem.ListConversion: Default -> 1 outlier was removed (16.53 us)
// * Legends *
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
1 us : 1 Microsecond (0.000001 sec)
From an unordered list of int, I want to have the smallest difference between two elements. I have a code that is working but way to slow. Can anyone sugest some change to improve the performance? Please explain why you did the change and what will be the performance gain.
let allInt = [ 5; 8; 9 ]
let sortedList = allInt |> List.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedList.Item a - sortedList.Item a + 1 ]
printfn "%i" (List.min differenceList) // print 1 (because 9-8 smallest difference)
I think I'm doing to much list creation or iteration but I don't know how to write it differently in F#...yet.
Edit: I'm testing this code on list with 100 000 items or more.
Edit 2: I believe that if I can calculte the difference and have the min in one go it should improve the perf a lot, but I don't know how to do that, anay idea?
Thanks in advance
The List.Item performs in O(n) time and is probably the main performance bottle neck in your code. The evaluation of differenceList iterates the elements of sortedList by index, which means the performance is around O((N-2)(2(N-2))), which simplifies to O(N^2), where N is the number of elements in sortedList. For long lists, this will eventually perform badly.
What I would do is to eliminate calls to Item and instead use the List.pairwise operation
let data =
[ let rnd = System.Random()
for i in 1..100000 do yield rnd.Next() ]
#time
let result =
data
|> List.sort
|> List.pairwise // convert list from [a;b;c;...] to [(a,b); (b,c); ...]
|> List.map (fun (a,b) -> a - b |> abs) // Calculates the absolute difference
|> List.min
#time
The #time directives lets me measure execution time in F# Interactive and the output I get when running this code is:
--> Timing now on
Real: 00:00:00.029, CPU: 00:00:00.031, GC gen0: 1, gen1: 1, gen2: 0
val result : int = 0
--> Timing now off
F#'s built-in list type is implemented as a linked list, which means accessing elements by index has to enumerate the list all the way to the index each time. In your case you have two index accesses repeated N-2 times, getting slower and slower with each iteration, as the index grows and each access needs to go through longer part of the list.
First way out of this would be using an array instead of a list, which is a trivial change, but grants you faster index access.
(*
[| and |] let you define an array literal,
alternatively use List.toArray allInt
*)
let allInt = [| 5; 8; 9 |]
let sortedArray = allInt |> Array.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedArray.[a] - sortedArray.[a + 1] ]
Another approach might be pairing up the neighbours in the list, subtracting them and then finding a min.
let differenceList =
sortedList
|> List.pairwise
|> List.map (fun (x,y) -> x - y)
List.pairwise takes a list of elements and returns a list of the neighbouring pairs. E.g. in your example List.pairwise [ 5; 8; 9 ] = [ (5, 8); (8, 9) ], so that you can easily work with the pairs in the next step, the subtraction mapping.
This way is better, but these functions from List module take a list as input and produce a new list as the output, having to pass through the list 3 times (1 for pairwise, 1 for map, 1 for min at the end). To solve this, you can use functions from the Seq module, which work with .NETs IEnumerable<'a> interface allowing lazy evaluation resulting usually in fewer passes.
Fortunately in this case Seq defines alternatives for all the functions we use here, so the next step is trivial:
let differenceSeq =
sortedList
|> Seq.pairwise
|> Seq.map (fun (x,y) -> x - y)
let minDiff = Seq.min differenceSeq
This should need only one enumeration of the list (excluding the sorting phase of course).
But I cannot guarantee you which approach will be fastest. My bet would be on simply using an array instead of the list, but to find out, you will have to try it out and measure for yourself, on your data and your hardware. BehchmarkDotNet library can help you with that.
The rest of your question is adequately covered by the other answers, so I won't duplicate them. But nobody has yet addressed the question you asked in your Edit 2. To answer that question, if you're doing a calculation and then want the minimum result of that calculation, you want List.minBy. One clue that you want List.minBy is when you find yourself doing a map followed by a min operation (as both the other answers are doing): that's a classic sign that you want minBy, which does that in one operation instead of two.
There's one gotcha to watch out for when using List.minBy: It returns the original value, not the result of the calculation. I.e., if you do ints |> List.pairwise |> List.minBy (fun (a,b) -> abs (a - b)), then what List.minBy is going to return is a pair of items, not the difference. It's written that way because if it gives you the original value but you really wanted the result, you can always recalculate the result; but if it gave you the result and you really wanted the original value, you might not be able to get it. (Was that difference of 1 the difference between 8 and 9, or between 4 and 5?)
So in your case, you could do:
let allInt = [5; 8; 9]
let minPair =
allInt
|> List.pairwise
|> List.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
The List.minBy operation also exists on sequences, so if your list is large enough that you want to avoid creating an intermediate list of pairs, then use Seq.pairwise and Seq.minBy instead:
let allInt = [5; 8; 9]
let minPair =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
EDIT: Yes, I see that you've got a list of 100,000 items. So you definitely want the Seq version of this. The F# seq type is just IEnumerable, so if you're used to C#, think of the Seq functions as LINQ expressions and you'll have the right idea.
P.S. One thing to note here: see how I'm doing let a, b = minPair? That's called destructuring assignment, and it's really useful. I could also have done this:
let a, b =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
and it would have given me the same result. Seq.minBy returns a tuple of two integers, and the let a, b = (tuple of two integers) expression takes that tuple, matches it against the pattern a, b, and thus assigns a to have the value of that tuple's first item, and b to have the value of that tuple's second item. Notice how I used the phrase "matches it against the pattern": this is the exact same thing as when you use a match expression. Explaining match expressions would make this answer too long, so I'll just point you to an excellent reference on them if you haven't already read it:
https://fsharpforfunandprofit.com/posts/match-expression/
Here is my solution:
let minPair xs =
let foo (x, y) = abs (x - y)
xs
|> List.allPairs xs
|> List.filter (fun (x, y) -> x <> y)
|> List.minBy foo
|> foo
I have some code with a few hundred lines. Many small pieces of it have the following structure:
let soa =
election
|> Series.observations
printfn "%A" <| soa
Frequently two things happen:
1) Mysteriously the last line is changed to:
printfn "%A" <|
so that the code above and what follows becomes
let soa =
election
|> Series.observations
printfn "%A" <|
let sls =
election
|> Series.sample (seq ["Party A"; "Party R"])
printfn "%A" <| sls
This happens hundreds of lines above where I am editing the file in the editor.
2) When this happens F# Interactive does not flag the error. No error messages are generated. However, if I try to access sls I get the message:
error FS0039: The value or constructor 'sls' is not defined.
Any ideas on why a bit of code is erased in the editor? (This happens quite frequently)
And why doesn't F# Interactive issue an error message?
The second let block is interpreted as argument for the preceding printfn, because the pipe, being an operator, provides an exception to the offset rule: the second argument of an operator does not have to be indented farther than the first argument. And since the second let block is not at top level, but rather is part of the printfn's argument, its definitions don't become accessible outside.
Let's try some experimentation:
let f x = x+1
// Normal application
f 5
// Complex expression as argument
f (5+6)
// Let-expression as argument
f (let x = 5 in x + 6)
// Replacing the `in` with a newline
f ( let x = 5
x + 6 )
// Replacing parentheses with pipe
f <|
let x = 5
x + 6
// Operators (of which the pipe is one) have an exception to the offset rule.
// This is done to support flows like this:
[1;2;3] |>
List.map ((+) 1) |>
List.toArray
// Applying this exception to the `f` + `let` expression:
f <|
let x = 5
x + 6
In F# the following statement will fail with the following errors
let listx2 = [1..10]
|> List.map(fun x -> x * 2)
|> List.iter (fun x -> printf "%d " x)
Block following this 'let' is unfinished. Expect an expression.
Unexpected infix operator in binding. Expected incomplete structured construct at or before this point or other token.
However the following will compile
let listx2 =
[1..10]
|> List.map(fun x -> x * 2)
|> List.iter (fun x -> printf "%d " x)
I also noticed that this compiles but has a warning
let listx2 = [1..10] |>
List.map(fun x -> x * 2) |>
List.iter (fun x -> printf "%d " x)
Possible incorrect indentation: this token is offside of context started at position (10:18). Try indenting this token further or using standard formatting conventions.
What is the difference between the first two statements?
When you have
let listx2 = [1..10]
you are implicitly setting the indent level of the next line to be at the same character as the [. As given by the following rule for offside characters from the spec:
Immediately after an = token is encountered in a Let or Member context.
So in the first example, the |> is indented less than [ so you get an error, but in the second they are the same, so it all works.
I am not quite sure why moving the |> to the end of the line only gives a warning though.