Why doesn't F# interactive flag this error? - visual-studio

I have some code with a few hundred lines. Many small pieces of it have the following structure:
let soa =
election
|> Series.observations
printfn "%A" <| soa
Frequently two things happen:
1) Mysteriously the last line is changed to:
printfn "%A" <|
so that the code above and what follows becomes
let soa =
election
|> Series.observations
printfn "%A" <|
let sls =
election
|> Series.sample (seq ["Party A"; "Party R"])
printfn "%A" <| sls
This happens hundreds of lines above where I am editing the file in the editor.
2) When this happens F# Interactive does not flag the error. No error messages are generated. However, if I try to access sls I get the message:
error FS0039: The value or constructor 'sls' is not defined.
Any ideas on why a bit of code is erased in the editor? (This happens quite frequently)
And why doesn't F# Interactive issue an error message?

The second let block is interpreted as argument for the preceding printfn, because the pipe, being an operator, provides an exception to the offset rule: the second argument of an operator does not have to be indented farther than the first argument. And since the second let block is not at top level, but rather is part of the printfn's argument, its definitions don't become accessible outside.
Let's try some experimentation:
let f x = x+1
// Normal application
f 5
// Complex expression as argument
f (5+6)
// Let-expression as argument
f (let x = 5 in x + 6)
// Replacing the `in` with a newline
f ( let x = 5
x + 6 )
// Replacing parentheses with pipe
f <|
let x = 5
x + 6
// Operators (of which the pipe is one) have an exception to the offset rule.
// This is done to support flows like this:
[1;2;3] |>
List.map ((+) 1) |>
List.toArray
// Applying this exception to the `f` + `let` expression:
f <|
let x = 5
x + 6

Related

F# int list versus unit list

open System
let rec quick (cast: int list) mmm =
match mmm with
| [] -> []
| first::rest ->
let small = (rest |> List.filter (fun x -> x < first))
let large = (rest |> List.filter (fun x -> x >= first))
quick small |> ignore
quick large |> ignore
//[small # [first] # large]
List.concat [small; [first]; large]
[<EntryPoint>]
let main argv =
printfn "%A" (quick [3;5;6;7;8;7;5;4;3;4;5;6]);;
0
Trying to implement a simple quicksort function in F#.
Relatively new to the language, but by all account from what I've read and my understanding of the syntax this should present an integer list but is instead presenting the ambiguous "unit list".
Why does this give a unit list and not an int list?
It errors out at "%A" saying the types do not match.
As given in the OP, quick is a function that takes two parameters: cast and mmm. The type of the function is int list -> int list -> int list.
The function call quick [3;5;6;7;8;7;5;4;3;4;5;6], however, only supplies one argument. Since F# functions are curried, the return value is a new function:
> quick [3;5;6;7;8;7;5;4;3;4;5;6];;
val it : (int list -> int list) = <fun:it#3-4>
This function (in my F# Interactive window called it#3-4) has the type int list -> int list - that is: It's a function that 'still waits' for an int list argument before it runs.
When you print it with the %A format specifier, it prints <fun:it#4-5> to the console. The return value of printfn is () (unit):
> printfn "%A" (quick [3;5;6;7;8;7;5;4;3;4;5;6]);;
<fun:it#4-5>
val it : unit = ()
You probably only want the function to take a single list parameter. Additionally, the steps you ignore are having no effect, so you might consider another way to recursively call quick.

how to do this without "match ... with"

For a better understanding, I try to rewrite this code without "... with" but I struggle:
let rec blast list =
list with
| x :: y :: [] -> x
| hd :: tl -> blast tl
| _ -> fail "not enough";;
Any ideas? Thanks!
Sure we could "manually" try to match each pattern.
The first applies when there is exactly 2 elements, the second when there is more than 1 (but not 2) and the third in all other cases (0 elements).
The second case can be folded into the last case (As when there is 1 element, the recursive call just fails).
So now we have 3 cases: exactly 2, more than 2 and less than 2.
Perfect for List.compare_length_with: 'a list -> int -> int:
let rec beforelast list =
let cmp = List.compare_length_with list 2 in
if cmp = 0 then (* Exactly 2 elements *)
List.hd list
else if cmp > 0 then (* More than 2 elements *)
beforelast (List.tl list)
else (* 1 or 0 elements *)
failwith "not enough"
Though note that you are still pattern matching under the hood, because that's what OCaml data types are made for. For example, List.hd might be implemented like:
let hd = function
| head :: _ -> head
| [] -> raise (Failure "hd")
So the match ... with way should be the way that leads to a better understanding.

F# List optimisation

From an unordered list of int, I want to have the smallest difference between two elements. I have a code that is working but way to slow. Can anyone sugest some change to improve the performance? Please explain why you did the change and what will be the performance gain.
let allInt = [ 5; 8; 9 ]
let sortedList = allInt |> List.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedList.Item a - sortedList.Item a + 1 ]
printfn "%i" (List.min differenceList) // print 1 (because 9-8 smallest difference)
I think I'm doing to much list creation or iteration but I don't know how to write it differently in F#...yet.
Edit: I'm testing this code on list with 100 000 items or more.
Edit 2: I believe that if I can calculte the difference and have the min in one go it should improve the perf a lot, but I don't know how to do that, anay idea?
Thanks in advance
The List.Item performs in O(n) time and is probably the main performance bottle neck in your code. The evaluation of differenceList iterates the elements of sortedList by index, which means the performance is around O((N-2)(2(N-2))), which simplifies to O(N^2), where N is the number of elements in sortedList. For long lists, this will eventually perform badly.
What I would do is to eliminate calls to Item and instead use the List.pairwise operation
let data =
[ let rnd = System.Random()
for i in 1..100000 do yield rnd.Next() ]
#time
let result =
data
|> List.sort
|> List.pairwise // convert list from [a;b;c;...] to [(a,b); (b,c); ...]
|> List.map (fun (a,b) -> a - b |> abs) // Calculates the absolute difference
|> List.min
#time
The #time directives lets me measure execution time in F# Interactive and the output I get when running this code is:
--> Timing now on
Real: 00:00:00.029, CPU: 00:00:00.031, GC gen0: 1, gen1: 1, gen2: 0
val result : int = 0
--> Timing now off
F#'s built-in list type is implemented as a linked list, which means accessing elements by index has to enumerate the list all the way to the index each time. In your case you have two index accesses repeated N-2 times, getting slower and slower with each iteration, as the index grows and each access needs to go through longer part of the list.
First way out of this would be using an array instead of a list, which is a trivial change, but grants you faster index access.
(*
[| and |] let you define an array literal,
alternatively use List.toArray allInt
*)
let allInt = [| 5; 8; 9 |]
let sortedArray = allInt |> Array.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedArray.[a] - sortedArray.[a + 1] ]
Another approach might be pairing up the neighbours in the list, subtracting them and then finding a min.
let differenceList =
sortedList
|> List.pairwise
|> List.map (fun (x,y) -> x - y)
List.pairwise takes a list of elements and returns a list of the neighbouring pairs. E.g. in your example List.pairwise [ 5; 8; 9 ] = [ (5, 8); (8, 9) ], so that you can easily work with the pairs in the next step, the subtraction mapping.
This way is better, but these functions from List module take a list as input and produce a new list as the output, having to pass through the list 3 times (1 for pairwise, 1 for map, 1 for min at the end). To solve this, you can use functions from the Seq module, which work with .NETs IEnumerable<'a> interface allowing lazy evaluation resulting usually in fewer passes.
Fortunately in this case Seq defines alternatives for all the functions we use here, so the next step is trivial:
let differenceSeq =
sortedList
|> Seq.pairwise
|> Seq.map (fun (x,y) -> x - y)
let minDiff = Seq.min differenceSeq
This should need only one enumeration of the list (excluding the sorting phase of course).
But I cannot guarantee you which approach will be fastest. My bet would be on simply using an array instead of the list, but to find out, you will have to try it out and measure for yourself, on your data and your hardware. BehchmarkDotNet library can help you with that.
The rest of your question is adequately covered by the other answers, so I won't duplicate them. But nobody has yet addressed the question you asked in your Edit 2. To answer that question, if you're doing a calculation and then want the minimum result of that calculation, you want List.minBy. One clue that you want List.minBy is when you find yourself doing a map followed by a min operation (as both the other answers are doing): that's a classic sign that you want minBy, which does that in one operation instead of two.
There's one gotcha to watch out for when using List.minBy: It returns the original value, not the result of the calculation. I.e., if you do ints |> List.pairwise |> List.minBy (fun (a,b) -> abs (a - b)), then what List.minBy is going to return is a pair of items, not the difference. It's written that way because if it gives you the original value but you really wanted the result, you can always recalculate the result; but if it gave you the result and you really wanted the original value, you might not be able to get it. (Was that difference of 1 the difference between 8 and 9, or between 4 and 5?)
So in your case, you could do:
let allInt = [5; 8; 9]
let minPair =
allInt
|> List.pairwise
|> List.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
The List.minBy operation also exists on sequences, so if your list is large enough that you want to avoid creating an intermediate list of pairs, then use Seq.pairwise and Seq.minBy instead:
let allInt = [5; 8; 9]
let minPair =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
EDIT: Yes, I see that you've got a list of 100,000 items. So you definitely want the Seq version of this. The F# seq type is just IEnumerable, so if you're used to C#, think of the Seq functions as LINQ expressions and you'll have the right idea.
P.S. One thing to note here: see how I'm doing let a, b = minPair? That's called destructuring assignment, and it's really useful. I could also have done this:
let a, b =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
and it would have given me the same result. Seq.minBy returns a tuple of two integers, and the let a, b = (tuple of two integers) expression takes that tuple, matches it against the pattern a, b, and thus assigns a to have the value of that tuple's first item, and b to have the value of that tuple's second item. Notice how I used the phrase "matches it against the pattern": this is the exact same thing as when you use a match expression. Explaining match expressions would make this answer too long, so I'll just point you to an excellent reference on them if you haven't already read it:
https://fsharpforfunandprofit.com/posts/match-expression/
Here is my solution:
let minPair xs =
let foo (x, y) = abs (x - y)
xs
|> List.allPairs xs
|> List.filter (fun (x, y) -> x <> y)
|> List.minBy foo
|> foo

Why does putting the first line of the expression on the same line as let not compile?

In F# the following statement will fail with the following errors
let listx2 = [1..10]
|> List.map(fun x -> x * 2)
|> List.iter (fun x -> printf "%d " x)
Block following this 'let' is unfinished. Expect an expression.
Unexpected infix operator in binding. Expected incomplete structured construct at or before this point or other token.
However the following will compile
let listx2 =
[1..10]
|> List.map(fun x -> x * 2)
|> List.iter (fun x -> printf "%d " x)
I also noticed that this compiles but has a warning
let listx2 = [1..10] |>
List.map(fun x -> x * 2) |>
List.iter (fun x -> printf "%d " x)
Possible incorrect indentation: this token is offside of context started at position (10:18). Try indenting this token further or using standard formatting conventions.
What is the difference between the first two statements?
When you have
let listx2 = [1..10]
you are implicitly setting the indent level of the next line to be at the same character as the [. As given by the following rule for offside characters from the spec:
Immediately after an = token is encountered in a Let or Member context.
So in the first example, the |> is indented less than [ so you get an error, but in the second they are the same, so it all works.
I am not quite sure why moving the |> to the end of the line only gives a warning though.

Why is using a sequence so much slower than using a list in this example

Background:
I have a sequence of contiguous, time-stamped data.
The data-sequence has holes in it, some large, others just a single missing value.
Whenever the hole is just a single missing value, I want to patch the holes using a dummy-value (larger holes will be ignored).
I would like to use lazy generation of the patched sequence, and I am thus using Seq.unfold.
I have made two versions of the method to patch the holes in the data.
The first consumes the sequence of data with holes in it and produces the patched sequence. This is what i want, but the methods runs horribly slow when the number of elements in the input sequence rises above 1000, and it gets progressively worse the more elements the input sequence contains.
The second method consumes a list of the data with holes and produces the patched sequence and it runs fast. This is however not what I want, since this forces the instantiation of the entire input-list in memory.
I would like to use the (sequence -> sequence) method rather than the (list -> sequence) method, to avoid having the entire input-list in memory at the same time.
Questions:
1) Why is the first method so slow (getting progressively worse with larger input lists)
(I am suspecting that it has to do with repeatedly creating new sequences with Seq.skip 1, but I am not sure)
2) How can I make the patching of holes in the data fast, while using an input sequence rather than an input list?
The code:
open System
// Method 1 (Slow)
let insertDummyValuesWhereASingleValueIsMissing1 (timeBetweenContiguousValues : TimeSpan) (values : seq<(DateTime * float)>) =
let sizeOfHolesToPatch = timeBetweenContiguousValues.Add timeBetweenContiguousValues // Only insert dummy-values when the gap is twice the normal
(None, values) |> Seq.unfold (fun (prevValue, restOfValues) ->
if restOfValues |> Seq.isEmpty then
None // Reached the end of the input seq
else
let currentValue = Seq.hd restOfValues
if prevValue.IsNone then
Some(currentValue, (Some(currentValue), Seq.skip 1 restOfValues )) // Only happens to the first item in the seq
else
let currentTime = fst currentValue
let prevTime = fst prevValue.Value
let timeDiffBetweenPrevAndCurrentValue = currentTime.Subtract(prevTime)
if timeDiffBetweenPrevAndCurrentValue = sizeOfHolesToPatch then
let dummyValue = (prevTime.Add timeBetweenContiguousValues, 42.0) // 42 is chosen here for obvious reasons, making this comment superfluous
Some(dummyValue, (Some(dummyValue), restOfValues))
else
Some(currentValue, (Some(currentValue), Seq.skip 1 restOfValues))) // Either the two values were contiguous, or the gap between them was too large to patch
// Method 2 (Fast)
let insertDummyValuesWhereASingleValueIsMissing2 (timeBetweenContiguousValues : TimeSpan) (values : (DateTime * float) list) =
let sizeOfHolesToPatch = timeBetweenContiguousValues.Add timeBetweenContiguousValues // Only insert dummy-values when the gap is twice the normal
(None, values) |> Seq.unfold (fun (prevValue, restOfValues) ->
match restOfValues with
| [] -> None // Reached the end of the input list
| currentValue::restOfValues ->
if prevValue.IsNone then
Some(currentValue, (Some(currentValue), restOfValues )) // Only happens to the first item in the list
else
let currentTime = fst currentValue
let prevTime = fst prevValue.Value
let timeDiffBetweenPrevAndCurrentValue = currentTime.Subtract(prevTime)
if timeDiffBetweenPrevAndCurrentValue = sizeOfHolesToPatch then
let dummyValue = (prevTime.Add timeBetweenContiguousValues, 42.0)
Some(dummyValue, (Some(dummyValue), currentValue::restOfValues))
else
Some(currentValue, (Some(currentValue), restOfValues))) // Either the two values were contiguous, or the gap between them was too large to patch
// Test data
let numbers = {1.0..10000.0}
let contiguousTimeStamps = seq { for n in numbers -> DateTime.Now.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
// The fast sequence-patching (method 2)
dataWithOccationalHoles |> List.of_seq |> insertDummyValuesWhereASingleValueIsMissing2 timeBetweenContiguousValues |> Seq.iter (fun pair -> printfn "%f %s" (snd pair) ((fst pair).ToString()))
// The SLOOOOOOW sequence-patching (method 1)
dataWithOccationalHoles |> insertDummyValuesWhereASingleValueIsMissing1 timeBetweenContiguousValues |> Seq.iter (fun pair -> printfn "%f %s" (snd pair) ((fst pair).ToString()))
Any time you break apart a seq using Seq.hd and Seq.skip 1 you are almost surely falling into the trap of going O(N^2). IEnumerable<T> is an awful type for recursive algorithms (including e.g. Seq.unfold), since these algorithms almost always have the structure of 'first element' and 'remainder of elements', and there is no efficient way to create a new IEnumerable that represents the 'remainder of elements'. (IEnumerator<T> is workable, but its API programming model is not so fun/easy to work with.)
If you need the original data to 'stay lazy', then you should use a LazyList (in the F# PowerPack). If you don't need the laziness, then you should use a concrete data type like 'list', which you can 'tail' into in O(1).
(You should also check out Avoiding stack overflow (with F# infinite sequences of sequences) as an FYI, though it's only tangentially applicable to this problem.)
Seq.skip constructs a new sequence. I think that is why your original approach is slow.
My first inclination is to use a sequence expression and Seq.pairwise. This is fast and easy to read.
let insertDummyValuesWhereASingleValueIsMissingSeq (timeBetweenContiguousValues : TimeSpan) (values : seq<(DateTime * float)>) =
let sizeOfHolesToPatch = timeBetweenContiguousValues.Add timeBetweenContiguousValues // Only insert dummy-values when the gap is twice the normal
seq {
yield Seq.hd values
for ((prevTime, _), ((currentTime, _) as next)) in Seq.pairwise values do
let timeDiffBetweenPrevAndCurrentValue = currentTime.Subtract(prevTime)
if timeDiffBetweenPrevAndCurrentValue = sizeOfHolesToPatch then
let dummyValue = (prevTime.Add timeBetweenContiguousValues, 42.0) // 42 is chosen here for obvious reasons, making this comment superfluous
yield dummyValue
yield next
}

Resources