Working with missing values in Deedle Time Series in F# (2) - filter

This question is related to
Working with missing values in Deedle Time Series in F# (1)
Suppose i have a Series<'K,'T opt> with some missing values
For example i have obtained a series
series4;;
 val it : Series<int,int opt> =
 1 -> 1
 2 -> 2
 3 -> 3
 4 -> <missing>
I could have got it this way:
let series1 = Series.ofObservations [(1,1);(2,2);(3,3)]
let series2 = Series.ofObservations [(1,2);(2,2);(3,1);(4,4)]
let series3 = series1.Zip(series2,JoinKind.Outer);;
let series4 = series3 |> Series.mapValues fst
However, in Deedle if you do
let series1_plus_2 = series1+series2
val series1_plus_2 : Series<int,int> =
1 -> 3
2 -> 4
3 -> 4
4 -> <missing>
you can see that the type Series<int,int> also naturally allows for missing values. And this seems the natural way to use functions in Deedle handling missing values
So my question is given series4 of type Series<int,int opt>, how do i get back a series with the "same" values but a type Series<int,int> ????
Notably, strange things are happening
for example, Series.dropMissing has not the expected behaviour when applied to series4
Series.dropMissing series4;;
 val it : Series<int,int opt> =
 1 -> 1
 2 -> 2
 3 -> 3
 4 -> <missing>
its NOT dropping the missing value !!

Main problem here is that int opt is not Deedle standard way to handle missing values. Value in series4 is not missing, but it have value OptionalValue.Missing. You can convert Series<int, int opt> to Series<int, int> for example this way:
let series4' = series4 |> Series.mapAll (fun _ v -> v |> Option.bind OptionalValue.asOption)

Related

F# Bjorklund algorithm: convert while-loop to recursive function: type-constraint issue

I’m doing the deep dive into f# finally. Long time c-style imperative guy - but lover of all languages. I’m attempting the Bjorklund algorithm for Euclidean Rhythms. Bjorklund: Most equal spacing of 1’s in a binary string up to rotation, e.g. 1111100000000 -> 1001010010100.
https://erikdemaine.org/papers/DeepRhythms_CGTA/paper.pdf
I initially based my attempt off a nice js/lodash implementation. I tried from scratch but got all tied up in old concepts.
https://codepen.io/teropa/details/zPEYbY
Here's my 1:1 translation
let mutable pat = "1111100000000" // more 0 than 1
//let mutable pat = "1111111100000" // more 1 than 0
// https://stackoverflow.com/questions/17101329/f-sequence-comparison
let compareSequences = Seq.compareWith Operators.compare
let mutable apat = Array.map (fun a -> [a]) ( Seq.toArray pat )
let mutable cond = true
while cond do
let (head, rem) = Array.partition (fun v -> (compareSequences v apat.[0]) = 0) apat
cond <- rem.Length > 1
match cond with
| false -> ()
| true ->
for i=0 to (min head.Length rem.Length)-1 do
apat.[i] <-apat.[i] # apat.[^0]
apat <- apat.[.. ^1]
let tostring (ac : char list) = (System.String.Concat(Array.ofList(ac)))
let oned = (Array.map (fun a -> tostring a) apat )
let res = Array.reduce (fun a b -> a+b) oned
printfn "%A" res
That seems to work. But since I want to (learn) be as functional, not necc. idiomatic, as possible, I wanted to lose the while and recurse the main algorithm.
Now I have this:
let apat = Array.map (fun a -> [a]) ( Seq.toArray pat )
let rec bjork bpat:list<char> array =
let (head, rem) = Array.partition (fun v -> (compareSequences v bpat.[0]) = 0) bpat
match rem.Length > 1 with
| false -> bpat
| true ->
for i=0 to (min head.Length rem.Length)-1 do
bpat.[i] <-bpat.[i] # bpat.[^0]
bjork bpat.[.. ^1]
let ppat = bjork apat
The issue is the second argument to compareSequences: bpat.[0] I am getting the error:
The operator 'expr.[idx]' has been used on an object of indeterminate type based on information prior to this program point. Consider adding further type constraints
I'm a bit confused since this seems so similar to the while-loop version. I can see that the signature of compareSequences is different but don't know why. apat has the same type in each version save the mutability. bpat in 2nd version is same type as apat.
while-loop: char list -> char list -> int
rec-funct : char list -> seq<char> -> int
I will say I've had some weird errors learning f# that ended up having to do with issues elsewhere in the code so hopefully this is not a lark.
Also, there may be other ways to do this, including Bresenham's line algorithm, but I'm on the learning track and this seemed a good algorithm for several functional concepts.
Can anyone see what I am missing here? Also, if someone who is well versed in the functional/f# paradigm has a nice way of approaching this, I'd like to see that.
Thanks
Ted
EDIT:
The recursive as above does not work. Just couldn't test. This works, but still has a mutable.
let rec bjork (bbpat:list<char> array) =
let mutable bpat = bbpat
let (head, rem) = Array.partition (fun v -> (compareSequences v bpat.[0]) = 0) bpat
match rem.Length > 1 with
| false -> bpat
| true ->
for i=0 to (min head.Length rem.Length)-1 do
bpat.[i] <-bpat.[i] # bpat.[^0]
bpat <- bpat.[.. ^1]
bjork bpat
You need to put parentheses around (bpat:list<char> array). Otherwise the type annotation applies to bjork, not to bbpat:
let rec bjork (bbpat:list<char> array) =
...
Also note that calculating length and indexing are both O(n) operations on an F# linked lists. Consider pattern matching instead.

How can I follow F# Lint's suggestion to use `id`

I am comparing two lists of thangs. Since I'm more familiar with Linq than F#, I did this:
let r1 = (rows1.Zip (rows2, fun r1 r2 -> rowComparer r1 r2)) .All (fun f -> f)
This raises two complaints from the F# linter.
Lint: If `rowComparer` has no mutable arguments partially applied then the lambda can be removed.
Lint: `fun x -> x` might be able to be refactored into `id`.
Of these, I could understand the latter, and tried this:
let r1 = (rows1.Zip (rows2, fun r1 r2 -> rowComparer r1 r2)) .All id
But this made the F# compiler complain:
This expression was expected to have type
'System.Func<bool,bool>'
but here has type
''a -> 'a'
Can someone say how this code can be more righteous?
I would suggest using the F# List or Seq modules instead of LINQ methods. Then you'll be able to use F# types like 'a -> 'a instead of System.Func<'a, 'a>, and you can pass id to the forAll function. If you could post a complete example, it would be easier to give you a complete answer, but I think something like this would be roughly equivalent to what you're doing with LINQ:
let compare (rowComparer: ('a * 'a) -> bool) rows =
Seq.zip rows >> Seq.map rowComparer >> Seq.forall id
This creates a function that takes two sequences and compares each value in the first to the corresponding value in the second, generating a sequence of booleans. It then returns true if all of the values in the sequence are true, otherwise it returns false. This is achieved using function composition and partial application to build a new function with the required signature.
You can then partially apply a row comparer function to create a specialized compare function for each of your scenarios, as follows:
let compareEqual = compare (fun (a,b) -> a = b)
compareEqual [0; 1; 2] [0; 1; 2] // true
compareEqual [0; 1; 2] [2; 1; 2] // false
You can supply the standard function id as an argument if you create an instance of System.Func with the correct number of generic type parameters from it. When employing a lambda expression, the F# compiler does that for you.
open System.Linq
let foo rowComparer (rows1 : seq<_>) (rows2 : seq<_>) =
(rows1.Zip (rows2, fun r1 r2 -> rowComparer r1 r2)).All(System.Func<_,_>(id))
// val foo :
// rowComparer:('a -> 'b -> bool) -> rows1:seq<'a> -> rows2:seq<'b> -> bool

Seq.take won't return elements

When I run the following code:
getTheData() |> Seq.take 3
it does not return the elements, instead it outputs this:
val it : seq<Collections.Generic.KeyValuePair<ID,Data>>
I am using Visual Studio 2017 and F# Interactive
What is wrong, should it not output the first 3 items?
getTheData function =
let getTheData() =
(#"C:\Users\data.xlsx")
|> (ParseExcel >> datap)
|> Seq.distinct
|> Seq.map(fun b -> b.ID, b)
|> Map.ofSeq
Seq.take is not considered a terminal operation on a sequence in F#. As mentioned in the comments, sequences are lazily evaluated, and only operations that are considered "terminal" will cause a sequence to be iterated. Terminal operations include Seq.iter (if you want to perform an action on each element) and Seq.toList (if you want a materialized list of each element), as well as others like Seq.exactlyOne.
In F# interactive, you can probably just evaluate it to see the first few values. In the following example mirroring yours, evaluating it at the end will display the 3 values taken:
open System
let getTheData() =
seq {
for n in {0..1000} -> Guid.NewGuid(), n
} |> Map.ofSeq
getTheData()
|> Seq.take 3;;
it;;
val it : seq<Collections.Generic.KeyValuePair<Guid,int>> =
seq
[[001830fe-9ce3-4649-8609-571e4aedb4c7, 791]
{Key = 001830fe-9ce3-4649-8609-571e4aedb4c7;
Value = 791;};
[001bf0a9-5981-4bc0-bcaf-046af7f4866a, 383]
{Key = 001bf0a9-5981-4bc0-bcaf-046af7f4866a;
Value = 383;};
[004b44a7-85d2-4ce5-91bf-49bcc44f03ba, 91]
{Key = 004b44a7-85d2-4ce5-91bf-49bcc44f03ba;
Value = 91;}]

How to shuffle a "DenseMatrix" in F#

In Matlab we can write the following code to shuffle a matrix:
data = data(:, randperm(size(data,2)));
there is what I write with Math.NET:
let csvfile = #"../UFLDL-tutorial-F#/housing.csv"
let housingAsLines =
File.ReadAllLines(csvfile)
|> Array.map (fun t -> t.Split(',')
|> Array.map (fun t -> float t))
let housingAsMatrix= DenseMatrix.OfRowArrays housingAsLines
let housingAsMatrixT = housingAsMatrix.Transpose()
let v1 = DenseVector.Create(housingAsMatrixT.ColumnCount,1.0)
housingAsMatrixT.InsertRow(0,v1)
// How to shuffle a "DenseMatrix" in F#
To simulate matrix operation in Matlab, using the F# slice syntax and zero-based indexing. However, it doesn't work.
housingAsMatrixT.[*,0]
And I got the error message in vscode.
The field, constructor or member 'GetSlice' is not defined
You actually have two questions, 1) how to slice Matrices ala matlab and 2) how to shuffle the columns of a matrix.
For 1) actually Issue 277 you linked in the comment does indeed provide the solution. However you might be using an old version or you might not be referencing the F# extensions correctly:
#r #"..\packages\MathNet.Numerics.3.13.1\lib\net40\MathNet.Numerics.dll"
#r #"..\packages\MathNet.Numerics.FSharp.3.13.1\lib\net40\MathNet.Numerics.FSharp.dll"
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.Distributions
open System
//let m = DenseMatrix.randomStandard<float> 5 5
let m = DenseMatrix.random<float> 5 5 (ContinuousUniform(0., 1.))
let m' = m.[*,0]
m'
//val it : Vector<float> =
//seq [0.4710989485; 0.2220238937; 0.566367266; 0.2356496324; ...]
This extracts the first column of the matrix.
Now for 2), assuming you need to shuffle the matrix or the arrays containing a matrix you can use some of the approaches below. There might be a more elegant method within mathnet.numerics.
To permute the vector above: m'.SelectPermutation() or SelectPermutationInplace for arrays. There are other convenience function like .Column(idx),.EnumerateColumnsIndexed() or EnumerateColumns(), etc.
So m'.SelectPermutation() will shuffle the elements of m'. Or to shuffle the columns (which your matlab function does):
let idx = Combinatorics.GeneratePermutation 5
idx
//val it : int [] = [|2; 0; 1; 4; 3|]
let m2 = idx |> Seq.map (fun i -> m.Column(i)) |> DenseMatrix.ofColumnSeq
m2.Column(1) = m.Column(0)
//val it : bool = true
Since the first column of the original matrix moved to the second column of the new matrix, the two should be equal.
With Neural Networks I had to shuffle an array of Matrices and used the following code. Note the base data structure is an array ([]) and each item in the array is a Matrix. This is not shuffling a matrix, but an array. It should give you some idea of how to proceed for your problem.
type Random() =
static member Shuffle (a : 'a[]) =
let rand = new System.Random()
let swap (a: _[]) x y =
let tmp = a.[x]
a.[x] <- a.[y]
a.[y] <- tmp
Array.iteri (fun i _ -> swap a i (rand.Next(i, Array.length a))) a
and called it like
Random.Shuffle trainingData
Addendum
Here is the code to convert a byte[] to a DenseMatrix of double
let byteArrayToMatrix (bytes : byte[]) : Matrix<double> =
let (x : Vector<byte>) = Vector<byte>.Build.DenseOfArray bytes
let (y : Vector<double>) = x.Map(fun x -> double x)
let (z : Matrix<double>) = Matrix<double>.Build.DenseOfRowVectors y
z
#GuyCoder and #s952163, thanks for your help. I implemented a quick-and-dirty version. It is not good enough but it works.
Please feel free to comment. Thank you.
#load "../packages/FsLab.1.0.2/FsLab.fsx"
open System
open System.IO
open MathNet.Numerics.LinearAlgebra.Double
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.Distributions
// implementation of the Fisher-Yates shuffle by Mathias
// http://www.clear-lines.com/blog/post/Optimizing-some-old-F-code.aspx
let swap fst snd i =
if i = fst then snd else
if i = snd then fst else
i
let shuffle items (rng: Random) =
let rec shuffleTo items upTo =
match upTo with
| 0 -> items
| _ ->
let fst = rng.Next(upTo)
let shuffled = List.permute (swap fst (upTo - 1)) items
shuffleTo shuffled (upTo - 1)
let length = List.length items
shuffleTo items length
let csvfile = #"/eUSB/sync/fsharp/UFLDL-tutorial-F#/housing.csv"
let housingAsLines =
File.ReadAllLines(csvfile)
|> Array.map (fun t -> t.Split(',')
|> Array.map (fun t -> float t))
let housingAsMatrix= DenseMatrix.OfRowArrays housingAsLines
let housingAsMatrixTmp = housingAsMatrix.Transpose()
let v1 = DenseVector.Create(housingAsMatrixTmp.ColumnCount,1.0)
let housingAsMatrixT = housingAsMatrixTmp.InsertRow(0,v1)
let m = housingAsMatrixT.RowCount - 1
let listOfArray = [0..m]
let random = new Random()
let shuffled = shuffle listOfArray random
let z = [for i in shuffled -> (housingAsMatrixT.[i, *])]
let final = DenseMatrix.OfRowVectors z

F# equivalent of LINQ Single

Ok, so for most LINQ operations there is a F# equivalent.
(Generally in the Seq module, since Seq= IEnumerable)
I can't find the equiv of IEmumerable.Single, I prefer Single over First (which is Seq.find), because it is more defensive - it asserts for me the state is what I expect.
So I see a couple of solutions (other than than using Seq.find).
(These could be written as extension methods)
The type signature for this function, which I'm calling only, is
('a->bool) -> seq<'a> -> 'a
let only = fun predicate src -> System.Linq.Enumerable.Single<'a>(src, predicate)
let only2 = Seq.filter >> Seq.exactlyOne
only2 is preferred, however it won't compile (any clues on that?).
In F# 2.0, this is a solution works without enumerating the whole sequence (close to your 2nd approach):
module Seq =
let exactlyOne seq =
match seq |> Seq.truncate 2 with
| s when Seq.length s = 1 -> s |> Seq.head |> Some
| _ -> None
let single predicate =
Seq.filter predicate >> exactlyOne
I choose to return option type since raising exception is quite unusual in F# high-order functions.
EDIT:
In F# 3.0, as #Oxinabox mentioned in his comment, Seq.exactlyOne exists in Seq module.
What about
let Single source f =
let worked = ref false
let newf = fun a ->
match f a with
|true ->
if !worked = true then failwith "not single"
worked := true
Some(a)
|false -> None
let r = source |> Seq.choose newf
Seq.nth 0 r
Very unidiomatic but probably close to optimal
EDIT:
Solution with exactlyOne
let only2 f s= (Seq.filter f s) |> exactlyOne

Resources