In Matlab we can write the following code to shuffle a matrix:
data = data(:, randperm(size(data,2)));
there is what I write with Math.NET:
let csvfile = #"../UFLDL-tutorial-F#/housing.csv"
let housingAsLines =
File.ReadAllLines(csvfile)
|> Array.map (fun t -> t.Split(',')
|> Array.map (fun t -> float t))
let housingAsMatrix= DenseMatrix.OfRowArrays housingAsLines
let housingAsMatrixT = housingAsMatrix.Transpose()
let v1 = DenseVector.Create(housingAsMatrixT.ColumnCount,1.0)
housingAsMatrixT.InsertRow(0,v1)
// How to shuffle a "DenseMatrix" in F#
To simulate matrix operation in Matlab, using the F# slice syntax and zero-based indexing. However, it doesn't work.
housingAsMatrixT.[*,0]
And I got the error message in vscode.
The field, constructor or member 'GetSlice' is not defined
You actually have two questions, 1) how to slice Matrices ala matlab and 2) how to shuffle the columns of a matrix.
For 1) actually Issue 277 you linked in the comment does indeed provide the solution. However you might be using an old version or you might not be referencing the F# extensions correctly:
#r #"..\packages\MathNet.Numerics.3.13.1\lib\net40\MathNet.Numerics.dll"
#r #"..\packages\MathNet.Numerics.FSharp.3.13.1\lib\net40\MathNet.Numerics.FSharp.dll"
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.Distributions
open System
//let m = DenseMatrix.randomStandard<float> 5 5
let m = DenseMatrix.random<float> 5 5 (ContinuousUniform(0., 1.))
let m' = m.[*,0]
m'
//val it : Vector<float> =
//seq [0.4710989485; 0.2220238937; 0.566367266; 0.2356496324; ...]
This extracts the first column of the matrix.
Now for 2), assuming you need to shuffle the matrix or the arrays containing a matrix you can use some of the approaches below. There might be a more elegant method within mathnet.numerics.
To permute the vector above: m'.SelectPermutation() or SelectPermutationInplace for arrays. There are other convenience function like .Column(idx),.EnumerateColumnsIndexed() or EnumerateColumns(), etc.
So m'.SelectPermutation() will shuffle the elements of m'. Or to shuffle the columns (which your matlab function does):
let idx = Combinatorics.GeneratePermutation 5
idx
//val it : int [] = [|2; 0; 1; 4; 3|]
let m2 = idx |> Seq.map (fun i -> m.Column(i)) |> DenseMatrix.ofColumnSeq
m2.Column(1) = m.Column(0)
//val it : bool = true
Since the first column of the original matrix moved to the second column of the new matrix, the two should be equal.
With Neural Networks I had to shuffle an array of Matrices and used the following code. Note the base data structure is an array ([]) and each item in the array is a Matrix. This is not shuffling a matrix, but an array. It should give you some idea of how to proceed for your problem.
type Random() =
static member Shuffle (a : 'a[]) =
let rand = new System.Random()
let swap (a: _[]) x y =
let tmp = a.[x]
a.[x] <- a.[y]
a.[y] <- tmp
Array.iteri (fun i _ -> swap a i (rand.Next(i, Array.length a))) a
and called it like
Random.Shuffle trainingData
Addendum
Here is the code to convert a byte[] to a DenseMatrix of double
let byteArrayToMatrix (bytes : byte[]) : Matrix<double> =
let (x : Vector<byte>) = Vector<byte>.Build.DenseOfArray bytes
let (y : Vector<double>) = x.Map(fun x -> double x)
let (z : Matrix<double>) = Matrix<double>.Build.DenseOfRowVectors y
z
#GuyCoder and #s952163, thanks for your help. I implemented a quick-and-dirty version. It is not good enough but it works.
Please feel free to comment. Thank you.
#load "../packages/FsLab.1.0.2/FsLab.fsx"
open System
open System.IO
open MathNet.Numerics.LinearAlgebra.Double
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.Distributions
// implementation of the Fisher-Yates shuffle by Mathias
// http://www.clear-lines.com/blog/post/Optimizing-some-old-F-code.aspx
let swap fst snd i =
if i = fst then snd else
if i = snd then fst else
i
let shuffle items (rng: Random) =
let rec shuffleTo items upTo =
match upTo with
| 0 -> items
| _ ->
let fst = rng.Next(upTo)
let shuffled = List.permute (swap fst (upTo - 1)) items
shuffleTo shuffled (upTo - 1)
let length = List.length items
shuffleTo items length
let csvfile = #"/eUSB/sync/fsharp/UFLDL-tutorial-F#/housing.csv"
let housingAsLines =
File.ReadAllLines(csvfile)
|> Array.map (fun t -> t.Split(',')
|> Array.map (fun t -> float t))
let housingAsMatrix= DenseMatrix.OfRowArrays housingAsLines
let housingAsMatrixTmp = housingAsMatrix.Transpose()
let v1 = DenseVector.Create(housingAsMatrixTmp.ColumnCount,1.0)
let housingAsMatrixT = housingAsMatrixTmp.InsertRow(0,v1)
let m = housingAsMatrixT.RowCount - 1
let listOfArray = [0..m]
let random = new Random()
let shuffled = shuffle listOfArray random
let z = [for i in shuffled -> (housingAsMatrixT.[i, *])]
let final = DenseMatrix.OfRowVectors z
Related
What approach would you use to generate the set of NxN matrices containing only zeros and ones which represents all possible distinct combinations?
let matrix Array2D.init N N (fun x y -> something)
If you don't know F# then pseudocode will be a contribution aswell.
So what I want is a list/array of all the distinct matrix combinations
So, I think the hard part is the generating the list of elements. We can do it recursively.
The base case is easy. For a 1x1 matrix, you have 1 element which can only have two combinations: [|[|0|]; [|1|]|].
For a 2x2 elements, we have 2^2 = 4 elements. Each one of these can be either 1 or 0, so there are 2^4 = 16 combinations possible. To get all the combinations possible for this 2x2 array, we can think of it as an array of length 4.
But first, let's think about an array of length 2. Then we have to find all the combinations between [|[|0|]; [|1|]|] and [|[|0|]; [|1|]|]. This would be [|[|0; 0|]; [|0;1|]; [|1;0|]; [|1; 1|]|]. Luckily, there's a function called Array.allPairs which will generate the array of all possible combinations between two arrays, which already does this for us!
So, we can apply Array.allPairs to each element of our array of length 4 sequentially to get all the possible combinations for the entire matrix using Array.reduce. I make a function called pairsToArray to basically flatten the data structure.
let pairsToArray x = Array.concat [|fst x; snd x|]
let rec binary N =
match N with
| 0 -> [||]
| 1 -> [|[|0|]; [|1|]|]
| n -> let elements = n*n
let combinations = Array.init elements (fun i -> binary 1)
let result = Array.reduce (fun acc i -> Array.allPairs acc i |> Array.map pairsToArray) combinations
result
Now, all that remains is converting this to a Array2D.
Something like should do the trick
let c = binary 2
c |> Array.map (fun i -> Array2D.init 2 2 (fun j k -> i.[j+k*2]))
for the 2x2 case
Maybe something like this
let rec addOne (N1: int, N2: int) (M: int[,]) (i: int, j: int)=
if M.[i,j] = 0
then M.[i,j] <- 1
true
else M.[i,j] <- 0
let newi, newj =
if i < N1-1
then (i+1,j)
else (0,j+1)
if newj = N2
then false
else addOne (N1, N2) M (newi,newj)
combined with this
let N = 3
let M: int[,] = Array2D.zeroCreate N N
let mylist =
[ yield M;
while addOne (N,N) M (0,0)
do yield Array2D.copy M ]
I don't know if it makes sense.
It is a method to find the "next" matrix, and then make a list of all the matrices that we encounter that way.
edit: replaced bool with int (0 and 1) to better fit the original question.
I'm new to F# and I'm trying to figure out how to return a random string value from a list/array of strings.
I have a list like this:
["win8FF40", "win10Chrome45", "win7IE11"]
How can I randomly select and return one item from the list above?
Here is my first try:
let combos = ["win8FF40";"win10Chrome45";"win7IE11"]
let getrandomitem () =
let rnd = System.Random()
fun (combos : string[]) -> combos.[rnd.Next(combos.Length)]
Both the answers given here by latkin and mydogisbox are good, but I still want to add a third approach that I sometimes use. This approach isn't faster, but it's more flexible and more composable, and fast enough for small sequences. Depending on your needs, you can use one of higher performance options given here, or you can use the following.
Single-argument function using Random
Instead of directly enabling you to select a single element, I often define a shuffleR function like this:
open System
let shuffleR (r : Random) xs = xs |> Seq.sortBy (fun _ -> r.Next())
This function has the type System.Random -> seq<'a> -> seq<'a>, so it works with any sort of sequence: lists, arrays, collections, and lazily evaluated sequences (although not with infinite sequences).
If you want a single random element from a list, you can still do that:
> [1..100] |> shuffleR (Random ()) |> Seq.head;;
val it : int = 85
but you can also take, say, three randomly picked elements:
> [1..100] |> shuffleR (Random ()) |> Seq.take 3;;
val it : seq<int> = seq [95; 92; 12]
No-argument function
Sometimes, I don't care about having to pass in that Random value, so I instead define this alternative version:
let shuffleG xs = xs |> Seq.sortBy (fun _ -> Guid.NewGuid())
It works in the same way:
> [1..100] |> shuffleG |> Seq.head;;
val it : int = 11
> [1..100] |> shuffleG |> Seq.take 3;;
val it : seq<int> = seq [69; 61; 42]
Although the purpose of Guid.NewGuid() isn't to provide random numbers, it's often random enough for my purposes - random, in the sense of being unpredictable.
Generalised function
Neither shuffleR nor shuffleG are truly random. Due to the ways Random and Guid.NewGuid() work, both functions may result in slightly skewed distributions. If this is a concern, you can define an even more general-purpose shuffle function:
let shuffle next xs = xs |> Seq.sortBy (fun _ -> next())
This function has the type (unit -> 'a) -> seq<'b> -> seq<'b> when 'a : comparison. It can still be used with Random:
> let r = Random();;
val r : Random
> [1..100] |> shuffle (fun _ -> r.Next()) |> Seq.take 3;;
val it : seq<int> = seq [68; 99; 54]
> [1..100] |> shuffle (fun _ -> r.Next()) |> Seq.take 3;;
val it : seq<int> = seq [99; 63; 11]
but you can also use it with some of the cryptographically secure random number generators provided by the Base Class Library:
open System.Security.Cryptography
open System.Collections.Generic
let rng = new RNGCryptoServiceProvider ()
let bytes = Array.zeroCreate<byte> 100
rng.GetBytes bytes
let q = bytes |> Queue
FSI:
> [1..100] |> shuffle (fun _ -> q.Dequeue()) |> Seq.take 3;;
val it : seq<int> = seq [74; 82; 61]
Unfortunately, as you can see from this code, it's quite cumbersome and brittle. You have to know the length of the sequence up front; RNGCryptoServiceProvider implements IDisposable, so you should make sure to dispose of rng after use; and items will be removed from q after use, which means it's not reusable.
Cryptographically random sort or selection
Instead, if you really need a cryptographically correct sort or selection, it'd be easier to do it like this:
let shuffleCrypto xs =
let a = xs |> Seq.toArray
use rng = new RNGCryptoServiceProvider ()
let bytes = Array.zeroCreate a.Length
rng.GetBytes bytes
Array.zip bytes a |> Array.sortBy fst |> Array.map snd
Usage:
> [1..100] |> shuffleCrypto |> Array.head;;
val it : int = 37
> [1..100] |> shuffleCrypto |> Array.take 3;;
val it : int [] = [|35; 67; 36|]
This isn't something I've ever had to do, though, but I thought I'd include it here for the sake of completeness. While I haven't measured it, it's most likely not the fastest implementation, but it should be cryptographically random.
Your problem is that you are mixing Arrays and F# Lists (*type*[] is a type notation for Array). You could modify it like this to use lists:
let getrandomitem () =
let rnd = System.Random()
fun (combos : string list) -> List.nth combos (rnd.Next(combos.Length))
That being said, indexing into a List is usually a bad idea since it has O(n) performance since an F# list is basically a linked-list. You would be better off making combos into an array if possible like this:
let combos = [|"win8FF40";"win10Chrome45";"win7IE11"|]
I wrote a blog post on exactly this topic a while ago: http://latkin.org/blog/2013/11/16/selecting-a-random-element-from-a-linked-list-3-approaches-in-f/
3 approaches are given there, with discussion of performance and tradeoffs of each.
To summarize:
// pro: simple, fast in practice
// con: 2-pass (once to get length, once to select nth element)
let method1 lst (rng : Random) =
List.nth lst (rng.Next(List.length lst))
// pro: ~1 pass, list length is not bound by int32
// con: more complex, slower in practice
let method2 lst (rng : Random) =
let rec step remaining picks top =
match (remaining, picks) with
| ([], []) -> failwith "Don't pass empty list"
// if only 1 element is picked, this is the result
| ([], [p]) -> p
// if multiple elements are picked, select randomly from them
| ([], ps) -> step ps [] -1
| (h :: t, ps) ->
match rng.Next() with
// if RNG makes new top number, picks list is reset
| n when n > top -> step t [h] n
// if RNG ties top number, add current element to picks list
| n when n = top -> step t (h::ps) top
// otherwise ignore and move to next element
| _ -> step t ps top
step lst [] -1
// pro: exactly 1 pass
// con: more complex, slowest in practice due to tuple allocations
let method3 lst (rng : Random) =
snd <| List.fold (fun (i, pick) elem ->
if rng.Next(i) = 0 then (i + 1, elem)
else (i + 1, pick)
) (0, List.head lst) lst
Edit: I should clarify that above shows a few ways to get a random element from a list, assuming you must use a list. If it fits with the rest of your program's design, it is definitely more efficient to take a random element from an array.
Suppose I have the following matrix:
The matrix can be broken down into chunks such that each chunk must, for all rows, have the same number of columns where the value is marked true for that row.
For example, the following chunk is valid:
This means that rows do not have to be contiguous.
Columns do not have to be contiguous either, as the following is a valid chunk:
However, the following is invalid:
That said, what is an algorithm that can be used to select chunks such that the minimal number of chunks will be used when finding all the chunks?
Given the example, above, the proper solution is (items with the same color represent a valid chunk):
In the above example, three is the minimal number of chunks that this can be broken down into.
Note that the following is also a valid solution:
There's not a preference to the solutions, really, just to get the least number of chunks.
I thought of counting using adjacent cells, but that doesn't account for the fact that the column values don't have to be contiguous.
I believe the key lies in finding the chunks with the largest area given the constraints, removing those items, and then repeating.
Taking that approach, the solution is:
But how to traverse the matrix and find the largest area is eluding me.
Also note, that if you want to reshuffle the rows and/or columns during the operations, that's a valid operation (in order to find the largest area), but I'd imagine you can only do it after you remove the largest areas from the matrix (after one area is found and moving onto the next).
You are doing circuit minimization on a truth table. For 4x4 truth tables, you can use a K map. The Quine-McCluskey algorithm is a generalization that can handle larger truth tables.
Keep in mind the problem is NP-Hard, so depending on the size of your truth tables, this problem can quickly grow to a size that is intractable.
This problem is strongly related to Biclustering, for which there are many efficient algorithms (and freely available implementations). Usually you will have to specify the number K of clusters you expect to find; if you don't have a good idea what K should be, you can proceed by binary search on K.
In case the biclusters don't overlap, you are done, otherwise you need to do some geometry to cut them into "blocks".
The solution I propose is fairly straightforward, but very time consuming.
It can be decomposed in 4 major steps:
find all the existing patterns in the matrix,
find all the possible combinations of these patterns,
remove all the incomplete pattern sets,
scan the remaining list to get the set with the minimum number of elements
First of, the algorithm below works on either column or row major matrices. I chose column for the explanations, but you may swap it for rows at your convenience, as long as it remains consistent accross the whole process.
The sample code accompanying the answer is in OCaml, but doesn't use any specific feature of the language, so it should be easy to port to other ML dialects.
Step 1:
Each column can be seen as a bit vector. Observe that a pattern (what you call chunk in your question) can be constructed by intersecting (ie. and ing) all the columns, or all the rows composing it, or even a combinations. So the first step is really about producing all the combinations of rows and columns (the powerset of the matrix' rows and columns if you will), intersecting them at the same time, and filter out the duplicates.
We consider the following interface for a matrix datatype:
module type MATRIX = sig
type t
val w : int (* the width of the matrix *)
val h : int (* the height ........ *)
val get : t -> int -> int -> bool (* cell value getter *)
end
Now let's have a look at this step's code:
let clength = M.h
let rlength = M.w
(* the vector datatype used throughought the algorithm
operator on this type are in the module V *)
type vector = V.t
(* a pattern description and comparison operators *)
module Pattern = struct
type t = {
w : int; (* width of thd pattern *)
h : int; (* height of the pattern *)
rows : vector; (* which rows of the matrix are used *)
cols : vector; (* which columns... *)
}
let compare a b = Pervasives.compare a b
let equal a b = compare a b = 0
end
(* pattern set : let us store patterns without duplicates *)
module PS = Set.Make(Pattern)
(* a simple recursive loop on #f #k times *)
let rec fold f acc k =
if k < 0
then acc
else fold f (f acc k) (pred k)
(* extract a column/row of the given matrix *)
let cr_extract mget len =
fold (fun v j -> if mget j then V.set v j else v) (V.null len) (pred len)
let col_extract m i = cr_extract (fun j -> M.get m i j) clength
let row_extract m i = cr_extract (fun j -> M.get m j i) rlength
(* encode a single column as a pattern *)
let col_encode c i =
{ w = 1; h = count c; rows = V.set (V.null clength) i; cols = c }
let row_encode r i =
{ h = 1; w = count r; cols = V.set (V.null rlength) i; rows = r }
(* try to add a column to a pattern *)
let col_intersect p c i =
let col = V.l_and p.cols c in
let h = V.count col in
if h > 0
then
let row = V.set (V.copy p.rows) i in
Some {w = V.count row; h = h; rows = row; clos = col}
else None
let row_intersect p r i =
let row = V.l_and p.rows r in
let w = V.count row in
if w > 0
then
let col = V.set (V.copy p.cols) i in
Some { w = w; h = V.count col; rows = row; cols = col }
else None
let build_patterns m =
let bp k ps extract encode intersect =
let build (l,k) =
let c = extract m k in
let u = encode c k in
let fld p ps =
match intersect p c k with
None -> l
| Some npc -> PS.add npc ps
in
PS.fold fld (PS.add u q) q, succ k
in
fst (fold (fun res _ -> build res) (ps, 0) k)
in
let ps = bp (pred rlength) PS.empty col_extract col_encode col_intersect in
let ps = bp (pred clength) ps row_extract row_encode row_intersect in
PS.elements ps
The V module must comply with the following signature for the whole algorithm:
module type V = sig
type t
val null : int -> t (* the null vector, ie. with all entries equal to false *)
val copy : t -> t (* copy operator *)
val get : t -> int -> bool (* get the nth element *)
val set : t -> int -> t (* set the nth element to true *)
val l_and : t -> t -> t (* intersection operator, ie. logical and *)
val l_or : t -> t -> t (* logical or *)
val count : t -> int (* number of elements set to true *)
val equal : t -> t -> bool (* equality predicate *)
end
Step 2:
Combining the patterns can also be seen as a powerset construction, with some restrictions: A valid pattern set may only contain patterns which don't overlap. The later can be defined as true for two patterns if both contain at least one common matrix cell.
With the pattern data structure used above, the overlap predicate is quite simple:
let overlap p1 p2 =
let nullc = V.null h
and nullr = V.null w in
let o v1 v2 n = not (V.equal (V.l_and v1 v2) n) in
o p1.rows p2.rows nullr && o p1.cols p2.cols nullc
The cols and rows of the pattern record indicate which coordinates in the matrix are included in the pattern. Thus a logical and on both fields will tell us if the patterns overlap.
For including a pattern in a pattern set, we must ensure that it does not overlap with any pattern of the set.
type pset = {
n : int; (* number of patterns in the set *)
pats : pattern list;
}
let overlap sp p =
List.exists (fun x -> overlap x p) sp.pats
let scombine sp p =
if overlap sp p
then None
else Some {
n = sp.n + 1;
pats = p::sp.pats;
}
let build_pattern_sets l =
let pset l p =
let sp = { n = 1; pats = [p] } in
List.fold_left (fun l spx ->
match scombine spx p with
None -> l
| Some nsp -> nsp::l
) (sp::l) l
in List.fold_left pset [] l
This step produces a lot of sets, and thus is very memory and computation intensive. It's certainly the weak point of this solution, but I don't see yet how to reduce the fold.
Step 3:
A pattern set is incomplete if when rebuilding the matrix with it, we do not obtain the original one. So the process is rather simple.
let build_matrix ps w =
let add m p =
let rec add_col p i = function
| [] -> []
| c::cs ->
let c =
if V.get p.rows i
then V.l_or c p.cols
else c
in c::(add_col p (succ i) cs)
in add_col p 0 m
in
(* null matrix as a list of null vectors *)
let m = fold (fun l _ -> V.null clength::l) [] (pred rlength) in
List.fold_left add m ps.pats
let drop_incomplete_sets m l =
(* convert the matrix to a list of columns *)
let m' = fold (fun l k -> col_extract m k ::l) [] (pred rlength) in
let complete m sp =
let m' = build_matrix sp in
m = m'
in List.filter (fun x -> complete m' x) l
Step 4:
The last step is just selecting the set with the smallest number of elements:
let smallest_set l =
let smallest ps1 ps2 = if ps1.n < ps2.n then ps1 else ps2 in
match l with
| [] -> assert false (* there should be at least 1 solution *)
| h::t -> List.fold_left smallest h t
The whole computation is then just the chaining of each steps:
let compute m =
let (|>) f g = g f in
build_patterns m |> build_pattern_sets |> drop_incomplete_sets m |> smallest_set
Notes
The algorithm above constructs a powerset of a powerset, with some limited filtering. There isn't as far as I know a way to reduce the search (as mentioned in a comment, if this is a NP hard problem, there isn't any).
This algorithm checks all the possible solutions, and correctly returns an optimal one (tested with many matrices, including the one given in the problem description.
One quick remark regarding the heuristic you propose in your question:
it could be easily implemented using the first step, removing the largest pattern found, and recursing. That would yeld a solution much more rapidly than my algorithm. However, the solution found may not be optimal.
For instance, consider the following matrix:
.x...
.xxx
xxx.
...x.
The central 4 cell chunck is the largest which may be found, but the set using it would comprise 5 patterns in total.
.1...
.223
422.
...5.
Yet this solution uses only 4:
.1...
.122
334.
...4.
Update:
Link to the full code I wrote for this answer.
I have implemented the Baum-Welch algorithm and I am playing with some toy data, generated with a known distribution. The data is normally distributed, had different mean and standard deviation depending on the hidden state. There are 2 states. The algorithm seems to converge for most of the parameters apart from the initial distribution of the hidden state, which always converges to either (0; 1) or (1; 0) depending on the random data.
Is it normal in this algorithm? If so I would appreciate some references, if not some hints how to find the bug.
Code (F#). First a helper module:
module MyMath
let sqr (x:float) = x*x
let inline (./) (array:float[]) (d:float) =
Array.map (fun x -> x/d) array
let inline (.*) (array:float[]) (d:float) =
Array.map (fun x -> x*d) array
let map f s =
s |> Seq.map f |> Seq.toArray
let normalize v =
let sum = Seq.sum v
map (fun x -> x/sum) v
let row i array = seq { for j in 0 .. (Array2D.length2 array)-1 do yield array.[i,j]}
let column j array = seq { for i in 0 .. (Array2D.length1 array)-1 do yield array.[i,j]}
let sum (v:float[]) = v |> Array.sum
let sumTo N (f:int->float) = Seq.init N f |> Seq.sum
let sum_column j (array:float[,]) = column j array |> Seq.sum
let sum_row i (array:float[,]) = row i array |> Seq.sum
let mean data = (sum data)/(float (Array.length data))
let var data =
let m=mean data
let N=Array.length data
let sum=Seq.sumBy (fun x -> sqr(x)) data
sum/(float N)
let induction start T nextRow =
let result = Array.zeroCreate T
result.[0] <- start
for t=1 to T-1 do
result.[t] <- nextRow t result.[t-1]
result
let backInduction last T previousRow =
let result = Array.zeroCreate T
result.[T-1] <- last
for t=T-2 downto 0 do
result.[t] <- previousRow t result.[t+1]
result
let inductionNormalized start T nextRow =
let result = Array.zeroCreate T
let norm = Array.zeroCreate T
norm.[0] <- sum start
result.[0] <- start./norm.[0]
for t=1 to T-1 do
result.[t] <- nextRow t result.[t-1]
norm.[t] <- sum result.[t]
result.[t] <- result.[t]./norm.[t]
(result, norm)
The main module:
module BaumWelch
open System
open MyMath
let mu (theta : float[,]) q = theta.[q,0]
let sigma (theta : float[,]) q = theta.[q,1]
let likelihood getDrift getVol dt parameters state observation =
let mu = getDrift parameters state
let sigma = Math.Abs (getVol parameters state:float)
let sqrt_dt = Math.Sqrt dt
let residueSquared =
let r = Likelihood.normalizedResidue mu sigma dt sqrt_dt observation in r*r
let result = (Math.Exp (-0.5*residueSquared))/(sigma * (Math.Sqrt (2.0*Math.PI*dt)))
if result<0.0 then failwith "Negative density, it certainly shouldn't have happened"
else result
let alphaBeta b (initialPi:float[]) initialA observations= //notation in comments from the Erratum for Rabiner
let T = Array.length observations
let N = Array2D.length1 initialA
let alphaStart = Array.init N (fun i -> initialPi.[i] * (b i observations.[0])) //this contains \bar{\alpha}
let alpha_j_t (previousRow:float[]) t j = (sumTo N (fun i -> previousRow.[i]*initialA.[i, j]))* (b j observations.[t]) //this contains \bar{\alpha}
let alphaInductionStep t previousRow = Array.init N (alpha_j_t previousRow t)
let (alpha, norm) = inductionNormalized alphaStart T alphaInductionStep
let betaStart = Array.init N (fun i -> 1.0/norm.[T-1])
let beta_j_t (nextRow:float[]) t j = (sumTo N (fun i -> initialA.[j, i]*nextRow.[i]*(b i observations.[t+1])))/norm.[t]
let betaInductionStep t nextRow = Array.init N (beta_j_t nextRow t)
let beta = backInduction betaStart T betaInductionStep
(alpha, beta, norm) //c_t = 1/norm_t
let log_P_O norm =
let result = norm |> Seq.sumBy (fun norm_t -> Math.Log norm_t)//c_t = 1/norm_t
if Double.IsNaN result then failwith "log likelihood is NaN"
else result
let gamma (alpha:float[][], beta:float[][], norm:float[]) i t =
alpha.[t].[i]*beta.[t].[i]*norm.[t]
let xi b (initialA:float[,]) (alpha:float[][]) (beta:float[][]) (observations:float[]) i j t =
alpha.[t].[i]*initialA.[i,j]*(b j observations.[t+1])*beta.[t+1].[j]
let oneStep llFunction dt (initialPi, initialA, initialTheta) observations =
let T = Array.length observations
let N = Array2D.length1 initialA
let b = llFunction dt initialTheta
let (alpha, beta, norm) = alphaBeta b initialPi initialA observations
let gamma = gamma (alpha, beta, norm)
let xi = xi b initialA alpha beta observations
let pi = Array.init N (fun i -> gamma i 0) //Rabiner (40a)
let A = //Rabiner (40b)
let A_func i j = (sumTo (T-1) (xi i j))/(sumTo (T-1) (gamma i))
Array2D.init N N A_func
let mean i = (sumTo T (fun t -> (gamma i t) * observations.[t]))/(sumTo T (gamma i))//Rabiner (53)
let var i =
let numerator = sumTo T (fun t -> (gamma i t) * (sqr (observations.[t]-(mean i))))
let denumerator = sumTo T (gamma i)
numerator/denumerator
let mu i = ((mean i) + 0.5*(var i))/dt
let sigma i = Math.Sqrt ((var i)/dt)
let theta = Array2D.init N 2 (fun i k -> if k=0 then mu i else sigma i)
let logLikelihood = log_P_O norm //Rabiner (103)
(logLikelihood, (pi, A, theta))
let print (ll, (pi, A, theta)) =
printfn "pi = %A" pi
printfn "A = %A" A
printfn "theta = %A" theta
printfn "logLikelihood = %f" ll
let baumWelch likelihood dt initialParams observations =
let tolerance = 10e-5
let rec doStep parameters previousLL =
//print (previousLL, parameters)
let (logLikelihood, parameters) = oneStep likelihood dt parameters observations
if Math.Abs(previousLL - logLikelihood) < tolerance then (logLikelihood, parameters)
else doStep parameters logLikelihood
doStep initialParams -10e100
I haven't tried to guess my way through the F#, but here are some observations:
1) How many initial states do you have observations for? If the answer is "just one" then the probability of the observations can be written as P(state 0) P(obs | state is 0) + P(state 1) P(obs | state 1). Depending on which of the two P(obs | state is X) is higher, the maximum likelihood solution will have either P(state 0) = 1 or P(state 1) = 1. I would only expect to see intermediate probabilities for the initial state when it is possible that you are observing observations that derive from a number of different initial states - for instance, if you have more than one stretch of toy data to analyse at the same time.
2) In looking for a bug, it can help to produce toy data where the answer is entirely obvious. If I had n stretches of data of the form {0, 0, 0, 0...} and m stretches of data of the form {1, 1, 1, 1...} I might hope to see state 0 assigned initial probability n/(m +n) - or of course m/(n + m), since the program doesn't know which state I wish to link with which sequence.
3) Another way to check programs is to look for some sort of consistency or conservation check. Since the model of two initial states can be made the same as a model with just one initial state, a special set of transition probabilities for the first observation, and possibly a special dummy first observation, you could check its behaviour with two initial states against its behaviour with just one initial state and some fudging.
My guess is that using only one observation sequence almost always leads to probs converging to 0/1.
After thinking about it for some time, I think the behaviour I described might actually be correct. The reason being that in the observed data this distribution was "used" only once, so intuitively there is not enough statistics to infer the distribution. Having said that I think the algorithm should be able to recover (with reasonable accuracy that is) the actual value of the hidden state variable at time 0 - because this had impact on the whole time series.
How would you make the folowing code functional with the same speed? In general, as an input I have a list of objects containing position coordinates and other stuff and I need to create a 2D array consisting those objects.
let m = Matrix.Generic.create 6 6 []
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
pos |> List.iter (fun (pz,py) ->
let z, y = int pz, int py
m.[z,y] <- (pz,py) :: m.[z,y]
)
It could be probably done in this way:
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
Matrix.generic.init 6 6 (fun z y ->
pos |> List.fold (fun state (pz,py) ->
let iz, iy = int pz, int py
if iz = z && iy = y then (pz,py) :: state else state
) []
)
But I guess it would be much slower because it loops through the whole matrix times the list versus the former list iteration...
PS: the code might be wrong as I do not have F# on this computer to check it.
It depends on the definition of "functional". I would say that a "functional" function means that it always returns the same result for the same parameters and that it doesn't modify any global state (or the value of parameters if they are mutable). I think this is a sensible definition for F#, but it also means that there is nothing "dis-functional" with using mutation locally.
In my point of view, the following function is "functional", because it creates and returns a new matrix instead of modifying an existing one, but of course, the implementation of the function uses mutation.
let performStep m =
let res = Matrix.Generic.create 6 6 []
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
for pz, py in pos do
let z, y = int pz, int py
res.[z,y] <- (pz,py) :: m.[z,y]
res
Mutation-free version:
Now, if you wanted to make the implementation fully functional, then I would start by creating a matrix that contains Some(pz, py) in the places where you want to add the new list element to the element of the matrix and None in all other places. I guess this could be done by initializing a sparse matrix. Something like this:
let sp = pos |> List.map (fun (pz, py) -> int pz, int py, (pz, py))
let elementsToAdd = Matrix.Generic.initSparse 6 6 sp
Then you should be able to combine the original matrix m with the newly created elementsToAdd. This can be certainly done using init (however, having something like map2 would be maybe nicer):
let res = Matrix.init 6 6 (fun i j ->
match elementsToAdd.[i, j], m.[i, j] with
| Some(n), res -> n::res
| _, res -> res )
There is still quite likely some mutation hidden in the F# library functions (such as init and initSparse), but at least it shows one way to implement the operation using more primitive operations.
EDIT: This will work only if you need to add at most single element to each matrix cell. If you wanted to add multiple elements, you'd have to group them first (e.g. using Seq.groupBy)
You can do something like this:
[1.3, 4.3; 5.6, 5.4; 1.5, 4.8]
|> Seq.groupBy (fun (pz, py) -> int pz, int py)
|> Seq.map (fun ((pz, py), ps) -> pz, py, ps)
|> Matrix.Generic.initSparse 6 6
But in your question you said:
How would you make the folowing code functional with the same speed?
And in a later comment you said:
Well, I try to avoid mutability so that the code would be simple to paralelize in the future
I am afraid this is a triumph of hope over reality. Functional code generally has poor absolute performance and scales badly when parallelized. Given the huge amount of allocation this code is doing, you're not likely to see any performance gain from parallelism at all.
Why do you want to do it functionally? The Matrix type is designed to be mutated, so the way you're doing it now looks good to me.
If you really want to do it functionally, though, here's what I'd do:
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
let addValue m k v =
if Map.containsKey k m then
Map.add k (v::m.[k]) m
else
Map.add k [v] m
let map =
pos
|> List.map (fun (x,y) -> (int x, int y),(x,y))
|> List.fold (fun m (p,q) -> addValue m p q) Map.empty
let m = Matrix.Generic.init 6 6 (fun x y -> if (Map.containsKey (x,y) map) then map.[x,y] else [])
This runs through the list once, creating an immutable map from indices to lists of points. Then, we initialize each entry in the matrix, doing a single map lookup for each entry. This should take total time O(M + N log N) where M and N are the number of entries in your matrix and list respectively. I believe that your original solution using mutation takes O(M+N) time and your revised solution takes O(M*N) time.