Related
From an unordered list of int, I want to have the smallest difference between two elements. I have a code that is working but way to slow. Can anyone sugest some change to improve the performance? Please explain why you did the change and what will be the performance gain.
let allInt = [ 5; 8; 9 ]
let sortedList = allInt |> List.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedList.Item a - sortedList.Item a + 1 ]
printfn "%i" (List.min differenceList) // print 1 (because 9-8 smallest difference)
I think I'm doing to much list creation or iteration but I don't know how to write it differently in F#...yet.
Edit: I'm testing this code on list with 100 000 items or more.
Edit 2: I believe that if I can calculte the difference and have the min in one go it should improve the perf a lot, but I don't know how to do that, anay idea?
Thanks in advance
The List.Item performs in O(n) time and is probably the main performance bottle neck in your code. The evaluation of differenceList iterates the elements of sortedList by index, which means the performance is around O((N-2)(2(N-2))), which simplifies to O(N^2), where N is the number of elements in sortedList. For long lists, this will eventually perform badly.
What I would do is to eliminate calls to Item and instead use the List.pairwise operation
let data =
[ let rnd = System.Random()
for i in 1..100000 do yield rnd.Next() ]
#time
let result =
data
|> List.sort
|> List.pairwise // convert list from [a;b;c;...] to [(a,b); (b,c); ...]
|> List.map (fun (a,b) -> a - b |> abs) // Calculates the absolute difference
|> List.min
#time
The #time directives lets me measure execution time in F# Interactive and the output I get when running this code is:
--> Timing now on
Real: 00:00:00.029, CPU: 00:00:00.031, GC gen0: 1, gen1: 1, gen2: 0
val result : int = 0
--> Timing now off
F#'s built-in list type is implemented as a linked list, which means accessing elements by index has to enumerate the list all the way to the index each time. In your case you have two index accesses repeated N-2 times, getting slower and slower with each iteration, as the index grows and each access needs to go through longer part of the list.
First way out of this would be using an array instead of a list, which is a trivial change, but grants you faster index access.
(*
[| and |] let you define an array literal,
alternatively use List.toArray allInt
*)
let allInt = [| 5; 8; 9 |]
let sortedArray = allInt |> Array.sort;
let differenceList = [ for a in 0 .. N-2 do yield sortedArray.[a] - sortedArray.[a + 1] ]
Another approach might be pairing up the neighbours in the list, subtracting them and then finding a min.
let differenceList =
sortedList
|> List.pairwise
|> List.map (fun (x,y) -> x - y)
List.pairwise takes a list of elements and returns a list of the neighbouring pairs. E.g. in your example List.pairwise [ 5; 8; 9 ] = [ (5, 8); (8, 9) ], so that you can easily work with the pairs in the next step, the subtraction mapping.
This way is better, but these functions from List module take a list as input and produce a new list as the output, having to pass through the list 3 times (1 for pairwise, 1 for map, 1 for min at the end). To solve this, you can use functions from the Seq module, which work with .NETs IEnumerable<'a> interface allowing lazy evaluation resulting usually in fewer passes.
Fortunately in this case Seq defines alternatives for all the functions we use here, so the next step is trivial:
let differenceSeq =
sortedList
|> Seq.pairwise
|> Seq.map (fun (x,y) -> x - y)
let minDiff = Seq.min differenceSeq
This should need only one enumeration of the list (excluding the sorting phase of course).
But I cannot guarantee you which approach will be fastest. My bet would be on simply using an array instead of the list, but to find out, you will have to try it out and measure for yourself, on your data and your hardware. BehchmarkDotNet library can help you with that.
The rest of your question is adequately covered by the other answers, so I won't duplicate them. But nobody has yet addressed the question you asked in your Edit 2. To answer that question, if you're doing a calculation and then want the minimum result of that calculation, you want List.minBy. One clue that you want List.minBy is when you find yourself doing a map followed by a min operation (as both the other answers are doing): that's a classic sign that you want minBy, which does that in one operation instead of two.
There's one gotcha to watch out for when using List.minBy: It returns the original value, not the result of the calculation. I.e., if you do ints |> List.pairwise |> List.minBy (fun (a,b) -> abs (a - b)), then what List.minBy is going to return is a pair of items, not the difference. It's written that way because if it gives you the original value but you really wanted the result, you can always recalculate the result; but if it gave you the result and you really wanted the original value, you might not be able to get it. (Was that difference of 1 the difference between 8 and 9, or between 4 and 5?)
So in your case, you could do:
let allInt = [5; 8; 9]
let minPair =
allInt
|> List.pairwise
|> List.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
The List.minBy operation also exists on sequences, so if your list is large enough that you want to avoid creating an intermediate list of pairs, then use Seq.pairwise and Seq.minBy instead:
let allInt = [5; 8; 9]
let minPair =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
let a, b = minPair
let minDifference = abs (a - b)
printfn "The difference between %d and %d was %d" a b minDifference
EDIT: Yes, I see that you've got a list of 100,000 items. So you definitely want the Seq version of this. The F# seq type is just IEnumerable, so if you're used to C#, think of the Seq functions as LINQ expressions and you'll have the right idea.
P.S. One thing to note here: see how I'm doing let a, b = minPair? That's called destructuring assignment, and it's really useful. I could also have done this:
let a, b =
allInt
|> Seq.pairwise
|> Seq.minBy (fun (x,y) -> abs (x - y))
and it would have given me the same result. Seq.minBy returns a tuple of two integers, and the let a, b = (tuple of two integers) expression takes that tuple, matches it against the pattern a, b, and thus assigns a to have the value of that tuple's first item, and b to have the value of that tuple's second item. Notice how I used the phrase "matches it against the pattern": this is the exact same thing as when you use a match expression. Explaining match expressions would make this answer too long, so I'll just point you to an excellent reference on them if you haven't already read it:
https://fsharpforfunandprofit.com/posts/match-expression/
Here is my solution:
let minPair xs =
let foo (x, y) = abs (x - y)
xs
|> List.allPairs xs
|> List.filter (fun (x, y) -> x <> y)
|> List.minBy foo
|> foo
In Matlab we can write the following code to shuffle a matrix:
data = data(:, randperm(size(data,2)));
there is what I write with Math.NET:
let csvfile = #"../UFLDL-tutorial-F#/housing.csv"
let housingAsLines =
File.ReadAllLines(csvfile)
|> Array.map (fun t -> t.Split(',')
|> Array.map (fun t -> float t))
let housingAsMatrix= DenseMatrix.OfRowArrays housingAsLines
let housingAsMatrixT = housingAsMatrix.Transpose()
let v1 = DenseVector.Create(housingAsMatrixT.ColumnCount,1.0)
housingAsMatrixT.InsertRow(0,v1)
// How to shuffle a "DenseMatrix" in F#
To simulate matrix operation in Matlab, using the F# slice syntax and zero-based indexing. However, it doesn't work.
housingAsMatrixT.[*,0]
And I got the error message in vscode.
The field, constructor or member 'GetSlice' is not defined
You actually have two questions, 1) how to slice Matrices ala matlab and 2) how to shuffle the columns of a matrix.
For 1) actually Issue 277 you linked in the comment does indeed provide the solution. However you might be using an old version or you might not be referencing the F# extensions correctly:
#r #"..\packages\MathNet.Numerics.3.13.1\lib\net40\MathNet.Numerics.dll"
#r #"..\packages\MathNet.Numerics.FSharp.3.13.1\lib\net40\MathNet.Numerics.FSharp.dll"
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.Distributions
open System
//let m = DenseMatrix.randomStandard<float> 5 5
let m = DenseMatrix.random<float> 5 5 (ContinuousUniform(0., 1.))
let m' = m.[*,0]
m'
//val it : Vector<float> =
//seq [0.4710989485; 0.2220238937; 0.566367266; 0.2356496324; ...]
This extracts the first column of the matrix.
Now for 2), assuming you need to shuffle the matrix or the arrays containing a matrix you can use some of the approaches below. There might be a more elegant method within mathnet.numerics.
To permute the vector above: m'.SelectPermutation() or SelectPermutationInplace for arrays. There are other convenience function like .Column(idx),.EnumerateColumnsIndexed() or EnumerateColumns(), etc.
So m'.SelectPermutation() will shuffle the elements of m'. Or to shuffle the columns (which your matlab function does):
let idx = Combinatorics.GeneratePermutation 5
idx
//val it : int [] = [|2; 0; 1; 4; 3|]
let m2 = idx |> Seq.map (fun i -> m.Column(i)) |> DenseMatrix.ofColumnSeq
m2.Column(1) = m.Column(0)
//val it : bool = true
Since the first column of the original matrix moved to the second column of the new matrix, the two should be equal.
With Neural Networks I had to shuffle an array of Matrices and used the following code. Note the base data structure is an array ([]) and each item in the array is a Matrix. This is not shuffling a matrix, but an array. It should give you some idea of how to proceed for your problem.
type Random() =
static member Shuffle (a : 'a[]) =
let rand = new System.Random()
let swap (a: _[]) x y =
let tmp = a.[x]
a.[x] <- a.[y]
a.[y] <- tmp
Array.iteri (fun i _ -> swap a i (rand.Next(i, Array.length a))) a
and called it like
Random.Shuffle trainingData
Addendum
Here is the code to convert a byte[] to a DenseMatrix of double
let byteArrayToMatrix (bytes : byte[]) : Matrix<double> =
let (x : Vector<byte>) = Vector<byte>.Build.DenseOfArray bytes
let (y : Vector<double>) = x.Map(fun x -> double x)
let (z : Matrix<double>) = Matrix<double>.Build.DenseOfRowVectors y
z
#GuyCoder and #s952163, thanks for your help. I implemented a quick-and-dirty version. It is not good enough but it works.
Please feel free to comment. Thank you.
#load "../packages/FsLab.1.0.2/FsLab.fsx"
open System
open System.IO
open MathNet.Numerics.LinearAlgebra.Double
open MathNet.Numerics
open MathNet.Numerics.LinearAlgebra
open MathNet.Numerics.Distributions
// implementation of the Fisher-Yates shuffle by Mathias
// http://www.clear-lines.com/blog/post/Optimizing-some-old-F-code.aspx
let swap fst snd i =
if i = fst then snd else
if i = snd then fst else
i
let shuffle items (rng: Random) =
let rec shuffleTo items upTo =
match upTo with
| 0 -> items
| _ ->
let fst = rng.Next(upTo)
let shuffled = List.permute (swap fst (upTo - 1)) items
shuffleTo shuffled (upTo - 1)
let length = List.length items
shuffleTo items length
let csvfile = #"/eUSB/sync/fsharp/UFLDL-tutorial-F#/housing.csv"
let housingAsLines =
File.ReadAllLines(csvfile)
|> Array.map (fun t -> t.Split(',')
|> Array.map (fun t -> float t))
let housingAsMatrix= DenseMatrix.OfRowArrays housingAsLines
let housingAsMatrixTmp = housingAsMatrix.Transpose()
let v1 = DenseVector.Create(housingAsMatrixTmp.ColumnCount,1.0)
let housingAsMatrixT = housingAsMatrixTmp.InsertRow(0,v1)
let m = housingAsMatrixT.RowCount - 1
let listOfArray = [0..m]
let random = new Random()
let shuffled = shuffle listOfArray random
let z = [for i in shuffled -> (housingAsMatrixT.[i, *])]
let final = DenseMatrix.OfRowVectors z
Suppose I have the following matrix:
The matrix can be broken down into chunks such that each chunk must, for all rows, have the same number of columns where the value is marked true for that row.
For example, the following chunk is valid:
This means that rows do not have to be contiguous.
Columns do not have to be contiguous either, as the following is a valid chunk:
However, the following is invalid:
That said, what is an algorithm that can be used to select chunks such that the minimal number of chunks will be used when finding all the chunks?
Given the example, above, the proper solution is (items with the same color represent a valid chunk):
In the above example, three is the minimal number of chunks that this can be broken down into.
Note that the following is also a valid solution:
There's not a preference to the solutions, really, just to get the least number of chunks.
I thought of counting using adjacent cells, but that doesn't account for the fact that the column values don't have to be contiguous.
I believe the key lies in finding the chunks with the largest area given the constraints, removing those items, and then repeating.
Taking that approach, the solution is:
But how to traverse the matrix and find the largest area is eluding me.
Also note, that if you want to reshuffle the rows and/or columns during the operations, that's a valid operation (in order to find the largest area), but I'd imagine you can only do it after you remove the largest areas from the matrix (after one area is found and moving onto the next).
You are doing circuit minimization on a truth table. For 4x4 truth tables, you can use a K map. The Quine-McCluskey algorithm is a generalization that can handle larger truth tables.
Keep in mind the problem is NP-Hard, so depending on the size of your truth tables, this problem can quickly grow to a size that is intractable.
This problem is strongly related to Biclustering, for which there are many efficient algorithms (and freely available implementations). Usually you will have to specify the number K of clusters you expect to find; if you don't have a good idea what K should be, you can proceed by binary search on K.
In case the biclusters don't overlap, you are done, otherwise you need to do some geometry to cut them into "blocks".
The solution I propose is fairly straightforward, but very time consuming.
It can be decomposed in 4 major steps:
find all the existing patterns in the matrix,
find all the possible combinations of these patterns,
remove all the incomplete pattern sets,
scan the remaining list to get the set with the minimum number of elements
First of, the algorithm below works on either column or row major matrices. I chose column for the explanations, but you may swap it for rows at your convenience, as long as it remains consistent accross the whole process.
The sample code accompanying the answer is in OCaml, but doesn't use any specific feature of the language, so it should be easy to port to other ML dialects.
Step 1:
Each column can be seen as a bit vector. Observe that a pattern (what you call chunk in your question) can be constructed by intersecting (ie. and ing) all the columns, or all the rows composing it, or even a combinations. So the first step is really about producing all the combinations of rows and columns (the powerset of the matrix' rows and columns if you will), intersecting them at the same time, and filter out the duplicates.
We consider the following interface for a matrix datatype:
module type MATRIX = sig
type t
val w : int (* the width of the matrix *)
val h : int (* the height ........ *)
val get : t -> int -> int -> bool (* cell value getter *)
end
Now let's have a look at this step's code:
let clength = M.h
let rlength = M.w
(* the vector datatype used throughought the algorithm
operator on this type are in the module V *)
type vector = V.t
(* a pattern description and comparison operators *)
module Pattern = struct
type t = {
w : int; (* width of thd pattern *)
h : int; (* height of the pattern *)
rows : vector; (* which rows of the matrix are used *)
cols : vector; (* which columns... *)
}
let compare a b = Pervasives.compare a b
let equal a b = compare a b = 0
end
(* pattern set : let us store patterns without duplicates *)
module PS = Set.Make(Pattern)
(* a simple recursive loop on #f #k times *)
let rec fold f acc k =
if k < 0
then acc
else fold f (f acc k) (pred k)
(* extract a column/row of the given matrix *)
let cr_extract mget len =
fold (fun v j -> if mget j then V.set v j else v) (V.null len) (pred len)
let col_extract m i = cr_extract (fun j -> M.get m i j) clength
let row_extract m i = cr_extract (fun j -> M.get m j i) rlength
(* encode a single column as a pattern *)
let col_encode c i =
{ w = 1; h = count c; rows = V.set (V.null clength) i; cols = c }
let row_encode r i =
{ h = 1; w = count r; cols = V.set (V.null rlength) i; rows = r }
(* try to add a column to a pattern *)
let col_intersect p c i =
let col = V.l_and p.cols c in
let h = V.count col in
if h > 0
then
let row = V.set (V.copy p.rows) i in
Some {w = V.count row; h = h; rows = row; clos = col}
else None
let row_intersect p r i =
let row = V.l_and p.rows r in
let w = V.count row in
if w > 0
then
let col = V.set (V.copy p.cols) i in
Some { w = w; h = V.count col; rows = row; cols = col }
else None
let build_patterns m =
let bp k ps extract encode intersect =
let build (l,k) =
let c = extract m k in
let u = encode c k in
let fld p ps =
match intersect p c k with
None -> l
| Some npc -> PS.add npc ps
in
PS.fold fld (PS.add u q) q, succ k
in
fst (fold (fun res _ -> build res) (ps, 0) k)
in
let ps = bp (pred rlength) PS.empty col_extract col_encode col_intersect in
let ps = bp (pred clength) ps row_extract row_encode row_intersect in
PS.elements ps
The V module must comply with the following signature for the whole algorithm:
module type V = sig
type t
val null : int -> t (* the null vector, ie. with all entries equal to false *)
val copy : t -> t (* copy operator *)
val get : t -> int -> bool (* get the nth element *)
val set : t -> int -> t (* set the nth element to true *)
val l_and : t -> t -> t (* intersection operator, ie. logical and *)
val l_or : t -> t -> t (* logical or *)
val count : t -> int (* number of elements set to true *)
val equal : t -> t -> bool (* equality predicate *)
end
Step 2:
Combining the patterns can also be seen as a powerset construction, with some restrictions: A valid pattern set may only contain patterns which don't overlap. The later can be defined as true for two patterns if both contain at least one common matrix cell.
With the pattern data structure used above, the overlap predicate is quite simple:
let overlap p1 p2 =
let nullc = V.null h
and nullr = V.null w in
let o v1 v2 n = not (V.equal (V.l_and v1 v2) n) in
o p1.rows p2.rows nullr && o p1.cols p2.cols nullc
The cols and rows of the pattern record indicate which coordinates in the matrix are included in the pattern. Thus a logical and on both fields will tell us if the patterns overlap.
For including a pattern in a pattern set, we must ensure that it does not overlap with any pattern of the set.
type pset = {
n : int; (* number of patterns in the set *)
pats : pattern list;
}
let overlap sp p =
List.exists (fun x -> overlap x p) sp.pats
let scombine sp p =
if overlap sp p
then None
else Some {
n = sp.n + 1;
pats = p::sp.pats;
}
let build_pattern_sets l =
let pset l p =
let sp = { n = 1; pats = [p] } in
List.fold_left (fun l spx ->
match scombine spx p with
None -> l
| Some nsp -> nsp::l
) (sp::l) l
in List.fold_left pset [] l
This step produces a lot of sets, and thus is very memory and computation intensive. It's certainly the weak point of this solution, but I don't see yet how to reduce the fold.
Step 3:
A pattern set is incomplete if when rebuilding the matrix with it, we do not obtain the original one. So the process is rather simple.
let build_matrix ps w =
let add m p =
let rec add_col p i = function
| [] -> []
| c::cs ->
let c =
if V.get p.rows i
then V.l_or c p.cols
else c
in c::(add_col p (succ i) cs)
in add_col p 0 m
in
(* null matrix as a list of null vectors *)
let m = fold (fun l _ -> V.null clength::l) [] (pred rlength) in
List.fold_left add m ps.pats
let drop_incomplete_sets m l =
(* convert the matrix to a list of columns *)
let m' = fold (fun l k -> col_extract m k ::l) [] (pred rlength) in
let complete m sp =
let m' = build_matrix sp in
m = m'
in List.filter (fun x -> complete m' x) l
Step 4:
The last step is just selecting the set with the smallest number of elements:
let smallest_set l =
let smallest ps1 ps2 = if ps1.n < ps2.n then ps1 else ps2 in
match l with
| [] -> assert false (* there should be at least 1 solution *)
| h::t -> List.fold_left smallest h t
The whole computation is then just the chaining of each steps:
let compute m =
let (|>) f g = g f in
build_patterns m |> build_pattern_sets |> drop_incomplete_sets m |> smallest_set
Notes
The algorithm above constructs a powerset of a powerset, with some limited filtering. There isn't as far as I know a way to reduce the search (as mentioned in a comment, if this is a NP hard problem, there isn't any).
This algorithm checks all the possible solutions, and correctly returns an optimal one (tested with many matrices, including the one given in the problem description.
One quick remark regarding the heuristic you propose in your question:
it could be easily implemented using the first step, removing the largest pattern found, and recursing. That would yeld a solution much more rapidly than my algorithm. However, the solution found may not be optimal.
For instance, consider the following matrix:
.x...
.xxx
xxx.
...x.
The central 4 cell chunck is the largest which may be found, but the set using it would comprise 5 patterns in total.
.1...
.223
422.
...5.
Yet this solution uses only 4:
.1...
.122
334.
...4.
Update:
Link to the full code I wrote for this answer.
I have implemented the Baum-Welch algorithm and I am playing with some toy data, generated with a known distribution. The data is normally distributed, had different mean and standard deviation depending on the hidden state. There are 2 states. The algorithm seems to converge for most of the parameters apart from the initial distribution of the hidden state, which always converges to either (0; 1) or (1; 0) depending on the random data.
Is it normal in this algorithm? If so I would appreciate some references, if not some hints how to find the bug.
Code (F#). First a helper module:
module MyMath
let sqr (x:float) = x*x
let inline (./) (array:float[]) (d:float) =
Array.map (fun x -> x/d) array
let inline (.*) (array:float[]) (d:float) =
Array.map (fun x -> x*d) array
let map f s =
s |> Seq.map f |> Seq.toArray
let normalize v =
let sum = Seq.sum v
map (fun x -> x/sum) v
let row i array = seq { for j in 0 .. (Array2D.length2 array)-1 do yield array.[i,j]}
let column j array = seq { for i in 0 .. (Array2D.length1 array)-1 do yield array.[i,j]}
let sum (v:float[]) = v |> Array.sum
let sumTo N (f:int->float) = Seq.init N f |> Seq.sum
let sum_column j (array:float[,]) = column j array |> Seq.sum
let sum_row i (array:float[,]) = row i array |> Seq.sum
let mean data = (sum data)/(float (Array.length data))
let var data =
let m=mean data
let N=Array.length data
let sum=Seq.sumBy (fun x -> sqr(x)) data
sum/(float N)
let induction start T nextRow =
let result = Array.zeroCreate T
result.[0] <- start
for t=1 to T-1 do
result.[t] <- nextRow t result.[t-1]
result
let backInduction last T previousRow =
let result = Array.zeroCreate T
result.[T-1] <- last
for t=T-2 downto 0 do
result.[t] <- previousRow t result.[t+1]
result
let inductionNormalized start T nextRow =
let result = Array.zeroCreate T
let norm = Array.zeroCreate T
norm.[0] <- sum start
result.[0] <- start./norm.[0]
for t=1 to T-1 do
result.[t] <- nextRow t result.[t-1]
norm.[t] <- sum result.[t]
result.[t] <- result.[t]./norm.[t]
(result, norm)
The main module:
module BaumWelch
open System
open MyMath
let mu (theta : float[,]) q = theta.[q,0]
let sigma (theta : float[,]) q = theta.[q,1]
let likelihood getDrift getVol dt parameters state observation =
let mu = getDrift parameters state
let sigma = Math.Abs (getVol parameters state:float)
let sqrt_dt = Math.Sqrt dt
let residueSquared =
let r = Likelihood.normalizedResidue mu sigma dt sqrt_dt observation in r*r
let result = (Math.Exp (-0.5*residueSquared))/(sigma * (Math.Sqrt (2.0*Math.PI*dt)))
if result<0.0 then failwith "Negative density, it certainly shouldn't have happened"
else result
let alphaBeta b (initialPi:float[]) initialA observations= //notation in comments from the Erratum for Rabiner
let T = Array.length observations
let N = Array2D.length1 initialA
let alphaStart = Array.init N (fun i -> initialPi.[i] * (b i observations.[0])) //this contains \bar{\alpha}
let alpha_j_t (previousRow:float[]) t j = (sumTo N (fun i -> previousRow.[i]*initialA.[i, j]))* (b j observations.[t]) //this contains \bar{\alpha}
let alphaInductionStep t previousRow = Array.init N (alpha_j_t previousRow t)
let (alpha, norm) = inductionNormalized alphaStart T alphaInductionStep
let betaStart = Array.init N (fun i -> 1.0/norm.[T-1])
let beta_j_t (nextRow:float[]) t j = (sumTo N (fun i -> initialA.[j, i]*nextRow.[i]*(b i observations.[t+1])))/norm.[t]
let betaInductionStep t nextRow = Array.init N (beta_j_t nextRow t)
let beta = backInduction betaStart T betaInductionStep
(alpha, beta, norm) //c_t = 1/norm_t
let log_P_O norm =
let result = norm |> Seq.sumBy (fun norm_t -> Math.Log norm_t)//c_t = 1/norm_t
if Double.IsNaN result then failwith "log likelihood is NaN"
else result
let gamma (alpha:float[][], beta:float[][], norm:float[]) i t =
alpha.[t].[i]*beta.[t].[i]*norm.[t]
let xi b (initialA:float[,]) (alpha:float[][]) (beta:float[][]) (observations:float[]) i j t =
alpha.[t].[i]*initialA.[i,j]*(b j observations.[t+1])*beta.[t+1].[j]
let oneStep llFunction dt (initialPi, initialA, initialTheta) observations =
let T = Array.length observations
let N = Array2D.length1 initialA
let b = llFunction dt initialTheta
let (alpha, beta, norm) = alphaBeta b initialPi initialA observations
let gamma = gamma (alpha, beta, norm)
let xi = xi b initialA alpha beta observations
let pi = Array.init N (fun i -> gamma i 0) //Rabiner (40a)
let A = //Rabiner (40b)
let A_func i j = (sumTo (T-1) (xi i j))/(sumTo (T-1) (gamma i))
Array2D.init N N A_func
let mean i = (sumTo T (fun t -> (gamma i t) * observations.[t]))/(sumTo T (gamma i))//Rabiner (53)
let var i =
let numerator = sumTo T (fun t -> (gamma i t) * (sqr (observations.[t]-(mean i))))
let denumerator = sumTo T (gamma i)
numerator/denumerator
let mu i = ((mean i) + 0.5*(var i))/dt
let sigma i = Math.Sqrt ((var i)/dt)
let theta = Array2D.init N 2 (fun i k -> if k=0 then mu i else sigma i)
let logLikelihood = log_P_O norm //Rabiner (103)
(logLikelihood, (pi, A, theta))
let print (ll, (pi, A, theta)) =
printfn "pi = %A" pi
printfn "A = %A" A
printfn "theta = %A" theta
printfn "logLikelihood = %f" ll
let baumWelch likelihood dt initialParams observations =
let tolerance = 10e-5
let rec doStep parameters previousLL =
//print (previousLL, parameters)
let (logLikelihood, parameters) = oneStep likelihood dt parameters observations
if Math.Abs(previousLL - logLikelihood) < tolerance then (logLikelihood, parameters)
else doStep parameters logLikelihood
doStep initialParams -10e100
I haven't tried to guess my way through the F#, but here are some observations:
1) How many initial states do you have observations for? If the answer is "just one" then the probability of the observations can be written as P(state 0) P(obs | state is 0) + P(state 1) P(obs | state 1). Depending on which of the two P(obs | state is X) is higher, the maximum likelihood solution will have either P(state 0) = 1 or P(state 1) = 1. I would only expect to see intermediate probabilities for the initial state when it is possible that you are observing observations that derive from a number of different initial states - for instance, if you have more than one stretch of toy data to analyse at the same time.
2) In looking for a bug, it can help to produce toy data where the answer is entirely obvious. If I had n stretches of data of the form {0, 0, 0, 0...} and m stretches of data of the form {1, 1, 1, 1...} I might hope to see state 0 assigned initial probability n/(m +n) - or of course m/(n + m), since the program doesn't know which state I wish to link with which sequence.
3) Another way to check programs is to look for some sort of consistency or conservation check. Since the model of two initial states can be made the same as a model with just one initial state, a special set of transition probabilities for the first observation, and possibly a special dummy first observation, you could check its behaviour with two initial states against its behaviour with just one initial state and some fudging.
My guess is that using only one observation sequence almost always leads to probs converging to 0/1.
After thinking about it for some time, I think the behaviour I described might actually be correct. The reason being that in the observed data this distribution was "used" only once, so intuitively there is not enough statistics to infer the distribution. Having said that I think the algorithm should be able to recover (with reasonable accuracy that is) the actual value of the hidden state variable at time 0 - because this had impact on the whole time series.
How would you make the folowing code functional with the same speed? In general, as an input I have a list of objects containing position coordinates and other stuff and I need to create a 2D array consisting those objects.
let m = Matrix.Generic.create 6 6 []
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
pos |> List.iter (fun (pz,py) ->
let z, y = int pz, int py
m.[z,y] <- (pz,py) :: m.[z,y]
)
It could be probably done in this way:
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
Matrix.generic.init 6 6 (fun z y ->
pos |> List.fold (fun state (pz,py) ->
let iz, iy = int pz, int py
if iz = z && iy = y then (pz,py) :: state else state
) []
)
But I guess it would be much slower because it loops through the whole matrix times the list versus the former list iteration...
PS: the code might be wrong as I do not have F# on this computer to check it.
It depends on the definition of "functional". I would say that a "functional" function means that it always returns the same result for the same parameters and that it doesn't modify any global state (or the value of parameters if they are mutable). I think this is a sensible definition for F#, but it also means that there is nothing "dis-functional" with using mutation locally.
In my point of view, the following function is "functional", because it creates and returns a new matrix instead of modifying an existing one, but of course, the implementation of the function uses mutation.
let performStep m =
let res = Matrix.Generic.create 6 6 []
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
for pz, py in pos do
let z, y = int pz, int py
res.[z,y] <- (pz,py) :: m.[z,y]
res
Mutation-free version:
Now, if you wanted to make the implementation fully functional, then I would start by creating a matrix that contains Some(pz, py) in the places where you want to add the new list element to the element of the matrix and None in all other places. I guess this could be done by initializing a sparse matrix. Something like this:
let sp = pos |> List.map (fun (pz, py) -> int pz, int py, (pz, py))
let elementsToAdd = Matrix.Generic.initSparse 6 6 sp
Then you should be able to combine the original matrix m with the newly created elementsToAdd. This can be certainly done using init (however, having something like map2 would be maybe nicer):
let res = Matrix.init 6 6 (fun i j ->
match elementsToAdd.[i, j], m.[i, j] with
| Some(n), res -> n::res
| _, res -> res )
There is still quite likely some mutation hidden in the F# library functions (such as init and initSparse), but at least it shows one way to implement the operation using more primitive operations.
EDIT: This will work only if you need to add at most single element to each matrix cell. If you wanted to add multiple elements, you'd have to group them first (e.g. using Seq.groupBy)
You can do something like this:
[1.3, 4.3; 5.6, 5.4; 1.5, 4.8]
|> Seq.groupBy (fun (pz, py) -> int pz, int py)
|> Seq.map (fun ((pz, py), ps) -> pz, py, ps)
|> Matrix.Generic.initSparse 6 6
But in your question you said:
How would you make the folowing code functional with the same speed?
And in a later comment you said:
Well, I try to avoid mutability so that the code would be simple to paralelize in the future
I am afraid this is a triumph of hope over reality. Functional code generally has poor absolute performance and scales badly when parallelized. Given the huge amount of allocation this code is doing, you're not likely to see any performance gain from parallelism at all.
Why do you want to do it functionally? The Matrix type is designed to be mutated, so the way you're doing it now looks good to me.
If you really want to do it functionally, though, here's what I'd do:
let pos = [(1.3,4.3); (5.6,5.4); (1.5,4.8)]
let addValue m k v =
if Map.containsKey k m then
Map.add k (v::m.[k]) m
else
Map.add k [v] m
let map =
pos
|> List.map (fun (x,y) -> (int x, int y),(x,y))
|> List.fold (fun m (p,q) -> addValue m p q) Map.empty
let m = Matrix.Generic.init 6 6 (fun x y -> if (Map.containsKey (x,y) map) then map.[x,y] else [])
This runs through the list once, creating an immutable map from indices to lists of points. Then, we initialize each entry in the matrix, doing a single map lookup for each entry. This should take total time O(M + N log N) where M and N are the number of entries in your matrix and list respectively. I believe that your original solution using mutation takes O(M+N) time and your revised solution takes O(M*N) time.