Difference Between J48 and Markov Chains - algorithm

I am attempting to do some evaluation of the relative rates of different algorithms in the C# and F# realms using WekaSharp and one of the algorithms I was interested in was Markov Chains. I know Weka has an HMM application but I have not been able to implement this into WekaSharp and was wondering if there was a way to modify the J48 Algorithm to suit this purpose. I know there is some similarity between J48 and first order Markov chains but am trying to determine what needs to be modified and if this is a reasonable thing to do. Here is the J48 as implemented in Yin Zhu's WekaSharp:
type J48() =
static member DefaultPara = "-C 0.25 -M 2"
static member MakePara(?binarySplits, ?confidenceFactor, ?minNumObj, ?unpruned, ?useLaplace) =
let binarySplitsStr =
let b = match binarySplits with
| Some (v) -> v
| None -> false
if not b then "-B" else ""
let confidenceFactorStr =
let c = match confidenceFactor with
| Some (v) -> v
| None -> 0.25 // default confi
"-C " + c.ToString()
let minNumObjStr =
let m = match minNumObj with
| Some (v) -> v
| None -> 2
"-M " + m.ToString()
let unprunedStr =
let u = match unpruned with
| Some (v) -> v
| None -> false
if u then "-U" else ""
let useLaplaceStr =
let u = match useLaplace with
| Some (v) -> v
| None -> false
if u then "-A" else ""
binarySplitsStr + " " + confidenceFactorStr + " " + minNumObjStr + " " + unprunedStr + " " + useLaplaceStr
Thank you very much.

J48 is just an implementation of the C4.5 algorithm that learns decision trees by considering the entropy of each attribute (dimension) and taking the attribute that has maximum entropy as root of the current subtree. This algorithm does not need reinforcement.
I guess that by Markov Chains you mean Hidden Markov Model that is used in reinforcement learning.
You should take a look to HMMWeka.
A related question is:
What is the equivalent for a Hidden Markov Model in the WEKA toolkit?

Related

A specific push down automaton for a language (PDA)

I'm wondering how could I design a pushdown automaton for this specific language.
I can't solve this..
L2 = { u ∈ {a, b}∗ : 3 ∗ |u|a = 2 ∗ |u|b + 1 }
So the number of 'a's multiplied by 3 is equals to number of 'b's multiplied by 2 and added 1.
The grammar corresponding to that language is something like:
S -> ab | ba |B
B -> abB1 | baB1 | aB1b | bB1a | B1ab | B1ba
B1 -> aabbbB1 | baabbB1 | [...] | aabbb | baabb | [...]
S generates the base case (basically strings with #a = 1 = #b) or B
B generates the base case + B1 (in every permutation)
B1 adds 2 'a' and 3 'b' to the base case (in fact if you keep adding this number of 'a' and 'b' the equation 3#a = 2#b + 1 will always be true!). I didn't finish writing B1, basically you need to add every permutation of 2 'a' and 3 'b'. I think you'll be able to do it on your own :)
When you're finished with the grammar, designing the PDA is simple. More info here.
3|u|a = 2|u|b + 1 <=> 3|u|a - 2|u|b = 1
The easiest way to design a PDA for this is to implement this equation directly.
For any string x, let f(x) = 3|x|a - 2|x|b. Then design a PDA such that, after processing any string x:
The stack depth is always equal to abs( floor( f(x)/3 ) );
The symbol on the top of the stack (if any), reflects the sign of floor( f(x)/3 ). You only need 2 kinds of stack symbols
The current state number = f(x) mod 3. Of course you only need 3 states.
From the state number and the symbol on top of the stack, you can detect when f(x) = 1, and at that condition the PDA accepts x as a string in the language.

Re-generate random numbers in a loop

I would like to create a basic genetic algorithm in order to output a set of input to enter in an emulator. Basically, what it does is :
Generate an input sheet
List item
Run said input
slightly modify it
Run it
See whichever input set performed better and "fork" it and repeat until the problem is solved
So : here is my code to generate the first set of inputs :
(* RNG initialization
* unit *)
Random.self_init();;
(* Generating a starting input file
* array
* 500 inputs long *)
let first_input =
let first_array = Array.make 500 "START" in
for i = 1 to 499 do
let input =
match Random.int(5) with
| 0 -> "A "
| 1 -> "B "
| 2 -> "DOWN "
| 3 -> "LEFT "
| 4 -> "RIGHT "
| _ -> "START " in
first_array.(i) <- input
done;
first_array;;
And here is my "mutation" function that randomly alters some inputs :
(* Mutating input_file
* Rate : in percent, must be positive and <= 100
* a must be an array of strings *)
let mutation a n=
let mutation_rate = n in
for i = 0 to ((Array.length(a) * mutation_rate / 100) - 1) do
let input =
match Random.int(5) with
| 0 -> "A "
| 1 -> "B "
| 2 -> "DOWN "
| 3 -> "LEFT "
| 4 -> "RIGHT "
| _ -> "START " in
a.( Random.int(498) + 1) <- input
done;;
However, I don't feel like my function is efficient because I had to paste the pattern matching part in the mutation function and I think there has to be a smarter way to proceed. If I define my "input" function as a global function, then it is only evaluated once (let's say as "RIGHT" and all occurrences of "input" will return "RIGHT" which is not really useful.
Thanks.
There isn't anything wrong with putting that into it's own function. What you are missing is an argument to make the function deal with the side-effect of Random.int. Since you are not using this argument, it's often/always the case people use unit.
let random_input () = match Random.int 5 with
| 0 -> "A "
| 1 -> "B "
| 2 -> "DOWN "
| 3 -> "LEFT "
| 4 -> "RIGHT "
| _ -> "START "
What you are doing here is pattern matching the argument, and since there is only one constructor this matching is exhaustive. But technically, you can replace the () above with an _. This will match anything making the function polymorphic against it's argument, 'a -> string. In this case it's bad form since it may lead to confusion as to what the parameter is for.

Breadth-first Search in F# (BFS)

I want to implement search using BFS. The Algorithm say that i must use a queue to get FIFO effect.
I read Chris Okasaki's Purely Functional Data Structures book and found how to make a queue (i wrote using F#) :
type 'a queue = 'a list * 'a list
let emtpy = [],[]
let isEmpty = function
| [],_ -> true
| _ -> false
let checkf = function
| [],r -> List.rev r,[]
| q -> q
let snoc (f,r) x = checkf (f,x :: r)
let head = function
| ([],_) -> failwith "EMPTY"
| (x::f,r) -> x
let tail = function
| ([],_) -> failwith "EMPTY"
| (x::f,r) -> checkf (f,r)
anyone know how to implement this to BFS?
and i have this code to make a tree from a list:
let data = [4;3;8;7;10;1;9;6;5;0;2]
type Tree<'a> =
| Node of Tree<'a> * 'a * Tree<'a>
| Leaf
let rec insert tree element =
match element,tree with
| x,Leaf -> Node(Leaf,x,Leaf)
| x,Node(l,y,r) when x <= y -> Node((insert l x),y,r)
| x,Node(l,y,r) when x > y -> Node(l,y,(insert r x))
| _ -> Leaf
let makeTree = List.fold insert Leaf data
(want to combine these two codes)
the BFS algorithm is this:
Initialise the search by placing the starting vertex in the queue.
While the queue is not empty.
Remove the front vertex from the queue.
If this is a solution then we're finished -- report success.
Otherwise, compute the immediate children of this vertex and enqueue them.
Otherwise we have exhausted the queue and found no solution -- report failure.
My F# syntax is a bit wobbly, but here's how I'd sketch out the solution:
bfs start = bfsLoop ([start], [])
bfsLoop q0 =
if isEmpty q0
then failWith "No solution"
else v = head q0
if isSolution v
then v
else q1 = tail q0
vs = childrenOf v
q = foldl snoc vs q1
bfsLoop q
Hope this helps.
Might still be useful 11 years later?
BFS in F# is not hard: Instead of a while loop you can use recursion to keep it mutable-free.
I enqueue each node with its trace so we can calculate the solution path.
let data = [4;3;8;7;10;1;9;6;5;0;2]
type Tree<'a> =
| Node of Tree<'a> * 'a * Tree<'a>
| Leaf
let rec insert tree element =
match element,tree with
| x,Leaf -> Node(Leaf,x,Leaf)
| x,Node(l,y,r) when x <= y -> Node((insert l x),y,r)
| x,Node(l,y,r) when x > y -> Node(l,y,(insert r x))
| _ -> Leaf
let tree = List.fold insert Leaf data
// BFS
let rec find goal queue =
match queue with
| [] -> None
| (Leaf, _)::tail -> find goal tail
| (Node (l,y,r), trace)::tail ->
if y = goal then Some (List.rev (y::trace)) else
find goal (tail # [ l, y::trace; r, y::trace ])
// for example, to find the 5 in your tree
find 5 [tree, []]
|> printfn "%A"
// it will return: Some [4; 8; 7; 6; 5]
// because your tree looks like this:
// 4
// / \
// 3 8
// / / \
// 1 7 10
// / \ / /
// 0 2 6 9
// /
// 5

Create (pseudo) Cyclic Discriminated Unions in F#

I've run into a small problem here. I wrote the Tortoise and Hare cycle detection algorithm.
type Node =
| DataNode of int * Node
| LastNode of int
let next node =
match node with
|DataNode(_,n) -> n
|LastNode(_) -> failwith "Error"
let findCycle(first) =
try
let rec fc slow fast =
match (slow,fast) with
| LastNode(a),LastNode(b) when a=b -> true
| DataNode(_,a), DataNode(_,b) when a=b -> true
| _ -> fc (next slow) (next <| next fast)
fc first <| next first
with
| _ -> false
This is working great for
let first = DataNode(1, DataNode(2, DataNode(3, DataNode(4, LastNode(5)))))
findCycle(first)
It shows false. Right. Now when try to test it for a cycle, I'm unable to create a loop!
Obviously this would never work:
let first = DataNode(1, DataNode(2, DataNode(3, DataNode(4, first))))
But I need something of that kind! Can you tell me how to create one?
You can't do this with your type as you've defined it. See How to create a recursive data structure value in (functional) F#? for some alternative approaches which would work.
As an alternative to Brian's solution, you might try something like:
type Node =
| DataNode of int * NodeRec
| LastNode of int
and NodeRec = { node : Node }
let rec cycle = DataNode(1, { node =
DataNode(2, { node =
DataNode(3, { node =
DataNode(4, { node = cycle}) }) }) })
Here is one way:
type Node =
| DataNode of int * Lazy<Node>
| LastNode of int
let next node = match node with |DataNode(_,n) -> n.Value |LastNode(_) -> failwith "Error"
let findCycle(first) =
try
let rec fc slow fast =
match (slow,fast) with
| LastNode(a),LastNode(b) when a=b->true
| DataNode(a,_), DataNode(b,_) when a=b -> true
| _ -> fc (next slow) (next <| next fast)
fc first <| next first
with
| _ -> false
let first = DataNode(1, lazy DataNode(2, lazy DataNode(3, lazy DataNode(4, lazy LastNode(5)))))
printfn "%A" (findCycle(first))
let rec first2 = lazy DataNode(1, lazy DataNode(2, lazy DataNode(3, lazy DataNode(4, first2))))
printfn "%A" (findCycle(first2.Value))
Even though both Brian and kvb posted answers that work, I still felt I needed to see if it was possible to achieve the same thing in a different way. This code will give you a cyclic structure wrapped as a Seq<'a>
type Node<'a> = Empty | Node of 'a * Node<'a>
let cyclic (n:Node<_>) : _ =
let rn = ref n
let rec next _ =
match !rn with
| Empty -> rn := n; next Unchecked.defaultof<_>
| Node(v, x) -> rn := x; v
Seq.initInfinite next
let nodes = Node(1, Node(2, Node(3, Empty)))
cyclic <| nodes |> Seq.take 40 // val it : seq<int> = seq [1; 2; 3; 1; ...]
The structure itself is not cyclic, but it looks like it from the outside.
Or you could do this:
//removes warning about x being recursive
#nowarn "40"
type Node<'a> = Empty | Node of 'a * Lazy<Node<'a>>
let rec x = Node(1, lazy Node(2, lazy x))
let first =
match x with
| Node(1, Lazy(Node(2,first))) -> first.Value
| _ -> Empty
Can you tell me how to create one?
There are various hacks to get a directly cyclic value in F# (as Brian and kvb have shown) but I'd note that this is rarely what you actually want. Directly cyclic data structures are a pig to debug and are usually used for performance and, therefore, made mutable.
For example, your cyclic graph might be represented as:
> Map[1, 2; 2, 3; 3, 4; 4, 1];;
val it : Map<int,int> = map [(1, 2); (2, 3); (3, 4); (4, 1)]
The idiomatic way to represent a graph in F# is to store a dictionary that maps from handles to vertices and, if necessary, another for edges. This approach is much easier to debug because you traverse indirect recursion via lookup tables that are comprehensible as opposed to trying to decipher a graph in the heap. However, if you want to have the GC collect unreachable subgraphs for you then a purely functional alternative to a weak hash map is apparently an unsolved problem in computer science.

Inconsistent behaviour with Haskell

I was reading on perceptrons and trying to implement one in haskell. The algorithm seems to be working as far as I can test. I'm going to rewrite the code entirely at some point, but before doing so I thought of asking a few questions that have arosen while coding this.
The neuron can be trained when returning the complete neuron. let neuron = train set [1,1] works, but if I change the train function to return an incomplete neuron without the inputs, or try to pattern match and create only an incomplete neuron, the code falls into neverending loop.
tl;dr when returning complete neuron everything works, but when returning curryable neuron, the code falls into a loop.
module Main where
import System.Random
type Inputs = [Float]
type Weights = [Float]
type Threshold = Float
type Output = Float
type Trainingset = [(Inputs, Output)]
data Neuron = Neuron Threshold Weights Inputs deriving Show
output :: Neuron -> Output
output (Neuron threshold weights inputs) =
if total >= threshold then 1 else 0
where total = sum $ zipWith (*) weights inputs
rate :: Float -> Float -> Float
rate t o = 0.1 * (t - o)
newweight :: Float -> Float -> Weights -> Inputs -> Weights
newweight t o weight input = zipWith nw weight input
where nw w x = w + (rate t o) * x
learn :: Neuron -> Float -> Neuron
learn on#(Neuron tr w i) t =
let o = output on
in Neuron tr (newweight t o w i) i
converged :: (Inputs -> Neuron) -> Trainingset -> Bool
converged n set = not $ any (\(i,o) -> output (n i) /= o) set
train :: Weights -> Trainingset -> Neuron
train w s = train' s (Neuron 1 w)
train' :: Trainingset -> (Inputs -> Neuron) -> Neuron
train' s n | not $ converged n set
= let (Neuron t w i) = train'' s n
in train' s (Neuron t w)
| otherwise = n $ fst $ head s
train'' :: Trainingset -> (Inputs -> Neuron) -> Neuron
train'' ((a,b):[]) n = learn (n a) b
train'' ((a,b):xs) n = let
(Neuron t w i) = learn (n a) b
in
train'' xs (Neuron t w)
set :: Trainingset
set = [
([1,0], 0),
([1,1], 1),
([0,1], 0),
([0,0], 0)
]
randomWeights :: Int -> IO [Float]
randomWeights n =
do
g <- newStdGen
return $ take n $ randomRs (-1, 1) g
main = do
w <- randomWeights 2
let (Neuron t w i) = train w set
print $ output $ (Neuron t w [1,1])
return ()
Edit: As per comments, specifying a little more.
Running with the code above, I get:
perceptron: <<loop>>
But by editing the main method to:
main = do
w <- randomWeights 2
let neuron = train w set
print $ neuron
return ()
(Notice the let neuron, and print rows), everything works and the output is:
Neuron 1.0 [0.71345896,0.33792675] [1.0,0.0]
Perhaps I am missing something, but I boiled your test case down to this program:
module Main where
data Foo a = Foo a
main = do
x ← getLine
let (Foo x) = Foo x
putStrLn x
This further simplifies to:
main = do
x ← getLine
let x = x
putStrLn x
The problem is that binding (Foo x) to something that depends on x
is a cyclic dependency. To evaluate x, we need to know the value of
x. OK, so we just need to calculate x. To calculate x, we need to
know the value of x. That's fine, we'll just calculate x. And so on.
This isn't C, remember: it's binding, not assignment, and the binding
is evaluated lazily.
Use better variable names, and it all works:
module Main where
data Foo a = Foo a
main = do
line ← getLine
let (Foo x) = Foo line
putStrLn x
(The variable in question, in your case, is w.)
This is a common mistake in Haskell. You cannot say things like:
let x = 0
let x = x + 1
And have it mean what it would in a language with assignment, or even nonrecursive binding. The first line is irrelevant, it gets shadowed by the second line, which defines x as x+1, that is, it defines recursively x = ((((...)+1)+1)+1)+1, which will loop upon evaluation.

Resources