Understanding candidate elimination algorithm - algorithm

I am trying to get to grips with how to do the candidate elimination algorithm by hand. I know the answer, but I dont the steps on how to get there. Can anyone guide me or point me in the right direction. Here is the question that I am working on:
Consider a concept description language with three attributes predened as follows:
attribute1 attribute2 attribute3
---------- ------------ ------------
| | | | | | |
a b c d e f g
Demonstrate version space learning using the following positive and negative training examples:
1. ( a c f ) +)
2. ( b c f ) +)
3. ( a e g ) -)
4. ( a c g ) -)
5. ( b d f ) -)
Show how the candidate elimination algorithm changes the boundary sets after
processing each example.
This is what I have so far:
1. ( a c f ) +) Generalize..
G:(???)
S:(acf)
2. ( b c f ) +) Generalize...
G:(a??), (?e?), (?d?), (??g) - Not even sure if this is correct
S:(?cf)
Can someone guide me or give me advice please? Thanks

here: http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/vspace/vs_prob1.html
this website really helps !
I made the tree for this exercise and the answer I got is G=S= (?cf)
it goes like this:
G: (???)
S: (acf)
2.
G: (???)
S: (?cf)
G: (???)
becomes
(b??),(?c?),(?d?),(??f)
after pruning =
(?c?),(??f)
S: (?cf)
4.
G: (?c?),(??f)
becomes (?cf),(??f)
(because (?c?) no longer holds for this example)
S: (?cf)
5.
G: (?cf),(??f)
becomes (?cf),(?cf)
(because < ??f > no longer holds for this example)
S: (?cf)
Final answer is:
G: (?cf),(?cf) = (?cf)
S: (?cf)
the final hypothesis is (?cf)
Hope this helps.

Related

A specific push down automaton for a language (PDA)

I'm wondering how could I design a pushdown automaton for this specific language.
I can't solve this..
L2 = { u ∈ {a, b}∗ : 3 ∗ |u|a = 2 ∗ |u|b + 1 }
So the number of 'a's multiplied by 3 is equals to number of 'b's multiplied by 2 and added 1.
The grammar corresponding to that language is something like:
S -> ab | ba |B
B -> abB1 | baB1 | aB1b | bB1a | B1ab | B1ba
B1 -> aabbbB1 | baabbB1 | [...] | aabbb | baabb | [...]
S generates the base case (basically strings with #a = 1 = #b) or B
B generates the base case + B1 (in every permutation)
B1 adds 2 'a' and 3 'b' to the base case (in fact if you keep adding this number of 'a' and 'b' the equation 3#a = 2#b + 1 will always be true!). I didn't finish writing B1, basically you need to add every permutation of 2 'a' and 3 'b'. I think you'll be able to do it on your own :)
When you're finished with the grammar, designing the PDA is simple. More info here.
3|u|a = 2|u|b + 1 <=> 3|u|a - 2|u|b = 1
The easiest way to design a PDA for this is to implement this equation directly.
For any string x, let f(x) = 3|x|a - 2|x|b. Then design a PDA such that, after processing any string x:
The stack depth is always equal to abs( floor( f(x)/3 ) );
The symbol on the top of the stack (if any), reflects the sign of floor( f(x)/3 ). You only need 2 kinds of stack symbols
The current state number = f(x) mod 3. Of course you only need 3 states.
From the state number and the symbol on top of the stack, you can detect when f(x) = 1, and at that condition the PDA accepts x as a string in the language.

Candidate elimination algorithm lecture example

I am looking through some lecture slides and cannot understand why the bold hypothesis at last G are just discarded, I can come to the same answer but don't understand why they're just discarded.
sky temperature humidity
| | | | | |
Sunny Rainy Warm Coo Normal Low
and the set of positive and negative training examples:
1. ( S W N )+)
2. ( R C L )-)
3 . ( S C N )+)
4. ( S W L )-)
Training with the first example: ( S W N ) +) generalizing…
G = [( ? ? ? )]
S = [( S W N )]
Training with the second example: ( R C L ) -) specializing…
G = [( S ? ? ) ( ? W ? ) ( ? ? N )
S = [( S W N )]
Training with the third example: ( S C N ) +) generalizing…
G = [( S ? ? )( ? ? N )] (the other is discarded )
S = [( S ? N )]
Training with the fourth example: ( S W L ) -) specializing…
G = [( S C ? )( S ? N )( R ? N )(? C N)] (bold are discarded )
S = [( S ? N )]
Convergence, the learned concept must be: [( S ? N )]
G = [( S C ? )( S ? N )( R ? N )(? C N)] (bold are discarded )
It can be simply using the candidate elimination algorithm. According to that the reasons can be summarized as follows.
Inconsistent hypothesis: According to the algorithm we have to first remove the hypotheses which are not consistent with target data(D)
In this case ( R ? N ) is removed it's inconsistent with ( S ? N )
Specific boundary being more general than the general boundary.
If the specific boundy become more specific that the general one. There can be a boundary overlapping.
if we compare derived ( S C ? ) with ( S ? N ) , we can compare middle c with ? of (S ? N). The derived one having a constant makes it more specific compared to the specific boundary. So it should be removed. Same goes with (? C N).
I see the question is bit older but I hope someone would find this useful.

Cleaner way to represent languages accepted by DFAs?

I am given 2 DFAs. * denotes final states and -> denotes the initial state, defined over the alphabet {a, b}.
1) ->A with a goes to A. -> A with b goes to *B. *B with a goes to *B. *B with b goes to ->A.
The regular expression for this is clearly:
E = a* b(a* + (a* ba* ba*)*)
And the language that it accepts is L1= {w over {a,b} | w is b preceeded by any number of a's followed by any number of a's or w is b preceeded by any number of a's followed by any number of bb with any number of a's in middle of(middle of bb), end or beginning.}
2) ->* A with b goes to ->* A. ->*A with a goes to *B. B with b goes to -> A. *B with a goes to C. C with a goes to C. C with b goes to C.
Note: A is both final and initial state. B is final state.
Now the regular expression that I get for this is:
E = b* ((ab) * + a(b b* a)*)
Finally the language that this DFA accepts is:
L2 = {w over {a, b} | w is n 1's followed by either k 01's or a followed by m 11^r0' s where n,km,r >= 0}
Now the question is, is there a cleaner way to represent the languages L1 and L2 because it does seem ugly. Thanks in advance.
E = a* b(a* + (a* ba* ba*)*)
= a*ba* + a*b(a* ba* ba*)*
= a*ba* + a*b(a*ba*ba*)*a*
= a*b(a*ba*ba*)*a*
= a*b(a*ba*b)*a*
This is the language of all strings of a and b containing an odd number of bs. This might be most compactly denoted symbolically as {w in {a,b}* | #b(w) = 1 (mod 2)}.
For the second one: the only way to get to state B is to see an a in A, and the only way to get to C from outside C is to see an a in B. C is a dead state and the only way to get to it is to see aa starting in A. That is: if you ever see two as in a row, the string is not in the language; the language is the set of all strings over a and b not containing the substring aa. This might be most compactly denoted symbolically as {(a+b)*aa(a+b)*}^c where ^c means "complement".

Finding largest f satisfying a property given f is non-decreasing in its arguments

this has been bugging me for a while.
Lets say you have a function f x y where x and y are integers and you know that f is strictly non-decreasing in its arguments,
i.e. f (x+1) y >= f x y and f x (y+1) >= f x y.
What would be the fastest way to find the largest f x y satisfying a property given that x and y are bounded.
I was thinking that this might be a variation of saddleback search and I was wondering if there was a name for this type of problem.
Also, more specifically I was wondering if there was a faster way to solve this problem if you knew that f was the multiplication operator.
Thanks!
Edit: Seeing the comments below, the property can be anything
Given a property g (where g takes a value and returns a boolean) I am simply looking for the largest f such that g(f) == True
For example, a naive implementation (in haskell) would be:
maximise :: (Int -> Int -> Int) -> (Int -> Bool) -> Int -> Int -> Int
maximise f g xLim yLim = head . filter g . reverse . sort $ results
where results = [f x y | x <- [1..xLim], y <- [1..yLim]]
Let's draw an example grid for your problem to help think about it. Here's an example plot of f for each x and y. It is monotone in each argument, which is an interesting constraint we might be able to do something clever with.
+------- x --------->
| 0 0 1 1 1 2
| 0 1 1 2 2 4
y 1 1 3 4 6 6
| 1 2 3 6 6 7
| 7 7 7 7 7 7
v
Since we don't know anything about the property, we can't really do better than to list the values in the range of f in decreasing order. The question is how to do that efficiently.
The first thing that comes to mind is to traverse it like a graph starting at the lower-right corner. Here is my attempt:
import Data.Maybe (listToMaybe)
maximise :: (Ord b, Num b) => (Int -> Int -> b) -> (b -> Bool) -> Int -> Int -> Maybe b
maximise f p xLim yLim =
listToMaybe . filter p . map (negate . snd) $
enumIncreasing measure successors (xLim,yLim)
where
measure (x,y) = negate $ f x y
successors (x,y) = [ (x-1,y) | x > 0 ] ++ [ (x,y-1) | y > 0 ] ]
The signature is not as general as it could be (Num should not be necessary, but I needed it to negate the measure function because enumIncreasing returns an increasing rather than a decreasing list -- I could have also done it with a newtype wrapper).
Using this function, we can find the largest odd number which can be written as a product of two numbers <= 100:
ghci> maximise (*) odd 100 100
Just 9801
I wrote enumIncreasing using meldable-heap on hackage to solve this problem, but it is pretty general. You could tweak the above to add additional constraints on the domain, etc.
The answer depends on what's expensive. The case that might be intersting is when f is expensive.
What you might want to do is look at pareto-optimality. Suppose you have two points
(1, 2) and (3, 4)
Then you know that the latter point is going to be a better solution, so long as f is a nondecreasing function. However, of course, if you have points,
(1, 2) and (2, 1)
then you can't know. So, one solution would be to establish a pareto-optimal frontier of points that the predicate g permits, and then evaluate these though f.

Wine Tasting problem

I've spent almost all competition time(3 h) for solving this problem. In vain :( Maybe you could help me to find the solution.
A group of Facebook employees just had a very successful product launch. To celebrate, they have decided to go wine tasting. At the vineyard, they decide to play a game. One person is given some glasses of wine, each containing a different wine. Every glass of wine is labelled to indicate the kind of wine the glass contains. After tasting each of the wines, the labelled glasses are removed and the same person is given glasses containing the same wines, but unlabelled. The person then needs to determine which of the unlabelled glasses contains which wine. Sadly, nobody in the group can tell wines apart, so they just guess randomly. They will always guess a different type of wine for each glass. If they get enough right, they win the game. You must find the number of ways that the person can win, modulo 1051962371.
Input
The first line of the input is the number of test cases, N. The next N lines each contain a test case, which consists of two integers, G and C, separated by a single space. G is the total number of glasses of wine and C is the minimum number that the person must correctly identify to win.
Constraints
N = 20
1 ≤ G ≤ 100
1 ≤ C ≤ G
Output
For each test case, output a line containing a single integer, the number of ways that the person can win the game modulo 1051962371.
Example input
5
1 1
4 2
5 5
13 10
14 1
Example output
1
7
1
651
405146859
Here's the one that doesn't need the prior knowledge of Rencontres numbers. (Well, it's basically the proof a formula from the wiki but I thought I'd share it anyway.)
First find f(n): the number of permutations of n elements that don't have a fixed point. It's simple by inclusion-exclusion formula: the number of permutations that fix k given points is (n-k)!, and these k points can be chosen in C(n,k) ways. So, f(n) = n! - C(n,1)(n-1)! + C(n,2)(n-2)! - C(n,3)(n-3)! + ...
Now find the number of permutations that have exactly k fixed points. These points can be chosen in C(n,k) ways and the rest n-k points can be rearranged in f(n-k) ways. So, it's C(n,k)f(n-k).
Finally, the answer to the problem is the sum of C(g,k)f(g-k) over k = c, c+1, ..., g.
My solution involved the use of Rencontres Numbers.
A Rencontres Number D(n,k) is the number of permutations of n elements where exactly k elements are in their original places. The problem asks for at least k elemenets, so I just took the sum over k, k+1,...,n.
Here's my Python submission (after cleaning up):
from sys import stdin, stderr, setrecursionlimit as recdepth
from math import factorial as fact
recdepth(100000)
MOD=1051962371
cache=[[-1 for i in xrange(101)] for j in xrange(101)]
def ncr(n,k):
return fact(n)/fact(k)/fact(n-k)
def D(n,k):
if cache[n][k]==-1:
if k==0:
if n==0:
cache[n][k]=1
elif n==1:
cache[n][k]=0
else:
cache[n][k]= (n-1)*(D(n-1,0)+D(n-2,0))
else:
cache[n][k]=ncr(n,k)*D(n-k,0)
return cache[n][k]
return cache[n][k]
def answer(total, match):
return sum(D(total,i) for i in xrange(match,total+1))%MOD
if __name__=='__main__':
cases=int(stdin.readline())
for case in xrange(cases):
stderr.write("case %d:\n"%case)
G,C=map(int,stdin.readline().split())
print answer(G,C)
from sys import stdin, stderr, setrecursionlimit as recdepth
from math import factorial as fact
recdepth(100000)
MOD=1051962371
cache=[[-1 for i in xrange(101)] for j in xrange(101)]
def ncr(n,k):
return fact(n)/fact(k)/fact(n-k)
def D(n,k):
if cache[n][k]==-1:
if k==0:
if n==0:
cache[n][k]=1
elif n==1:
cache[n][k]=0
else:
cache[n][k]= (n-1)*(D(n-1,0)+D(n-2,0))
else:
cache[n][k]=ncr(n,k)*D(n-k,0)
return cache[n][k]
return cache[n][k]
def answer(total, match):
return sum(D(total,i) for i in xrange(match,total+1))%MOD
if __name__=='__main__':
cases=int(stdin.readline())
for case in xrange(cases):
stderr.write("case %d:\n"%case)
G,C=map(int,stdin.readline().split())
print answer(G,C)
Like everyone else, I computed the function that I now know is Rencontres Numbers, but I derived the recursive equation myself in the contest. Without loss of generality, we simply assume the correct labels of wines are 1, 2, .., g, i.e., not permuted at all.
Let's denote the function as f(g,c). Given g glasses, we look at the first glass, and we could either label it right, or label it wrong.
If we label it right, we reduce the problem to getting c-1 right out of g-1 glasses, i.e., f(g-1, c-1).
If we label it wrong, we have g-1 choices for the first glass. For the remaining g-1 glasses, we must get c glasses correct, but this subproblem is different from the f we're computing, because out of the g-1 glasses, there's already a mismatching glass. To be more precise, for the first glass, our answer is j instead of the correct label 1. Let's assume there's another function h that computes it for us.
So we have f(g,c) = f(g-1,c-1) + (g-1) * h(g-1, c).
Now to compute h(g,c), we need to consider two cases at the jth glass.
If we label it 1, we reduce the problem to f(g-1,c).
If we label it k, we have g-1 choices, and the problem is reduced to h(g-1,c).
So we have h(g,c) = f(g-1,c) + (g-1) * h(g-1,c).
Here's the complete program in Haskell, with memoization and some debugging support.
import Control.Monad
import Data.MemoTrie
--import Debug.Trace
trace = flip const
add a b = mod (a+b) 1051962371
mul a b = mod (a*b) 1051962371
main = do
(_:input) <- liftM words getContents
let map' f [] = []
map' f (a:c:xs) = f (read a) (read c) : map' f xs
mapM print $ map' ans input
ans :: Integer -> Integer -> Integer
ans g c = foldr add 0 $ map (f g) [c..g]
memoF = memo2 f
memoH = memo2 h
-- Exactly c correct in g
f :: Integer -> Integer -> Integer
f g c = trace ("f " ++ show (g,c) ++ " = " ++ show x) x
where x = if c < 0 || g < c then 0
else if g == c then 1
else add (memoF (g-1) (c-1)) (mul (g-1) (memoH (g-1) c))
-- There's one mismatching position in g positions
h :: Integer -> Integer -> Integer
h g c = trace ("h " ++ show (g,c) ++ " = " ++ show x) x
where x = if c < 0 || g < c then 0
else add (memoF (g-1) c) (mul (g-1) (memoH (g-1) c))

Resources