Odd DFA for Language - algorithm

Can somebody please help me draw a NFA that accepts this language:
{ w | the length of w is 6k + 1 for some k ≥ 0 }
I have been stuck on this problem for several hours now. I do not understand where the k comes into play and how it is used in the diagram...

{ w | the length of w is 6k + 1 for some k ≥ 0 }
We can use the Myhill-Nerode theorem to constructively produce a provably minimal DFA for this language. This is a useful exercise. First, a definition:
Two strings w and x are indistinguishable with respect to a language L iff: (1) for every string y such that wy is in L, xy is in L; (2) for every string z such that xz is in L, wz is in L.
The insight in Myhill-Nerode is that if two strings are indistinguishable w.r.t. a regular language, then a minimal DFA for that language will see to it that the machine ends up in the same state for either string. Indistinguishability is reflexive, symmetric and transitive so we can define equivalence classes on it. Those equivalence classes correspond directly to the set of states in the minimal DFA. Now, to find the equivalence classes for our language. We consider strings of increasing length and see for each one whether it's indistinguishable from any of the strings before it:
e, the empty string, has no strings before it. We need a state q0 to correspond to the equivalence class this string belongs to. The set of strings that can come after e to reach a string in L is L itself; also written c(c^6)*
c, any string of length one, has only e before it. These are not, however, indistinguishable; we can add e to c to get ce = c, a string in L, but we cannot add e to e to get a string in L, since e is not in L. We therefore need a new state q1 for the equivalence class to which c belongs. The set of strings that can come after c to reach a string in L is (c^6)*.
It turns out we need a new state q2 here; the set of strings that take cc to a string in L is ccccc(c^6)*. Show this.
It turns out we need a new state q3 here; the set of strings that take ccc to a string in L is cccc(c^6)*. Show this.
It turns out we need a new state q4 here; the set of strings that take cccc to a string in L is ccc(c^6)*. Show this.
It turns out we need a new state q5 here; the set of strings that take ccccc to a string in L is cc(c^6)*. Show this.
Consider the string cccccc. What strings take us to a string in L? Well, c does. So does c followed by any string of length 6. Interestingly, this is the same as L itself. And we already have an equivalence class for that: e could also be followed by any string in L to get a string in L. cccccc and e are indistinguishable. What's more: since all strings of length 6 are indistinguishable from shorter strings, we no longer need to keep checking longer strings. Our DFA is guaranteed to have one the states q0 - q5 we have already identified. What's more, the work we've done above defines the transitions we need in our DFA, the initial state and the accepting states as well:
The DFA will have a transition on symbol c from state q to state q' if x is a string in the equivalence class corresponding to q and xc is a string in the equivalence class corresponding to q';
The initial state will be the state corresponding to the equivalence class to which e, the empty string, belongs;
A state q is accepting if any string (hence all strings) belonging to the equivalence class corresponding to the language is in the language; alternatively, if the set of strings that take strings in the equivalence class to a string in L includes e, the empty string.
We may use the notes above to write the DFA in tabular form:
q x q'
-- -- --
q0 c q1 // e + c = c
q1 c q2 // c + c = cc
q2 c q3 // cc + c = ccc
q3 c q4 // ccc + c = cccc
q4 c q5 // cccc + c = ccccc
q5 c q0 // ccccc + c = cccccc ~ e
We have q0 as the initial state and the only accepting state is q1.

Here's a NFA which goes 6 states forward then if there is one more character it stops on the final state. Otherwise it loops back non-deterministcally to the start and past the final state.
(Start) S1 -> S2 -> S3 -> S5 -> S6 -> S7 (Final State) -> S8 - (loop forever)
^ |
^ v |_|
|________________________| (non deterministically)

Related

Cleaner way to represent languages accepted by DFAs?

I am given 2 DFAs. * denotes final states and -> denotes the initial state, defined over the alphabet {a, b}.
1) ->A with a goes to A. -> A with b goes to *B. *B with a goes to *B. *B with b goes to ->A.
The regular expression for this is clearly:
E = a* b(a* + (a* ba* ba*)*)
And the language that it accepts is L1= {w over {a,b} | w is b preceeded by any number of a's followed by any number of a's or w is b preceeded by any number of a's followed by any number of bb with any number of a's in middle of(middle of bb), end or beginning.}
2) ->* A with b goes to ->* A. ->*A with a goes to *B. B with b goes to -> A. *B with a goes to C. C with a goes to C. C with b goes to C.
Note: A is both final and initial state. B is final state.
Now the regular expression that I get for this is:
E = b* ((ab) * + a(b b* a)*)
Finally the language that this DFA accepts is:
L2 = {w over {a, b} | w is n 1's followed by either k 01's or a followed by m 11^r0' s where n,km,r >= 0}
Now the question is, is there a cleaner way to represent the languages L1 and L2 because it does seem ugly. Thanks in advance.
E = a* b(a* + (a* ba* ba*)*)
= a*ba* + a*b(a* ba* ba*)*
= a*ba* + a*b(a*ba*ba*)*a*
= a*b(a*ba*ba*)*a*
= a*b(a*ba*b)*a*
This is the language of all strings of a and b containing an odd number of bs. This might be most compactly denoted symbolically as {w in {a,b}* | #b(w) = 1 (mod 2)}.
For the second one: the only way to get to state B is to see an a in A, and the only way to get to C from outside C is to see an a in B. C is a dead state and the only way to get to it is to see aa starting in A. That is: if you ever see two as in a row, the string is not in the language; the language is the set of all strings over a and b not containing the substring aa. This might be most compactly denoted symbolically as {(a+b)*aa(a+b)*}^c where ^c means "complement".

Theory of Computation. Turing Machine

Click here for the answer. Turing Machine
The question is to construct a Turing Machine which accepts the regular expression,
L = {a^n b^n | n>= 1}.
I am not sure if my answer is correct or wrong. Thank you in advance for your reply.
You cannot "accept the regular expression", only the language it describes. And what you provide is not a regular expression, but a set description. In fact, the language is not regular and therefore cannot be described by standard regular expressions.
The machine from your answer accepts the language described by the regular expression a^+ b^+.
A TM could mark the first a (e.g. by converting it to A) then delete the first b. And for each n one loop. If you and up with a string only of A, then accept.
As stated before, language L = {a^nb^n; n >= 1} cannot be described by regular expressions, it doesn't belong into the category of regular grammars. This language in particular is an example of context-free grammar, and thus it can be described by context-free grammar and recognized by pushdown automaton (an automaton with LIFO memory, a stack).
Grammar for this language would look something like this:
G = (V, S, R, P)
Where:
V is finite set of non-terminal characters, V = { S }
S is finite set of terminal characters, S = { a, b }
R is relation that describes "rewrites" from non-terminal characters to non-terminals and terminals, in this case R = { S -> aSb, S -> ab }
P is starting non-terminal character, P = S
A pushdown automata recognizing this language would be more complex, as it is a 7-tuple M = (Q, S, G, D, q0, Z, F)
Q is set of states
S is input alphabet
G is stack alphabet
D is the transition relation
q0 is start state
Z is initial stack symbol
F is set of accepting states
For our case, it would be:
Q = { q0, q1, qF }
S = { a, b }
G = { z0, X }
D will take a form of relation (current state, input character, top of stack) -> (output state, top of stack) (meaning you can move to a different state and rewrite top of stack (erase it, rewrite it or let it be)
(q0, a, z0) -> (q0, Xz0) - reading the first a
(q0, a, X) -> (q0, XX) - reading consecutive a's
(q0, b, X) -> (q1, e) - reading first b
(q1, b, X) -> (q1, e) - reading consecutive b's
(q1, e, z0) -> (qF, e) - reading last b
where e is empty word (sometimes called epsilon)
q0 = q0
Z = z0
F = { qF }
The language L = {a^n b^n | n≥1} represents a kind of language where we use only 2 character, i.e., a, b. In the beginning language has some number of a’s followed by equal number of b’s . Any such string which falls in this category will be accepted by this language. The beginning and end of string is marked by $ sign.
Step-1:
Replace a by X and move right, Go to state Q1.
Step-2:
Replace a by a and move right, Remain on same state
Replace Y by Y and move right, Remain on same state
Replace b by Y and move right, go to state Q2.
Step-3:
Replace b by b and move left, Remain on same state
Replace a by a and move left, Remain on same state
Replace Y by Y and move left, Remain on same state
Replace X by X and move right, go to state Q0.
Step-5:
If symbol is Y replace it by Y and move right and Go to state Q4
Else go to step 1
Step-6:
Replace Y by Y and move right, Remain on same state
If symbol is $ replace it by $ and move left, STRING IS ACCEPTED, GO TO FINAL STATE Q4

Is these algorithms equivalent?

E is a logic variable (T/F), P and Q are programs
(P)
If E then R
Else
S
(Q)
bool c = E
bool d = not E
While c do
Begin
R
c = d
End
While d do
Begin
S
d = c
End
We knew that, the same input mean the same output, so they are weak-equivalency, but what about the execute time numbers (R)? I am not sure R is for (R,S) or E?
As it is now, both variants are equivalent in the sense that R and S are fulfilled or not in both variants according to the same start conditions.
But the second variant also sets two variables c and d, and obviously they will be used somehow later, for otherwards there is no use in their setting inside while cycles. So, the second variant has additional and independent consequences (c and d are defined and set to false).
If the S and R can cancel the whole algorithm, that additional part becomes NOT independent.

Languages to context free grammars

Provide context free grammars for the following languages.
(a) {a^mb^nc^n | m ≥ 0 and n ≥ 0 }
(b) {a^nb^nc^m | m ≥ 0 and n ≥ 0 }
If there were any other rules involved such as m = n or anything like that, I could get it, but the general m greater than or equal to zero? I'm pretty confused. and also I don't understand how a and b would be any different.
Here was my shot at making a grammar out of this:
S1 --> S2 | e
S2 --> aS2bS2c | S3
S3 --> aS3 | S4
S4 --> bS4 | S5
S5 --> cS5 | c
Is a^mb^n m repetitions of a followed by n repetitions of b? It looks like you copy-pasted an assignment and neglected even to format it to be readable on this site.
Assuming I'm reading that correctly, moving on…
The key is that (in the first language) b and c are repeated an equal number of times. When you match a b you must simultaneously match a c. A production matching this subsequence would be
S1 => e
S1 => b S1 c
Note that there are two languages there so you need two answers. You aren't being asked for one grammar that handles both languages. (The main problem with that would be ambiguity in the case that n = m).

string pattern match,the suffix array can solve this or have more solution?

i have a string that random generate by a special characters (B,C,D,F,X,Z),for example to generate a following string list:
B D Z Z Z C D C Z
B D C
B Z Z Z D X
D B Z F
Z B D C C Z
B D C F Z
..........
i also have a pattern list, that is to match the generate string and return a best pattern and extract some string from the string.
string pattern
B D C [D must appear before the C >> DC]
B C F
B D C F
B X [if string have X,must be matched.]
.......
for example,
B D Z Z Z C D C Z,that have B and DC,so that can match by B D C
D B Z C F,that have B and C and F,so that can match by B C F
D B Z D F,that have B and F,so that can match by B F
.......
now,i just think about suffix array.
1.first convert a string to suffix array object.
2.loop each a pattern,that find which suffix array can be matched.
3.compare all matched patterns and get which is a best pattern.
var suffix_array=Convert a string to suffix array.
var list=new List();
for (int i=0;i<pattern length;i++){
if (suffix_array.match(pattern))
list.Add(pattern);
}
var max=list[0];
for (int i=1;i<list.length;i++){
{
if (list[i]>max)
max=list[i];
Write(list[i]);
}
i just think this method is to complex,that need to build a tree for a pattern ,and take it to match suffix array.who have a more idea?
====================update
i get a best solution now,i create a new class,that have a B,C,D,X...'s property that is array type.each property save a position that appear at the string.
now,if the B not appear at the string,we can immediately end this processing.
we can also get all the C and D position,and then compare it whether can sequential appear(DC,DCC,CCC....)
I'm not sure what programming language you are using; have you checked its capabilities with regular expressions ? If you are not familiar with these, you should be, hit Google.
var suffix_array=Convert a string to suffix array.
var best=(worst value - presumably zero - pattern);
for (int i=0;i<pattern list array length;i++){
if (suffix_array.match(pattern[i])){
if(pattern[i]>best){
best=pattern[i];
}
(add pattern[i] to list here if you still want a list of all matches)
}
}
write best;
Roughly, anyway, if I understand what you're looking for that's a slight improvement though I'm sure there may be a better solution.

Resources