Assuming 0 1 design a regular expression that generates the
Assuming = {0, 1}, design a regular expression that generates the language L = {w | every third position contains a ‘0’}.
Solution
assume = {0,1}{ w | w has length 3 and its 3rd symbol is 0 }
(0+1)(0+1)0(0+1)*
DFAs NFAs Regular Expressions!
L can be represented by some regexp L is regular
L can be represented by some regexp
L is regular
Induction Step: Suppose every regexp of length < k represents some regular language.Consider a regexp R of length k > 1
Three possibilities for R:
R = R1 + R2
R = R1 R2
R = (R1)*
Induction Step: Suppose every regexp of length < k represents some regular language.
Consider a regexp R of length k > 1
Three possibilities for R:
R = R1 + R2
R = R1 R2
R = (R1)*
By induction, R1 and R2 represent some regular languages, L1 and L2.But L(R) = L(R1 + R2) = L1 L2
so L(R) is regular, by the union theorem!
Induction Step: Suppose every regexp of length < k represents some regular language.
Consider a regexp R of length k > 1
Three possibilities for R:
R = R1 + R2
R = R1 R2
R = (R1)*
By induction, R1 and R2 represent some regular languages, L1 and L2
But L(R) = L(R1•R2) = L1••L2
so L(R) is regular by the concatenation theorem
Induction Step: Suppose every regexp of length < k represents some regular language.
Consider a regexp R of length k > 1
Three possibilities for R:
R = R1 + R2
R = R1 R2
R = (R1)*
By induction, R1 and R2 represent some regular languages, L1 and L2
But L(R) = L(R1*) = L1*
so L(R) is regular, by the star theorem
Induction Step: Suppose every regexp of length < k represents some regular language.
Consider a regexp R of length k > 1
Three possibilities for R:
R = R1 + R2
R = R1 R2
R = (R1)*
Therefore:
By induction, R1 and R2 represent some regular languages, L1 and L2
But L(R) = L(R1*) = L1*
so L(R) is regular, by the star theorem
If L is represented by a regexp, then L is regular
Give an NFA that accepts the language represented by (1(0 + 1))* 1 1,0
Regular expression: ( 1(0+1))*
Generalized NFAs (GNFA)
L can be represented by a regexp
L is a regular language
Idea: Transform an NFA for L into a regular expression by removing states and re-labeling the arcs with regular expressions
Rather than reading in just 0 or 1 letters from the string on a step, we can read in entire substrings
A GNFA is a 5-tuple G = (Q, , R, qstart, qaccept) Q, are states and alphabet
R : (Q-{qaccept}) (Q-{qstart}) R
is the transition function
qstart Q is the start state
qaccept Q is the (unique) accept state
R = set of all regular expressions over
A GNFA is a 5-tuple G = (Q, , R, qstart, qaccept)
Let w * and let G be a GNFA.
G accepts w if w can be written as w = w1 L wk where wi * and there is a sequence
r0, r1, ..., rk Q such that
• r0 = qstart
• wi matches R(ri-1, ri) for all i = 1, ..., k, and
• rk qaccept
L(G) = set of all strings that G accepts = “the language recognized by G”
Generalized NFA (GNFA)
cb
a*b q1 a q2
q0
This GNFA recognizes L(a*b(cb)*a) Is aaabcbcba accepted or rejected?
Is bba accepted or rejected? Is bcba accepted or rejected?
NFA
Add unique start and accept states
NFA
While the machine has more than 2 states:
Pick an internal state, rip it out and re-label the arrows with regexps,
to account for paths through the missing state
0 0
1
NFA
While the machine has more than 2 states:
Pick an internal state, rip it out and re-label the arrows with regexps,
to account for paths through the missing state
01*0
GNFA
While the machine has more than 2 states:
In general:
R(q1,q3)
R(q1,q2) Q2 R(q2,q3)
Q1 Q3
R(q2,q2)
GNFA
While the machine has more than 2 states:
In general:
R(q1,q2)R(q2,q2)*R(q2,q3) + R(q1,q3)
Q1 Q3
a a,b
q1 b q2 q3
q0
R(q0,q3) = (a*b)(a+b)*
represents L(N)
a,b
a*b q2 q3
q0
R(q0,q3) = (a*b)(a+b)*
represents L(N)
(a*b)(a+b)*
q0 q3
R(q0,q3) = (a*b)(a+b)*
represents L(N)
defines a new GNFA G
Formally: Given an DFA, add qstart and qacc to create G For all q,q’, define R(q,q’) to be if (q,) = q’, else
CONVERT(G): (Takes a GNFA, outputs a regexp)
If #states = 2 return R(qstart, qacc) If #states > 2
select qripQ different from qstart and qacc
define Q = Q – {qrip}
define R on Q’-{qacc} x Q’-{qstart} as:
R(qi,qj) = R(qi,qrip)R(qrip,qrip)*R(qrip,qj) + R(qi,qj)
return CONVERT(G) Claim:
L(G’) = L(G)
Theorem: Let R = CONVERT(G). Then L(R) = L(G).
Proof by induction on k, the number of states in G
Base Case: k = 2 CONVERT outputs R(qstart, qacc)
Inductive Step:
Assume theorem is true for k-1 state GNFAs
Let G have k states. Let G’ be the k-1 state GNFA obtained by ripping out a state.
We already claimed that L(G) = L(G)
G’ has k-1 states, so by induction,
L(G) = L(CONVERT(G’)) = L(R)
Therefore L(R)=L(G). QED
b
a
q1 q2
a
b
b a
q3
bb
b
a
q1 q2
a + ba
b
bb + (a + ba)b*a
q1
b + (a + ba)b*
(bb + (a + ba)b*a)* (b + (a + ba)b*)
Convert the NFA to a regular expression
a, b b
q1 q2
b
a b
q3
Convert the NFA to a regular expression
a, b b
q1 q2
b
a b
q3
Convert the NFA to a regular expression
q1 (a + b)b*b
a bb*b
q3
Convert the NFA to a regular expression
(a + b)b*b(bb*b)*q1
(a + b)b*b(bb*b)
((a + b)b*b(bb*b)*a)*( + (a + b)b*b(bb*b)*)
DFAs NFAs
DEFINITION
Regular Regular
Languages Expressions
Some Languages Are Not Regular:
Limitations on DFAs
Regular or Not?
C = { w | w has equal number of 1s and 0s}
NOT REGULAR!
D = { w | w has equal number of occurrences of 01 and 10 }
REGULAR!
{ w | w has equal number of occurrences of 01 and 10}= { w | w = 1, w = 0, or w = , or
w starts with a 0 and ends with a 0, or w starts with a 1 and ends with a 1 }
1 + 0 + + 0(0+1)*0 + 1(0+1)*1
Claim:
A string w has equal occurrences of 01 and 10 w starts and ends with the same bit.
The Pumping Lemma:
Structure in Regular Languages
Let L be a regular language
Then there is a positive integer P s.t.for all strings w L with |w| P there is a way to write w = xyz, where:
1. |y| > 0 (that is, y )
2. |xy| P
3. For all i 0, xyiz L
Why is it called the pumping lemma? The word w gets pumped into longer and longer strings…
Proof: Let M be a DFA that recognizes L
Let P be the number of states in M
Let w be a string where w L and |w| P
We show: w = xyz 1. |y| > 0
y 2. |xy| P
3. xyiz L for ALL i 0
x z
…
q0 j q|w|
There must exist j and k such that 0 j < k P, and qj = qk
Applying the Pumping Lemma
Let’s prove that
B = {0n1n | n 0} is not regular
By contradiction. Assume B is regular.
Let P be the number of states in a DFA for B. Let w = 0P1P







