Currently I´m trying to learn and understand formal languages and grammatics. I understand the Chomsky hierarchy but I found an task where I don´t know how they got the solution. The task is:
G=({S},{a,b},S,P)
P={S->epsilon, S->aS, S->Sb}
- What is the maximal type of this grammar?
- What is the maximal type of
L(G)
?
I know that the grammar is type 2, but in the answer was written that L(G)
is
type 3.
It seems that there is also a type 3 grammar describing this language, but how do you know which is the maximal type of a formal language? Is there some trick?
I don't think there's some easy, algorithmic way to always tell what class of language
L(G)
is; I mean to say, in general, I think there are cases that simply cannot be proven one way or the other. This from incompleteness/undecidability, given that unrestricted grammars can encode arithmetic. This is very hand-wavy.Heuristically, you can try to understand what your language is, you can transform the grammar to try to force the grammar to be simpler, etc., but these do not guarantee success.
In this particular case, it's not too bad to figure out what the language is, recognize it is regular, and write down a grammar for it.
Any string derived from
S
can begin with any number ofa
s byS -> aS
, and can end with any number ofb
s byS -> Sb
. We might hypothesize that the language isa*b*
, and then prove it by showing (1) L(G) is a subset of ab and (2) ab is a subset of L(G).(1) The proof is by induction on the number of applications of
S -> aS
andS -> Sb
. Base case: if neither of these is applied, only the stringe
is derivable.e
is ina*b*
, so the base case is confirmed. Induction hypothesis: assume any string derived by up to and includingk
applications ofS -> aS
andS -> Sb
is ina*b*
. Induction step: we show that any string derived usingk+1
applications of these productions is also ina*b*
. Any string derived ink+1
productions has one derived ink
productions by skipping thek+1
st production. This is because the productions keep the nonterminal sets the same. We already know this shorter string derived ink
applications is ina*b*
. We have two cases to consider:S -> aS
: this adds onea
to the front of the shorter string ina*b*
, keeping it ina*b*
.S -> Sb
: this adds oneb
to the end of the shorter string ina*b*
, keeping it ina*b*
.Since both cases keep the string in
a*b*
, all strings derived ink+1
applications of the productions are ina*b*
and, by induction, all strings generated by the grammar are.(2) Consider any string
a^n b^m
ina*b*
. It is generated by the grammar as follows:n
applications ofA -> aS
,m
applications ofB -> Sb
, and one application ofS -> e
. Thus all strings ina*b*
are generated by the grammar.