What is a context-free grammar?

A context-free grammar (CFG) is a set of recursive rules we use to form strings of a particular context-free language. In simple words, we use CFGs to form context-free languages.

Components

The following are the components of a CFG:

  • Terminal symbols: These are the set of characters involved in a language, and the building blocks of the strings. These remain on the right side of the production rule.

  • Non-terminal symbols: They are involved in the string production process. That is, they generate terminal symbols. These always appear on the left side of the production rule.

  • Start symbol: This is a non-terminal symbol we specify to start the production chain.

  • Production rules: These are the rules that specify the replacement of non-terminals. They are of the following form: nonterminal  set of nonterminals and terminalsnon-terminal\space \to \space set\space of\space non-terminals\space and\space terminals

The components above constitute the 4 tuples (S, V, T, P) of a CFG:

  • S: The starting variable/symbol

  • V: A set of non-terminals

  • T: A set of terminals

  • P: A set of production rules

The steps to produce a string

A parse tree forms from a CFG that allows us to keep a step-by-step record of the string production. The steps are as follows:

  1. Begin the parse tree with the start symbol. This is the root.

  2. Select a production path from the right side of the start symbol and branch it out of the root of the parse tree.

  3. Repeat the process of step 2 until all the branches reach the terminals or epsilon (null) at their leaves.

Examples

CFGs are formed by the hit and trial method. That is, there is no formula we can use to create CFGs.

  • Consider a language L=(ε,a,aa,aaa,aaa,.......)L = (ε, a, aa, aaa, aaa,.......)or L=(a)L = (a^*).

By enumerating Kleene's star, we see that if we produce an aa from a non-terminal and apply recursion to it, we form a string of aas.

Let's form a parse tree for aaaa from our created CFG to check if our strings form in the correct order.

%0 node_1 S node_2 a node_1->node_2 node_3 S node_1->node_3 node_1657106827673 a node_3->node_1657106827673 node_1657106870157 S node_3->node_1657106870157 node_1657106881813 ε node_1657106870157->node_1657106881813

We read the leaves from left to right to form the final string. In this case, it is aaεaaε. As epsilon is useless to us, we write aaaa as our final string.

  • Now, let's suppose L=(Palindrome on (a,b)).L = (Palindrome \space on \space (a,b)^*).

So, we want a string's right and left sides to be identical. In this case, we can keep the production symbol in the middle and put the same character on both sides of the production symbol, like this:

Now, let's build a parse tree for the palindrome ababaababa.

%0 node_1 S node_2 a node_1->node_2 node_3 S node_1->node_3 node_4 a node_1->node_4 node_1657108021792 b node_3->node_1657108021792 node_1657108018265 S node_3->node_1657108018265 node_1657108022722 b node_3->node_1657108022722 node_1657108052299 a node_1657108018265->node_1657108052299

As the string starts and ends with aa, we use the first production of SS for the second level. Then, we use the second production for the second level, and so on. If we read the leaves from left to right, we get ababaababa.

Note: By checking the CFG with a few more examples, we can know if our grammar generates such strings for all palindromes.

Copyright ©2024 Educative, Inc. All rights reserved