...

/

Derivation Trees and Ambiguous Grammars

Derivation Trees and Ambiguous Grammars

Learn about derivation trees and their advantages.

Derivation trees

Derivations of strings in context-free languages have an associated tree representation called a derivation tree (or parse tree or syntax tree). The advantage of derivation trees is that they reveal the hierarchical structure of a derivation. The following diagram shows the tree for the string, “The cow annoys the student.” For those who diagrammed sentences in elementary school, this will have a somewhat familiar look.

The rewrite rules are also called production rules, or just productions for short. Consider the following simplistic grammar, which generates random sentences:

Press + to interact
Derivation tree for "The cow annoys the student"
Derivation tree for "The cow annoys the student"

The derivation tree clearly depicts the order of substitution in a top-down fashion. The leaves of the tree, enclosed in boxes here for emphasis, are the terminal substrings, which we read from the tree left-to-right to construct the sentence. (We arbitrarily insert spaces between the words for readability.)


The following grammar forms simple mathematical expressions using addition, multiplication, and parentheses for grouping:

Let’s derive the expression a+bca+b*c and draw its derivation tree:

Press + to interact
Derivation tree for a+b*c
Derivation tree for a+b*c

It is interesting that with this grammar, it is possible to obtain a different leftmost derivation and associated derivation tree for the very same string:

Press + to interact
Another derivation tree for a+b*c
Another derivation tree for a+b*c

Both trees yield the same sentence, but there is an important structural difference between the two trees. Which one is correct?

From a mathematical perspective, the first tree in the derivation of a+bca+b*c is correct because it represents the correct order of operations. The first tree says, “The entire expression is the sum of two subexpressions.” What are those two subexpressions? The first subexpression is aa and the second is “The product of bb and cc.” Operations further down in the tree evaluate first so that their results are available for the operators higher up in the tree. This is how derivation trees represent the precedence of operations: the operations further down in the tree evaluate first.

Note: Operations at the lower levels of a derivation tree evaluate first.

Operator precedence

A grammar that gives two distinct derivation trees (or two distinct leftmost derivations) for the same string is an ambiguous grammar. As with nondeterminism, ambiguity is not our friend—it makes implementations difficult. In this case, However, in this case, we can modify the grammar to be unambiguous by reflecting the order of operations in the grammar. The usual order of precedence in mathematics is:

  1. Expressions grouped by parentheses.
  2. Exponents.
  3. Multiplication and division.
  4. Addition and subtraction.

Therefore, we want the higher-precedence operations to occur further down in derivation trees, so we will represent them by variables “further down” in the grammar. See the revised grammar below:

Addition is at the top of the grammar, so it will appear at the top of a derivation tree, unless it is inside parentheses. Let’s now derive the expression a+bca+b*c ...

Access this course and 1400+ top-rated courses and projects.