Derivation Trees and Ambiguous Grammars
Learn about derivation trees and their advantages.
Derivation trees
Derivations of strings in context-free languages have an associated tree representation called a derivation tree (or parse tree or syntax tree). The advantage of derivation trees is that they reveal the hierarchical structure of a derivation. The following diagram shows the tree for the string, “The cow annoys the student.” For those who diagrammed sentences in elementary school, this will have a somewhat familiar look.
The rewrite rules are also called production rules, or just productions for short. Consider the following simplistic grammar, which generates random sentences:
The derivation tree clearly depicts the order of substitution in a top-down fashion. The leaves of the tree, enclosed in boxes here for emphasis, are the terminal substrings, which we read from the tree left-to-right to construct the sentence. (We arbitrarily insert spaces between the words for readability.)
The following grammar forms simple mathematical expressions using addition, multiplication, and parentheses for grouping:
Let’s derive the expression and draw its derivation tree:
It is interesting that with this grammar, it is possible to obtain a different leftmost derivation and associated derivation tree for the very same string:
Both trees yield the same sentence, but there is an important structural difference between the two trees. Which one is correct?
From a mathematical perspective, the first tree in the derivation of is correct because it represents the correct order of operations. The first tree says, “The entire expression is the sum of two subexpressions.” What are those two subexpressions? The first subexpression is and the second is “The product of and .” Operations further down in the tree evaluate first so that their results are available for the operators higher up in the tree. This is how derivation trees represent the precedence of operations: the operations further down in the tree evaluate first.
Note: Operations at the lower levels of a derivation tree evaluate first.
Operator precedence
A grammar that gives two distinct derivation trees (or two distinct leftmost derivations) for the same string is an ambiguous grammar. As with nondeterminism, ambiguity is not our friend—it makes implementations difficult. In this case, However, in this case, we can modify the grammar to be unambiguous by reflecting the order of operations in the grammar. The usual order of precedence in mathematics is:
- Expressions grouped by parentheses.
- Exponents.
- Multiplication and division.
- Addition and subtraction.
Therefore, we want the higher-precedence operations to occur further down in derivation trees, so we will represent them by variables “further down” in the grammar. See the revised grammar below:
Addition is at the top of the grammar, so it will appear at the top of a derivation tree, unless it is inside parentheses. Let’s now derive the expression ...