Decision Algorithms

Learn about the dynamic programming-based CYK algorithm to parse context-free languages.

We'll cover the following...

The CYK algorithm for parsing CFLs
Stage 1
Stage 2
- Stage 3
Stage 4
- Tabular method for CYK Algorithm
- Diagonal (Stage) 4
  - Diagonal approach in action

The CYK algorithm for parsing CFLs

The most fundamental question to ask about a context-free language is whether a given string is in the language (this is the parsing problem). If a deterministic pushdown automata (DPDA) is available, then the answer is as simple as running a string through it, but many context-free languages are not deterministic. There happens to be an algorithm, however, that works for all context-free languages. This is where the Chomsky normal form for context-free grammars is useful.

Recall that rules in a CNF grammar come in only two forms: where the right-hand side of a rule is either a single terminal, or exactly two nonterminals. The algorithm we have in mind begins by matching symbols of the string to the first type of rule, inferring the variables that produce each symbol, and then looks for rules where those variables appear as right-hand sides, so we can trace back to where those rules in turn originated. Continuing in this manner, if we eventually reach the start variable, then we know the string is in the language.

This algorithm, sometimes referred to as the "CYK Algorithm,"In honor of the authors Cocke, Younger, and Kasami, who discovered the algorithm independently. uses a problem-solving technique called dynamic programming to obtain the result. Dynamic programming solves problems in stages and uses the results from previous stages to solve later ones. We’ll illustrate the algorithm by testing the string $baaa$ with the following CFG in Chomsky normal form:

We can easily see that this grammar does generate the string $baaa$ . In fact, the grammar is ambiguous and there are multiple, distinct parse trees for this string. But let’s get to the algorithm.

Stage 1

We begin by finding all the variables that can produce substrings of length 1, that is, that directly produce the terminal symbols $a$ or $b$ . For the grammar in question, we observe that $b$ comes from $X$ and $a$ is generated by the variables $X$ , $Y$ , and $A$ , which we denote as follows:

Getting Started

Exam on Formal Languages

Finite Automata

Exam on Finite Automata

Regular Expressions and Grammars

Exam on Regular Expressions and Grammars

Properties of Regular Languages

Exam on Properties of Regular Languages

Pushdown Automata

Exam on Pushdown Automata

Context-Free Grammars

Exam on Context-Free Grammars

Properties of Context-Free Languages

Exam on Properties of Context-Free Languages

Turing Machines

Exam on Turing Machines

The Landscape of Formal Languages

Exam on the Landscape of Formal Languages

Computability

Exam on Computability

Conclusion

Decision Algorithms

The CYK algorithm for parsing CFLs

Stage 1

Stage 2