Home/Blog/Data Science/Deterministic finite automata
Home/Blog/Data Science/Deterministic finite automata

Deterministic finite automata

Malik Jahan
11 min read

In the theory of computation, a language is a set of strings composed of alphabets. There are different categories of languages. Each category represents a set of computational problems. Commonly-known categories of languages are listed below:

One primitive category is regular languages.

A regular language is one that can be represented by a regular expression or finite automata.

A deterministic finite automata (DFA) is an abstract mathematical model composed of the following components:

  • Set of alphabets Σ\Sigma
  • Set of states QQ
  • Set of final states (F)(F) such that FQF \subseteq Q
  • Dedicated start state q0q_0 where q0Qq_0 \in Q

It also contains a set of transition functions where each transition function δ:Q×ΣQ\delta: Q \times \Sigma \rightarrow Q is defined as:

δ(q,a)=p\delta(q,a)=p

where p,qQp, q \in Q and aΣa \in \Sigma. If we read aa while being at the state qq, the transition will take us to the state pp.

The language of the DFA is defined as the set of strings that can be accepted by the DFA. A string is considered an accepted string if a sequence of transition functions can lead from q0q_0 to one of the final states of the DFA by reading each character of the string from left to right. The acceptance can be notated using an extended transition function δ\delta^*, as shown below:

δ(q0,w)F\delta^*(q_0,w) \in F

Here, ww is a string composed of the characters of Σ\Sigma. The language comprises all strings wΣw \in \Sigma^* for which δ(q0,w)F\delta^*(q_0,w) \in F.

Characteristics of a DFA#

A deterministic finite automata should fulfill the following characteristics:

  • It has a finite number of states.
  • It should have a dedicated start state.
  • A set of final states should be clearly designated.
  • From each state qQq \in Q, there must be exactly one transition for each alphabet aΣa \in \Sigma. This results in QΣ|Q|^{|\Sigma|} transitions in a DFA.

Notations#

A DFA is defined as a tuple (Q,Σ,δ,q0,FQ(Q,\Sigma,\delta,q_0,F \subseteq Q):

  • State: Each state qQq \in Q is notated as a labeled circle.
  • Start state: The start state q0q_0 of the DFA is notated as a labeled circle having an unlabeled incoming edge (arrow).
  • Final state: Each final state qFq' \in F is notated as a labeled double circle (circle in a circle).
  • Transition: Each transition is defined as a transition function δ:Q×ΣQ\delta: Q \times \Sigma \rightarrow Q notated as a labeled directed edge (arrow) going from a state to a state. The label has to be aΣa \in \Sigma.
Basic symbols used to define a DFA
Basic symbols used to define a DFA

Examples#

Let’s design DFAs for a few sample languages composed of different sets of alphabets.

Example 1: Binary strings of odd length#

Formally, the language will be described over Σ={0,1}\Sigma=\{0,1\} as follows:

L1={wΣ:w=2k+1 where kZ+}L_1=\{w \in \Sigma^*: |w|=2k+1 \text{ where }k\in\mathbb{Z}^+\}

Σ\Sigma^* notates a combination of alphabets of any size including zero. For example, 000000, 110110 and 1111111111 are members of L1L_1 because their length (number of characters) is odd.

To construct the deterministic finite automata for this language, we need to create two states. Each state is expected to take the responsibility of memorizing the odd and even lengths of the input. In other words, each of the states will be used to memorize w%2|w|\%2, i.e. 00 or 11. We don’t have any external memory in DFAs. Therefore, the limited amount of information can be hardwired in the states. For our convenience, we can label the states as qevenq_{even} and qoddq_{odd}. With every new input character, the state will change from odd to even or even to odd. The resultant DFA will look like the one shown below:

DFA to accept strings of odd length
DFA to accept strings of odd length
Question 1

Does the DFA above fulfill all the required characteristics of a DFA?

Show Answer
Question 2

What will be the behavior of the DFA on the input (ww) 110110?

Show Answer
Question 3

What will be the behavior of the DFA on the input (ww) 10011001?

Show Answer

Example 2: Counting characters of each type#

Formally, the language will be described over Σ={a,b}\Sigma=\{a,b\} as follows:

L2={wΣ:the number of a’s is even and the number of b’s is odd}L_2=\{w \in \Sigma^* : \text{the number of } a \text{'s is even and the number of }b \text{'s is odd} \}

The language L2L_2 contains strings like bb, aabaab, abaaba, bbbaabbbaa, bababbabab, etc.

There are two types of counters involved in the recognition of this language. After a transition, the number of aa's could be even or odd. Similarly, after a transition the number of bb's could be even or odd. At the start of the DFA recognizing this language, both of these counters will be even because the number of aa's and the number of bb's are zero at the start. In other words, the number of aa's is even and the number of bb's is even at the start state, i.e. qevenq_{even}. We have four such possible combinations that need to be modeled in the DFA. A combination with an even number of aa's and an even number of bb's will be represented by a state in the DFA notated as qEEq_{EE}, where EEEE notates even and even. The first symbol in this notation represents the number of aa's and the second symbol notates the number of bb's. We model the DFA using this notation of the states and link them with the corresponding transitions as shown below.

DFA to accept strings containing an even number a's and an odd number of b's
DFA to accept strings containing an even number a's and an odd number of b's
Question 1

What will be the behavior of the DFA above if we give the string (ww) aba as an input to it?

Show Answer
Question 2

What will be the behavior of the DFA on the input string (ww) ba?

Show Answer

Example 3: Recognizing integers divisible by 3#

The language containing integers that are divisible by 3 can be formally described as follows over Σ={0,1,2,...,9}\Sigma=\{0,1,2,...,9\}:

L3={wΣ:w%3=0 and w1}L_3=\{w\in \Sigma^*: w \% 3 = 0 \text{ and }|w| \geq 1\}

This language is composed of non-empty strings that can contain any digit from 00 to 99. It’s given that any integer (which is a string made up of characters of Σ\Sigma) can be a multiple of 33 or not. When it’s a multiple of 33 then the remainder will be zero. When the given string isn’t a multiple of 33 then the possible remainder can be either 11 or 22. Memorizing the remainders is important in this context. So, each state is expected to represent a possible remainder when the string consumed till that state is divided by 33. We label these states as 00, 11, and 22 to represent the respective remainders. Additionally, we need to have a separate start state ss that will take us to either of the states notating a remainder.

For example, if the starting digit of the input is 11, 44, or 77 then its remainder (when divided by 33) is 11. These three possible input characters should take us to the state 11. When at state 11, if the next digit is 00, it makes up the string 1010, 4040, or 7070, depending on what the first character was. When divided by 33, its remainder remains 11. That’s why the loop with the labels of these possible input characters keeps us in the same state. But if we read 11, 44 or 77 after the first character, the second character will be appended with the remainder computed so far, i.e. 11. So, the input string of two characters will be of the form 1111, 1414 or 1717. When divided by 3, it will take us to state 22 because the remainder of all these combinations when divided by 33 is 22. Similarly, if the second input character is 22, 55, or 88 then the resultant becomes 1212, 1515, or 1818 after appending with the remainder 11. Because its remainder when divided by 33 is 00, it takes us to state 00. This logic is replicated across all possible transitions from all the states.

DFA to accept integers that are divisible by 3
DFA to accept integers that are divisible by 3
Question 1

Should the input (ww) 222222 be accepted by the DFA? How will the DFA behave on it?

Show Answer
Question 2

Should the input string (ww) 404404 be accepted by the DFA? How will it behave on this input?

Show Answer
Question 3

When is a string rejected by the DFA?

Show Answer

Applications of DFA#

DFAs can be used to model many interesting real-world applications. Some examples are listed below:

Way forward#

We hope that you enjoyed this introduction to DFA. For an in-depth understanding of the theory of automata, you can explore the following:

Theory of Computation

Cover
Theory of Computation

What are the mathematics behind computers? What are the theoretical foundations of computer languages? Such questions can be explored by understanding formal languages and automata models. In this comprehensive course, you’ll explore formal languages, regular languages, and how to model them with their associated automata models. Next, you’ll cover regular expressions and grammar, and their equivalents. Then, you’ll explore context-free languages (the foundations of programming languages), pushdown automata, and the relationship between them. Toward the end, you’ll learn about recursively enumerable languages, Turing machines with their variations, context-sensitive languages, and unrestricted grammars. By the end of the course, you’ll have gained a deep understanding of several formal languages and their associated automata. Also, since there are plenty of exercises throughout the course, your understanding and problem-solving skills will be thoroughly reinforced.

15hrs
Beginner
10 Playgrounds
9 Quizzes

  

Free Resources