Feature #1: Remove Comments

Implementing the "Remove Comments" feature for our "Language Compiler" project.

Description

The first functionality we will be building will remove comments from a piece of code prior to its compilation. There are two types of comments in the C++ language, inline comments and block comments.

  • Inline comment: The string // is used for an inline comment, which means that the characters to the right of // in the same line should be ignored.
  • Block comment: The block comment is enclosed between the non-overlapping occurrence of /* and */. Everything inside these delimiters is ignored. Here, occurrences happen in reading order, meaning line by line from left to right. Note that the string /*/ does not yet end the block comment because the ending would be overlapping the beginning.

Note: If two comments are nested, the first effective comment takes precedence over others. In other words, if the string // occurs in a block comment, it is ignored. Similarly, if the string /* occurs in a line or block comment, it is also ignored. For example:

  • // this is an inline /* comment */
  • /* this is a /* // block comment */

When implementing this feature, we have to detect the comments in the code first and then remove them. If any line of the code becomes empty after removing the comments, we should discard that line as well. Tabs and spaces are not considered empty lines. We can also assume that the // and /* will only exist in the context of comments and not part of strings or statements in the code.

Imagine that we have to remove comments from the following C++ code:

Level up your interview prep. Join Educative to access 80+ hands-on prep courses.