What is a non-capturing group in regular expressions?

A regular expression, also known as regex or regexp, is a sequence of characters that specifies a search pattern. It is a powerful tool used by programming languages to find, replace, and validate strings of text based on patterns rather than just string literals.

Many programming languages, including JavaScript, Python, Perl, and Ruby, support regular expressions. They are also commonly used by text editors such as Sublime Text, Notepad++, and Vim.

Note: Learn more about regular expressions at Regular Expressions for Programmers.

Regular expressions also allow us to create complex search patterns to match the text. One of the most useful features of regular expressions is the ability to use capture groups to extract specific parts of a matched string. However, sometimes we want to group parts of a regular expression without capturing them. This is where no-capturing groups come in.

What is a non-capturing group?

A non-capturing group is a way to group a set of characters or expressions in a regular expression without capturing the matched text. In contrast to capturing groups, non-capturing groups are marked by a special syntax that tells the regular expression engine not to store the matched text in a separate memory slot.

Here is the syntax for a non-capturing group in JavaScript:

(?:expression)
Non-capturing group regex syntax

The (?:) syntax denotes a non-capturing group, and expression represents the regular expression pattern to be matched.

Example

Let's look at a simple example:

Suppose we have a string that contains a phone number in the format (123) 456-7890. We want to match the entire phone number, but we are not interested in capturing the area code (i.e., the digits inside the parenthesis). Here is how we can use a non-capturing group to achieve this:

const str = "(123) 456-7890";
const regex = /\(\d{3}\)\s*(?:{3}-\d{4})/;
const match = str.match(regex);
console.log(match[0]); //"(123) 456-7890"
Regular expression to match a phone number

In this example, the regular expression /\(\d{3}\)\s*(?:\d{3}-\d{4})/ matches a phone number in the format (123) 456-7890. The \(\d{3}\) pattern matches the three digits inside the parentheses, which is the area code. Since we're not interested in capturing the area code, we use a non-capturing group to group the \d{3}-\d{4} pattern (which matches the remaining digits of the phone number) without capturing it. The \s* pattern matches the zero or more whitespace characters between the area code and the remaining digits. Finally, we use the match method to find the first match of the regular expression in the string, and we access the entire match using the [0] index of the resulting array.

Another example is using a non-capturing group to match a string that starts with either Mr. or Ms., followed by a name:

const regex = /^(?:Mr\.|Ms\.) ([A-Za-z]+)$/;
const str1 = 'Mr. Smith';
const str2 = 'Ms. Johnson';
console.log(regex.test(str1)); // true
console.log(regex.test(str2)); // true
Regular expression to match a string

In this example, the non-capturing group (?:Mr\.|Ms\.) matches either Mr. or Ms., but does not capture the matched text. The rest of the regular expression matches a space, followed by one or more letters (using the character class ([A-Za-z]+)) at the end of the string (using the $ anchor).

Note: The test method of the regular expression object is used to check whether the regular expression matches the given string. It returns true if the regular expression matches the string, and false otherwise.

Example

The following is an example to try out, with the explanation given below:

const regexWithCapturingGroup = /(\d{3})-(\d{3})-(\d{4})/;
const phoneNumber1 = '123-456-7890';
const phoneNumber2 = '555-867-5309';
console.log(phoneNumber1.match(regexWithCapturingGroup)); // Output: ["123-456-7890", "123", "456", "7890"]
console.log(phoneNumber2.match(regexWithCapturingGroup)); // Output: ["555-867-5309", "555", "867", "5309"]
// Regular expression with a non-capturing group
const regexWithNonCapturingGroup = /(?:\d{3})-(\d{3})-(\d{4})/;
const phoneNumber3 = '123-456-7890';
const phoneNumber4 = '555-867-5309';
console.log(phoneNumber3.match(regexWithNonCapturingGroup)); // Output: ["123-456-7890", "456", "7890"]
console.log(phoneNumber4.match(regexWithNonCapturingGroup));

Explanation

In this example, we have two regular expressions:

  • one with a capturing group

  • one with a non-capturing group

The regular expressions are used to match phone numbers in the format xxx-xxx-xxxx.

The first regular expression, regexWithCapturingGroup, uses capturing groups to capture each part of the phone number: the first three digits, the second three digits, and the last four digits. When we call match on the phone numbers, the regular expression returns an array of matches, with the entire phone number as the first element, and each of the capturing groups as subsequent elements.

The second regular expression, regexWithNonCapturingGroup, uses a non-capturing group to match the first three digits of the phone number, but does not capture them. This means that when we call match, the first element of the resulting array will still be the entire phone number, but the second and third elements will be the second three digits and the last four digits, respectively.

By using a non-capturing group in this case, we can simplify our regular expression and avoid capturing unnecessary information, which can help make our code more efficient and easier to read.

Advantages of using non-capturing groups

Using non-capturing groups in regular expressions can offer several benefits over capturing groups:

  1. Improved performance and efficiency of regular expressions: Capturing groups can slow down regular expressions because they require the regex engine to allocate memory to store the matched text. In contrast, non-capturing groups don't have this overhead, making them faster and more efficient.

  2. Avoidance of side effects caused by capturing groups: Capturing groups can have unintended side effects in some situations. For example, if we use a capturing group within a regular expression and then replace the matched text with a back reference, the replacement text may include unwanted content. Non-capturing groups can help avoid such side effects by not capturing the text in the first place.

  3. Simplification of regular expressions and better code readability: By using non-capturing groups, we can simplify our regular expressions and make them easier to read and understand. Non-capturing groups signal to other developers that their matched text is not important for the operation of the regular expression and that the group is only being used for grouping purposes.

When to use non-capturing groups

Non-capturing groups are useful in situations where we need to group a set of character expressions, but we don't need to capture the matched text. For example:

  • When there is no need to capture the matched text.

  • When we want to group together a set of characters or expressions but don't need to reference them later in the regular expression.

  • When we want to improve the performance of a regular expression by capturing text that we don't need.

Conclusion

Non-capturing groups are useful regular expressions in JavaScript that can help improve performance, simplify regular expressions, and avoid unwanted side effects. Using non-capturing groups can help us write more efficient and maintainable regular expressions in our code.

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved