What is the grepl() function in R?

Key takeways:

  • grepl() checks for pattern matches in strings.

  • grepl() returns logical vector (TRUE/FALSE) indicating pattern presence.

  • Parameters:

    • pattern: RegEx pattern to search

    • x: Character vector to search within

    • ignore.case: TRUE for case-insensitive matching

    • perl: TRUE for Perl-compatible regex

    • fixed: TRUE for literal string matching

    • useBytes: TRUE for byte-by-byte matching

  • The use cases of grepl() are filtering data, finding patterns, and text analysis.

Stephen Cole Kleene invented regular expressions (RegEx), which are powerful tools used for searching, matching, and manipulating text patterns. Using regular expressions can significantly enhance the efficiency and accuracy of these tasks. grepl() is a handy function in R for applying RegEx to efficiently identify matches in data. It is a useful tool in R, primarily used for pattern matching using regular expressions.

The grepl() function

The word “grepl” stands for “grep logical.” The grepl() function in R simply searches for matches in characters or sequences of characters present in a given string.

Basic usage of the grepl() function

The grep () helps with various tasks, such as quickly identifying and extracting rows that match a specific pattern, locating keywords or phrases in text data, and determining which elements of a character vector contain a given pattern.

How to use grepl() in R

The syntax for the grepl method is as follows:

grepl(pattern, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)

Parameters

  • pattern: This is the character or sequence of characters that will be matched against the specified elements of the string.

  • x: This is the specified string vector.

  • ignore.case: If TRUE, i.e., it finds a match, the code ignores the upper or lowercase. This is optional.

  • perl: This determines whether Perl-compatible regular expressions (RegExps) should be used or if the priority has been exceeded. This is optional.

  • fixed: This is a logical value. If TRUE, then the pattern of the characters or sequence of characters is matched. This is optional.

  • useBytes: This is a logical value. If TRUE, the matching is simply done byte-by-byte instead of character-by-character. This makes the program faster; this is also optional.

Return value

The grepl() function returns FALSE or TRUE depending on whether a match is found in a character or sequence of characters within a string.

Using grepl() for basic pattern matching

Let’s see the code below:

# Creating string vector
x <- c("CAR", "BIKE")
# Calling grepL() function
grepl("CA", x)

From the output of the code:

[1]  TRUE FALSE

We can see that it returns TRUE. This means that CA exists in the first item of the string variable CAR. FALSE means it is absent in the second item of the string variable BIKE.

Using ignore.case of grepl()

Let’s see the code below:

# creating a string vector
name <- c("CAR", "bIKE", "BICYCLE", "AEROPLANE")
# passing ignore.case argument to the grepl() function
grepl("bi", name, ignore.case = TRUE)

From the output of the code above:

[1] FALSE TRUE TRUE FALSE

We can see that it returns TRUE for the second and third elements of the string variable, "bIKE" and "BICYCLE". This happens even though they are not all in lowercase, like the argument we pass to the grepl() function. This way, the ignore.case parameter makes a case-insensitive search with grepl() in R.

Using the perl and fixed parameters in the grepl()

Let's see the code below:

# creating a string vector
name <- c("CAR", "b|ke", "BICYCLE", "AEROPLANE")
# Without fixed parameter
grepl("b.", name, fixed = FALSE)
# With fixed parameter
grepl("b.", name, fixed = TRUE)
# creating another vector
phrases <- c("Good Educative platform", "Educative good platform", "Educative platform", "R course Educative platform", "platform Educative")
# using grepl() with a Perl-compatible RegEx pattern
result_perl <- grepl("(?<=\\bEducative\\s)platform\\b", phrases, perl = TRUE)
# displaying the result
print(result_perl)

From the output of the code with fixed = FALSE:

[1]  FALSE TRUE FALSE FALSE

In a regular expression, the . metacharacter matches any single character. Therefore, the pattern b. will match any string in the name vector that has a b followed by any single character. As a result, it will return TRUE for element b|ke since it contains a b followed by another character.

From the output of the code with fixed = TRUE:

[1]  FALSE FALSE FALSE FALSE

With fixed = TRUE, the . is interpreted literally, so it will only match the exact string b.. Since none of the elements in the name vector contain b., the function will return FALSE for all elements.

From the output of the code with perl = TRUE:

[1]  TRUE FALSE TRUE TRUE FALSE

With perl = TRUE, the Perl-compatible RegEx pattern (?<=\\bEducative\\s)platform\\b matches the word “platform” only if it is preceded by the word “Educative” and a space. Here, (?<=) is a look behind the assertion ensuring “Educative” appears before “platform.” The \\b denotes word boundaries, confirming that both “Educative” and “platform” are treated as complete words. The \\s matches a space character, ensuring “Educative” is followed by a space.

Quiz!

1

What does grepl() stand for in R?

A)

Graphical regular expression locator

B)

General regular expression pattern locator

C)

Grep logical

D)

None of the above

Question 1 of 40 attempted

In summary, the grepl() function is used for R pattern matching in text data, leveraging regular expressions to efficiently find matches and extract information. Understanding its parameters and return values ​​can help us complete data manipulation and R data analysis tasks.

Frequently asked questions

Haven’t found what you were looking for? Contact Us


What is the difference between grep() and grepl() in R?

The grep() function returns the indices of elements that match a pattern, while the grepl() function returns a logical vector indicating whether each element matches the pattern.


What is the equivalent of grepl() in R?

str_detect() is equivalent to grepl().


How do I search multiple patterns in grepl()?

We can search for multiple patterns using the | operator, e.g., grepl("pattern1|pattern2", x).


What is the difference between gsub() and grepl()?

gsub() is used for substituting matched patterns in a string, while grepl() is used to check if a pattern exists.


What is the difference between character strings and vectors?

Character strings are individual text elements, while character vectors are arrays or lists of these strings.


What is the difference between caseless, byte-based, and exact matching in grepl()?

The caseless matching (ignore.case = TRUE) ignores text case, byte-based (useBytes = TRUE) matches bytes, and exact (fixed = TRUE) matches patterns exactly as strings.


How do we handle newline characters in the input string of grepl()?

To handle newline characters in the input string with the grepl() function, we can use the newline character \n in the RegEx pattern to detect and match newlines.


How do we use the negation operator (!) to exclude certain patterns?

Use !grepl("pattern", x) to exclude elements that match the pattern in x.


Free Resources

Copyright ©2025 Educative, Inc. All rights reserved