In R programming, the group_by()
function is applied on data frames or tables. It groups them accordingly so that various operations could be performed. It works similar to PIVOT Table
command in Excel and GROUP BY in SQL.
group_by(.data, ..., .add = FALSE, .drop = group_by_drop_default(.data))
It takes the following argument values:
data
: It represents the data frames or a data table.add
: The default value of add
is FALSE
. But if it is applied to existing data, the value will be TRUE.
.drop = group_by_drop_default(.data)
: It represents the default value for the .drop
attribute in the group_by()
function. So, by default .data
will be .tbl
.This function returns the given data in grouped form like a table.
In the code snippet below, we'll group two attributes of mtcars
dataset with itself to see how the group_by()
function works:
# including dplyr librarylibrary(dplyr, warn.conflicts = FALSE)# it will chain commands: mtcars and group_by(vs, am) databy_vs_am <- mtcars %>% group_by(vs, am)# summarise() will remove previous grouped attributesby_vs <- by_vs_am %>% summarise(total = n())# print remaining ungrouped valuesprint(by_vs)
dplyr
library in the program, where warn.conflicts = FALSE
hides conflict alert due to different loading modules.group_by(vs, am)
to group vs
(engine shape, either v-shape or straight) and amm
(transmission either automatic or manual) feature of mtcars
dataset to itself as %>%
forward pipe operator pushes vs
and am
into it. summarise(total = n())
to ungroup the grouped values above with mtcars
dataset. It returns a tibble with an additional column to keep count of unique entries in vs
and am
columns.4x3
tibble with vs
, am
, and, total
columns to the console.