Source: R/count-tally.R
count.Rd
count()
lets you quickly count the unique values of one or more variables:df %>% count(a, b)
is roughly equivalent todf %>% group_by(a, b) %>% summarise(n = n())
.count()
is paired with tally()
, a lower-level helper that is equivalentto df %>% summarise(n = n())
. Supply wt
to perform weighted counts,switching the summary from n = n()
to n = sum(wt)
.
add_count()
and add_tally()
are equivalents to count()
and tally()
but use mutate()
instead of summarise()
so that they add a new columnwith group-wise counts.
Usage
count(x, ..., wt = NULL, sort = FALSE, name = NULL)# S3 method for data.framecount( x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = group_by_drop_default(x))tally(x, wt = NULL, sort = FALSE, name = NULL)add_count(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = deprecated())add_tally(x, wt = NULL, sort = FALSE, name = NULL)
Arguments
- x
A data frame, data frame extension (e.g. a tibble), or alazy data frame (e.g. from dbplyr or dtplyr).
- ...
<
data-masking
> Variables to groupby.- wt
<
data-masking
> Frequency weights.Can beNULL
or a variable:If
NULL
(the default), counts the number of rows in each group.If a variable, computes
sum(wt)
for each group.
- sort
If
TRUE
, will show the largest groups at the top.- name
The name of the new column in the output.
If omitted, it will default to
n
. If there's already a column calledn
,it will usenn
. If there's a column calledn
andnn
, it'll usennn
, and so on, addingn
s until it gets a new name.- .drop
Handling of factor levels that don't appear in the data, passedon to
group_by()
.For
count()
: ifFALSE
will include counts for empty groups (i.e. forlevels of factors that don't exist in the data).For
add_count()
: deprecated since itcan't actually affect the output.
Value
An object of the same type as .data
. count()
and add_count()
group transiently, so the output has the same groups as the input.
Examples
# count() is a convenient way to get a sense of the distribution of# values in a datasetstarwars %>% count(species)#> # A tibble: 38 × 2#> species n#> <chr> <int>#> 1 Aleena 1#> 2 Besalisk 1#> 3 Cerean 1#> 4 Chagrian 1#> 5 Clawdite 1#> 6 Droid 6#> 7 Dug 1#> 8 Ewok 1#> 9 Geonosian 1#> 10 Gungan 3#> # ℹ 28 more rowsstarwars %>% count(species, sort = TRUE)#> # A tibble: 38 × 2#> species n#> <chr> <int>#> 1 Human 35#> 2 Droid 6#> 3 NA 4#> 4 Gungan 3#> 5 Kaminoan 2#> 6 Mirialan 2#> 7 Twi'lek 2#> 8 Wookiee 2#> 9 Zabrak 2#> 10 Aleena 1#> # ℹ 28 more rowsstarwars %>% count(sex, gender, sort = TRUE)#> # A tibble: 6 × 3#> sex gender n#> <chr> <chr> <int>#> 1 male masculine 60#> 2 female feminine 16#> 3 none masculine 5#> 4 NA NA 4#> 5 hermaphroditic masculine 1#> 6 none feminine 1starwars %>% count(birth_decade = round(birth_year, -1))#> # A tibble: 15 × 2#> birth_decade n#> <dbl> <int>#> 1 10 1#> 2 20 6#> 3 30 4#> 4 40 6#> 5 50 8#> 6 60 4#> 7 70 4#> 8 80 2#> 9 90 3#> 10 100 1#> 11 110 1#> 12 200 1#> 13 600 1#> 14 900 1#> 15 NA 44# use the `wt` argument to perform a weighted count. This is useful# when the data has already been aggregated oncedf <- tribble( ~name, ~gender, ~runs, "Max", "male", 10, "Sandra", "female", 1, "Susan", "female", 4)# counts rows:df %>% count(gender)#> # A tibble: 2 × 2#> gender n#> <chr> <int>#> 1 female 2#> 2 male 1# counts runs:df %>% count(gender, wt = runs)#> # A tibble: 2 × 2#> gender n#> <chr> <dbl>#> 1 female 5#> 2 male 10# When factors are involved, `.drop = FALSE` can be used to retain factor# levels that don't appear in the datadf2 <- tibble( id = 1:5, type = factor(c("a", "c", "a", NA, "a"), levels = c("a", "b", "c")))df2 %>% count(type)#> # A tibble: 3 × 2#> type n#> <fct> <int>#> 1 a 3#> 2 c 1#> 3 NA 1df2 %>% count(type, .drop = FALSE)#> # A tibble: 4 × 2#> type n#> <fct> <int>#> 1 a 3#> 2 b 0#> 3 c 1#> 4 NA 1# Or, using `group_by()`:df2 %>% group_by(type, .drop = FALSE) %>% count()#> # A tibble: 4 × 2#> # Groups: type [4]#> type n#> <fct> <int>#> 1 a 3#> 2 b 0#> 3 c 1#> 4 NA 1# tally() is a lower-level function that assumes you've done the groupingstarwars %>% tally()#> # A tibble: 1 × 1#> n#> <int>#> 1 87starwars %>% group_by(species) %>% tally()#> # A tibble: 38 × 2#> species n#> <chr> <int>#> 1 Aleena 1#> 2 Besalisk 1#> 3 Cerean 1#> 4 Chagrian 1#> 5 Clawdite 1#> 6 Droid 6#> 7 Dug 1#> 8 Ewok 1#> 9 Geonosian 1#> 10 Gungan 3#> # ℹ 28 more rows# both count() and tally() have add_ variants that work like# mutate() instead of summarisedf %>% add_count(gender, wt = runs)#> # A tibble: 3 × 4#> name gender runs n#> <chr> <chr> <dbl> <dbl>#> 1 Max male 10 10#> 2 Sandra female 1 5#> 3 Susan female 4 5df %>% add_tally(wt = runs)#> # A tibble: 3 × 4#> name gender runs n#> <chr> <chr> <dbl> <dbl>#> 1 Max male 10 15#> 2 Sandra female 1 15#> 3 Susan female 4 15