Generate a summary table of descriptive data for every individual in a dataset suitable for tabulation in a report.

Usage

dgr_table(
  dat,
  fields,
  names,
  cutoff = 7,
  sig = 3,
  by = NULL,
  idvar = "ID",
  navars = c("-99", "-999"),
  mtype = "geomean"
)

Arguments

dat: An input data frame, with one row per unique individual.
fields: A vector of strings containing the names of the fields to be included in the summary table.
names: A vector of strings containing descriptive names for the fields to be included in the summary table.
cutoff: An integer defining the maximum number of unique values a variable should have to be considered categorical. Fields with more than this number of unique values are considered continuous for the purposes of the summary table (defaults to 7).
sig: The number of significant digits summary values should have (defaults to 3).
by: The field to use for grouping (a string). If not NULL (the default), the summary table will contain columns for each unique value of this field, as well as a column summarizing across all fields.
idvar: The field in the dataset identifying each unique individual (defaults to "ID").
navars: A vector containing values that are to be interpreted as missing (defaults to "-99" and "-999"). NA values are always considered to be missing.
mtype: The type of mean to apply; geomean, the geometric mean (default) or mean, the arithmetic mean.

Value

A data frame containing a summary of all the fields listed in fields, for each individual in the dataset (the dataset should not contain duplicated individuals), conditioned on the field in by. Continuous values are summarized as median, mean, range and number of missing values. Categorical values are summarized as count and relative percentage.

Author

Justin Wilkins, justin.wilkins@occams.com

Examples

if (FALSE) { # \dontrun{
 count_na(c(0,5,7,NA,3,3,NA))
} # }