R/as_survey_design.r
as_survey_design.Rd
Create a survey object with a survey design.
as_survey_design(.data, ...)
# S3 method for data.frame
as_survey_design(
.data,
ids = NULL,
probs = NULL,
strata = NULL,
variables = NULL,
fpc = NULL,
nest = FALSE,
check_strata = !nest,
weights = NULL,
pps = FALSE,
variance = c("HT", "YG"),
...
)
# S3 method for survey.design2
as_survey_design(.data, ...)
# S3 method for tbl_lazy
as_survey_design(
.data,
ids = NULL,
probs = NULL,
strata = NULL,
variables = NULL,
fpc = NULL,
nest = FALSE,
check_strata = !nest,
weights = NULL,
pps = FALSE,
variance = c("HT", "YG"),
...
)
A data frame (which contains the variables specified below)
ignored
Variables specifying cluster ids from largest level to smallest level (leaving the argument empty, NULL, 1, or 0 indicate no clusters).
Variables specifying cluster sampling probabilities.
Variables specifying strata.
Variables specifying variables to be included in survey. Defaults to all variables in .data
Variables specifying a finite population correct, see
svydesign
for more details.
If TRUE
, relabel cluster ids to enforce nesting within strata.
If TRUE
, check that clusters are nested in strata.
Variables specifying weights (inverse of probability).
"brewer" to use Brewer's approximation for PPS sampling without replacement. "overton" to use Overton's approximation. An object of class HR to use the Hartley-Rao approximation. An object of class ppsmat to use the Horvitz-Thompson estimator.
For pps without replacement, use variance="YG" for the Yates-Grundy estimator instead of the Horvitz-Thompson estimator
An object of class tbl_svy
If provided a data.frame, it is a wrapper
around svydesign
. All survey variables must be included
in the data.frame itself. Variables are selected by using bare column names, or
convenience functions described in select
.
If provided a survey.design2
object from the survey package,
it will turn it into a srvyr object, so that srvyr functions will work with it
# Examples from ?survey::svydesign
library(survey)
data(api)
# stratified sample
dstrata <- apistrat %>%
as_survey_design(strata = stype, weights = pw)
# one-stage cluster sample
dclus1 <- apiclus1 %>%
as_survey_design(dnum, weights = pw, fpc = fpc)
# two-stage cluster sample: weights computed from population sizes.
dclus2 <- apiclus2 %>%
as_survey_design(c(dnum, snum), fpc = c(fpc1, fpc2))
## multistage sampling has no effect when fpc is not given, so
## these are equivalent.
dclus2wr <- apiclus2 %>%
dplyr::mutate(weights = weights(dclus2)) %>%
as_survey_design(c(dnum, snum), weights = weights)
dclus2wr2 <- apiclus2 %>%
dplyr::mutate(weights = weights(dclus2)) %>%
as_survey_design(c(dnum), weights = weights)
## syntax for stratified cluster sample
## (though the data weren't really sampled this way)
apistrat %>% as_survey_design(dnum, strata = stype, weights = pw,
nest = TRUE)
#> Stratified 1 - level Cluster Sampling design (with replacement)
#> With (162) clusters.
#> Called via srvyr
#> Sampling variables:
#> - ids: dnum
#> - strata: stype
#> - weights: pw
#> Data variables:
#> - cds (chr), stype (fct), name (chr), sname (chr), snum (dbl), dname (chr),
#> dnum (int), cname (chr), cnum (int), flag (int), pcttest (int), api00
#> (int), api99 (int), target (int), growth (int), sch.wide (fct), comp.imp
#> (fct), both (fct), awards (fct), meals (int), ell (int), yr.rnd (fct),
#> mobility (int), acs.k3 (int), acs.46 (int), acs.core (int), pct.resp (int),
#> not.hsg (int), hsg (int), some.col (int), col.grad (int), grad.sch (int),
#> avg.ed (dbl), full (int), emer (int), enroll (int), api.stu (int), pw
#> (dbl), fpc (dbl)
## PPS sampling without replacement
data(election)
dpps <- election_pps %>%
as_survey_design(fpc = p, pps = "brewer")
# dplyr 0.7 introduced new style of NSE called quosures
# See `vignette("programming", package = "dplyr")` for details
st <- quo(stype)
wt <- quo(pw)
dstrata <- apistrat %>%
as_survey_design(strata = !!st, weights = !!wt)