R/as_survey_design.r
as_survey_design.Rd
Create a survey object with a survey design.
as_survey_design(.data, ...) # S3 method for data.frame as_survey_design( .data, ids = NULL, probs = NULL, strata = NULL, variables = NULL, fpc = NULL, nest = FALSE, check_strata = !nest, weights = NULL, pps = FALSE, variance = c("HT", "YG"), ... ) # S3 method for survey.design2 as_survey_design(.data, ...) # S3 method for tbl_lazy as_survey_design( .data, ids = NULL, probs = NULL, strata = NULL, variables = NULL, fpc = NULL, nest = FALSE, check_strata = !nest, weights = NULL, pps = FALSE, variance = c("HT", "YG"), ... )
.data | A data frame (which contains the variables specified below) |
---|---|
... | ignored |
ids | Variables specifying cluster ids from largest level to smallest level (leaving the argument empty, NULL, 1, or 0 indicate no clusters). |
probs | Variables specifying cluster sampling probabilities. |
strata | Variables specifying strata. |
variables | Variables specifying variables to be included in survey. Defaults to all variables in .data |
fpc | Variables specifying a finite population correct, see
|
nest | If |
check_strata | If |
weights | Variables specifying weights (inverse of probability). |
pps | "brewer" to use Brewer's approximation for PPS sampling without replacement. "overton" to use Overton's approximation. An object of class HR to use the Hartley-Rao approximation. An object of class ppsmat to use the Horvitz-Thompson estimator. |
variance | For pps without replacement, use variance="YG" for the Yates-Grundy estimator instead of the Horvitz-Thompson estimator |
An object of class tbl_svy
If provided a data.frame, it is a wrapper
around svydesign
. All survey variables must be included
in the data.frame itself. Variables are selected by using bare column names, or
convenience functions described in select
.
If provided a survey.design2
object from the survey package,
it will turn it into a srvyr object, so that srvyr functions will work with it
# Examples from ?survey::svydesign library(survey) data(api) # stratified sample dstrata <- apistrat %>% as_survey_design(strata = stype, weights = pw) # one-stage cluster sample dclus1 <- apiclus1 %>% as_survey_design(dnum, weights = pw, fpc = fpc) # two-stage cluster sample: weights computed from population sizes. dclus2 <- apiclus2 %>% as_survey_design(c(dnum, snum), fpc = c(fpc1, fpc2)) ## multistage sampling has no effect when fpc is not given, so ## these are equivalent. dclus2wr <- apiclus2 %>% dplyr::mutate(weights = weights(dclus2)) %>% as_survey_design(c(dnum, snum), weights = weights) dclus2wr2 <- apiclus2 %>% dplyr::mutate(weights = weights(dclus2)) %>% as_survey_design(c(dnum), weights = weights) ## syntax for stratified cluster sample ## (though the data weren't really sampled this way) apistrat %>% as_survey_design(dnum, strata = stype, weights = pw, nest = TRUE)#> Stratified 1 - level Cluster Sampling design (with replacement) #> With (162) clusters. #> Called via srvyr #> Sampling variables: #> - ids: dnum #> - strata: stype #> - weights: pw #> Data variables: cds (chr), stype (fct), name (chr), sname (chr), snum (dbl), #> dname (chr), dnum (int), cname (chr), cnum (int), flag (int), pcttest (int), #> api00 (int), api99 (int), target (int), growth (int), sch.wide (fct), #> comp.imp (fct), both (fct), awards (fct), meals (int), ell (int), yr.rnd #> (fct), mobility (int), acs.k3 (int), acs.46 (int), acs.core (int), pct.resp #> (int), not.hsg (int), hsg (int), some.col (int), col.grad (int), grad.sch #> (int), avg.ed (dbl), full (int), emer (int), enroll (int), api.stu (int), pw #> (dbl), fpc (dbl)## PPS sampling without replacement data(election) dpps <- election_pps %>% as_survey_design(fpc = p, pps = "brewer") # dplyr 0.7 introduced new style of NSE called quosures # See `vignette("programming", package = "dplyr")` for details st <- quo(stype) wt <- quo(pw) dstrata <- apistrat %>% as_survey_design(strata = !!st, weights = !!wt)