Create a survey object by specifying the survey's two phase design. It is a
wrapper around twophase
. All survey variables must be
included in the data.frame itself. Variables are selected by using bare
column names, or convenience functions described in
select
.
as_survey_twophase(.data, ...) # S3 method for data.frame as_survey_twophase( .data, id, strata = NULL, probs = NULL, weights = NULL, fpc = NULL, subset, method = c("full", "approx", "simple"), ... ) # S3 method for twophase2 as_survey_twophase(.data, ...)
.data  A data frame (which contains the variables specified below) 

...  ignored 
id  list of two sets of variable names for sampling unit identifiers 
strata  list of two sets of variable names (or 
probs  list of two sets of variable names (or 
weights  Only for method = "approx", list of two sets of variable names (or 
fpc  list of two sets of variables (or 
subset  bare name of a variable which specifies which observations are selected in phase 2 
method  "full" requires (much) more memory, but gives unbiased variance estimates for
general multistage designs at both phases. "simple" or "approx" use less memory, and is correct for
designs with simple random sampling at phase one and stratified randoms sampling at phase two. See

An object of class tbl_svy
# Examples from ?survey::twophase # twophase simple random sampling. data(pbc, package="survival") library(dplyr) pbc < pbc %>% mutate(randomized = !is.na(trt) & trt > 0, id = row_number()) d2pbc < pbc %>% as_survey_twophase(id = list(id, id), subset = randomized) d2pbc %>% summarize(mean = survey_mean(bili))#> # A tibble: 1 × 2 #> mean mean_se #> <dbl> <dbl> #> 1 3.26 0.256# twostage sampling as twophase library(survey) data(mu284) mu284_1 < mu284 %>% dplyr::slice(c(1:15, rep(1:5, n2[1:5]  3))) %>% mutate(id = row_number(), sub = rep(c(TRUE, FALSE), c(15, 3415))) dmu284 < mu284 %>% as_survey_design(ids = c(id1, id2), fpc = c(n1, n2)) # first phase cluster sample, second phase stratified within cluster d2mu284 < mu284_1 %>% as_survey_twophase(id = list(id1, id), strata = list(NULL, id1), fpc = list(n1, NULL), subset = sub) dmu284 %>% summarize(total = survey_total(y1), mean = survey_mean(y1))#> # A tibble: 1 × 4 #> total total_se mean mean_se #> <dbl> <dbl> <dbl> <dbl> #> 1 15080 2274. 44.4 2.27#> # A tibble: 1 × 4 #> total total_se mean mean_se #> <dbl> <dbl> <dbl> <dbl> #> 1 15080 2274. 44.4 2.27