如果人们说他们有分层后的权重,这并不一定意味着他们实施了适当的分层后(例如,将每个人口统计单元中的权重重新调整为已知的人口总数)。我听说的“分层后权重”的使用中大约 80% 实际上是指经过校准的权重(即,与其尝试调整五向表中的每个单元格,不如仅调整权重以匹配五个表的变量单独)。我提出了有人所说的关于这种区别的方法论咆哮。然而,正如安东尼在另一个答案中指出的那样,这种区别在标准误差计算中起作用. 使用适当的后分层权重,您可以应用常规方差估计公式,或多或少地将您的后层视为抽样层(除了较小的技术性)。对于仅在每个表边距上校准的权重,计算会更加复杂。survey
无论如何,这两个程序都是在包中内部化的。您只需要将分层/校准后的变量提供给适当的设计对象/公式。
library(survey)
data(api)
# cross-classified post-stratification variable in population
apipop$stype.sch.wide <- 10*as.integer(apipop$stype) +
as.integer(apipop$sch.wide)
# cross-classified post-stratification variable in sample
apiclus1$stype.sch.wide <-
10*as.integer(apiclus1$stype) + as.integer(apiclus1$sch.wide)
# population totals
(pop.totals <- xtabs(~stype.sch.wide, data=apipop))
# reference design
dclus1 <- svydesign(id=~dnum,weights=~pw,data=apiclus1,fpc=~fpc)
# post-stratification of the original design
dclus1p <- postStratify(dclus1,~stype.sch.wide, pop.totals)
# design with post-stratified weights, but no evidence of post-stratification
dclus1pfake <- svydesign(id=~dnum,weights=~weights(dclus1p),data=apiclus1,fpc=~fpc)
# taking off the design with known weights, add post-stratification interaction
dclus1pp <- postStratify(dclus1pfake,~stype.sch.wide, pop.totals)
# estimates and standard errors: starting point
svymean(~api00,dclus1)
# post-stratification reduces standard errors a bit
svymean(~api00,dclus1p)
# but here we are not aware of the survey being post-stratified
svymean(~api00,dclus1pfake)
# if we just add post-stratification variables to the design object
# that only had post-stratified weights, the result is the same
# as for post-stratified object based on the original weights
svymean(~api00,dclus1pp)