为了扩展韦恩的出色答案,ppml使用稳健的(异方差)方差-协方差矩阵以及对该矩阵的有限样本调整以减少偏差。
sandwich()这些与R 中同名包中的计算非常相似。唯一的区别是有限样本调整的完成方式。在该sandwich(...)函数中,默认情况下根本不进行有限样本调整,即三明治除以 1/n,其中 n 是观察次数。或者,sandwich(..., adjust = TRUE)可以使用除以 1/(n - k),其中 k 是回归器的数量。然而,Stata 除以 1/(n - 1)。
以下是通过使用调整因子为 1/(n-1) 的自定义三明治方差来使 R 与 Stata 匹配的方法:
. clear
. set more off
. capture ssc install rsource
. use http://personal.lse.ac.uk/tenreyro/mock, clear
. saveold ~/Desktop/mock, version(12) replace
(saving in Stata 12 format, which can be read by Stata 11 or 12)
file ~/Desktop/mock.dta saved
. rsource, terminator(XXX) rpath("/usr/local/bin/R") roptions("--vanilla")
Assumed R program path: "/usr/local/bin/R"
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Beginning of R output
R version 3.2.4 (2016-03-10) -- "Very Secure Dishes"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library("foreign")
> library("sandwich")
> library("lmtest")
> mock<-read.dta("~/Desktop/mock.dta")
> glmm<-glm(formula=y ~ x + w, family=quasipoisson(link="log"),data=mock)
>
> sandwich1 <- function(object, ...) sandwich(object) * nobs(object) / (nobs(object) - 1)
> coeftest(glmm,vcov=sandwich1)
z test of coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.516969 0.098062 5.2718 1.351e-07 ***
x 0.125657 0.101591 1.2369 0.2161
w 0.013410 0.710752 0.0189 0.9849
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
End of R output
.
. ppml y x w
note: checking the existence of the estimates
Number of regressors excluded to ensure that the estimates exist: 0
Number of observations excluded: 0
note: starting ppml estimation
note: y has noninteger values
Iteration 1: deviance = 139.7855
Iteration 2: deviance = 137.7284
Iteration 3: deviance = 137.7222
Iteration 4: deviance = 137.7222
Number of parameters: 3
Number of observations: 100
Pseudo log-likelihood: -173.89764
R-squared: .01628639
Option strict is: off
------------------------------------------------------------------------------
| Robust
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .1256565 .1015913 1.24 0.216 -.0734588 .3247718
w | .0134101 .7107518 0.02 0.985 -1.379638 1.406458
_cons | .5169689 .0980624 5.27 0.000 .3247702 .7091676
------------------------------------------------------------------------------
这是生成上述输出的 Stata/R 代码。我正在使用rsource从 Stata 运行 R(您需要调整rpath()以下内容以匹配您的设置),但这并不是必需的:您可以只rsource从 R 运行该部分。
clear
set more off
capture ssc install rsource
use http://personal.lse.ac.uk/tenreyro/mock, clear
saveold ~/Desktop/mock, version(12) replace
rsource, terminator(XXX) rpath("/usr/local/bin/R") roptions("--vanilla")
library("foreign")
library("sandwich")
library("lmtest")
mock<-read.dta("~/Desktop/mock.dta")
glmm<-glm(formula=y ~ x + w, family=quasipoisson(link="log"),data=mock)
sandwich1 <- function(object, ...) sandwich(object) * nobs(object) / (nobs(object) - 1)
coeftest(glmm,vcov=sandwich1)
XXX
ppml y x w