拉格朗日乘数检验

Lagrange Multiplier test (Score test)

Score test的名称则来自于Score本身。

Score Test，Likelihood Ratio Test和Wald Test的图形表示

（摘自：http://www.ats.ucla.edu/stat/mult_pkg/faq/general/nested_tests.htm）

Likelihood Ratio Test计算的是$$\xi^R = 2 (l(\hat{a}) – l(0))$$, 就是两倍红色竖线的长度，这个统计量近似有自由度为1的卡方分布（假设只有一个自由变量）

$$\xi^R = 2 (l(\hat{a}) – l(0)) = 2 log(L(\hat{a})/L(0)) \sim \chi^2_1$$

Score Test只需要考虑$$0$$ 这一点上似然函数的性质，在图像上通过蓝色部分表示（通过考察这一点的斜率和曲率）。

\begin{align} 斜率 &＝ U = l'(0) \\ |曲率| &＝ V = |l”(0)| = -l”(0) \\ \xi^S &= U’ * V^{-1} * U = \frac{l'(0)^2}{|l”(0)|} \sim \chi^2_1 \end{align}

\begin{align} 斜率 &＝ l'(a) \\ \xi^W &= \hat{a}^2 * |l”(a)| \sim \chi^2_1 \end{align}

Score Test等价于Likelihood Ratio Test和Wald Test

$$\chi^{R} = 2 * ( l(\hat{a}) – l(0))$$

$$l(0) = l(\hat{a}) – \hat{a} l'(\hat{a}) + \frac{1}{2} \hat{a}^2 l”(\hat{a})$$

$$l(0) = l(\hat{a}) + \frac{1}{2} \hat{a}^2 l”(\hat{a})$$

$$\xi^R = 2 (l(\hat{a}) – l(0)) = \hat{a}^2 * (-l”(\hat{a})) = \hat{a}^2 * |l”(\hat{a})|= \xi^W$$

\begin{align} l(\hat{a}) &= l(0) + \hat{a} * l'(0) + \frac{1}{2}\hat{a}^2 * l”(0) \\ l'(\hat{a}) &= l'(0) + \hat{a} * l”(0) = 0 \end{align}

$$\hat{a} = – \frac{l'(0)}{l”(0)}$$

\begin{align} \xi^R &= 2* (l(\hat{a}) – l(0)) = 2 * \hat{a} * l'(0) + \hat{a}^2 * l”(0) \\ &= – 2 * \frac{l'(0)^2}{l”(0)} + \frac{l'(0)^2}{l”(0)} \\ &= \frac{l'(0)^2}{-l”(0)} \\ &= \frac{l'(0)^2}{|l”(0)|} \\ &= \xi^S \end{align}

Score Test的形式

\begin{align} U &= Score = l'(\hat{\theta}) = l'( (\theta_1 = 0, \hat{\theta}_2) ) \\ V &= I^{11}_{\hat{\theta}} = (I_{11} – I_{12} (I_{22})^{-1} I_{22} )_{\hat{\theta}} \end{align}

$$I(\theta) = \left[ \begin{array}{cc} I_{11} & I_{12} \\ I_{21} & I_{22} \end{array} \right]_\theta = \left[ \begin{array}{cc} I_{11}(\theta) & I_{12}(\theta) \\ I_{21}(\theta) & I_{22}(\theta) \end{array} \right]$$

$$X_Z = (I-H) X = (I – Z (Z’Z)^{-1} Z’ ) X$$

$$V_{XX} = X_Z’ X_Z = X’ (I-H)’ (I-H) X = X’ (I – H) X = X’X – X’Z(Z’Z)^{-1}ZX$$

一些例子

1. 简单的线性回归（Simple Linear Regression) ：$$Y = X b + \epsilon, \epsilon_{ii} \sim N(0,\sigma^2)$$

$$l(b, \sigma^2) = – \frac{n}{2} \log(\sigma^2) – \frac{(Y-Xb)'(Y-Xb)}{2 \sigma^2}$$

$$U_b = \frac{\partial l}{\partial b} = Y’X / \hat{\sigma}^2 \\ V_{bb} = -\frac{\partial l^2}{\partial^2 b} = X’X / \hat{\sigma}^2\\$$

2. 一般的线性回归：$$Y = X b + Z r + \epsilon, \epsilon_{ii} \sim N(0,1)$$
\begin{align} U_b &= \frac{\partial l}{\partial b} = (Y – Z \hat{r})’X / \hat{\sigma}^2 \\ V_{bb} &= -\frac{\partial l^2}{\partial^2 b} = (X’X – X’Z(Z’Z)^{-1}Z’X) / \hat{\sigma}^2\\ \hat{\sigma}^2 &＝ \frac{ (Y-Z \hat{r})'(Y-Z \hat{r})}{n} \\ \hat{r} &= (Z’Z)^{-1} Z’Y \end{align}

\begin{align} U_b &= Y'(I-H_z)X \\ V_{bb} &= X'(I-H_z)X /\hat{\sigma}^2 ]\\ \hat{\sigma}^2 &＝ \frac{ Y’ (I-H_z) Y }{n} \\ \end{align}

$$X_z = (I-H_z)X \\ Y_z = (I-H_z)Y$$

加快C++编译速度

Speed up C++ compiling speed

make CXX="ccache g++"


ccache首次使用会稍微慢一点。因为它会缓存（cache）源程序。但之后编译的速度就变得飞快。