{"id":520,"date":"2014-07-06T00:17:03","date_gmt":"2014-07-06T04:17:03","guid":{"rendered":"http:\/\/zhanxw.com\/blog\/?p=520"},"modified":"2014-07-06T00:17:59","modified_gmt":"2014-07-06T04:17:59","slug":"%e5%8d%a1%e6%96%b9%e6%a3%80%e9%aa%8c","status":"publish","type":"post","link":"https:\/\/zhanxw.com\/blog\/2014\/07\/%e5%8d%a1%e6%96%b9%e6%a3%80%e9%aa%8c\/","title":{"rendered":"\u5361\u65b9\u68c0\u9a8c"},"content":{"rendered":"<p>\u5361\u65b9\u68c0\u9a8c<br \/>\nChi-square test<\/p>\n<p>\u5361\u65b9\u68c0\u9a8c\u7684\u5168\u79f0\u662fPearson&#8217;s Chi-square test\uff0c\u8fd9\u662f\u7edf\u8ba1\u4e2d\u6700\u91cd\u8981\u7684\u68c0\u9a8c\u3002<br \/>\n\u4ece\u7528\u9014\u6765\u8bb2\uff0c \u5361\u65b9\u68c0\u9a8c\u53ef\u4ee5\u505a\u4e24\u4ef6\u4e8b\uff1a<br \/>\n\uff081\uff09\u9002\u914d\u5ea6\u68c0\u5b9a\uff08Goodness of fit\uff09\uff0c\u68c0\u67e5\u6837\u672c\u662f\u5426\u7b26\u5408\u67d0\u79cd\u968f\u673a\u5206\u5e03\uff1b<br \/>\n\uff082\uff09\u72ec\u7acb\u6027\u68c0\u5b9a\uff08Independence test\uff09\uff0c\u68c0\u67e5\u591a\u4e2a\u53d8\u91cf\u4e4b\u95f4\u662f\u5426\u72ec\u7acb\u3002<\/p>\n<p>\u8fd9\u7bc7\u4e0d\u8bb2\u600e\u4e48\u7528\u5361\u65b9\u68c0\u9a8c\uff0c\u800c\u662f\u8bb2\u8bb2\u5361\u65b9\u68c0\u9a8c\u7684\u516c\u5f0f\uff0c\u63a8\u5bfc\uff0c\u4ee5\u53ca\u548c\u4f3c\u7136\u68c0\u9a8c\u7684\u8054\u7cfb\uff0c\u76ee\u7684\u662f\u6e29\u6545\u77e5\u65b0\u3002<\/p>\n<p>\u4ece\u516c\u5f0f\u6765\u8bb2\u5361\u65b9\u68c0\u9a8c\uff1a<br \/>\n[mathjax]<br \/>\n$$ \\chi^2 = \\sum_i \\frac{(O_i &#8211; E_i)^2}{E_i} $$<br \/>\n\u8fd9\u91cc[latex]O_i[\/latex]\u662f\u89c2\u5bdf\u5230\u7684\u6b21\u6570\uff0c[latex]E_i[\/latex]\u662f\u671f\u671b\u7684\u6b21\u6570\uff0c<br \/>\n\u7edf\u8ba1\u91cf[latex]\\chi^2[\/latex] \u670d\u4ece\u81ea\u7531\u5ea6[latex]df[\/latex]\u7684\u5361\u65b9\u5206\u5e03\uff08[latex]\\chi^2[\/latex] distribution\uff09\u3002<\/p>\n<p>\u5047\u8bbe\u6709\u4e00\u4e2a\u6709\u53ef\u80fd\u662f\u591a\u9879\u5206\u5e03\u7684\u968f\u673a\u53d8\u91cf[latex]X \\sim Multinomial(N, p_1, p_2, \\ldots, p_k)[\/latex]\u3002<br \/>\n\u89c2\u5bdf\u5230[latex]k[\/latex]\u7c7b\u7684\u6b21\u6570\u5206\u522b\u662f[latex]x_1, x_2, \\ldots,x_k[\/latex].<br \/>\n\u4e3a\u4e86\u7edf\u8ba1\u68c0\u9a8c[latex]X[\/latex]\u662f\u5426\u670d\u4ece\u8fd9\u4e2a\u5206\u5e03\uff0c\u5217\u51fa\u96f6\u5047\u8bbe[latex]H_0[\/latex] \u4e3a[latex]X[\/latex]\u670d\u4ece\u591a\u9879\u5206\u5e03\uff0c\u5bf9\u7acb\u5047\u8bbe\u4e3a[latex]X[\/latex]\u4e0d\u670d\u4ece\u8fd9\u4e2a\u591a\u9879\u5206\u5e03\uff0c\u53ef\u4ee5\u5f97\u5230\uff1a<\/p>\n<p>$$ \\chi^2 = \\sum_i \\frac{(O_i &#8211; E_i)^2}{E_i} \uff1d \\sum_i \\frac{(O_i &#8211; N \\cdot p_i)^2}{N \\cdot p_i}$$<\/p>\n<p>\u6ce8\u610f\u8fd9\u4e2a\u6bcf\u4e00\u9879\u7684\u5206\u6bcd\u662f[latex] N \\cdot p_i [\/latex], \u4f46\u6839\u636e\u4e2d\u5fc3\u6781\u9650\u5b9a\u7406\uff08Central Limit Theorem\uff09\uff0c<br \/>\n$$\\frac{O_i &#8211; N \\cdot p_i}{\\sqrt{N \\cdot p_i \\cdot (1-p_i)}} \\rightarrow Normal(0, 1)$$<\/p>\n<p>\u540c\u5361\u65b9\u68c0\u9a8c\u7684\u516c\u5f0f\u76f8\u6bd4\u8f83\uff0c\u8fd9\u91cc\u9762\u6709\u4e24\u4e2a\u95ee\u9898\uff1a<br \/>\n1. \u4e3a\u4ec0\u4e48\u5361\u65b9\u68c0\u9a8c\u4e0d\u662f[latex]N[\/latex]\u4e2a[latex]\\frac{O_i &#8211; N \\cdot p_i}{\\sqrt{N \\cdot p_i \\cdot (1-p_i)}}[\/latex]\u7684\u5e73\u65b9\u548c\uff1f<br \/>\n2. \u4e3a\u4ec0\u4e48[latex] \\chi^2 [\/latex]\u7684\u81ea\u7531\u5ea6\u662f\\( n-1 \\)\uff0c\u800c\u4e0d\u662f \\( n \\)\uff1f<\/p>\n<p>\u56de\u7b54\u8fd9\u4e24\u4e2a\u95ee\u9898\u7684\u4e00\u4e2a\u65b9\u6cd5\u662f\u63a8\u5bfc\u4e00\u4e0b \\( \\chi^2 \\) \u7684\u5206\u5e03\u3002<br \/>\n\u6839\u636e\uff3b1\uff3d\uff0c\u63a8\u5bfc\u7684\u601d\u8def\u5982\u4e0b\uff1a<\/p>\n<p>1. \u4ee4\\( Z_i = \\frac{O_i &#8211; N * p_i}{\\sqrt{N*p_i}} \\)\uff0c\u56e0\u6b64\\( \\chi^2 = \\sum_i Z_i^2 \\).<\/p>\n<p>2.<br \/>\n\u6839\u636e\u4e2d\u5fc3\u6781\u9650\u5b9a\u7406,<br \/>\n\\( Z_i \\rightarrow \\sqrt{1-p_i} N(0, 1) = N(0, 1-p_i) \\)<br \/>\n\\( Cov(Z_i, Z_j) = \\frac{Cov(O_i, O_j)}{n*\\sqrt{p_i*p_j}} = -\\sqrt{p_i * p_j} \\)<\/p>\n<p>3. \u4e3a\u4e86\u5f97\u5230\\( Z_1^2 + Z_2+ \\ldots + Z_k^2 \\) \u7684\u6781\u9650\u5206\u5e03\uff0c<br \/>\n\u6211\u4eec\u5148\u5047\u8bbe\\( \\vec{g} = (g_1, g_2, \\ldots, g_r) \\) \u662fi.i.d.\u7684\u6b63\u6001\u5206\u5e03\uff0c<br \/>\n\u518d\u5047\u8bbe\u5355\u4f4d\u5411\u91cf \\( \\vec{p} = (\\sqrt{p_1}, \\sqrt{p_2}, \\ldots, \\sqrt{p_k}) \\).<br \/>\n\u5f53\\( \\vec{g} \\) \u5411\u91cf \u51cf\u53bb \u5b83\u5728\\( \\vec{p} \\)\u7684\u6295\u5f71\u65f6\uff0c<br \/>\n\\( \\vec{g^p} = \\vec{g} &#8211; ( \\vec{g} \\cdot \\vec{p} ) \\cdot \\vec{p} \\) \u7684\u5206\u5e03\u548c \\( Z_1, Z_2, \\ldots, Z_k \\) \u662f \u4e00\u6837\u7684\u3002<br \/>\n\u4e3a\u4e86\u8bc1\u660e\u8fd9\u4e00\u70b9\uff0c \u53ef\u4ee5\u8bc1\u660e \\( E(Z_i^2) \\rightarrow (g^p_i)^2 \\) \u548c \\( E(Z_i Z_j) \\rightarrow g^p_i \\cdot g^p_j \\) \u3002<br \/>\n\u6709\u4e86\\( \\vec{g^p} \\), \u5c31\u53ef\u4ee5\u628a\\( \\chi^2 \\) \u8868\u793a\u4e3a \\( \\chi^2 = \\sum_i (\\vec{g^p}_i)^2 \\) \u5373 \\( \\vec{g^p} \\) \u957f\u5ea6\u7684\u5e73\u65b9\u3002<\/p>\n<p>4. \u4e0b\u9762\u8ba1\u7b97\\( \\vec{g^p} \\) \u7684\u5206\u5e03<br \/>\n\\( \\vec{g} \\) \u7684\u6bcf\u4e2a\u5206\u91cf\u662fi.i.d\u7684\u6b63\u6001\u5206\u5e03\u3002<br \/>\n\u6839\u636e\u6b63\u6001\u5206\u5e03\u7684\u6027\u8d28\uff0c\u5bf9\\( \\vec{g} \\) \u7684\u5750\u6807\u8fdb\u884c\u6b63\u4ea4\u53d8\u6362\u540e\uff0c\u5728\u65b0\u5750\u6807\u4e0b\uff0c\u6bcf\u4e2a\u5750\u6807\u4ecd\u7136\u662fi.i.d\u7684\u6b63\u6001\u5206\u5e03 \uff08\u89c1\uff3b2\uff3d\uff09<br \/>\n\u56e0\u6b64\u6211\u4eec\u6784\u9020\u51fa\u4e00\u4e2a\u7279\u6b8a\u7684\u6b63\u4ea4\u53d8\u6362\u4f7f\u5f97\\( \\vec{g} \\) \u53d8\u6362\u4e4b\u540e\u6709\u4e00\u4e2a\u5750\u6807\u7cfb\u548c\\( \\vec{p} \\)\u540c\u4e00\u4e2a\u65b9\u5411\uff0c<br \/>\n\u56e0\u6b64\u53ef\u4ee5\u628a\\( \\vec{g} \\) \u8868\u793a\u4e3a\\( (g&#8217;_1, g&#8217;_2, \\ldots, g&#8217;_k) \\)\uff0c\u5e76\u4e14<br \/>\n\\( \\vec{g^p} = (g&#8217;_1, g&#8217;_2, \\ldots, g&#8217;_{k-1}, 0) \\)\u3002<br \/>\n\u56e0\u6b64\\( \\chi^2 = \\sum_{i=1}^{k-1} (g&#8217;_i)^2 = \\chi^2_{k-1} \\)\u3002<\/p>\n<p>\u4ee5\u4e0a\u56db\u4e2a\u6b65\u9aa4\u53ef\u4ee5\u4ece\u7406\u8bba\u4e0a\u8bc1\u660e\u5361\u65b9\u68c0\u9a8c\u3002\u53e6\u5916\u4e00\u79cd\u601d\u8def\u662f\u628a\u4f3c\u7136\u68c0\u9a8c\uff08G-test\uff09\u548c\u5361\u65b9\u68c0\u9a8c\u8054\u7cfb\u8d77\u6765\uff1a<br \/>\n$$ \\begin{align} G<br \/>\n&#038;= 2 \\sum_i O_i \\log(\\frac{O_i}{E_i}) \\\\<br \/>\n&#038;= 2 \\sum_i O_i \\log(1 + \\frac{O_i &#8211; E_i}{E_i}) \\\\<br \/>\n&#038;\\sim 2 \\sum_i O_i \\frac{O_i &#8211; E_i}{E_i} \\\\<br \/>\n&#038;= 2 \\sum_i \\frac{O_i^2 &#8211; E_i * O_i}{E_i}\\\\<br \/>\n&#038;= 2 \\sum_i \\frac{O_i^2 &#8211; 2 * E_i * O_i + E_i^2}{E_i}\\\\<br \/>\n&#038;= 2 \\sum_i \\frac{(O_i &#8211; E_i)^2} {E_i}<br \/>\n\\end{align}<br \/>\n$$<br \/>\n\u4e0a\u9762\u7684\u8ba1\u7b97\u4e2d\u7528\u5230\u4e86\\( \\log(1+x) \\sim x \\) \u548c \\( \\sum_i O_i = \\sum_i E_i \\)\u3002<br \/>\n\u53ef\u89c1\u4e24\u79cd\u68c0\u9a8c\u57fa\u672c\u4e0a\u662f\u7b49\u4ef7\u7684\uff08\u4e0a\u9762\u63a8\u5bfc\u6765\u81ea\uff3b3\uff3d\uff0c\u4f46\u66f4\u7b80\u6d01\uff09\u3002<\/p>\n<p>\u7136\u800c\uff0c\u4f3c\u7136\u68c0\u9a8c\u548c\u5361\u65b9\u68c0\u9a8c\u76f8\u6bd4\uff0c\u4f3c\u7136\u68c0\u9a8c\u9700\u8981\u8ba1\u7b97\u5bf9\u6570\uff0c\u5728\u76ae\u5c14\u68ee\u7684\u5e74\u4ee3\uff0c\u8fd9\u662f\u5f88\u4e0d\u65b9\u4fbf\u7684\u3002<br \/>\n\u5361\u65b9\u68c0\u9a8c\u5219\u4fbf\u4e8e\u624b\u7b97\uff0c\u8fd9\u6216\u8bb8\u662f\u5f97\u5230\u5e7f\u6cdb\u5e94\u7528\u7684\u539f\u56e0\u4e4b\u4e00\u5427\u3002<\/p>\n<p>\uff3b1\uff3dPearson&#8217;s Theorem  <a href=\"http:\/\/ocw.mit.edu\/courses\/mathematics\/18-443-statistics-for-applications-fall-2003\/lecture-notes\/lec23.pdf\" target=\"_blank\">http:\/\/ocw.mit.edu\/courses\/mathematics\/18-443-statistics-for-applications-fall-2003\/lecture-notes\/lec23.pdf<\/a><br \/>\n\uff3b2\uff3dOrthogonal Transformation of Standard Normal Sample  <a href=\"http:\/\/ocw.mit.edu\/courses\/mathematics\/18-443-statistics-for-applications-fall-2003\/lecture-notes\/lec15.pdf\" target=\"_blank\">http:\/\/ocw.mit.edu\/courses\/mathematics\/18-443-statistics-for-applications-fall-2003\/lecture-notes\/lec15.pdf<\/a><br \/>\n\uff3b3\uff3dThe Two-Way Likelihood Ratio (G) Test  <a href=\"http:\/\/arxiv.org\/pdf\/1206.4881v2.pdf\" target=\"_blank\">http:\/\/arxiv.org\/pdf\/1206.4881v2.pdf<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5361\u65b9\u68c0\u9a8c Chi-square test \u5361\u65b9\u68c0\u9a8c\u7684\u5168\u79f0\u662fPearson&#8217;s Chi-square test\uff0c\u8fd9\u662f\u7edf\u8ba1\u4e2d\u6700\u91cd\u8981\u7684\u68c0\u9a8c\u3002 \u4ece\u7528\u9014\u6765\u8bb2\uff0c \u5361\u65b9\u68c0\u9a8c\u53ef\u4ee5\u505a\u4e24\u4ef6\u4e8b\uff1a \uff081\uff09\u9002\u914d\u5ea6\u68c0\u5b9a\uff08Goodness of fit\uff09\uff0c\u68c0\u67e5\u6837\u672c\u662f\u5426\u7b26\u5408\u67d0\u79cd\u968f\u673a\u5206\u5e03\uff1b \uff082\uff09\u72ec\u7acb\u6027\u68c0\u5b9a\uff08Independence test\uff09\uff0c\u68c0\u67e5\u591a\u4e2a\u53d8\u91cf\u4e4b\u95f4\u662f\u5426\u72ec\u7acb\u3002 \u8fd9\u7bc7\u4e0d\u8bb2\u600e\u4e48\u7528\u5361\u65b9\u68c0\u9a8c\uff0c\u800c\u662f\u8bb2\u8bb2\u5361\u65b9\u68c0\u9a8c\u7684\u516c\u5f0f\uff0c\u63a8\u5bfc\uff0c\u4ee5\u53ca\u548c\u4f3c\u7136\u68c0\u9a8c\u7684\u8054\u7cfb\uff0c\u76ee\u7684\u662f\u6e29\u6545\u77e5\u65b0\u3002 \u4ece\u516c\u5f0f\u6765\u8bb2\u5361\u65b9\u68c0\u9a8c\uff1a [mathjax] $$ \\chi^2 = \\sum_i \\frac{(O_i &#8211; E_i)^2}{E_i} $$ \u8fd9\u91cc[latex]O_i[\/latex]\u662f\u89c2\u5bdf\u5230\u7684\u6b21\u6570\uff0c[latex]E_i[\/latex]\u662f\u671f\u671b\u7684\u6b21\u6570\uff0c \u7edf\u8ba1\u91cf[latex]\\chi^2[\/latex] \u670d\u4ece\u81ea\u7531\u5ea6[latex]df[\/latex]\u7684\u5361\u65b9\u5206\u5e03\uff08[latex]\\chi^2[\/latex] distribution\uff09\u3002 \u5047\u8bbe\u6709\u4e00\u4e2a\u6709\u53ef\u80fd\u662f\u591a\u9879\u5206\u5e03\u7684\u968f\u673a\u53d8\u91cf[latex]X \\sim Multinomial(N, p_1, p_2, \\ldots, p_k)[\/latex]\u3002 \u89c2\u5bdf\u5230[latex]k[\/latex]\u7c7b\u7684\u6b21\u6570\u5206\u522b\u662f[latex]x_1, x_2, \\ldots,x_k[\/latex]. \u4e3a\u4e86\u7edf\u8ba1\u68c0\u9a8c[latex]X[\/latex]\u662f\u5426\u670d\u4ece\u8fd9\u4e2a\u5206\u5e03\uff0c\u5217\u51fa\u96f6\u5047\u8bbe[latex]H_0[\/latex] \u4e3a[latex]X[\/latex]\u670d\u4ece\u591a\u9879\u5206\u5e03\uff0c\u5bf9\u7acb\u5047\u8bbe\u4e3a[latex]X[\/latex]\u4e0d\u670d\u4ece\u8fd9\u4e2a\u591a\u9879\u5206\u5e03\uff0c\u53ef\u4ee5\u5f97\u5230\uff1a $$ \\chi^2 = \\sum_i \\frac{(O_i &#8211; E_i)^2}{E_i} \uff1d \\sum_i \\frac{(O_i &#8211; N \\cdot p_i)^2}{N \\cdot [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15],"tags":[134,133,19],"class_list":["post-520","post","type-post","status-publish","format-standard","hentry","category-statistics","tag-chisquare","tag-pearson","tag-statistics-2"],"_links":{"self":[{"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/posts\/520","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/comments?post=520"}],"version-history":[{"count":0,"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/posts\/520\/revisions"}],"wp:attachment":[{"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/media?parent=520"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/categories?post=520"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/zhanxw.com\/blog\/wp-json\/wp\/v2\/tags?post=520"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}