Feb 212011
 

We will briefly explain why we would be interested in implementing exact logistic regression, then provides C++ and R codes.

1. Why exact test?

Since we want to have a clear mind of how likely/unlikely the realization we observed. In the classic example of 2×2 table without covariates, especially the 2×2 table has very few (<5) occurrence, Fisher exact tests are often applied, and large sample theory cannot give an accurate estimation.

2. Why exact logistic regression?

Fisher’s exact test cannot applied to logistic model. For example, when we have covariates in the model, we want BOTH estimate the effect size and get its exact p-value. In this case, only exact logistic regression provides solution.The theoretical background is provided in reference [1].

My implementation:

Download: exactLogisticRegression.tar

1. I verified the results with SAS.
2. The speed is comparable to, or faster than SAS.

Cons:
1. Only 1 interested parameter conditioning on all other parameter is supported for now.
2. I have not implemented the confidence limit parts, as it’s a bit more tedious.

R binding : mypackage_1.0.tar

See mypackage/R/rcpp_hello_world.R, I wrote a R function to wrap the C++ function.

R binding is helped by RCpp package. It greatly reduced the workload of exchanging date (in the form of matrix, list, vector) between C++ and R. A quick tutorial can be found from RCpp homepage(http://dirk.eddelbuettel.com/code/rcpp.html). For experienced Rcpp user, the quick-ref documentation (http://dirk.eddelbuettel.com/code/rcpp/Rcpp-quickref.pdf) is helpful.

【1】 Exact Logistic Regression, Robert E. Derr, SAS Institute Inc., Cary, NC http://support.sas.com/rnd/app/da/new/daexactlogistic.html

【2】Rcpp: Rcpp: Seamless R and C++ Integration dirk.eddelbuettel.com/code/rcpp.html