Skip to contents

Weisiger (2014) data set used in Rainey and McCaskey (2021) to illustrate Firth's (1993) penalized maximum likelihood estimator. These are the data to reproduce Model 3 in Table 2 on p. 370.




A data frame with 35 observations and seven variables:


whether conquest is followed by significant guerrilla resistance


conqueror’s Polity score


intercapital distance, logged


the percentage of a conquered country’s territory that is mountainous


the density of the occupying force, which is calculated by dividing force size by the area of the conquered country


gross domestic product (GDP) per capita


whether the pre-conquest political leader of the country, who forms the most natural leader for any guerrilla resistance, remained free to operate in the country

For further details, see Weisiger (2014, pp. 365-366).


Firth, David. 1993. "Bias Reduction of Maximum Likelihood Estimates." Diometrika 80(1): 27-38. doi:10.1093/biomet/80.1.27

Rainey, Carlisle and Kelly McCaskey. 2021. "Estimating Logit Models with Small Samples. Political Science Research and Methods 9(3): 549-564. doi:10.1017/psrm.2021.9

Weisiger, Alex. 2014. "Victory Without Peace: Conquest, Insurgency, and War Termination." Conflict Management and Peace Science 31(4): 357–382. doi:10.1177/0738894213508691

Weisiger, Alex. 2014. "" Replication data for: Victory without Peace: Conquest, Insurgency, and War Termination., Harvard Dataverse, V1. doi:10.7910/DVN/OPCGOE/Q40MGO


# a simple example

weis <- crdata::weisiger2014

# formula for Model 3 in Table 2 of Weisiger (2014)
f <- resist ~ polity_conq + lndist + terrain + soldperterr + gdppc2 + coord

# reproduce Weisiger's LPM estimates
ls <- lm(f, data = weis) # linear probability model
#> Call:
#> lm(formula = f, data = weis)
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.60206 -0.26321  0.01564  0.18910  0.74388 
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)   
#> (Intercept) -1.619e+00  6.229e-01  -2.599  0.01475 * 
#> polity_conq -2.248e-02  1.350e-02  -1.665  0.10705   
#> lndist       2.433e-01  8.257e-02   2.946  0.00642 **
#> terrain      5.191e-03  3.794e-03   1.368  0.18211   
#> soldperterr -2.544e-02  3.410e-02  -0.746  0.46195   
#> gdppc2      -3.787e-05  4.521e-05  -0.838  0.40938   
#> coord        4.365e-01  1.522e-01   2.867  0.00778 **
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> Residual standard error: 0.3428 on 28 degrees of freedom
#> Multiple R-squared:  0.6082,	Adjusted R-squared:  0.5242 
#> F-statistic: 7.244 on 6 and 28 DF,  p-value: 9.718e-05

# fit a logit model
mle <- glm(f, data = weis, family = "binomial") # logistic regression
#> Call:
#> glm(formula = f, family = "binomial", data = weis)
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)  
#> (Intercept) -2.820e+01  1.409e+01  -2.001   0.0454 *
#> polity_conq -3.442e-01  2.200e-01  -1.565   0.1176  
#> lndist       3.539e+00  1.884e+00   1.878   0.0604 .
#> terrain      3.385e-02  5.957e-02   0.568   0.5698  
#> soldperterr -6.011e-02  3.982e-01  -0.151   0.8800  
#> gdppc2      -9.639e-04  1.046e-03  -0.922   0.3566  
#> coord        5.648e+00  3.087e+00   1.830   0.0673 .
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> (Dispersion parameter for binomial family taken to be 1)
#>     Null deviance: 47.111  on 34  degrees of freedom
#> Residual deviance: 15.513  on 28  degrees of freedom
#> AIC: 29.513
#> Number of Fisher Scoring iterations: 7