A data set on guerrilla resistance; illustrates logistic regression with a small sample

Weisiger (2014) data set used in Rainey and McCaskey (2021) to illustrate Firth's (1993) penalized maximum likelihood estimator. These are the data to reproduce Model 3 in Table 2 on p. 370.

Usage

weisiger2014

Format

A data frame with 35 observations and seven variables:

resist: whether conquest is followed by significant guerrilla resistance
polity_conq: conqueror’s Polity score
lndist: intercapital distance, logged
terrain: the percentage of a conquered country’s territory that is mountainous
soldperterr: the density of the occupying force, which is calculated by dividing force size by the area of the conquered country
gdppc2: gross domestic product (GDP) per capita
coord: whether the pre-conquest political leader of the country, who forms the most natural leader for any guerrilla resistance, remained free to operate in the country

For further details, see Weisiger (2014, pp. 365-366).

References

Firth, David. 1993. "Bias Reduction of Maximum Likelihood Estimates." Diometrika 80(1): 27-38. doi:10.1093/biomet/80.1.27

Rainey, Carlisle and Kelly McCaskey. 2021. "Estimating Logit Models with Small Samples. Political Science Research and Methods 9(3): 549-564. doi:10.1017/psrm.2021.9

Weisiger, Alex. 2014. "Victory Without Peace: Conquest, Insurgency, and War Termination." Conflict Management and Peace Science 31(4): 357–382. doi:10.1177/0738894213508691

Weisiger, Alex. 2014. "conq_ins_data.tab." Replication data for: Victory without Peace: Conquest, Insurgency, and War Termination., Harvard Dataverse, V1. doi:10.7910/DVN/OPCGOE/Q40MGO

Examples


# a simple example

weis <- crdata::weisiger2014

# formula for Model 3 in Table 2 of Weisiger (2014)
f <- resist ~ polity_conq + lndist + terrain + soldperterr + gdppc2 + coord

# reproduce Weisiger's LPM estimates
ls <- lm(f, data = weis) # linear probability model
summary(ls)
#> 
#> Call:
#> lm(formula = f, data = weis)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.60206 -0.26321  0.01564  0.18910  0.74388 
#> 
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)   
#> (Intercept) -1.619e+00  6.229e-01  -2.599  0.01475 * 
#> polity_conq -2.248e-02  1.350e-02  -1.665  0.10705   
#> lndist       2.433e-01  8.257e-02   2.946  0.00642 **
#> terrain      5.191e-03  3.794e-03   1.368  0.18211   
#> soldperterr -2.544e-02  3.410e-02  -0.746  0.46195   
#> gdppc2      -3.787e-05  4.521e-05  -0.838  0.40938   
#> coord        4.365e-01  1.522e-01   2.867  0.00778 **
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 0.3428 on 28 degrees of freedom
#> Multiple R-squared:  0.6082,	Adjusted R-squared:  0.5242 
#> F-statistic: 7.244 on 6 and 28 DF,  p-value: 9.718e-05
#> 

# fit a logit model
mle <- glm(f, data = weis, family = "binomial") # logistic regression
summary(mle)
#> 
#> Call:
#> glm(formula = f, family = "binomial", data = weis)
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)  
#> (Intercept) -2.820e+01  1.409e+01  -2.001   0.0454 *
#> polity_conq -3.442e-01  2.200e-01  -1.565   0.1176  
#> lndist       3.539e+00  1.884e+00   1.878   0.0604 .
#> terrain      3.385e-02  5.957e-02   0.568   0.5698  
#> soldperterr -6.011e-02  3.982e-01  -0.151   0.8800  
#> gdppc2      -9.639e-04  1.046e-03  -0.922   0.3566  
#> coord        5.648e+00  3.087e+00   1.830   0.0673 .
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 47.111  on 34  degrees of freedom
#> Residual deviance: 15.513  on 28  degrees of freedom
#> AIC: 29.513
#> 
#> Number of Fisher Scoring iterations: 7
#>