Logistic Regression with a Small Sample • crdata

First, load the weisiger2014 data set.

library(tidyverse)

# load weisiger2014 data
weis <- crdata::weisiger2014

# show weisiger2014 data set
glimpse(weis)
#> Rows: 35
#> Columns: 7
#> $ resist      <dbl> 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0…
#> $ polity_conq <dbl> -6, -5, 3, -6, -6, -9, -9, -5, -9, -4, -9, -9, -4, -9, -7,…
#> $ lndist      <dbl> 9.127671, 5.483439, 7.811700, 7.305770, 7.584401, 6.358510…
#> $ terrain     <dbl> 31.2, 22.2, 47.6, 1.1, 28.4, 0.0, 0.0, 10.5, 10.5, 21.6, 2…
#> $ soldperterr <dbl> -3.97814370, -3.10807820, -3.89651870, -1.29540790, -4.695…
#> $ gdppc2      <dbl> 685.9000, 1457.0000, 912.1334, 676.0000, 681.7000, 4831.00…
#> $ coord       <dbl> 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1…

To facilitate interpretation and comparability, rescale the explanatory variables using arm::rescale() (i.e., so that continuous explanatory variables have mean zero and SD 0.5 and the binary explanatory variables to have mean zero.)

# rescale weisiger2014 explanatory variables using arm::rescale()
rs_weis <- weis %>%
  mutate(across(polity_conq:coord, arm::rescale)) %>%
  glimpse()
#> Rows: 35
#> Columns: 7
#> $ resist      <dbl> 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0…
#> $ polity_conq <dbl> -0.18230763, -0.10050293, 0.55393473, -0.18230763, -0.1823…
#> $ lndist      <dbl> 0.934818228, -0.938114356, 0.258482472, -0.001537566, 0.14…
#> $ terrain     <dbl> 0.32513516, 0.09430898, 0.74575175, -0.44685017, 0.2533225…
#> $ soldperterr <dbl> -0.641316029, -0.467859669, -0.625043259, -0.106485507, -0…
#> $ gdppc2      <dbl> -0.354801313, -0.127853930, -0.288217123, -0.357715045, -0…
#> $ coord       <dbl> 0.7428571, -0.2571429, 0.7428571, 0.7428571, -0.2571429, -…

Now we fit three models.

Least squares, which replicates Model 3 of Table 2 of Weisiger (2014).
A logit alternative, fit with maximum likelihood.
A logit alternative, fit with penalized maximum likelihood using the {brglm2} package.

# fit models 
f <- resist ~ polity_conq + lndist + terrain + soldperterr + gdppc2 + coord
ls <- lm(f, data = rs_weis) # lpm (replicates Table 2 Model 3 of Weisiger 2014)
mle <- glm(f, data = rs_weis, family = "binomial") # logistic regression
pmle <- glm(f, data = rs_weis, family = "binomial", method = brglm2::brglmFit) # logistic regression

To interpret the results, we might focus on Weisiger’s Hypothesis 12.

Hypothesis: Resistance will be more likely when the pre-war political leader remains at large in the conquered country.

The table below compares the estimates.

modelsummary::modelsummary(list("Least Squares" = ls, 
                                "Logit via MLE" = mle, 
                                "Logit via PMLE" = pmle))

	Least Squares	Logit via MLE	Logit via PMLE
(Intercept)	0.400	−0.731	−0.477
	(0.058)	(0.832)	(0.523)
polity_conq	−0.275	−4.208	−2.277
	(0.165)	(2.689)	(1.657)
lndist	0.473	6.886	3.402
	(0.161)	(3.666)	(1.828)
terrain	0.202	1.320	1.102
	(0.148)	(2.323)	(1.358)
soldperterr	−0.128	−0.302	−0.595
	(0.171)	(1.997)	(1.419)
gdppc2	−0.129	−3.275	−1.154
	(0.154)	(3.553)	(1.833)
coord	0.436	5.648	3.051
	(0.152)	(3.087)	(1.548)
Num.Obs.	35	35	35
R2	0.608
R2 Adj.	0.524
AIC	32.6	29.5	32.1
BIC	45.0	40.4	43.0
Log.Lik.	−8.291	−7.756	−9.074
F	7.244	1.210	1.576
RMSE	0.31	0.27	0.27

The plot below compares the estimates. Notice that Firth’s PML estimator produces smaller estimates and CIs.

modelsummary::modelplot(list("Logit via MLE" = mle, "Logit via PMLE" = pmle), conf_level = 0.90)