A data set on enforcement operations; illustrates Poisson regression
Source:R/holland2015.R
holland2015.RdHolland (2015) data set used in Rainey (2023) to illustrate the difference between the "average-of-simulations" and directly transformed quantities of interest. These are the data to reproduce Model (1) for Bogota, Lima, and Santiago from Holland's Table 2 on p. 368. Note that the coefficients in Table 2 are standardized and exponentiated with (Stata's) robust standard errors.
Format
A data frame with 89 district-level observations across three cities:
citythe observation's city (Holland performs separate analyses for each city)
districtthe name of the district (the unit of observation)
operationsthe number of enforcement operations conducted per month by a district (averaged across three or more months per district)
lowershare of lower-class residents in a district
vendorsthe number of unlicensed street vendors (in thousands)
budgetdistrict budget per capita
populationdistrict population (unclear scale)
For further details, see Holland (2015, p. 364) and pp. 3-6 of the file ajps12125-sup-0001-SupMat.docx available on the journal website.
Details
In Rainey (2023), I focus on the following hypothesis:
"My first hypothesis is that enforcement operations drop off with the fraction of poor residents in an electoral district. So district poverty should be a negative and significant predictor of enforcement, but only in politically decentralized cities [Lima and Santiago]. Poverty should have no relationship with enforcement in politically centralized cities [Bogota] once one controls for the number of vendors (p. 362)"
References
Holland, Alisha C. 2015. "The Distributive Politics of Enforcement." American Journal of Political Science 59(2): 357–71. doi:10.1111/ajps.12125 .
Holland, Alisha. 2014. "Replication data for: The Distributive Politics of Enforcement." Harvard Dataverse, V2. doi:10.7910/DVN/24859 .
Rainey, Carlisle. 2023. "A Careful Consideration of CLARIFY: Simulation-Induced Bias in Point Estimates of Quantities of Interest." Political Science Research and Methods. Forthcoming. doi:10.1017/psrm.2023.8 .
Examples
# a simple example
# load data
holland <- crdata::holland2015
# table of observations per city
table(holland$city)
#>
#> bogota lima santiago
#> 19 36 34
# formula corresponds to model 1 for each city in holland (2015) table 2
f <- operations ~ lower + vendors + budget + population
# fit poisson regression model for Santiago
fit <- glm(f, family = poisson, data = holland, subset = city == "santiago")
summary(fit)
#>
#> Call:
#> glm(formula = f, family = poisson, data = holland, subset = city ==
#> "santiago")
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 2.6190747 0.5185130 5.051 4.39e-07 ***
#> lower -0.0376151 0.0088506 -4.250 2.14e-05 ***
#> vendors -0.2209762 0.1216055 -1.817 0.0692 .
#> budget -0.0005983 0.0004912 -1.218 0.2232
#> population 0.0154822 0.0093948 1.648 0.0994 .
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> (Dispersion parameter for poisson family taken to be 1)
#>
#> Null deviance: 226.08 on 33 degrees of freedom
#> Residual deviance: 167.49 on 29 degrees of freedom
#> AIC: 226.79
#>
#> Number of Fisher Scoring iterations: 7
#>