A data set on enforcement operations; illustrates Poisson regression
Source:R/holland2015.R
holland2015.Rd
Holland (2015) data set used in Rainey (2023) to illustrate the difference between the "average-of-simulations" and directly transformed quantities of interest. These are the data to reproduce Model (1) for Bogota, Lima, and Santiago from Holland's Table 2 on p. 368. Note that the coefficients in Table 2 are standardized and exponentiated with (Stata's) robust standard errors.
Format
A data frame with 89 district-level observations across three cities:
city
the observation's city (Holland performs separate analyses for each city)
district
the name of the district (the unit of observation)
operations
the number of enforcement operations conducted per month by a district (averaged across three or more months per district)
lower
share of lower-class residents in a district
vendors
the number of unlicensed street vendors (in thousands)
budget
district budget per capita
population
district population (unclear scale)
For further details, see Holland (2015, p. 364) and pp. 3-6 of the file ajps12125-sup-0001-SupMat.docx available on the journal website.
Details
In Rainey (2023), I focus on the following hypothesis:
"My first hypothesis is that enforcement operations drop off with the fraction of poor residents in an electoral district. So district poverty should be a negative and significant predictor of enforcement, but only in politically decentralized cities [Lima and Santiago]. Poverty should have no relationship with enforcement in politically centralized cities [Bogota] once one controls for the number of vendors (p. 362)"
References
Holland, Alisha C. 2015. "The Distributive Politics of Enforcement." American Journal of Political Science 59(2): 357–71. doi:10.1111/ajps.12125 .
Holland, Alisha. 2014. "Replication data for: The Distributive Politics of Enforcement." Harvard Dataverse, V2. doi:10.7910/DVN/24859 .
Rainey, Carlisle. 2023. "A Careful Consideration of CLARIFY: Simulation-Induced Bias in Point Estimates of Quantities of Interest." Political Science Research and Methods. Forthcoming. doi:10.1017/psrm.2023.8 .
Examples
# a simple example
# load data
holland <- crdata::holland2015
# table of observations per city
table(holland$city)
#>
#> bogota lima santiago
#> 19 36 34
# formula corresponds to model 1 for each city in holland (2015) table 2
f <- operations ~ lower + vendors + budget + population
# fit poisson regression model for Santiago
fit <- glm(f, family = poisson, data = holland, subset = city == "santiago")
summary(fit)
#>
#> Call:
#> glm(formula = f, family = poisson, data = holland, subset = city ==
#> "santiago")
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 2.6190747 0.5185130 5.051 4.39e-07 ***
#> lower -0.0376151 0.0088506 -4.250 2.14e-05 ***
#> vendors -0.2209762 0.1216055 -1.817 0.0692 .
#> budget -0.0005983 0.0004912 -1.218 0.2232
#> population 0.0154822 0.0093948 1.648 0.0994 .
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> (Dispersion parameter for poisson family taken to be 1)
#>
#> Null deviance: 226.08 on 33 degrees of freedom
#> Residual deviance: 167.49 on 29 degrees of freedom
#> AIC: 226.79
#>
#> Number of Fisher Scoring iterations: 7
#>