Why I Don't Like Coefficient Plots
Posted on July 6th, 2012
Over the last few days, I've written a couple of posts (here, here) about creating coefficient plots. I like them way better than tables, but I don't really see a need for them. In fact, I think they can be misleading. In this post, I explain why (for the most part) I do not include regression tables in my papers and what I use in their place.
To understand why regression tables can be mislead, let's return to an example that has come up a couple of times in the last few posts--a simple model of turning out to vote. I run the logistic regression model and obtain the coefficients below.
If we are willing to interpret this regression model causally, then it seems that education, for example, has a positive effect on the probability of turning out. However, this is holding all other explanatory variables in the model constant. Does this make sense? I don't think so. Education probably affects both union membership and partisan strength. In order to get a better estimate of the effect of education, we should probably leave the "intervening" variables of union membership and partisan strength out of the model. (This is something that Andrew and Jennifer talk about a good bit in their book, but I don't come across it much in political science research.)
If the coefficients for control variables (predictors other than the primary explanatory variables) cannot be directly interpreted, why include them in the main text? In the interest of transparency and reproducibility, it makes sense to make these available to others in an appendix, but journal space is too valuable to include information so far removed from the main argument.
Instead of hypothesizing about control variables, providing (in a table or graph) and "interpreting" their coefficients, I prefer a detailed presentation and discussion of the effects of interest. Control variables should be mentioned, perhaps in the research design section, but discussion of their estimates is unnecessary.
A paper that I recently revised and resubmitted to the American Journal of Political Science illustrates (I think) the power of this approach.
I argue using a formal model that, contrary to the large literature on comparative electoral institutions, proportional electoral rules actually reduce parties' incentives to mobilize. I make three specific claims.
- Competitiveness Hypothesis: In both SMDP and PR systems, the mobilization effort by a district’s parties increases as the district’s competitiveness increases.
- Disproportionality Hypothesis.At any level of competitiveness in a district, the mobilization effort by the district’s parties is greater in SMDP systems than in PR systems.
- Interaction Hypothesis: The (positive) marginal effect of a district’s level of competitiveness on the mobilization effort by the district’s parties is greater under SMDP rules than under PR rules.
The typical approach in political science would be to run a regression model, put the coefficients in a table (or graph), and star the ones that are statistically significant. A Sophisticated researcher might include Brambor, Clark, and Golder's "marginal effect" plots.
Instead, I prefer a focused and detailed approach on the quantities relevant for the hypotheses. I provide these figures below. The figures combined with the detailed captions illustrate how powerful a detailed, focused approach can be.