"Statistically Insignificant" or "Non-Statistically Significant"

I use the phrases "statistical insignificance" and "statistically insignificant" often, but I was recently informed that these terms are not correct. Instead, I was told to say something like "non-statistically significant." In light of this, I'm careful to say "not statistically significance" or "a lack of statistical significance" in my forthcoming AJPS article.

Since then, though, I've been paying attention and notice that researchers smarter than me use both, so I'm not too worried about the distinction.

In Bayesian Methods and "The Insignificance of Null Hypothesis Significance Testing", Jeff Gill uses the phrase "non-statistically significant."

In his blog posts and articles, Andrew Gelman seems to prefer phrases like "statistically insignificant" and "nonsignificant."

I think I prefer "statistically insignificant," since the negation is more clearly on the significant. "Non-statistically significant" makes it sound like we're talking about some other kind of significance, such as substantive significance.

The danger is that the word "insignificant" implies there is "no effect."

I'd be curious to know what others think.

Some Initial Observations on Replications as Class Projects

I taught the graduate course in linear models at UB last semester and a major portion of the course was a replication project. Here are a few quick observations.

  1. Building the course around a replication project has made organizing the course a lot easier. After all, my ultimate goal at the end of the semester is that students be able to run their own regression. Since students are working on their replication projects and asking questions, I have a pretty good sense of what I should be talking about next.
  2. Based on my discussion with some of the students in the class, the replication projects gives the in-class discussions and readings a sense of purpose. I usually try to set up the readings during class, have the students do the necessary reading, and then apply those ideas to their projects. I've given them a clear target to reach by the end of the semester (a high-quality quantitative analysis) and they can see how each topic we discuss helps them get closer to that goal. It's still early, but I think it's been effective so far.
  3. I'm encouraged by the availability of data--it seems like researchers are doing a better job of that. Under Rick Wilson's editorship, AJPS has become an example for other journals to follow.
  4. We've had a couple of cases, though, in which the posted data was incomplete. In each case, a variable was missing.
  5. I'm a little surprised at how difficult it is to understand what is going on in the analysis from the paper. I just try to imagine how hard it would be to replicate someone's results if they did not provide the computer code. I encouraged students to focus their efforts on papers that provided data and code. In two cases, we still weren't able to replicate due to missing data. In a separate cases, the students didn't have a script and were totally lost. They had a complete and well-documented data source, but we eventually had to e-mail the authors for a Stata .do file. With this, we were able to replicate the results.

In the end, these replications were very popular with the students and they managed to write a few great papers. I'm so satisfied with the outcome, I'm doing it again this semester in my advanced methods class.

The Front-End of Methods Training

Based on my own experience and interactions with other professors and students, most methods training in political science starts with a “baby stats” course, continues into a more detailed course on linear models, and finishes with a fairly rigorous course on the generalized linear model that includes a grab bag of the latest and greatest methods. In my experience the detail and breadth of these courses increases as the students goes along. Related to this, departments (with limited methods-oriented faculty) tend to devote their methodologists to the more advanced courses and, if necessary, use more substantive-oriented faculty for their introductory courses. My experience in the statistics department at Florida State suggests a slightly different approach might train students more effectively. While a course on the GLM is crucial (I think I’ve used logit in every paper I’ve ever written), a thorough course in probability seems just as important to me. So what are the key ideas that students should learn in an introductory methods course?

  1. point estimation
  2. interval estimation
  3. hypothesis testing

This could be done in the context of differences-in-means and a simple linear regression with a single explanatory variable (or even multiple regression). I’ve never used a Chi-square test in an actually application and I’ve certainly never done one by hand, so I don’t really see the point of doing several by hand as part of an applied methods class. Methods training in political science falls short of it’s potential because early methods classes fail to deal head on with these key concepts and then try to build on a nonexistent foundation. To really get a handle on the three key ideas of point estimation, interval estimation, and hypothesis testing, students need to be familiar with some basic principles of probability theory.

  1. probability distributions and random variables  (pdfs/pmfs, cdfs/cmfs, computer simulation)
  2. Bayes’ rule for discrete and continuous events
  3. mean and variance (of a random variable, not the sample mean and variance)
  4. conditional expectation
  5. central limit theorem
  6. sampling distributions

I’d start the class with a scatterplot of two theoretically related variables, such as the incumbent party’s presidential vote share and change in GDP.  I’d ask students to think about how these two things might be related. Based on simply inspecting the scatterplot, I’d ask them two specific questions.

  1. For every percentage point increase in the GDP growth rate, how many percentage points does the incumbent party’s vote share increase? Don’t worry about being exactly correct just come up with a “good estimate." Call this quantity the “effect."
  2. Choose two values that you are “fairly confident” lie above and below the actual effect.
  3. Are you "fairly confident” that that the actually effect is greater than zero?

I’d then set out to tackle these questions throughout the class. These imply other questions as well, such as what makes and estimate a “good estimate” and the technical meaning of “fairly confident.” I’d note that to answer these question, we need a statistical model, so I’d suggest \(y_i \sim N(\mu_i, \sigma^2)\), where \(\mu_i = \beta_{cons} + \beta_{x}x\). I could then note that this and similar models serve as powerful tools for answering these types of questions and that it’s really important to understand the details. I’d jump in with the normal distribution, expanding to other distributions, and working my way down the list, always coming back to the fundamental concepts of point estimation, interval estimation, and hypothesis testing, being vary careful with details and not shying away from the mathematical background. I don’t know what sort of textbook would be appropriate for this style of class. My favorite is Casella and Berger, but that’s much too advanced for an introductory class for political science graduate students. I haven’t spent a lot of time with it, but DeGroot and Schervish seems promising. These are just some initial ideas, so let me know what you think, especially if you disagree.


My favorite pontificators in political science is Fernando Martel Garcia. I got to know him at replication panel ISA, where he quite vigorously opposed the APSR's policy of auto-rejecting replication papers. Fernando recently posted this gem to the PolMeth mailing list.

In the real world computers do not work alone but at the behest of the researcher operating them.  And the problem is that the latter are often trying to solve a different minimization problem. Namely, choosing regressors, samples, time periods, functional forms, measures, proxies, etc. that minimize the p-value of interest. Thus, in the context of research practice, or how scientists go about doing science, it might be more appropriate to say that most OLS estimates are JUNK rather than BLUE.  And so educators ought to do a much better job of teaching research practice and good research design, over and above OLS.

p-values get a lot of hate from many in the methodology community, but I actually like them. In fact, I'm growing more and more frequentist in my thinking. However, if researchers use p-values as their optimization criterion, then we are in rough shape. But what can we expect, since it seems that journal use p-values as a rejection criterion?

Another Benefit of Publicly Version-Controlled Research

I've been thinking quite a bit lately about why and how political scientists should publicly version control their research projects. By research projects, I mean data, manuscript, and code. And by publicly version control, I mean use Git to version-control and post a public GitHub repository, from the beginning of the project, so that other researchers are free to follow and borrow as needed. Below I quickly summarize some of the benefits of version control and discuss another benefit that Git and GitHub have had on my own research.

My thinking about public version control for research projects began Zach Jones' discussion of the idea in a recent Political Methodologist. Teaching the linear models class here at UB last semester solidified its importance in my mind. We used Dropbox for version control and sharing, but Git and GitHub are better.

Several recent articles and posts outline why researchers (as opposed to programmers) might use Git and GitHub. Here's a brief summary:

  1. A paper by Karthik Ram.
  2. A short essay by Zach Jones.
  3. A short essay by Christopher Gandrud.

If I've missed something, please let me know.

There are a lot of good reasons to do this:

  • History. We all use version control. Most of us do it poorly. Using Git/GitHub, I'm learning to do it better. Git/GitHub offers a formal way to easily maintain a complete history of a project. In general, it's good to avoid filenames like new_final_carlisle_v3c_updated.docx. A recent comic makes this point clear. We need a method of updating files while keeping track of the old versions so that we can go back if needed. But the approach of giving different filenames to different version is inefficient at best. My approach of keeping "old files" in a designated folder is no better. Git/GitHub solves these issues. Second, Git allows you to tag a project at certain stages, such as "Initial submission to AJPS." After getting an invitation to revise and resubmit and making the required changes, I can compare the current version of the project to the (now several months old) version I initially submitted. This makes writing response memos much easier.
  • Transparency. Zach Jones most clearly makes the point that Git/GitHub increases transparency in the context of political science research. Git/GitHub essentially allows others to actively monitor the progress of a project or study its past development. Related to the motivation to using GitHub in an open manner is the idea of an "open notebook." Carl Boettiger is one of the most well-know proponents of open notebooks. This kind of openness provides a wonderful opportunity to receive comments and suggestions from a wide audience. This allows other to catch errors that might otherwise go unnoticed. It also gives readers a formal avenue to make suggestions, not to mention keeping a complete history of the suggestions and any subsequent discussion. GitHub allows the public to open issues, which is a wonderful way to receive and organize feedback on a paper.
  • Accessibility. Christopher Gandrud makes the point clearly in a clearly in a recent edition of The Political Methodologist, though he discusses accessibility purely in the context of building data sets. But similar arguments could be made for code. I recently had a graduate student express interest in some MRP estimates of state-level opinion on the Affordable Care Act. I told her that I had spent some time collecting surveys and writing code to produce the estimates. I noted that, ideally, she would not duplicate my work, but, if possible, build on it. I was able to point her to the GitHub repository for the project, which hopefully she'll find useful as a starting point for her own work. As part of my experience supervising replication projects as part of graduate methods classes and my own experience with replication data, the clean, final versions of the data that researchers typically post publicly do not allow future researchers to build on easily build on previous work. If authors posted the raw data and all the (possibly long and messy) code to do the cleaning and recoding, it would be much easier for future researcher to build on past contribution. Indeed, research shows that making the data and code freely available lowers the barriers to reuse and increases citations.

But these are the commonly cited reasons for using Git and GitHub. But in my practice, I've found another reason, perhaps more important than the above.

One thing that I first noticed in my students, but now I see that I'm just as guilty of, is "the race to a regression." That is, I devote the absolute minimum required effort (or less) to everything leading up to the regression. My attitude is usually that I'll go back later and clean up everything, double checking along the way, if the line of investigation "proves useful" (i.e., provides stars). I rarely go back later. I find that the script  let_me_just_try_this_really_quickly.R quickly becomes a part of  analysis.R . This is bad practice and careless.

Instead of a race to the regression, Git encourages me to develop projects a little more carefully, thinking about projects in tiny steps, each to be made public, and each done right and summarized in a nice commit message. The care in my research has noticeably improved. I think about how to do something better, do it better, and explain it in a commit message that I can refer to later.

In my view, project development in Git/GitHub works best when users make small, discrete changes to a project. This takes some thought and discipline, but it is the best way to go. I'm guilty of coming back from a conference and making dozens of small changes to an existing projects, incorporating all the suggestions in a single update. I just did it after the State Politics and Policy Conference. It is a poor way to go about developing a project. It is a poor way to keep track of things. It is a poor strategy, but I'm learning.

For more posts, see the Archives.