For years, the estimates of nonfatal gunshot injuries published by the Centers for Disease Control and Prevention have grown increasingly unreliable — in 2017, they were more suspect than ever. But researchers have continued to cite the numbers as authoritative. Last year, a CDC spokesperson defended the data, saying the agency’s experts were “confident that the sampling and estimation methods are appropriate.”

Now, the CDC is taking measures to curtail the spread of its most unreliable estimates. The 2016 and 2017 gun injury figures have been hidden on the agency’s public data portal, with a footnote stating, “Injury estimate is not shown because it is unstable.” The CDC will hide unstable estimates for all injury types within the next six months, according to a spokesperson. Also, the option to include statistical information about how reliable or unreliable the estimates are is now enabled by default. Until recently, it was disabled by default.

The changes follow reporting by The Trace and FiveThirtyEight that highlighted the unreliable estimates.

The CDC’s gun injury estimate was vulnerable to unreliability in part because of how few hospitals are surveyed in the data set that feeds it. When one hospital is replaced by another in the database the CDC uses, the change can cause the injury estimate to swing drastically. The CDC now says it is exploring the feasibility of collecting data from more hospitals, which would improve the estimate’s reliability.

An analysis by The Trace and FiveThirtyEight shows just how sensitive the current model is to changes in the sample. There’s no national database dedicated to tracking shooting incidents, so the CDC uses a more general injury database managed by the Consumer Product Safety Commission. The number of gun injuries treated in each hospital in the database is fed into a statistical model that extrapolates a national estimate. The smaller the number of hospitals in the pool, the larger the effect each one has on the estimate. Over time, hospitals leave the sample for a variety of reasons and are replaced.

The trouble is, the departing hospital and its replacement may treat very different numbers of injuries. From at least 2000 to 2010, a hospital labeled Primary Sampling Unit 41 submitted data to the CPSC’s panel. Raw numbers published by the CDC and CPSC show that this hospital treated a very small number of gunshot injuries: fewer than 10 each year from 2005 to 2010, and just 20 total over that six-year span. When this hospital dropped out of the database in 2010, it was replaced halfway through 2012 with a different one that treated a dramatically larger number of gun wounds: 793 during its first full year in the dataset.

Using methods developed in a 2017 paper that demonstrated the effect of hospital replacements in a larger but similar database, we analyzed data from the CDC and CPSC to measure the impact of this one substitution. The new hospital added over 22,000 nonfatal gun injuries to the 2015 national estimate — more than 100 times greater than the most ever contributed by its predecessor. This hospital — one of the 60 or so used in the sample — accounts for over one quarter of the total estimated gunshot injuries that year, which is the most recent data available.

When making a substitution in the database, the CPSC attempts to match the replacement hospital to the original based on some characteristics, like its geographic location and size. But according to Guohua Li, editor-in-chief of the medical journal Injury Epidemiology and founding director of Columbia University’s Center for Injury Science and Prevention, the CDC’s methodology doesn’t take into account factors like the volume of gun injuries treated, which leaves the estimate vulnerable to dramatic jumps like this one. He says the quickest way to address the problem would be to adjust the methodology to account for the larger volume of gun injuries.

The CDC acknowledges that hospitals that have recently been added to the system have been adding more gunshot injuries to the national estimate than the hospitals they replaced. “The influence of a gradually changing roster of participating hospitals does not translate to poor data quality,” the spokesperson from the CDC said in an email, “but rather reflects the varying characteristics of these hospitals.”

Other gun injury estimates are less susceptible to the distortions that hospital selection can introduce. The Healthcare Cost and Utilization Project, another database under the Department of Health and Human Services, uses data from more than 950 hospitals to create its own gun injury estimate — far more than the CDC. Among multiple sources of national gun injury data that The Trace and FiveThirtyEight reviewed last year, the CDC’s was the only data set that consistently showed an annual increase in gunshot wounds — an indicator that its estimates are out of step with other reliable data sources.

In May, the CDC’s leader acknowledged that the numbers needed to be fixed. Responding to an inquiry by 11 senators, CDC Director Robert Redfield wrote in a letter that the agency intends to “improve the precision and accuracy of [its] non-fatal firearm injury estimates.”

One solution would be to add more hospitals to the sample. “By expanding the roster of participating hospitals,” Redfield wrote, “the influence of any one hospital should be reduced and more stable estimates should be attainable.”

The CDC and CPSC are currently evaluating the system that is generating the national injury estimate, along with whether it needs to be expanded and how much it would cost to do so.

Senator Bob Menendez, the New Jersey lawmaker who led the group that wrote the letter this spring, is continuing to keep pressure on the agency. A new letter signed by him and four other Democratic senators says that “the CDC’s explanation falls short” and pushes Department of Health and Human Services Secretary Alex Azar for clarifications about several of the points in Redfield’s response.

Li said he is happy that the CDC is willing to make a change. “But I wish they had acknowledged the problems identified in a more straightforward way.”

Footnotes

  1. While we were able to figure out the names of some hospitals that were identified only by number in CPSC data, we are not permitted to share the names of the hospitals due to the terms of use for the anonymized data. The CDC, via the University of Michigan’s Inter-university Consortium for Political and Social Research, which manages the data set, denied our request for an exemption from this rule.
  2. The CDC supplies weights that adjust the number of gun injuries at each hospital, taking into account how big the hospital is, whether it’s a children’s hospital, and how many patients it treats.