“One Weird Trick” to Recover Suppressed Counts from CDC’s WONDER

The Centers for Disease Control and Prevention’s Wide-ranging Online Data for Epidemiological Research (WONDER) data retrieval system provides access to many types of public health information. Mortality and fertility counts for multiple years across standard geographical subdivisions broken down by race, gender, 5-year age groups, etc. are just some of the data available through this system. Though a vast array of data are available through WONDER, counts that fall between 0 and 9 are replaced with the word “Suppressed,” generating missing values, and hindering research agendas. This post describes a new method to recover some of these suppressed counts.

Method

It’s really quite simple to recover many counts suppressed by CDC’s WONDER, it just takes, as the spam advertisements claim, this “one weird trick.” And, as expected, the trick involves algebra any 5th grader could do:

what you want = a lot of what they have – (a lot of what they have – what you want)

Infant Mortality Example

As an example, let’s query infant mortality rates for all US counties in 2013 using the default method. Go to CDC’s WONDER home page and click the link Multiple cause of death (Detailed Mortality). Click the Data Request link from the Current Multiple Cause of Death Data section. Scroll down and click the “I Agree” button to agree to the terms and conditions for accessing these data. To make this data request, most of the default settings are fine, but let’s select the additional following options:

  • From the Organize table layout section, select “County” from the And By menu
  • From the Select demographics section, click the radial button next to Single-Year Ages
  • From the Select demographics section, select “< 1 year” from the Pick between list
  • From the Select year and month section, select “+ 2013” from the Year/Month list
  • From the Other options section, check the box next to Show Zero Values
  • From the Other options section, check the box next to Show Suppressed Values
  • From the Other options section, select “4” from the Precision menu

Click the Send button and CDC’s WONDER will return infant mortality rates for all US counties in 2013, but not all of them. Most of the counts returned, as you’ll notice, are suppressed. Taking counties as our unit of analysis gives us a response rate of approximately 14.86%. Figure 1 shows a map of these data.

Figure 1: Infant Mortality Rates among US Counties in 2013 (n=3142)

cdcMortalityInfantcdc2013n2

In our effort to get more counts, we repeat the same steps taken to construct Figure 1, but this time we select all available years: “+ 1999”, “+ 2000”, …, “+ 2013”. This action gives us the “a lot of what they have” part of the equation. Figure 2 shows infant mortality rates for all US counties over the years 1999 to 2013. The response rate among these counties is about 81.51%.

Figure 2: Infant Mortality Rates among US Counties years 1999 to 2013 (n=3142)

cdcMortalityInfantcdc19992013V2

Recall, the “one weird trick” involves subtracting the “(a lot of what they have – what you want)” part from the “a lot of what they have” part, which we got in the construction of Figure 2. To get the “(a lot of what they have – what you want)” piece of the equation, we accept the defaults settings in WONDER with the following exceptions:

  • From the Organize table layout section, select “County” from the And By menu
  • From the Select demographics section, click the radial button next to Single-Year Ages
  • From the Select demographics section, select “< 1 year” from the Pick between list
  • From the Select year and month section, select “+ 1999”, “+ 2000”, …, “+ 2012” from the Year/Month list
  • From the Other options section, check the box next to Show Zero Values
  • From the Other options section, check the box next to Show Suppressed Values
  • From the Other options section, select “4” from the Precision menu

Figure 3 shows infant mortality rates for all US counties over the years 1999 to 2012. The response rate among these counties is about 80.43%.

Figure 3: Infant Mortality Rates among US Counties years 1999 to 2012 (n=3142)

cdcMortalityInfantcdc19992012V2

To recover suppressed infant mortality counts among US counties in 2013, all we need to do is take the difference of the counts used to construct Figure 3 from those used to construct Figure 2. This method can greatly improve upon the naive approach and yield a higher response rate (80.43% as compared to 14.86%). Due to data use restrictions, no actual differences were taken in the development and presentation of this method. It’s assumed the new 2013 response rate, the one discussed here, will match the response rate of the counties across the years 1999 to 2012. The actual response rate has not been calculated. Because of these restrictions, we fail to plot and present a map of recovered infant mortality counts among US counties in 2013.

Confidentiality and Data Use Restrictions

The CDC, in case you’re wondering, suppresses all counts between 0 and 9 to ensure confidentiality and protect personal privacy (for more on this, see Assurance of Confidentiality). Recall from the terms and conditions, it is against the law to use these data in certain ways. Things not done on this post, include:

  • “present or publish death counts of 9 or fewer or death rates based on counts nine or fewer (in figures, graphs, maps, tables, etc.)”
  • “attempt to learn the identity of any person or establishment included in these data”
  • Disclose or make “other use of the identity of any person or establishment discovered inadvertently”

The method described and data used in this post are provided to support “health statistical reporting and analysis only.”

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s