Workshops on a New Vision for Federal Statistics

When the fourth paragraph of the newly ratified U.S. Constitution established the decennial census in 1788, data were laborious to collect and scarce.* This data scarcity continued for a long time: Many of the surveys used today to measure unemployment, crime, public health, seat belt use, housing availability, retail activity, price increases, … , were launched at a time when taking a survey was the fastest and most economical (sometimes the only) way to obtain information about a large population.**

Today, we are surrounded by data. The World Economic Forum estimates that by 2025, the global production of new data will be 463 exabytes (that’s one billion gigabytes) each day. Can incorporating non-survey data sources — administrative records used for programs such as Medicare and the Supplemental Nutrition Assistance Program, credit card transaction data, electronic health records, sensor and satellite data, or data gathered from the internet — improve the speed, detail, accuracy, or cost-efficiency of federal statistics?

These issues will be explored in a webcast workshop on “The Implications of Using Multiple Data Sources for Major Survey Programs,” to be held on Monday May 16 from 11:00 AM to 5:00 PM and Wednesday, May 18 from 11:00 AM to 3:00 PM (Eastern Daylight time).

The webcast is free and open to the public, but you need to register in advance. You can see the agenda for the workshop and register at https://www.nationalacademies.org/event/05-16-2022/the-implications-of-using-multiple-data-sources-for-major-survey-programs-workshop.

Copyright (c) 2022 Sharon L. Lohr

Footnotes

*Some of the nation’s Founders envisioned the value that might be provided by more detailed data beyond a mere population count for apportionment purposes. On February 14, 1790, James Madison wrote to Thomas Jefferson:

A Bill for taking a census has passed the House of Representatives, and is with the Senate. It contained a schedule for ascertaining the component classes of the Society, a kind of information extremely requisite to the Legislator, and much wanted for the science of Political Economy. A repetition of it every ten years would hereafter afford a most curious and instructive assemblage of facts. It was thrown out by the Senate as a waste of trouble and supplying materials for idle people to make a book.

**For example, the Current Population Survey was launched in 1940 to provide an accurate measure of unemployment without having to ask everyone in the population. Household members in a sample of about 60,000 households are asked questions about their labor force participation. Because the sample is chosen using random selection methods (that is, the Current Population Survey is a probability sample), the Bureau of Labor Statistics can use it to accurately estimate the unemployment rate for each state as well as for the US as a whole — faster and more accurately than if they attempted to ask everyone.

Sharon Lohr