High-Income Families and Student Debt: Why Survey Weights Matter

Statistics that are unexpected or surprising often get a lot of media coverage. “Dog bites man” isn’t news, but a surprising statistic — one that contradicts a widely held opinion or presents a new twist about a public policy issue — is.

In January 2019, the Urban Institute reported that 49 percent of all outstanding education debt in 2016 was held by families in the top quartile of the income distribution — a somewhat surprising result, since if education debt were evenly distributed across all income groups, one would expect the 25 percent of families with the highest incomes to have about 25 percent of the debt. Not surprisingly, the statistic was repeated widely. Catherine Rampell cited it in an April 25 column in The Washington Post. Andy Thomason wrote in the Chronicle of Higher Education that it argues in favor of a “more nuanced approach” to student debt. David Leonhardt’s column in The New York Times originally stated that because “the highest-earning quarter of the population holds about half of all student debt … universal student debt cancellation would be a giant welfare program for the bourgeoisie.”

The Urban Institute’s statistic may have been surprising, but it was also wrong. Jordan Weissmann of Slate magazine investigated a discrepancy between the Urban Institute’s statistic and analyses done by Adam Looney of the Brookings Institute, and that led to the discovery of the error.  The Urban Institute quickly corrected the statistic on its web page (Rampell and Leonhardt updated their online columns as well); the page now says that 34 percent of outstanding education debt is held by families in the top income quartile — somewhat more than the 25 percent that would be expected if student debt were evenly distributed across all income quartiles, but much lower than the original statistic of 49 percent.

Unfortunately, when a widely cited statistic is found to be in error and then corrected, some readers may think that no statistics can be trusted, or that statisticians can wave a magic wand and get any number they like out of a set of data. They can’t — at least not if they’re analyzing the data correctly.

Properties of statistics calculated from survey samples come from mathematical principles. When data analysts violate those principles, they get numbers that are wrong. It’s exactly the same with any kind of math. If you insist that “2 plus 2 equals 5” when doing arithmetic, your sums will be wrong. Let’s look at the statistical principles involved here.

Survey Weights in the Survey of Consumer Finances

The statistic about outstanding education loans was calculated from the 2016 Survey of Consumer Finances (SCF), a survey conducted every three years by the Federal Reserve Board to provide a snapshot of the income, assets, and liabilities of American families. The survey design and methodology are described in the Appendix of the main survey report.

Because it is desired to calculate statistics about asset classes that are held by relatively few families (such as tax-exempt bonds and certain types of businesses), the SCF oversamples wealthy families. That is, the sample contains more wealthy families than would be expected if the SCF simply drew families at random from the set of all families in the US.  

But doesn’t having so many wealthy families make the survey unrepresentative?  Not if the survey is analyzed correctly by using the survey weights. You can read about how, and why, survey weights work in Chapters 5 and 6 of my book Measuring Crime: Behind the Statistics.

Briefly, the survey weight of a family in the sample tells how many families in the population are represented by it. A family with survey weight 100 represents 100 families in the US population; a family with weight 30,000 represents 30,000 families in the US population. The SCF contains a higher percentage of families that are wealthy than does the population; consequently, a typical wealthy family in the sample has a lower weight so that it represents fewer families in the population than a nonwealthy family. Then, when you estimate the number of families with income over $100,000 by summing the weights of the families in the sample with income over $100,000, the estimate should be right on target.

The weights make a big difference for estimates from the SCF, and particularly for estimates of quantities that are related to wealth or income. For example, with the weights it is estimated that about 22 percent of US families have income over $100,000, even though about 36 percent of the 6,248 families in the sample have income over $100,000 (because wealthy families are oversampled). The next section discusses some details about the SCF weights, and you can skip to here if you’d rather not be immersed in them.

Figure 1. Survey weights vs. family income (both axes in log scale)

Figure 1. Survey weights vs. family income (both axes in log scale)

The scatterplot in Figure 1 of survey weights vs. income from the 2016 SCF (restricted to the families with income at least $1,000) shows that the weights of high-income families in the SCF tend to be lower than those of other families. I used logarithmic scales for both axes so you can see the weights for the wealthy families on the graph. The tick marks on the horizontal axis represent successive powers of ten, denoting incomes of 1,000, 10,000, 100,000, 1 million, 10 million, and 100 million dollars. Each circle in the graph represents a family in the sample. Most of the high-income families in the sample (in the lower right) have weight less than 500; each represents fewer than 500 families in the population. By contrast, most of the other families in the sample (in the upper left; they appear as a dark cloud because there are so many families in that area of the plot) each represent between 20,000 and 40,000 families in the population.

The three graphs in Figure 2 show the percentage of total outstanding education loan debt held by families in each quartile of the income distribution: when the weights are used correctly (left), when no weights are used (center), and when the incorrect weights from the Urban Institute’s original analysis are used (right). So that my analysis would be comparable with that from the Urban Institute, I restricted the data to families in which the family head is age 25 or older, so all mentions of population for these graphs refer to the set of US families where the head is at least 25 years old.*

Figure 2. Percentage of outstanding education loans held by families in each income quartile, among families in which the head is age 25 or older. The graph on the left gives the correct statistics; the statistics in the other two graphs are incorrect. Bars do not sum to 100 percent because of rounding.

Figure 2. Percentage of outstanding education loans held by families in each income quartile, among families in which the head is age 25 or older. The graph on the left gives the correct statistics; the statistics in the other two graphs are incorrect. Bars do not sum to 100 percent because of rounding.

The graph on the left of Figure 2 uses the correct weights (and accords with the statistics in the Urban Institute’s now-corrected post). This graph gives statistics that can be said to describe the population of all families in the US where the family head is age 25 or older. We estimate from the SCF that 34 percent of the outstanding student debt for the population is held by the approximately 25 percent of families with income above $97,000. The margin of error for the percentage of student debt held by the top income quartile of families is approximately 4 percentage points.**

The middle graph of Figure 2 shows what happened when I (deliberately) made the common mistake of estimating the percentages without using the weights. The distribution is actually pretty similar to that in the graph of the correct statistics on the left. But the percentages in the middle graph have a different interpretation — they apply only to the families who participated in the survey. The statistic in the last bar, for example, says that 38 percent of the total student debt for the set of 6,248 families in the sample belongs to families with income above $97,000. But the statistic should not be used to estimate the corresponding percentage for the set of all families in the US.***

The statistics in the graph on the right of Figure 2 were calculated with an incorrect set of weights, which I’ll call aweights. The collapse command in Stata® (the statistical package used by the Urban Institute) allows different types of weights, and the pweight option uses the survey weights and produces the correct estimates. But the aweight option changes the weights used in the analysis.

By definition, a quarter of the families in the US population are in each income quartile. The sum of the survey weights in each quartile, which estimates the number of US families with family head age 25 and older in that quartile, is approximately 30 million families (about one-quarter of the population). The aweight option changes the weights so they sum to the number of sampled families, not to the number of population families, in each quartile. Each of the first three quartiles has about 1,200 sampled families, but, since wealthy families are oversampled, the top income quartile contains about 2,300 families in the sample. So aweight multiplies each sampled family’s survey weight in the top quartile by about 2,300/(30 million), but it multiplies each sampled family’s survey weight in the other three quartiles by about 1,200/(30 million). After this multiplication, the sum of the aweights for families in the top income quartile is about 40 percent of the total aweights, not the 25 percent that it should be. The survey weight of a family can be interpreted as the number of families in the population represented by it; the aweights don’t really have any meaning in terms of the population. Because the aweights in the top quartile are larger than they should be, using the aweights to estimate the percentage of all loan amounts belonging to families in that quartile gives a number that is too large as well.

Everyone who analyzes data has made a mistake at one time or another. Sometimes proofreading the code will catch a mistake, but in this case, it would be easy for a proofreader to miss the one-letter difference in code (aweight instead of pweight ) that led to a 15-percentage-point difference in the estimate for the top income quartile.

But there are several things data analysts can do to prevent errors and check results.

  1. Check the calculations in a different statistical software package or in a spreadsheet. When I performed the calculations using PROC SURVEYMEANS of SAS® software with the SCF survey weights, I immediately obtained the correct estimates displayed in the left graph. I was able, after a few hours of work, to reproduce the incorrect statistics in the graph on the right in SAS® software, but I had to write special code and deliberately calculate an incorrect aweight variable to do so.

  2. Check that the weights used for the analysis sum to the size of the population you are studying. The SCF applies to a population of about 126 million families, so the weights for families in the sample should sum to approximately that amount.

  3. Try the code with a variable for which there are published estimates, such as median income, or the percentage of student loan debt held by young families making more than $60,000.

Interpreting the Statistics

What do these statistics about the percentage of student debt held by different income quartiles mean? There are some caveats to keep in mind when interpreting them.

I consider the SCF to be one of the highest quality sources of information available about the financial situation of American families. Still, the SCF data are from a survey, and depend on the accuracy of the numbers provided by survey participants. The SCF interviewers go to a lot of effort to get correct information from survey participants, but some answers may nevertheless be inaccurate. The survey also fails to obtain responses from some of the families asked to participate. The response rate is about 65 percent for the bulk of the data (it’s much lower for wealthy families), and the SCF adjusts the weights for nonresponse, but it is possible that the weight adjustments do not account for all the differences between families that agree to participate and those that do not. No data sets are perfect, however, and the SCF contains a wealth (sorry about the pun) of information about student debt — a purist who insists that every data set be perfect will end up making decisions without any data at all.

The second consideration, somewhat technical, is that the distribution of outstanding student debt is highly skewed; about half of the outstanding loan amounts are less than $20,000 but some families have debt exceeding $200,000. The megadebt families are scattered across the income quartiles, and are not associated with large weights, so this skewness does not appear to have a large effect on the estimated percentages. But it affects the standard errors of estimates.

And standard errors, which measure the uncertainty about the estimate that arises from taking a sample instead of obtaining data from everyone in the population, are important for interpreting SCF statistics. Overall, the data set contains information on 6,248 families, but sample sizes can be small for subpopulations (for example, families headed by women age 40 to 45 who have a graduate or professional degree) and some estimates can have large standard errors. The statistics in the leftmost graph of Figure 2 indicate that, when we look at all families with heads age 25 or older, the families in the lowest quartile of income have a smaller share of outstanding student debt than those in the other three income quartiles. But the percentage of debt held by the top income quartile is not statistically significantly different from the percentage held by the second income quartile, and the percentage held by the second income quartile is not significantly different from the percentage held by the third.

Thirdly, the SCF data provide a snapshot at a particular moment in time. The survey tells about the amount of loans outstanding at the time of the survey and the amount that was borrowed for each of those loans; it has no information on loans that the family took out and paid off in the past. As a consequence, the heads of families with loans tend to be younger than the heads of families without loans. The current income quartiles also do not necessarily reflect the earning potential of the persons with loans. A recent medical school graduate may have low income now but potentially large income in a few years. Looking at student debt by income quartile does not tell the whole story, and it is important to obtain other views of the data and look at other data sources.

When survey data are collected and analyzed in accordance with the mathematical theory that justifies their accuracy, the statistics from a survey describe the population. But they describe only that moment in time. The snapshot provided by the SCF data does not forecast what would happen under various proposals to change college tuition or student debt systems, which could result in different college tuition structures, different student choices for whether and where to attend college, and different student borrowing patterns. As statisticians Morris Hansen and W. Edwards Deming pointed out in 1949, “a sample survey can at best only give a picture of the past, not of the future.”

Footnotes and References

Hansen, M. H. and Deming, W. E. (1949). On an important limitation to the use of data from samples. Bulletin of the International Statistical Institute, 32, 214-219.

*I restricted these to families in which the head had age 25 or higher. This is slightly different than the Urban Institute’s analysis, which restricted to families in which the average age of the adults was 25 or higher, but the statistics turn out to be almost the same. The SCF report says that the family (actually, the SCF economists use the term “Principal Economic Unit,” but you can think of PEUs as families) head is the male in a mixed-sex couple and the older person in a same-sex couple.

**Standard error calculations for the SCF are more complicated than those for most other surveys, and require special computations. I wrote a macro in SAS® software to calculate standard errors, including both the sampling variability and the variability from the multiple imputations. My standard error calculations treated the income groups as fixed categories.

***Why are the statistics in the first and second graphs in Figure 2 so similar when the survey weights of wealthy families are so much lower than those of non-wealthy families? This finding does not contradict my discussion about the importance of weights, as can be seen by taking a closer look at the weights of families with student loan debt. Although the families in the sample with super-high incomes have very low weights (see Figure 1), most of them have no student debt (a family with income of $10 million usually does not need to take out student loans) and thus do not contribute to the student debt total for the top income quartile. A percentage calculated without weights is the same as that calculated with weights when all weights are equal; although the weights for families in the SCF sample with student debt are not all equal, the median weight for families with outstanding student debt in the top income quartile is only slightly lower than that in the other three quartiles and so the rightmost bar in the middle graph, calculated without weights, is only slightly too high. But you should still always use the survey weights with the SCF, and the weights make a huge difference for statistics about income or net worth.

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

Stata® is a registered trademark of StataCorp LLC.

Thanks to Matthew Chingos of the Urban Institute for his helpful comments on an earlier draft of this post.

Copyright (c) 2019 Sharon L. Lohr

Sharon Lohr