“When Should Researchers Use Inferential Statistics When Analyzing Data on Full Populations?“
Abstract: Many researchers uncritically use inferential statistical procedures (e.g., hypothesis tests) when analyzing complete population data—a situation in which inference may seem unnecessary. We begin by reviewing and analyzing the most common rationales for employing inferential procedures when analyzing full population data. Two common rationales—having to do with handling missing data and generalizing results to other times and/or places—either lack merit or amount to analyzing sample (not population) data. Whether it is appropriate to use inferential procedures depends on whether researchers are analyzing sample or population data and on whether they seek to make causal or descriptive claims. When doing descriptive research, the distinction between sample and population data is paramount: Inferential statistics should only be used to analyze sample data (to account for sampling variability) and never to analyze population data. When doing causal research, the distinction between sample data and population data is unimportant: Inferential procedures can and should always be used to distinguish (for example) robust associations from those that may have come about by chance alone. Crucially, using inferential procedures to analyze population data to make descriptive claims can lead to incorrect substantive conclusions—especially when population sizes and/or effect sizes are small.
*Co-sponsored with the Center for Social Statistics