Robust policy making with finite-sample statistics: An example from the Covid-19 prevalence debate

Monday, April 11 at 4:00 PM


Join the Salem Center and Panos Toulis (Chicago Booth).

The main paradigm in statistics depends on asymptotics as a limit of an infinite sample. It is well known, however, that asymptotic inference is generally not robust to small samples or model misspecification but what else can be done? This talk is centered around the idea that in many important settings it is possible to do inference that is valid in finite samples and does not require a complex model specification.

The primary example will be from Covid-19, where I will present a method for estimating prevalence from serology studies. The data are results from antibody tests in a sample from the population, where the test parameters, such as the true/false positive rates, may be unknown. The method scans the entire parameter space, and rejects parameter values using the joint data density as the test statistic. 

The key advantage of this approach over more standard model-based approaches is that it is valid in finite samples, for any data size. Moreover, our method requires only independence of serology test results, and does not rely on asymptotic arguments, normality assumptions, or other approximations. The downside of our method is mainly its computational complexity. We use Covid-19 serology studies in the US from the “early days” of the pandemic, and show that Covid-19 prevalence in the US was highly heterogeneous across states~(e.g., 0.7%–1.5% in California, 13%–17% in New York around mid-April of 2020 ). This was an early indication against universal public health policies, which proved correct in the long run.