Michael Lingzhi Li presents "Statistical Performance Guarantee for Selecting Those Predicted to Benefit Most from Treatment."

Presentation Date: 

Wednesday, October 4, 2023
Abstract:  Across a wide array of disciplines, many researchers use modern machine learning algorithms to identify a subgroup of individuals, called exceptional responders, who are likely to be helped by a treatment the most.  A common approach is to first estimate the conditional average treatment effect (CATE) or its proxy given a set of pre-treatment covariates and then optimize a cutoff of the resulting treatment prioritization score to prioritize who should receive the treatment.  Unfortunately, since these estimated scores are often biased and noisy in practice, naive reliance on them can lead to misleading inference. Furthermore, practitioners often utilize the same set of data to optimize the cutoff and evaluate the performance of the resulting subset, causing a multiple testing problem.  In this paper, we propose a methodology that has a uniform statistical performance guarantee for selecting such exceptional responders regardless of the cutoff optimization. Specifically, we develop a uniform confidence interval for experimentally evaluating the group average treatment effect (GATE) among the individuals whose estimated score is at least as high as any given quantile value.  This uniform confidence interval enables researchers to utilize arbitrary methods to choose the quantile of estimated score, including optimizing over the lower confidence bound of the estimated GATE among the selected individuals.  The proposed methodology provides this statistical performance guarantee without suffering from multiple testing problems, and also generalizes to a generic class of statistics beyond GATE.  Importantly, the validity of our methodology depends solely on randomization of treatment and random sampling of units and does not require modeling assumptions or resampling methods. Consequently, our methodology is applicable to any machine learning algorithm and is computationally efficient.
 
See also: 2023