David Ham presents "Using Machine Learning to Test Hypothesis in Conjoint Analysis"

Presentation Date: 

Wednesday, April 7, 2021



Conjoint analysis is a popular experimental design used to measure multidimensional preferences. Researchers examine how varying a factor of interest, while controlling for other relevant factors, impacts decision-making. Currently, there exist two methodological approaches to analyzing data from a conjoint experiment. The first focuses on estimating marginal effects of each factor while averaging over the other factors.  Although this allows for straightforward nonparametric estimation using a design-based approach, the results critically depend on the distribution of other factors and how interaction effects are aggregated. An alternative approach is model-based and in principle can compute any quantities of interest.  The primary drawback is that researchers must correctly specify the model, a challenging task for conjoint analysis with many factors.  In addition, a commonly used logistic regression has poor statistical properties even with a moderate number of factors. We propose a new hypothesis testing approach based on the conditional randomization test.  We answer the most fundamental question of conjoint analysis: Does a factor of interest matter in any way given the other factors? Our methodology is solely based on the randomization of factors, and hence is free from assumptions.  Yet, it allows researchers to use any test statistic, including those based on complex machine learning models.  As a result, we are able to combine the strengths of the existing design-based and model-based approaches.  We illustrate the proposed methodology through conjoint analysis of immigration preferences. An open-source software package is available for implementing the proposed methodology.

See also: 2020