Theodore Huang, Gregory Idos, Christine Hong, Stephen Gruber, Giovanni Parmigiani, and Danielle Braun. Forthcoming. “Extending Models Via Gradient Boosting: An Application to Mendelian Models.” Annals of Applied Statistics. Publisher's VersionAbstract
Improving existing widely-adopted prediction models is often a more efficient and robust way towards progress than training new models from scratch. Existing models may (a) incorporate complex mechanistic knowledge, (b) leverage proprietary information and, (c) have surmounted barriers to adoption. Compared to model training, model improvement and modification receive little attention. In this paper we propose a general approach to model improvement: we combine gradient boosting with any previously developed model to improve model performance while retaining important existing characteristics. To exemplify, we consider the context of Mendelian models, which estimate the probability of carrying genetic mutations that confer susceptibility to disease by using family pedigrees and health histories of family members. Via simulations we show that integration of gradient boosting with an existing Mendelian model can produce an improved model that outperforms both that model and the model built using gradient boosting alone. We illustrate the approach on genetic testing data from the USC-Stanford Cancer Genetics Hereditary Cancer Panel (HCP) study.
Zoe Guan, Theodore Huang, Anne Marie McCarthy, Kevin S Hughes, Alan Semine, Hajime Uno, Lorenzo Trippa, Giovanni Parmigiani, and Danielle Braun. 2020. “Combining Breast Cancer Risk Prediction Models.” arXiv preprint arXiv:2008.01019.
K Yin, Y Liu, B Lamichhane, JF Sandbach, G Patel, G Compagnoni, RH Kanak, B Rosen, DP Ondrula, L Smith, and others. 2020. “Legacy Genetic Testing Results for Cancer Susceptibility: How Common are Conflicting Classifications in a Large Variant Dataset from Multiple Practices?” Annals of Surgical Oncology.
Gavin Lee, Qing Zhang, Jane W Liang, Theodore Huang, Christine Choirat, Giovanni Parmigiani, and Danielle Braun. 2020. “PanelPRO: AR package for multi-syndrome, multi-gene risk modeling for individuals with a family history of cancer.” arXiv preprint arXiv:2010.13011.
Fan Gao, Xuedong Pan, Elissa B Dodd-Eaton, Carlos Vera C Recio, Jasmina Bojadzieva, Phuong L Mai, Kristin Zelley, Valen E Johnson, Danielle Braun, Kim E Nichols, and others. 2020. “A pedigree-based prediction model identifies carriers of deleterious de novo mutations in families with Li-Fraumeni syndrome.” bioRxiv.
Jinbo Chen, Eunchan Bae, Lingjiao Zhang, Kevin Hughes, Giovanni Parmigiani, Danielle Braun, and Timothy R Rebbeck. 2020. “Penetrance of Breast and Ovarian Cancer in Women Who Carry a BRCA1/2 Mutation and Do not Use Risk-Reducing Salpingo-Oophorectomy: An Updated Meta-analysis.” JNCI Cancer Spectrum.
Cathy Wang, Yan Wang, Kevin S Hughes, Giovanni Parmigiani, and Danielle Braun. 2020. “Penetrance of Colorectal Cancer Among Mismatch Repair Gene Mutation Carriers: A Meta-Analysis.” JNCI Cancer Spectrum.
Theodore Huang, Malka Gorfine, Li Hsu, Giovanni Parmigiani, and Danielle Braun. 2020. “Practical implementation of frailty models in Mendelian risk prediction.” Genetic Epidemiology.
Margaux LA Hujoel, Giovanni Parmigiani, and Danielle Braun. 2020. “Statistical approaches for meta-analysis of genetic mutation prevalence.” Genetic Epidemiology.
Theodore Huang, Danielle Braun, Henry T Lynch, and Giovanni Parmigiani. 2020. “Variation in cancer risk among families with genetic susceptibility.” Genetic epidemiology.
Yuxi Liu, Kanhua Yin, Basanta Lamichhane, John F Sandbach, Gayle Patel, Gia Compagnoni, Richard H Kanak, Barry Rosen, David P Ondrula, Linda Smith, and others. 2019. “Ask2Me VarHarmonizer: A Python-Based Tool to Harmonize Variants from Cancer Genetic Testing Reports and Map them to the ClinVar Database.” arXiv preprint arXiv:1911.08408.
Jaihwan Kim, Danielle Braun, Chinedu Ukaegbu, Tara G Dhingra, Fay Kastrinos, Giovanni Parmigiani, Sapna Syngal, and Matthew B Yurgelun. 2019. “Clinical Factors Associated With Gastric Cancer in Individuals With Lynch Syndrome.” Clinical Gastroenterology and Hepatology.
Francisco Acevedo, Victor Diego Armengol, Zhengyi Deng, Rong Tang, Suzanne Coopey, Emanuele Mazzola, Conor Lanahan, Danielle Braun, Adam Yala, Regina Barzilay, and others. 2019. “Incidental atypical hyperplasia/LCIS in mammoplasty specimens and subsequent risk of breast cancer.”.
Anne Marie McCarthy, Zoe Guan, Michaela Welch, Molly E Griffin, Dorothy A Sippo, Zhengyi Deng, Suzanne B Coopey, Ahmet Acar, Alan Semine, Giovanni Parmigiani, and others. 2019. “Performance of Breast Cancer Risk-Assessment Models in a Large Mammography Cohort.” JNCI: Journal of the National Cancer Institute.
Yujia Bao, Zhengyi Deng, Yan Wang, Heeyoon Kim, Victor Diego Armengol, Francisco Acevedo, Nofal Ouardaoui, Cathy Wang, Giovanni Parmigiani, Regina Barzilay, Danielle Braun, and Kevin S Hughes. 2019. “Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes.” JCO Clinical Cancer Informatics, forthcoming.
Zhengyi Deng, Kanhua Yin, Yujia Bao, Victor Diego Armengol, Cathy Wang, Ankur Tiwari, Regina Barzilay, Giovanni Parmigiani, Danielle Braun, and Kevin S Hughes. 2019. “Validation of a Semiautomated Natural Language Processing–Based Procedure for Meta-Analysis of Cancer Susceptibility Gene Penetrance.” JCO Clinical Cancer Informatics.
Danielle Braun, Jiabei Yang, Molly Griffin, Giovanni Parmigiani, and Kevin S Hughes. 2018. “A Clinical Decision Support Tool to Predict Cancer Risk for Commonly Tested Cancer-Related Germline Mutations.” Journal of genetic counseling, Pp. 1–13.
Thomas Madsen, Danielle Braun, Peng Gang, Giovanni Parmigiani, and Lorenzo Trippa. 2018. “Efficient computation of the joint probability of multiple inherited risk alleles from pedigree data.” Genetic Epidemiology, Pp. 1–11.
Francisco Acevedo, Diego V Armengol, Zhengyi Deng, Rong Tang, Suzanne B Coopey, Danielle Braun, Adam Yala, Regina Barzilay, Clara Li, Amy Colwell, and others. 2018. “Pathologic findings in reduction mammoplasty specimens: a surrogate for the population prevalence of breast cancer and high-risk lesions.” Breast Cancer Research and Treatment, Pp. 1–7.
Jessica A Cintolo-Gonzalez, Danielle Braun, Amanda L Blackford, Emanuele Mazzola, Ahmet Acar, Jennifer K Plichta, Molly Griffin, and Kevin S Hughes. 2017. “Breast cancer risk models: a comprehensive overview of existing models, validation, and clinical applications.” Breast Cancer Research and Treatment, Pp. 1–22.