Radiation therapy (RT) is a critical cancer treatment, but the existing radiation oncologist work force does not meet growing global demand. One key physician task in RT planning involves tumor segmentation for targeting, which requires substantial training and is subject to significant interobserver variation.
To determine whether crowd innovation could be used to rapidly produce artificial intelligence (AI) solutions that replicate the accuracy of an expert radiation oncologist in segmenting lung tumors for RT targeting.
DESIGN, SETTING, AND PARTICIPANTS:
We conducted a 10-week, prize-based, online, 3-phase challenge (prizes totaled $55 000). A well-curated data set, including computed tomographic (CT) scans and lung tumor segmentations generated by an expert for clinical care, was used for the contest (CT scans from 461 patients; median 157 images per scan; 77 942 images in total; 8144 images with tumor present). Contestants were provided a training set of 229 CT scans with accompanying expert contours to develop their algorithms and given feedback on their performance throughout the contest, including from the expert clinician.
MAIN OUTCOMES AND MEASURES:
The AI algorithms generated by contestants were automatically scored on an independent data set that was withheld from contestants, and performance ranked using quantitative metrics that evaluated overlap of each algorithm's automated segmentations with the expert's segmentations. Performance was further benchmarked against human expert interobserver and intraobserver variation.
A total of 564 contestants from 62 countries registered for this challenge, and 34 (6%) submitted algorithms. The automated segmentations produced by the top 5 AI algorithms, when combined using an ensemble model, had an accuracy (Dice coefficient = 0.79) that was within the benchmark of mean interobserver variation measured between 6 human experts. For phase 1, the top 7 algorithms had average custom segmentation scores (S scores) on the holdout data set ranging from 0.15 to 0.38, and suboptimal performance using relative measures of error. The average S scores for phase 2 increased to 0.53 to 0.57, with a similar improvement in other performance metrics. In phase 3, performance of the top algorithm increased by an additional 9%. Combining the top 5 algorithms from phase 2 and phase 3 using an ensemble model, yielded an additional 9% to 12% improvement in performance with a final S score reaching 0.68.
CONCLUSIONS AND RELEVANCE:
A combined crowd innovation and AI approach rapidly produced automated algorithms that replicated the skills of a highly trained physician for a critical task in radiation therapy. These AI algorithms could improve cancer care globally by transferring the skills of expert clinicians to under-resourced health care settings.
Scientists typically self-organize into teams, matching with others to collaborate in the production of new knowledge. We present the results of a field experiment conducted at Harvard Medical School to understand the extent to which search costs affect matching among scientific collaborators. We generated exogenous variation in search costs for pairs of potential collaborators by randomly assigning individuals to 90-minute structured information-sharing sessions as part of a grant funding opportunity for biomedical researchers. We estimate that the treatment increases the baseline probability of grant co-application of a given pair of researchers by 75% (increasing the likelihood of a pair collaborating from 0.16 percent to 0.28 percent), with effects higher among those in the same specialization. The findings indicate that matching between scientists is subject to considerable frictions, even in the case of geographically-proximate scientists working in the same institutional context with ample access to common information and funding opportunities.
Selecting among alternative projects is a core management task in all innovating organizations. In this paper, we focus on the evaluation of frontier scientific research projects. We argue that the "intellectual distance" between the knowledge embodied in research proposals and an evaluator's own expertise systematically relates to the evaluations given. To estimate relationships, we designed and executed a grant proposal process at a leading research university in which we randomized the assignment of evaluators and proposals to generate 2,130 evaluator-proposal pairs. We find that evaluators systematically give lower scores to research proposals that are closer to their own areas of expertise and to those that are highly novel. The patterns are consistent with biases associated with boundedly rational evaluation of new ideas. The patterns are inconsistent with intellectual distance simply contributing "noise" or being associated with private interests of evaluators. We discuss implications for policy, managerial intervention, and allocation of resources in the ongoing accumulation of scientific knowledge.
Harvard Medical School seems an unlikely organization to open up its innovation process. By most measures, the more than 20,000 faculty, research staff and graduate students affiliated with Harvard Medical School are already world class and at the top of the medical research game, with approximately $1.4 billion in annual funding from the U.S. National Institutes of Health (NIH). But in February 2010, Drew Faust, president of Harvard University, sent an email invitation to all faculty, staff and students at the university (more than 40,000 individuals) encouraging them to participate in an ideas challenge that Harvard Medical School had launched to generate research topics in Type 1 diabetes. Eventually, the challenge was shared with more than 250,000 invitees, resulting in 150 research ideas and hypotheses. The goal of opening up idea generation and disaggregating the different stages of the research process was to expand the number and range of people who might participate. Today, seven teams of multi-disciplinary researchers are working on the resulting potential breakthrough ideas. In this article, we describe how leaders of Harvard Catalyst, an organization whose mission is to drive therapies from the lab to patients' bedsides faster and to do so by working across the many silos of Harvard Medical School, chose to implement principles of open and distributed innovation.