Presentations

Eric Tchetgen Tchetgen presents "An Introduction to Proximal Causal Learning", at https://harvard.zoom.us/j/99424949004?pwd=aWtPNFM3ZzFYbWxIMXNoZDlyUElVZz09, Wednesday, October 21, 2020

A standard assumption for causal inference from observational data is that one has measured a sufficiently rich set of covariates to ensure that within covariates strata, subjects are exchangeable across observed treatment values. Skepticism about the exchangeability assumption in observational studies is often warranted because it hinges on one's ability to accurately measure covariates capturing all potential sources of confounding. Realistically, confounding mechanisms can rarely if ever, be learned with certainty from measured covariates. One can therefore only ever hope that...

Read more about Eric Tchetgen Tchetgen presents "An Introduction to Proximal Causal Learning"
Luke Miratrix presents "A Practitioner’s Guide to Intent-to-Treat Effects from Multisite (blocked) Individually Randomized Trials: Estimands, Estimators, and Estimates", at https://harvard.zoom.us/j/99424949004?pwd=aWtPNFM3ZzFYbWxIMXNoZDlyUElVZz09, Wednesday, October 14, 2020
There are many ways to estimate an overall average effect of a large-scale multisite individually randomized control trial.  The researcher can target the average effect across individuals or sites. Furthermore, the researcher can target the effect for the experimental sample or a larger population. If treatment effects vary across sites, these estimands can differ. Once an estimand is selected, an estimator must be chosen. Standard estimators, such as fixed-effects regression, can be biased. We describe 15 different estimators commonly in use, consider which estimands they are... Read more about Luke Miratrix presents "A Practitioner’s Guide to Intent-to-Treat Effects from Multisite (blocked) Individually Randomized Trials: Estimands, Estimators, and Estimates"
Michael Baiocchi presents "When black box algorithms are (not) appropriate: a principled prediction-problem ontology", at Zoom: https://harvard.zoom.us/j/99424949004?pwd=aWtPNFM3ZzFYbWxIMXNoZDlyUElVZz09, Wednesday, September 30, 2020

In the 1980s a new, extraordinarily productive way of reasoning about algorithms emerged. Though this type of reasoning has come to dominate areas of data science, it has been under-discussed and its impact under-appreciated. For example, it is the primary way we reason about "black box'' algorithms. In this talk we discuss its current use (i.e., as "the common task framework'') and its limitations; we find a large class of prediction-problems are inappropriate for this type of reasoning. Further, we find the common task framework does not provide a foundation for the deployment of an...

Read more about Michael Baiocchi presents "When black box algorithms are (not) appropriate: a principled prediction-problem ontology"
Reagan Moze presents "Recent Adventures in Causal(ish) Inference with Text as Data.", at Zoom: https://harvard.zoom.us/j/99424949004?pwd=aWtPNFM3ZzFYbWxIMXNoZDlyUElVZz09, Wednesday, September 23, 2020
Text data have a long history in social science and education research. However, these data are notoriously high-dimensional and characterized by many nuances of language that lack plausible statistical models. As a result, analysis of text data typically involves intensive human coding tasks where particular constructs or features of the text are first defined, and then a collection of documents are inspected and coded for the presence or absence of these constructs. While this process may be feasible in studies with smaller sample sizes, the time and resources required to train and employ... Read more about Reagan Moze presents "Recent Adventures in Causal(ish) Inference with Text as Data."
Connor Jerzak presents "Detecting and Characterizing Latent Influence Dynamics in Social Science Data Using Machine Learning", at Zoom: https://harvard.zoom.us/j/99424949004?pwd=aWtPNFM3ZzFYbWxIMXNoZDlyUElVZz09, Wednesday, September 16, 2020

Unobserved interactions between people and groups play a fundamental role in domestic and international politics. Yet, despite their importance, the vast complexity of these unobserved interactions has typically frustrated efforts to quantify them, forcing scholars to assume that the units in an analysis are independent or to study a limited range of interactions. Here, I develop a framework and machine learning model for detecting and characterizing unobserved interference dynamics using all available information: outcome, covariate, and independent variable data. Given minimal...

Read more about Connor Jerzak presents "Detecting and Characterizing Latent Influence Dynamics in Social Science Data Using Machine Learning"
Matthew Blackwell presents "Noncompliance and instrumental variables for 2^K factorial experiments", at Zoom: https://harvard.zoom.us/j/99424949004?pwd=aWtPNFM3ZzFYbWxIMXNoZDlyUElVZz09, Wednesday, September 9, 2020

Factorial experiments are widely used to assess the marginal, joint, and interactive effects of multiple concurrent factors. While a robust literature covers the design and analysis of these experiments, there is less work on how to handle treatment noncompliance in this setting. To fill this gap, we introduce a new methodology that uses the potential outcomes framework for analyzing 2^K factorial experiments with noncompliance on any number of factors. This framework builds on and extends the literature on both instrumental variables and factorial experiments in several ways. First, we...

Read more about Matthew Blackwell presents "Noncompliance and instrumental variables for 2^K factorial experiments"
Adam Kapelner presents "Harmonizing Optimized Designs with Classic Randomization in Experiments", at CGIS Knafel Building (K354) - 12-1:30 pm, Wednesday, February 12, 2020
Abstract: There is a long debate in experimental design between the classic randomization design of Fisher, Yates, Kempthorne, Cochran, and those who advocate deterministic assignments based on notions of optimality. In nonsequential trials comparing treatment and control, covariate measurements for each subject are known in advance, and subjects can be divided into two groups based on a criterion of imbalance. With the advent of modern computing, this partition can be made nearly perfectly balanced via numerical optimization, but these... Read more about Adam Kapelner presents "Harmonizing Optimized Designs with Classic Randomization in Experiments"
Gary King presents "Statistically Valid Inferences from Privacy Protected Data", at CGIS Knafel Building (K354) - 12-1:30 pm, Wednesday, February 5, 2020
Abstract: Unprecedented quantities of data that could help social scientists understand and ameliorate the challenges of human society are presently locked away inside companies, governments, and other organizations, in part because of worries about privacy violations. We address this problem with a general-purpose data access and analysis system with mathematical guarantees of privacy for individuals who may be represented in the data, statistical guarantees for researchers...
Read more about Gary King presents "Statistically Valid Inferences from Privacy Protected Data"

Pages