Bayesian inference and natural selection

    I saw an thought-provoking post at John Baez's diary the other day pointing out an interesting analogy between natural selection and Bayesian inference, and I can't decide if I should classify it as just "neat" or if it might also be "neat, and potentially deep" (which is where I'm leaning). Because it's a rather lengthy post, I'll just quote the relevant bits:

    The analogy is mathematically precise, and fascinating. In rough terms, it says that the process of natural selection resembles the...
    Read more about Bayesian inference and natural selection

    Bayesian Models of Human Learning and Reasoning: A Recap

    Drew Thomas

    An MIT tag team of Prof. Josh Tenenbaum and his graduate student Charles Kemp presented their research to the IQSS Research Workshop on Wednesday, October 19. The overlaying topic of Prof. Tenenbaum's research is machine learning; one major aspect of this is their method of categorizing the structure of the field to be learned.

    For example, it has made sense for hundreds of years that forms of life could be taxonomically identified according to a tree...

    Read more about Bayesian Models of Human Learning and Reasoning: A Recap

    Bayesian Propensity Score Matching

    Many people have realized that conventional propensity score matching (PSM) method does not take into account the uncertainties of estimating propensity scores. In other words, for each observation, PSM assumes that there is only one fixed propensity score. In contrast, Bayesian methods can generate a sample of propensity scores for any observation, by either monitoring the posterior distributions of the estimated propensity scores directly or predicting propensity scores from the posterior samples of the parameters of the propensity score model.

    Then matching on thus obtained...

    Read more about Bayesian Propensity Score Matching

    Bayesian vs. frequentist in cogsci

    Bayesian vs. frequentist - it's an old debate. The Bayesian approach views probabilities as degrees of belief in a proposition, while the frequentist says that a probability refers to a set of events, i.e., is derived from observed or imaginary frequency distributions. In order to avoid the well-trod ground comparing these two approaches in pure statistics, I'll consider instead how the debate changes when applied to cognitive science.

    One of the main arguments made against using Bayesian probability in statistics is that it's ill-grounded...

    Read more about Bayesian vs. frequentist in cogsci

    Benjamin Fry and Data Visualization

    Please join us tomorrow (Wednesday, 9/24) when we welcome Ben Fry to the applied statistics workshop. Ben's research explores data visualization--more details can be found here -- including details of his recently completed book "Data Visualization" and samples from his previous work .

    The workshop will meet at 12 noon in room K-354, CGIS-Knafel (1737 Cambridge St) with a light lunch served. The presentation will begin at 1215 and usually ends around 130 pm. All are welcome--

    ... Read more about Benjamin Fry and Data Visualization

    Best Practice Stats Reporting (Almost)

    Felix Elwert

    Let’s salute the New York Time’s for its near perfect polling documentation. In a recent edition of the Sunday Magazine, the Times includes a two-page spread on a phone survey on New York City politics. Though the survey touches on some life-and-death issues (“Would you ever date a Republican?��?), it’s really more for laughs than higher learning. Regardless, the Times goes to great length to describe its methodology:

    “Methodology: This telephone poll of a random sample of 1,011 adults in New York City was conducted for the New York Times...

    Read more about Best Practice Stats Reporting (Almost)

    Better Way To Make Cumulative Comparisons With Small Samples?

    On July 15, 1971 the research vessel Lev Berg set sail from Aralsk (Kazakhstan) to survey the Aral Sea, then the 4th largest freshwater lake in the world. The Soviet Union had been steadily draining the Aral for agricultural purposes since the 1950s and the Lev Berg was to measure the ecological damage. This trip included passing by the island Vozrozhdeniye on the South side.

    Lev Berg Image
    (Image Source: "The 1971 Smallpox Epidemic in Aralsk, Kazakhstan, and the...

    Read more about Better Way To Make Cumulative Comparisons With Small Samples?

    Beyond Standard Errors, Part I: What Makes an Inference Prone to Survive Rosenbaum-Type Sensitivity Tests?

    Jens Hainmueller

    Stimulated by the lectures in Statistics 214 (Causal Inference in the Biomedical and Social Sciences), Holger Kern and I have been thinking about Rosenbaum-type tests for sensitivity to hidden bias. Hidden bias is pervasive in observational settings and these sensitivity tests are a tool to deal with it. When done with your inference, it seems constructive to replace the usual qualitative statement that hidden bias “may be a problem��? with a precise quantitative statement like “in order to account for my estimated effect, a hidden bias has to be of...

    Read more about Beyond Standard Errors, Part I: What Makes an Inference Prone to Survive Rosenbaum-Type Sensitivity Tests?

    Beyond Standard Errors, Part II: What Makes an Inference Prone to Survive Rosenbaum-Type Sensitivity Tests?

    Jens Hainmueller

    Continuing from my previous post on this subject, sensitivity tests are still somewhat rarely (yet increasingly) used in applied research. This is unfortunate, I think, because, at least according to my own tests on several datasets, observational studies do vary considerably in their sensitivity to hidden bias. Some results go away once you allow for only a tiny amount of hidden bias, others are rock solid weathering very strongest hidden bias. One should always...

    Read more about Beyond Standard Errors, Part II: What Makes an Inference Prone to Survive Rosenbaum-Type Sensitivity Tests?

    Biden-Palin Linguistics

    This post looks at the linguistics of last night's Biden-Palin debate. Palin used the word "reform" 12 times compared to Biden's none. Biden used "middle class" 12 times to Palin's one.

    Here's a sequel to my earlier Obama-Clinton post. Overall,

    Overall, Biden uttered 7,065 words and Palin 7,646, with a total of 2,117 unique words. Which words did Biden use significantly more or less than Palin? For each word, we apply a chi-squared test that...

    Read more about Biden-Palin Linguistics

    Bike helmet laws

    I'm into biking (mostly road-biking these days) so I was interested to read a post on the New York Times' "Freakonomics" blog about a study that uses variation in bike helmet laws across US states to show that helmet laws decrease bike riding among kids and teens. Since I think that most people should ride bikes most of the time AND I have been known to bug people to wear helmets, perhaps I've been...

    Read more about Bike helmet laws

    Bill Support by Page Length

    There was a lot of press on the 1,000+-page length of the House health care bill, H.R. 3962. That got me thinking... didn't we hear the same thing about the stimulus bill and the Patriot Act? Aren't most "controversial" bills also very long?

    It would make sense. Controversial bills require a lot more ink -- pork, special cases, exceptions -- to reel in support. Uncontroversial bills can be written succinctly and pass as is.

    To assess this I scraped bills from OpenCongress, which maintains the full text, voting results and amendment...

    Read more about Bill Support by Page Length

    Beyond scatterplots

    Are scatterplots confusing? Turns out the graphics people at the New York Times, who I think have been putting out some outstanding work in the past few years, think so.

    Matthew Ericson, Deputy Graphics Editor at the NYT, gave a talk recently at the Infovis conference in which he described some of the techniques his staff uses to communicate information to readers. I wasn't there, but I looked through his slides (70 M zip file), which provide both...

    Read more about Beyond scatterplots

    Blitzstein on "In and Out of Network Sampling"

    Please join us this Wednesday as we welcome Joseph Blitzstein, Department of Statistics, Harvard University, who will present 'In and Out of Network Sampling'.

    Joe provided the following abstract for his talk ,

    In recent years it has become extremely common to need to work with
    network data, in applications such as the study of social networks,
    protein interaction networks, and the Internet. This has required the
    development of new generative models such as exponential random graph
    models and power law models. Yet it is usually prohibitively...

    Read more about Blitzstein on "In and Out of Network Sampling"

    Book Review: “The Probability of God��?, by Stephen D. Unwin

    Drew Thomas

    I continue with my review of The Probability of God, by Stephen D. Unwin, which I began here.

    The first clue I had that this book would have anything but rigorous mathematical analysis was that I found it in the Harvard’s Divinity library. As expected, the book is mainly philosophical in nature, but that doesn’t mean it exceeds its mathematical scope. Indeed, it gives the reader a good introduction to Bayesian inference while being very clear about its limits...

    Read more about Book Review: “The Probability of God��?, by Stephen D. Unwin

    Breastfeeding Research and Intention to Treat

    The current issue of The Atlantic features an interesting story by Hanna Rosin called "The Case Against Breastfeeding." Rosin argues that the health benefits of breastfeeding have been overstated by advocates and professional associations and that, given the costs (in mothers' time and independence), mothers should not be made to feel guilty if they decide not to breastfeed for the full recommended period. One of her key points is that observational studies overstate the benefits of breastfeeding by failing to...

    Read more about Breastfeeding Research and Intention to Treat

    breast cancer, rare diseases, and bayes rule, revised

    Happy Thanksgiving!

    Last Thursday, I posted about the recent government recommendations regarding breast cancer screening in women ages 40-49. At least one of you wrote me to say that one of my calculations might have been slightly off (they were), and so I did some more investigation on this issue, as well as on new recommendations on cervical pap smears. (Sorry --it took
    me a few days to get around to all of this!)

    To back up a second, here's what the controversial new ...

    Read more about breast cancer, rare diseases, and bayes rule, revised

    British Ideal Points

    Mike Kellermann

    We have talked a bit on the blog (here and here) about estimating the ideal points of legislators in different political systems. I've been doing some work on this problem in the United Kingdom, adapting an existing Bayesian ideal point model in an attempt to obtain plausible estimates of the preferences of British legislators.

    The basic Bayesian ideal point...

    Read more about British Ideal Points

    BUCLD: Statistical Learning in Language Development

    Amy Perfors

    The annual Boston University Conference on Language Development (BUCLD), this year held on November 4-6th, consistently offered a glimpse into the state of the art in language development. The highlight this year for me was a lunchtime symposium titled "Statistical learning in language development: what is it, what is its potential, and what are its limitations?" It featured a dialogue between three of the biggest names in this area: Jeff Elman at...

    Read more about BUCLD: Statistical Learning in Language Development