Abstract: While it is becoming easier to collect and store all kinds of data, including personal medical data, scientific data, and commercial data, there are relatively few people trained in the statistical and machine learning methods required to test hypotheses, make predictions, and otherwise create interpretable knowledge from this data. The automatic statistician project aims to build an artificial intelligence for data science, to help people make sense of their data and to uncover challenging research problems in automatic data analysis. I will discuss an early version of the system which can build statistical models from an open-ended language of models and then describe them in natural language. I will briefly review the class of regression models which the system constructs and how their properties allow for a modular description generation algorithm. The talk will conclude with examples of the output of the system and a discussion of future research directions.