Following Gov 2000 or the equivalent course in linear regression, Gov2001 is the second in the methods sequence for Government Department graduate and undergraduate students. While not required, most Government graduate students doing empirical work take the course. Graduate students in other departments and schools at Harvard (and in the area) also take the course. Undergraduates are especially welcome to take Gov 1002, which is taught along with this class. Non-Harvard students and others may also take this course by registering through the Harvard extension school, for course credit or as an auditor (see course number Stat E-200).
If there are seats in the room you're welcome to attend even if you're not formally registered, but if possible we would appreciate if you would sign up formally (as our teaching fellows get paid more!). If you are not a Harvard student, you can easily do this via Harvard extension school course Stat E-200 (and there is financial aid if you need it too).
If you need cross-registration papers signed, please bring them to the first class. You don't need permission from us to take the course. We observe that students who take the course for a grade participate more and get far more out of the experience (even among many of those who think or say it will be otherwise), but pass/fail and formal auditing are okay with us too.
For students who've taken a course in linear regression (such as Harvard's Gov2000), this course gives you the tools to learn new statistical methods or build them yourself. We focus on methods practically useful in real social science research. We aim to give you two types of skills
First, we show how to develop new approaches to research methods, data analysis, and statistical theory. More advanced statistical theory is not required when data and variables follow standard assumptions. Since this is not usually the case in most of the social sciences, we often cannot use ready-made statistical procedures developed elsewhere and for other purposes. We teach the underlying theory of inference (which, at its most fundamental is merely using facts you know to learn about facts you don't know); once understood, we can easily “reinvent” known statistical solutions to accommodate social science data, learn new techniques as they are developed, or even invent original approaches when required. Students will learn how to read an original scholarly article describing a new statistical technique, implement it in computer code, estimate the model with relevant data, understand and interpret the results, and present and explain the results to someone unfamiliar with statistics.
Gov 2000, a course in linear regression (with matrices), or the equivalent. If you know what
means, you're probably ok.
Most in-class experience will be lecture-based, but some parts are designed as a collective experience. This means that other students will be counting on you (and you on them), and so please come to class prepared. If you don't understand something, that's perfectly fine; we'll figure it out together and make sure no one is left behind. But if you don't put in the effort, it will hurt what everyone gets out of the class. If you have a question about one of the readings, post a question in Perusall. If you think you may know an answer to a query another student posted, or have a suggestion, please try to answer it. In fact, if you merely have an interesting idea, please contribute that as well.
The best way, and often the only way, to learn new statistical procedures is by doing. We will therefore make extensive use of a flexible (open-source and free) statistical software program called R and a companion package called Zelig (which we designed for you and those in your position). R is among the most widely used statistical software, and Zelig is a widely used packages in R. You will learn how to program in this class, if you do not know already.
For hardware, you are welcome to use your own computers. To install R and Zelig on your computer, see zeligproject.org. You are also welcome to use the HMDC computer labs (in the concourse and 3rd floor of CGIS North-Knafel, 1737 Cambridge Street), which have computers with R already installed on them. Harvard affiliates also have the option of registering for a Research Computing Environment (RCE) account through http://hmdc.harvard.edu. Having an RCE account allows you access to HMDC's cluster of servers, which are fast and well-equipped to handle large data sets or time-intensive procedures. In addition, these servers supply a persistent (linux) desktop environment that is accessible from any computer with an Internet connection.
Most of the probability and statistical theory in this class will be taught in the context of "Monte Carlo simulation'' (which we do not expect you to know prior to the course). We will write computer programs to verify, or substitute for, more difficult formal mathematical proofs. This intuitive technique will make it much easier to understand and to implement new statistical methods.
Each week, you will have reading and problem sets.
Reading assignments will be acquired and done in Perusall.com. You will also collaboratively annotate the readings, asking questions you may have, answering other students' questions, and generally engaging with the material and each other. To get started in Perusall, see Getting Started. We will explain more in Section as well.
We strongly encourage you to work together in groups on the problem sets, so long as you write up your work on your own, by yourself, without having anyone check your work before you hand it in.
Problem sets must be submitted each week by the beginning of section (Wednesday 6PM). The full solution key will be posted so you can review your answers. Because we will be posting answer/solution keys immediately after deadlines, we can't give credit for late work. You can still turn in late work for feedback and help learning the material. The problem sets - including looking at the solution keys - is an extremely important part of the learning process, so please keep up and let us know if you have any questions.
The main assignment is to write a research paper that replicates an existing piece of scholarship. The goal of the paper is to apply some advanced method to, or develop one for, a substantive problem in your field of study. You should aim to produce a publishable article, and, in fact, most students do publish their final paper in a scholarly journal. (I know it sounds absurdly hard, but that's only because you haven't learned some of the material we go over in class!) More information about the paper can be found at http://gking.harvard.edu/papers/.
You must find a co-author and a paper to replicate by Wednesday, February 24, at 5pm, by which point you should upload on Canvas a PDF copy of the paper along with a brief paragraph explaining your choice. You are also required to have one of your classmates who is not your coauthor sign off on your article choice after checking that your article meets all the criteria listed in "Publication, Publication".
On Wednesday, March 23, you must turn in a draft of the paper with little text but with figures and tables, and a proposed table of contents for your paper, in a relatively polished form. You should also arrange to hand over all of the data and information necessary to replicate the results of your analysis and reproduce your tables and figures. (We will coordinate the exchange of files and code through Canvas and via announcements in class.) On that day, you will hand over your paper and materials to another student we assign to you, and, in exchange, you will receive a different student's paper. Your task for the following week is to replicate the other student's analysis and write a memo to this student (with a copy to us), pointing out ways to make the paper and the analysis better. You will be evaluated based on how helpful, not how destructive, you are.
The final version of the paper is due the day before Reading Period, Wednesday April 27, at 5pm. You must turn in a hard copy of the paper and all data and code (on Canvas). You must also follow standard academic practice and create a permanent replication archive by uploading all your data and code to the Gov2001 Dataverse (http://dvn.iq.harvard.edu/dvn/dv/gov2001).
If you need an extension with the replication paper, you do not need to ask permission: We will accept papers until Thursday, May 5, at 5pm, but since you will have had more time, papers turned in after the original deadline will be graded according to proportionately higher standards. The number of incompletes we plan to give is governed by a Poisson distribution with λ=0.01, so please plan accordingly.
Once all papers are turned in, you aren't quite finished. We will turn over your replication paper to another student and assign you a small set of replication papers to evaluate. Your last assignment for the class will be to read and comment on a fellow group's work and to grade this paper according to certain guidelines we will provide. Your main objective is to give feedback on what changes and improvements need to happen in order for the paper to be published. As always, you will be evaluated based on how helpful, not how destructive, you are. Your comments on your fellow student's paper are due Friday May 13, at 5pm.
One of the best ways that people learn is by teaching and collaborating with others. We facilitate collaboration in several different ways:
This course is being offered as part of the Harvard Extension School's Distance Education Program. The recorded class meetings that you will view are from the Harvard FAS course, Government 2001, and this meets once per week throughout the term. Even though your participation will take place online, you are responsible for homework, readings, quizzes, and all other work. There will also be weekly on-campus section meetings and office hours for students who are able to attend, or watch the videotape of the section. Please see the Harvard Extension School distance education web site for more information.
Students taking the class through the extension school will complete a final exam instead of the replication paper. They will, however, participate in the replication assignment by replicating others' work.
All students will need to have access to the course webpage, which is operated by FAS. If you do not already have a Harvard ID, please make arrangements to get one or to set up an XID.
If you're in town, we'd love to have you in class physically, as long as there is room (and there usually is).
All reading assignments must be acquired and read at the web site Perusall (which can also be accessed through Canvas, Harvard's learning management system). Perusall will enable you to obtain answers to questions instantly and to work collaboratively with other members of class.
All readings are freely accessible to members of this class, except for the main text, [Gary King, Unifying Political Methodology: The Likelihood Theory of Statistical Inference. University of Michigan Press, 1998], which must be purchased at Perusall. In addition to the required text, we will assign a wide variety of scholarly papers.
Reading assignments will be announced at the end of every class.
Help is available when you need it. If you have any questions about the homework, your paper, or anything else related to the course, please use the class discussion forum on the Canvas site. Since all three of us and all students will be reachable via this platform, it's a very efficient way to get answers to questions that do not fit as comments on the video annotation tool sites. Please also respond to inquiries if you happen to know the answer. (You can control how often the platform emails you a digest of the latest Q&A.)
We will also use Canvas to post announcements regarding course logistics, including readings, video assignments, and problem sets.
Final grades will be a weighted average of the replication paper (or final exam), weekly problem sets, annotation grades in Perusall, and participation. (There will be no final exam.)
"Participation'' includes preparing for, joining in the discussion in class, coming to class and section, making a serious effort to contribute to the discussion queries in Perusall, and finding other ways of helping your classmates learn more. Finally, since everyone learns more when more connections exist among students, finding ways to help build class camaraderie will also count as part of participation and be very much appreciated by us!
The timeline below gives the outline of the weekly schedule. Students are expected to:
Keeping up with the weekly schedule is extremely important not only for your learning but for the rest of the class as well.
After the foundational material is presented (roughly the first third of the class), I will introduce a large variety of statistical models and methods. I will choose these based on what makes sense from a pedagogical perspective at first, but as the semester goes on I will choose more and more material based on students interest and class projects.
For more information on the content of the class, see the detailed lecture notes online, which gives a general outline. Here's another version of some of the material:
We will not get to all these topics, and the list of topics we do cover will likely include others than those listed here, depending on student interest.
I've written up a version of the theory of teaching behind this class in the article "How Social Science Research Can Improve Teaching". You can also watch the accompanying video at the same link. I try to develop new or improved teaching and learning tools every year, and so you'll likely see differences from this description in class.
King, Gary. 1998. Unifying Political Methodology: The Likelihood Theory of Statistical Inference Ann Arbor: University of Michigan Press.
A variety of papers will be assigned as well.
It is also helpful to have access to a book on R/S programming such as