Syllabus

This is the syllabus for GOV 2000(e)/1000/E-2000, Quantitative Methods for Political Science I. In addition to the links to the right, here is the  gov2000syll12_11_13.pdf .

Overview and Class Goals

How can we detect voting irregularities? What causes individuals to vote? In what sense (if any) does democracy (or trade) reduce the probability of war? Quantitative political scientists address these questions and many others by using and developing statistical methods that are informed by theories in political sci- ence. In this course, we provide an introduction to the tools used in basic quantitative political methodology. The first four weeks of the course cover introductory univariate statistics, while the remainder of the course focuses on linear regression models. Furthermore, the principles learned in this course provide a foundation for the future study of more advanced topics in quantitative political methodology.

While the tools of statistical inference are worth studying in their own right, the primary goal of this course is to provide graduate students (and some undergraduates) with the necessary skills to critically read, interpret, and replicate the quantitative content of many political science articles. As such, the statistical methods covered in this course will be presented within the context of a number of articles. Throughout the term, we will reanalyze the data and revisit the conclusions from

  • “Racial Prejudice and Attitudes Toward Affirmative Action,” by James H. Kuklinski, Paul M. Snider- man, Kathleen Knight, Thomas Piazza, Philip E. Tetlock, Gordon R. Lawrence, and Barbara Mellers, The American Journal of Political Science, 1997.
  • “Measuring Political Preferences,” by Lee Epstein and Carol Mershon, American Journal of Polititcal Science, 1996.
  • “Female Socialization: How Daughters Affect Their Legislator Fathers’ Voting on Women’s Issues,” by Ebonya L. Washington, The American Economic Review, 2008.
  • “Social Pressure and Voter Turnout: Evidence from a Large-Scale Field Experiment,” by Alan S. Gerber, Donald P. Green, and Christopher W. Larimer, The American Political Science Review, 2008.
  • “Law and Data: The Butterfly Ballot Episode,” by Henry E. Brady, Michael C. Herron, Walter R. Mebane Jr., Jasjeet Sekhon, Kenneth W. Shotts, Jonathan Wand, PS: Political Science and Politics, 2001.
  • “The Classical Liberals Were Right: Democracy, Interdependence, and Conflict 1950-1985,” by JohnR. Oneal and Bruce M. Russett, International Studies Quarterly, 1997.
  • “MPs for Sale? Returns to Office in Postwar British Politics,” by Andrew Eggers and Jens Hainmueller,The American Political Science Review, 2009.

Who Takes this Course?

GOV 2000 is designed for students who already have some background in statistics/mathematics/computing, or for beginners who are looking for a challenge. Students taking this section of the course will learn to be flexible data analysts, capable of tailoring standard methods to the unique specifications of each task. As such, these students will be asked in the problem sets to write/adjust the code necessary to replicate and critique results from the literature. This section of the course will be taught within the R statistical com- puting environment.

GOV 2000e is designed for students with a limited background in statistics/mathematics/computing. Students in this section of the course will focus on the analysis and critique portions of the assignments. This section of the course will be taught with the Stata statistical software, and the students will be provided with the additional code necessary to replicate and critique results from the literature.

GOV 1000 is intended for undergraduate students and will be taught with the Stata statistical software as the default option. Undergraduate students may choose to use R instead, but will be responsible for some additional questions on problem sets.

GOV E-2000 is designed for Harvard Extension School students. This section of the course will be taught with the R statistical computing environment as the default with the belief that concepts such as statistical simulation, which are heavily used in GOV E-2001, are important skills that students should take away from the course. However, we will consider accommodating requests to complete the problem sets using Stata statistical software.

Prerequisites and Recommended Preparation

The prerequisites differ across type of student. For graduate students in the Government Department, there are no prerequisites. For other graduate students, undergraduate students, and Extension School students, the prerequisite is GOV 50, GOV E-1005, or the equivalent.

For any student who meets the prerequisites yet is concerned with his or her preparedness for the course, we strongly encourage the following in advance of the semester. First, we recommend reading and working through the exercises in David Freedman, Robert Pisani, and Roger Purves, Statistics, 2007 (any of the older editions should suffice as well). Next, we encourage familiarization with the appropriate statistical package - R or Stata - for the section of the course the student intends on taking. Moreover, if the student plans on typesetting problem set answers in LaTeX, familiarity with the LaTeX markup language would be helpful. Resources on R, Stata, and LaTeX are available here.

Class Requirements

Grades will be based on

  • weekly homework assignments (50 % of final grade)
  • a midterm exam (10 % of final grade)
  • a cumulative take-home final exam (30 % of final grade)
  • participation, presentation, annotation, and reading comments (10 % of final grade).

I will not give incompletes in this course.

Homework

The weekly homework assignments will consist of analytical problems, computer work, and data analysis. The 2000 section of the course will have additional problems. For all sections, the homework will be assigned on Tuesday night and due the following Tuesday at 1:00pm. Solutions will be posted on Tuesday night, and students will have one week to “self correct” their homework on the basis of the solution key (due the following Tuesday at 1:00pm). These corrections should take the form of comments added to the original homework that indicate where mistakes were made and that demonstrate an understanding of those mistakes.

The homework write-up must be word processed (MS Word is fine), with tables and figures incorporated in the document. No late homework will be accepted except in the case of a documented emergency. All sufficiently attempted homework will be graded on a (+,✓,-) scale, and all sufficiently student corrected homework will recover half credit (e.g., homework that receives a ✓ and is sufficiently corrected with receive a final grade of ✓/+). All sufficiently attempted homework will be typed and well organized with all problems attempted, and all sufficiently corrected homework will include typed and well organized comments integrated into the original homework. The instructor will determine sufficiency in borderline cases.

We encourage students to work together on the homework assignments, but you must write your own solutions (this includes computer work), and you must write the names of your collaborators on your assignment. We also strongly suggest that you make a solo effort at all the problems before consulting others. The midterm and the final will be very difficult if you have no experience working on your own.

Midterm

The midterm will be a short checkout exam (5hrs), that should only take a few hours to complete, and only involves short analytical problems. This exam will be available for checkout one week after we finish the material on univariate statistics, and it it is designed to ensure that all students understand the foundational material before we move to regression.

Participation

Ten percent of the grade will be awarded for class participation, quality of presentation on the homework, annotation, and reading comments. A preliminary version of the lecture slides will be posted on Friday with references to pages of the textbook on the slides. Students will do the assigned reading and use the annotation tool to append questions/comments to the slides by Sunday evening. Additionally, students are required to answer a very short set of questions regarding the assigned reading by Sunday evening. The annotations and reading questions provide feedback for tailoring the Tuesday lecture to the needs of the students in the course.

Take-home Final

The take–home final exam will be handed out on Wednesday, December 5, one week before the last day of reading period. It will be due at 5:00pm on Wednesday, December 12, the last day of reading period. The take-home final is an exercise in guided replication and primarily involves data analysis and primarily involves data analysis and interpretation. Note that the format and goals for the take–home exam are very different from the format and goals for the midterm exam.

Discussion Sections

There will be two discussion sections for this course. The 2000 section will discuss the concepts and R code needed to complete the GOV 2000 and GOV E-2000 homework. The 2000e/1000 section will discuss the concepts and Stata procedures needed to complete the GOV 2000e and 1000 homework. Both sections will be recorded and all students are welcome to sit in both sections of the course.

Course Mailing List

The course mailing list is gov2000-list@lists.fas.harvard.edu. Please subscribe to the list here. If you have trouble subscribing to the list, please email Andy and Konstantin within the first week of the course. This an ideal forum for posting questions regarding the course material and/or computing. We encourage students to reply to each other’s questions, and a student’s respectful and constructive participation on the mailing list will count toward his/her class participation grade.

Office Hours

Adam Glynn's office hours will immediately follow lecture on Tuesdays from 4:00 - 6:00pm.

The office hours for Andy Hall and Konstantin Kashin will be determined during the first week of the course. Both will be held in the HMDC computer lab. If you have questions about the course material, computational issues, or other course-related issues please do not hesitate to set up an appointment with either Adam, Andy, or Konstantin.

If you have a general question, you can also send it to the course mailing list. This is almost always the fastest way to get an answer. However, you can also email Adam directly at aglynn@fas.harvard.edu. If the question is of general interest, he will forward the question and the answer to the list. Make sure to mention explicitly in your email if you would like to stay anonymous.

Readings

In addition to several articles that will be distributed in pdf form on the course website, the required books for the course are:

  •  (ALZ) Ashenfelter, Orley, Levine, Philip, and Zimmerman, David. 2003. Statistics and Econometrics: Methods and Applications. John Wiley & Sons.
  • (GH) Gelman, Andrew and Hill, Jennifer. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

Note: Most of the required material from ALZ can also be found in the Wooldridge text listed first in the optional books section.

Optional Books
The following books are optional but may prove useful to students looking for additional coverage of some of the course topics.

  • Wooldridge, Jeffrey. 2000. Introductory Econometrics. New York: South-Western.
  • Fox, John. 1997. Applied Regression Analysis, Linear Models, and Related Methods. Thousand Oaks, CA: Sage.
  • Fox, John. 2002. An R and S-PLUS Companion to Applied Regression. Thousand Oaks, CA: Sage.
  • Gill, Jeff. 2006. Essential Mathematics for Political and Social Research. 1st Edition. 2nd printing. New York: Cambridge University Press.
  • Weisberg, Sanford. 2005. Applied Linear Regression. 3rd Edition. Hoboken, NJ: John Wiley.
  • Freedman, David; Robert Pisani; and Roger Purves. 1998. Statistics. 3rd Edition. New York: Norton.
  • Agresti, Alan and Finlay, Barbara. 1997. Statistical Methods for the Social Sciences Upper Saddle River, NJ: Prentice Hall.
  • Cleveland, William S. 1993. Visualizing Data. Summit, NJ: Hobart Press.
  • Simon, Carl and Blume, Lawrence. 1994. Mathematics for Economists. New York: Norton.
  • Kennedy, Peter. 2003. A Guide to Econometrics. 5th Edition. Malden. Blackwell.
  • Wonnacott, Thomas H. and Ronald J. Wonnacott. 1990. Introductory Statistics. 5th Edition. New York: Wiley.
  • Venables, W.N. and B.D. Ripley. 2002. Modern Applied Statistics with S-PLUS. New York: Springer
  • Gonick, Larry and Smith, Woollcott. 1993. The Cartoon Guide to Statistics New York: Harper.
  • Tufte, Edward. 2001. The Visual Display of Quantitative Information, 2nd Edition. Cheshire, CN: Graph- ics Press.

Outline

I. Descriptive Inference

  1. Introduction
    1. Overview and Course Requirements
    2. Course Outline
    3. Introductory Sampling Activity
  2. Descriptive Questions
    1. Describing Univariate Populations
    2. Describing Bivariate and Multivariate Populations
    3. Summarization with Bivariate and Multivariate Regression
  3. Randomly Sampled Observations and Basic Probability
    1. Elementary Probability Theory
    2. Random Variables and Functions of Random Variables (Expectation, Variance, ...)
    3. Joint and Conditional Distributions
  4. Random Samples and Descriptive Inference (Univariate)
    1. Simple Random Sampling (with and without replacement)
    2. Distribution of the Sample as an Estimate of the Population Distribution
    3. Sample Statistics
    4. Sampling Distributions
    5. Point Estimation
    6. Interval Estimation (i.e., confidence intervals)
    7. Hypothesis Testing
  5. Random Samples and Descriptive Inference (Regression)
    1. Simple Random Sampling (with and without replacement)
    2. Stratified Random Sampling (with and without replacement)
    3. Distribution of the Sample as an Estimate of the Population Distribution
    4. Sample Statistics and Sampling Distributions
    5. Point Estimation and Interval Estimation
    6. Hypothesis Testing
  6. Diagnosing and/or Fixing Problems (Part 1)
    1. Nonlinearity
    2. Nonconstant Error Variance and Correlated Errors
    3. Weighted Least Squares and Generalized Least Squares
    4. "Robust" Standard Errors
    5. Nonnormality
    6. Unusual Observations (leverage points, outliers, and influence points)
  7. Diagnosing and/or Fixing Problems (Part 2)
    1. Data Missing at Random (conditional on observed data)
    2. Bounding and Sensitivity Analysis for Data Not Missing at Random
II. Causal Inference
  1. Introduction
    1. Potential Outcomes and Causal Effects
    2. Causal Inference as a Missing Data Problem
    3. Introductory Causal Inference Activity
  2. Causal Questions
    1. Describing Univariate Distributions of Potential Outcomes
    2. Describing Univariate Distributions of Causal Effects
    3. Conditional Distributions of Potential Outcomes and Causal Effects
    4. Causal Questions Not Addressed in the Course
  3. Randomized Treatment Assignment
    1. Identification with Randomized and Conditionally Randomized Treatment Assignment
    2. Estimation, CIs, and Testing with Randomized and Conditionally Randomized Treatment Assignment
  4. Observational Studies with Measured Confounding
    1. The Assumption of No Unmeasured Confounding
    2. Relation to Classical Econometric Assumptions
    3. Choosing Conditioning Variables
    4. Regression Based Estimation (Additive and Interactive)
  5. Diagnosing and/or Fixing Problems
    1. Assessing Overlap and Balance
    2. Revising the Question of Interest
    3. Bounding and Sensitivity Analysis for Unmeasured Confounding