Abstract: Severe degree heterogeneity is a universal phenomenon in large social networks. However, the degree parameters are largely nuisance to our major interest, and their effects can be carefully removed with proper statistical strategies. In the first part of the talk, I will take the mixed-membership estimation as an example and present several useful ideas for dealing with degree heterogeneity. We assume the network has K perceivable communities. Each node is associated with a K-dimensional “membership” vector whose entries describe the nodes’ “weights” on different communities. The goal is to estimate these membership vectors. We adopt a degree-corrected mixed-membership model and propose a spectral method that is conceptually simple, computationally fast, and rate-optimal. ¶ In the second part, I will showcase a dataset we have collected about coauthor/citation networks of statisticians. The data set consists of the meta information (e.g., authors, abstracts, citation counts, etc.) of about 70,000 papers in 36 representative journals in statistics and related fields, from 1984-2015. The dataset provides a fertile ground for methodological comparisons and for scientific discoveries. We report some Exploratory Data Analysis (EDA) results, such as productivity, journal-journal citation exchanges, and citation patterns of individual papers.
Tracy Ke is an Assistant Professor of Statistics at Harvard University.