How CBDB collects data and why you do not always find what you want

How CBDB collects data and why you do not always find what you want

August 2012

CBDB is long-term, open-ended project. For the last several years it has focused on collecting data systematically from reliable digital sources with the aim of exhausting these sources before turning to new projects. These include:

  • Biographical indexes 傳記資料索引 for Song (completed), Yuan (completed), and currently for Ming
  • Digital online systems with biographical data for Tang (from Kyoto University), for Ming-Qing (from Academia Sinica), and for women writer (from the Ming-Qing Women’s Writings project)
  • Other indexes, such as birth-death dates for Qing figures and listing of Song local officials

Thus at the moment CBDB does not try in a systematic way to accumulate data through in-depth research on individuals, although some of the sources it uses are based on exhaustive research.

There are several consequences of this approach.

  • Social associations, such as might be known from an individual’s literary collection 文集, are not exhaustive.
  • Career data, the ranks and positions a person held, will be biased toward higher offices.
  • Kin relations, such as might be known from funerary biographies (e.g. 墓誌銘), are not exhaustive.

In addition, different kinds of data from the same source are mined at different times, and it may take a year or more for all the data to appear in CBDB

CBDB coverage will increase as primary historical documents are mined for data. CBDB offers some financial support to researchers who wish to prepare data for CBDB from their own systematic mining of historical documents. Four such projects are currently underway:

  • Tang and Five dynasty funerary biographies
  • Song dynasty letters as found in the Complete Song Prose 全宋文
  • Biographies in the Standard Histories 正史: currently the Song History biographies 宋史列傳
  • Biographical data in the Collected Essential Documents of the Song 宋會要
  • A complete set of poetic correspondence

To date CBDB has avoided efforts to systematically mine local gazetteers 地方志 because of the regional bias that would result. However, at some point CBDB does aim to mine all biographical data in all gazetteers.

CBDB makes it possible for individual researchers to manually input information. This sometimes results in depth coverage of exceptional individuals (for example, Zhu Xi 朱熹 1130-1200).

CBDB supports projects that mine a source or collection of sources comprehensively; suggestions and inquiries are welome.

Please see the CBDB Sources from the left menu to see the detailed listing of sources harvested by CBDB.