Running Experiments

Building a Corpus of Spontaneous Speech in Tsez

We plan to create a corpus of spontaneous speech in Tsez, an endangered language of the Caucasus spoken by about 6,000 people, and three endangered Mayan languages. The project will involve collecting, transcribing and annotating the data in such a way that they could be used by other researchers. We will then compare these languages to spoken production from several heritage languages (Russian, Chinese, Avar, Spanish, and Mam) whose corpora will also be transcribed and annotated.