Vercellotti Corpus

Mary Lou Vercellotti
Department of English
Ball State University


Participants: 188
Type of Study: classroom
Location: Pittsburgh
Media type: audio
DOI: doi:10.21415/T5W88X

Browsable transcripts

Download transcripts

Media folder

Citation information

Publications using these data should cite:

Additional publications include:

Vercellotti, M. L. & Packer, J. (2016). Shifting structural complexity: The production of clause types in speeches given by English for Academic Purposes students. Journal of English for Academic Purposes, 22, 179-190.

Vercellotti, M. L. (2018). Finding variation: Assessing the development of syntactic complexity in ESL speech. International Journal of Applied Linguistics.1-15. DOI: 10.1111/ijal.12225

Vercellotti, M. L. & McCormick, D. E. (2018). Self-correction profiles of L2 English learners: A longitudinal multiple-case study. TEFL-EJ, 22(3), 1-25.

Vercellotti, M. L., Juffs, A., & Naismith, B.. Multiword seuences in English language learners' speech: The relationship between trigrams and lexical variety across development.. System, 98 ,

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This project investigates the development of complexity, accuracy, and fluency in the speech of English language learners. This research is classroom-based, pedagogy-driven language development research. Participants were adult learners entering an Intensive English Program (IEP) in the United States during the year 2010. The longitudinal data were collected during class meetings, as part of the speaking curriculum at every instruction level in the IEP. The data, two-minute monologues on a given topic, were collected by each student in the IEP multiple times per academic semester, and many students remained in the IEP across multiple semesters. The topics varied by semester, by level, and sometimes by class section for pedagogical reasons. The speeches were transcribed by a native speaker experienced in transcribing non-native speech and segmented into sentence-level units, AS-units, following Foster, Tonkyn, and Wigglesworth (2000).


Andrew Yankes reformatted this corpus into accord with current versions of CHAT.