BELC (Barcelona English Language Corpus)

Carme Muñoz
English Linguistics and Applied Linguistics
University of Barcelona


Participants: 55
Type of Study: interview / storytelling
Location: Spain
Media type: audio
DOI: doi:10.21415/T5S89C

Browsable transcripts

Download transcripts

Media folder

Citation information

Articles that make use of these data should cite:

For a list of additional publications related to this corpus, please click here.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

The Barcelona English Language Corpus (BELC) has its origin in the Barcelona Age Factor (BAF) project. This is a project that examines the effects of age on the acquisition of English as a foreign language.

The BAF Project began at a moment when the changes in the timing of foreign language instruction brought about by a new Education Law were being progressively implemented in both primary and secondary schools around Spain, entailing an earlier introduction of the foreign language in primary education from grade 6 (11 years) to grade 3 (8 years). The replacement of the previous curriculum by the new curriculum took eight years, during which it was possible to find pupils who had begun English instruction at the age of 11, under the previous curriculum, and pupils who had begun English instruction at the age of 8, under the new curriculum. In addition to these central groups, two other age groups were also included in the design of the study, one of adolescents whose initial age of learning English was 14 and one of adults who began instruction in English at the age of 18 or older.

The research on age effects on the learning of English as a foreign language was conducted with students from state schools in Catalonia (Spain). It is important to note that Catalonia is a bilingual community with a majority language, Spanish, known by practically the totality of the population, and a minority language, Catalan, which is the community language and the language of instruction in the state school system in Catalonia. English is the first foreign language in most schools, hence being the third language of school pupils. It is also important to remark that the earlier introduction of the foreign language entailed a decrease in intensity. That is, whereas English had been taught for three hours per week under the former curriculum (beginning in grade 6), at the time of data collection in the new curriculum it was taught for two hours and a half per week on average from grade 3 to grade 10, and for two hours per week in grades 11 and 12. The approximate amount of instruction in English was about 750 hours under the former curriculum, distributed over seven years; and about 800 hours, distributed over ten years, under the new one.

Introduction to the Data

Data were collected at four times: after 200 hours of instruction, 416 hours, 726 hours and 826 hours (Time 1, 2, 3, and 4, respectively) though only one of the groups was available the four times (see Table 1 below). There were 2063 subjects in total, but it should be noted that a number of them had had more hours of instruction, either because of extracurricular exposure or because of retaking a course grade. Pupils with only school exposure (OSE) fulfilled the conditions for comparison. Table 1 below indicates the number of subjects in each group, the age at which they began instruction in English and each group’s mean chronological age at testing.

Table 1. Characteristics of subjects in the study
TimeGroup A
AO = 8
Group B
AO = 11
Group C
AO = 14
Group D
AO = 18+
Time 1
200 h.
A1AT = 10;9
N = 284
OSE = 164
B1AT = 12;9
N = 286
OSE = 107
C1AT= 15,9
N = 40
OSE = 21
D1AT = 28;9
N = 91
OSE = 67
Time 2
416 h.
A2AT = 12;9
N = 278
OSE = 140
B2AT = 14;9
N = 240
OSE = 96
C2AT= 19,1
N = 11
OSE = 4
D2AT = 39;4
N = 44
OSE = 21
Time 3
726 h.
A3AT = 16;9
N = 338
OSE = 71
B3AT = 17;9
N = 296
OSE = 51
Time 4
826 h.
A4AT = 17;9
N = 155
OSE = 71
(AO = age of onset; AT = age at testing; N = number of subjects; OSE = only school exposure)

The data included in BELC correspond to those subjects who could be followed longitudinally and for whom there are two, three or four collection times over a period of seven years, although not all subjects fulfilled all the tasks (See Table 2).


The files in the TalkBank database are taken across the four times and across four tasks. The files are grouped in folders by the tasks. The file names gives first the time (1, 2, 3, 4) then the group (A, B, C), then the task (c, i, n, r), then the subject number (L06, etc).

Written composition. The written composition dealt with a familiar topic: “Me: my past, present and future”. Students were given a set time (15 minutes), the same for everybody. (Younger and less proficient learners did not use up all the time they were given because of their language limitations.)

Oral narrative. The narrative was elicited from a series of six pictures at which the subjects could freely look before and while they were telling the story in the presence of the researcher. In the story there are two main protagonists, a boy and a girl, who are getting ready for a picnic; a secondary character, their mother; and a character that disappears and later reappears, a dog that gets into the food basket and eats the children's sandwiches.

Oral interview. It was a semi-guided interview that began with a series of questions about the subject’s family, daily life and hobbies. This constituted a warming-up phase that helped students feel more at ease. In general, interviewers attempted to elicit as many responses as possible from the learners, and accepted learner-initiated topics in order to create as natural and interactive a situation as possible.

Role-play. The role-play task was performed in randomly chosen pairs. In the role-play one of the students was given the role of the mother/father while the second student was given the role of the son/daughter. The latter had to ask permission to have a party at home and both students were asked to negotiate setting, time, activities (music, eating, drinking), etc. The researcher gave the initial instructions and when needed also elicited talk by reminding learners of topics for discussion or led the task to its completion by asking about the outcome of the negotiation.

Table 2. Spoken tasks performed by BELC longitudinal learners

The main results of the BAF Project so far can be found in the volume Age and the Rate of Foreign Language Learning (see below).

Folders 5 and 6: BAF follow-up study

Publications related to this sub-corpus (among others):

Muñoz, C. (2011). Is input more significant than starting age in foreign language acquisition? International Review of Applied Linguistics (IRAL) 49, 113-133.

Muñoz, C. (2012). The significance of intensive exposure as a turning point in learners’ histories. In C. Muñoz (ed.) Intensive Exposure Experiences in Second Language Learning (pp. 141-160). Multilingual Matters.

Muñoz, C. (2014). Contrasting Effects of Starting Age and Input on the Oral Performance of Foreign Language Learners. Applied Linguistics 35, 463-482.

Muñoz, C. (2014). Starting age and other influential factors: Insights from learner interviews. Studies in Second Language Learning and Teaching 4, 465-484. Ortega, M. (2016). Crosslinguistic influence in L2 English oral production: the effects of cognitive language learning abilities and input. PhD thesis. Universitat de Barcelona

Description of the Subjects

The subjects (N=21) constitute a subsample from a larger on-going project (participants N=232, L1 Spanish and Ls1 Spanish and Catalan), which explores the influence of such independent variables as starting age, cumulative L2 input, frequency of the current contact with an L2, as well as the influence of cognitive abilities (working memory, attention switching capacity and language aptitude) on L2 proficiency and on L2 oral and written performance.

The subsample of the participants that we present here ( N=21; 6 male, 15 female) were undergraduate students, many of them majoring in English, with an intermediate to advanced level of English. Their average age at first testing was 23.6 (SD 8.3) and the range 18-52.

This group had had at least 6 years of English language learning experience: the average length was 14.2 (SD 8.2; range 6-38 ). The mean starting age, defined as the beginning of exposure to English as FL (preschool, primary school or secondary school) was 9.84 (SD 3.33) and the range 4-15. Most of the participants were multilingual, and had been learning an L3 for at least 1 year (mean 2.6, SD 1.2, range 1-5).

Folder 5: L2 written composition

The folder labelled 5-written_college contains the data from the written composition task for the college students. The written composition task dealt with a familiar topic: “My past, present and future expectations”. Students were given 15 minutes to write the task. There is one folder for time1time2 and one for time1time3.

N=6 participants performed the oral production task and the written composition twice with 1 year's interval (Time 1 and Time 2). N=15 participants performed the oral production task and the written composition twice with 2 years' interval (Time 1 and Time 3).

Folder 6: L2 oral production task

The folder labelled 6-narrative_college contains the L2 oral production tasks data. This was a video-retelling task elicited with the help of the video prompt (“Alone and Hungry” episode (7 minutes long) from the Charlie Chaplin movie). The subjects watched the whole episode once, then they watched the 1st part of the episode (3.5 minutes approximately) and were asked to retell this part. After that, the subjects watched the 2nd part of the movie, and subsequently did the retelling of the 2nd part. The transcriptions correspond to the retelling of the 1st part of the movie.

N=6 participants performed the oral production task and the written composition twice with 1 year's interval (Time 1 and Time 2). N=15 participants performed the oral production task and the written composition twice with 2 years' interval (Time 1 and Time 3).

Folder 7 - ELLiC. Ten-year longitudinal study

Citation information:

Tragant, E. & Muñoz. C. (2023). Ten Years of English Learning at School. Palgrave. In accordance with TalkBank rules, any use of data from this corpus must be accompanied by the above reference.

Project description

The source of these texts is the research project ELLiC (English Language Learning in Catalonia), a longitudinal study spanning from grade 1 to grade 10. The ELLiC project started as part of the ELLiE project (years 1-4; see Enever, 2011) and focused on five participant primary schools in the Barcelona area. From G1 to G6, some of the data was gathered collectively from the whole group of students. In those schools six learners from grade 1 became our focus participants (n= 28) and they were followed year after year until they finished primary education (G6). The ELLiC project continued throughout secondary education and followed a smaller number of the original participants in eight different high schools that those learners had transferred to. Data collection in those high schools was carried out in their first and last years of secondary education only (G7 and G10). By the time these 28 focal learners were in G10, three students were lost because they were no longer in their initial high school or we were not able to contact them. The final number of focal learners from whom we could collect data from primary school (G1-G6) and secondary school (G7 and G10) is 25.

Introduction to the Data

The compositions were collected from the whole classes at the end of grades 5 and 6, and only from the focal learners in grades 7 and 10. In grade 7 there was an extra data collection in Autumn (grade 7A), added to the regular one in Spring (7S). Students were asked to write a letter to a penfriend telling them about themselves and asking questions from them too. They were given 10 minutes but they took much less time in completing the task. Names of people and places have been largely anonymized and names have been replaced by pseudonyms.

Who are we?

2014: Our research group (GRAL) consists of the following members. Unless otherwise indicated, all participants are located in the Department of English at the University of Barcelona. 2023: GRAL members and PhD students attached to research projects