Skip to main content

Do children learn from their mistakes? A registered report evaluating error-based theories of language acquisition

Cite this dataset

Fazekas, Judit; Jessop, Andrew; Pine, Julian; Rowland, Caroline (2020). Do children learn from their mistakes? A registered report evaluating error-based theories of language acquisition [Dataset]. Dryad.


Error-based theories of language acquisition suggest that children, like adults, continuously make and evaluate predictions in order to reach an adult-like state of language use. However, while these theories have become extremely influential, their central claim - that unpredictable input leads to higher rates of lasting change in linguistic representations – has scarcely been tested. We designed a prime surprisal-based intervention study to assess this claim.

As predicted, both 5- to 6-year-old children (n=72) and adults (n=72) showed a pre- to post-test shift towards producing the dative syntactic structure they were exposed to in surprising sentences. The effect was significant in both age groups together, and in the child group separately when participants with ceiling performance in the pre-test were excluded.  Secondary predictions were not upheld: there were no verb-based learning effects and there was only reliable evidence for immediate prime surprisal effects in the adult, but not in the child group. To our knowledge this is the first published study demonstrating enhanced learning rates for the same syntactic structure when it appeared in surprising as opposed to predictable contexts, thus providing crucial support for error-based theories of language acquisition.


This dataset is associated with a psycholoinguistcics study targeting error-based learning theories. The dataset contains video descriptions from 72 adult and 72 5-6 year old child participants. The study was carried out in the form of a bingo game. The data collection took place in either the participants' school or in the departmental laboratories. The data was originally audiotaped then the relevant parts of the audio recordings were transcribed and the responses were coded (as double object dative, prepositional dative or other response). The transcription and coding procedure is discussed in the associated manuscript. The data uploaded here is an excel file (Prediction_learning_dataset.csv) featuring the transcribed responses from the anonymised participants and the response codes associated with them. The responses were not processed in any additional way other than being transcribed from the audio recordings.

Usage notes

The dataset (Prediction_learning_dataset.csv) can be used in combination with the ReadMe file (ReadMe for prediction_learning_dataset.docx) describing the predictors featured in the different coloumns of the excel file. There are missing values and excluded participants in this dataset. The exlusion criteria is detailed in the correspondint manuscript (2.1. Participants, 2.7. Coding and 3. Statistics and data analyses).

In addition to the main dataset, the repository contains the scripts containing the analyses described in the manuscript, the original datafile these analyses ran on (All_data_JF.csv, this file is identical to Prediction_learning_dataset.csv, however all columns are named in the latter file while the former contains unlabelled columns), the sentence lists and the testing logs.


Economic and Social Research Council