Linguistic laws in speech: the case of Catalan and Spanish
Cite this dataset
González Torre, Ivan; Hernández-Fernández, Antoni; Garrido, Juan-María; Lacasa, Lucas (2019). Linguistic laws in speech: the case of Catalan and Spanish [Dataset]. Dryad. https://doi.org/10.6071/M3XW9T
In this work we explore in an oral corpus of Catalan and Spanish (Glissando Corpus) four classical linguistic laws (Zipf's law, Herdan's law, Brevity law, and Menzerath-Altmann's law) in oral communication, both in physical units and in symbolic units measured in speech transcriptions, and we also reviewed two more laws recently reformulated: lognormality law and size-rank law. Our results reinforce with empirical evidence in two more languages the 'physical hypothesis' according to which linguistic laws could be explained by physical laws and the principles of information theory. In this sense, linguistic laws would have an oral origin and the evidences recovered in written texts would be a byproduct of the complexity that takes place in speech.