Skip to main content

Data from: Compression principle and Zipf’s law of brevity in infochemical communication


Hernandez-Fernandez, Antoni; G. Torre, Ivan (2022), Data from: Compression principle and Zipf’s law of brevity in infochemical communication, Dryad, Dataset,


Compression has been presented as a general principle of animal communication. Zipf's law of brevity is a manifestation of this postulate and can be generalised as the tendency of more frequent communicative elements to be shorter. Previous works supported this claim, showing evidence of Zipf's law of brevity in animal acoustical communication and human language. However, a significant part of the communicative effort in biological systems is carried out in other transmission channels, such as those based on infochemicals. To fill this gap, we seek, for the first time, shreds of evidence of this principle in infochemical communication by analysing the statistical tendency of more frequent infochemicals to be chemically shorter and lighter. We analyse data from the largest and most comprehensive open-access infochemical database known as Pherobase, recovering Zipf's law of brevity in interspecific communication (allelochemicals) but not in intraspecific communication (pheromones). Moreover, these results are robust even when addressing different magnitudes of study or mathematical approaches. Therefore, different dynamics from the compression principle would dominate intraspecific chemical communication, defying the universality of Zipf's law of brevity. To conclude, we discuss the exception found for pheromones in the light of other potential communicative paradigms such as pressures on successful communication or the Handicap principle.


The dataset was obtained crawling the Pherobase website

Usage notes

The database is provided as a standard CSV file. No special program or software is needed beyond a simple text editor.

It is also provided an excel information about classes and species at Those data are results from the processed database.

Additionally, one python Jupyter Notebook file for processing the database and obtainints results is also provided. Correlation tests and fitting to distributions are performed with the two R files provided. They are available at: