Skip to main content

Codes: A new approach to interspecific synchrony in population ecology using tail association

Cite this dataset

Ghosh, Shyamolina; Sheppard, Lawrence W.; Reid, Philip C.; Reuman, Daniel C. (2021). Codes: A new approach to interspecific synchrony in population ecology using tail association [Dataset]. Dryad.


Standard methods for studying the association between two ecologically important variables provide only a small slice of the information content of the association, but statistical approaches are available that provide comprehensive information. In particular, available approaches can reveal tail associations, i.e., accentuated or reduced associations between the more extreme values of variables. We here study the nature and causes of tail associations between phenological or population-density variables of co-located species, and their ecological importance. We employ a simple method of measuring tail associations which we call the partial Spearman correlation. Using multidecadal, multi-species spatiotemporal datasets on aphid first flights and marine phytoplankton population densities, we assess the potential for tail association to illuminate two major topics of study in community ecology: the stability or instability of aggregate community measures such as total community biomass and its
relationship with the synchronous or compensatory dynamics of the community’s constituent species; and the potential for fluctuations and trends in species phenology to result in trophic mismatches. We find that positively associated fluctuations in the population densities of co-located species commonly show asymmetric tail associations, i.e., it is common for two species’ densities to be more correlated when large than when small, or vice versa. Ordinary measures of association such as correlation do not take this asymmetry into account. Likewise, positively associated fluctuations in the phenology of co-located species also commonly show asymmetric tail associations. We provide evidence that tail associations between two or more species’ population density or phenology time series can be inherited from mutual tail associations of these quantities with an environmental driver. We argue that our understanding of community dynamics and stability, and of phenologies of interacting species, can be meaningfully improved in future work by taking into account tail associations.


Plankton data preprocessing steps were the same as used by Ghosh et al. (Advances in Ecological Research 62:409–468., 2020).  First, to reduce the
effects of sampling variation on statistical results, we chose the subset of locations for which more than 35 years of data were available for all species. Second, for a given location, we excluded Ceratium species that were undetected for more than 10% of sampled years at that location. Finally, we considered only those locations for which at least two Ceratium species remained. Sea surface temperature data preprocessing was the same as used by Sheppard et al. (EPJ Nonlinear Biomedical Physics 5:1, 2017). Aphid's phenology data were a subset of a larger dataset covering 11 locations, analyzed previously by Sheppard et al. (Nature Climate Change 6:610, 2016) and Ghosh et al. (Advances in Ecological Research 62:409-468., 2020). Data preprocessing was the same as that of Sheppard et al. (2016). Locations were screened, leading to the removal of one of the original 11 sampling locations, by requiring at least 30 years of data be available for all species, again to reduce sampling variation of statistics. We also had time series of winter average temperature for each location and year. The winter temperature for year t was the average of December of year t -1 to March of year t. 

Usage notes

The zipped folder is the repository of analyses: A new approach to interspecific synchrony in population ecology using tail association

How to compile the code

Knit makefile.Rmd using R markdown. If all dependencies are in place (see next section) this should re-compute all analyses from data to paper, resulting in three pdfs: MainText.pdf (the main text of the paper), SuppMat.pdf (the supporting information file for the paper), and makefile.pdf (notes on the compilation process - can be useful for error mitigation in the event of failure).

The knit should take about an hour or less on a standard modern laptop. Subsequent knits, if any, can be even faster because packages will be installed (see below) and because intermediate results are cached. If you try to knit MainText.Rmd or SuppMat.Rmd directly, you may have some limited success, but cross-document references and other features will fail so this is not recommended. To compile the documents from the command line, use the following: Rscript -e "library(knitr); knit('makefile.Rmd')".


Core software dependencies

  • R
  • R markdown
  • R studio
  • latex
  • bibtex

Data dependencies

Datasets are not included in the Data folder, and need to be obtained and put there for the code to run. A dataset which includes the plankton data we used, as a subset, can be obtained from the Dryad Digital Repository We do not have the rights to release the aphid data, so those data are not in the repository. The aphid data came from the Rothamsted Insect Survey (RIS) of Rothamsted Research ( The plankton data came from the Continuous Plankton Recorder (CPR) dataset of the Marine Biological Association of the UK ( Both organizations have clear policies for sharing data on their websites. James Bell ( is our contact at RIS and P. Chris Reid ( is our contact at CPR. If a user obtains written permission from these organizations then we will be happy to provide these datasets in the format expected by repository code.

Dependencies on the R checkpoint package

Code uses the R checkpoint package. This is set up in the master file makefile.Rmd in the R chunk checkpoint_chunk, which contains the following line of code specifying a date :

checkpoint("2019-01-01",checkpointLocation = "./")

The checkpoint package then automatically scans through other files looking for other required R packages. It then downloads and installs the versions of those packages that were available on the given date. This helps ensure that re-compiling the document uses exactly the same code that was originally used, in spite of package updates and other changes. This can take some time on first run but it is faster on subsequent runs because the packages are already installed. This also means that R package dependencies should only be the checkpoint package, since that package should scan for other packages and install them locally. Quite a few MB disk space are used (about 300Mb).

Dependencies on pandoc

The open source program pandoc converts documents from one format to another. Here, the knitr package uses it to convert the markdown files into latex format so that they can then be turned into PDF files. Installers for multiple operating systems are available here:

Dependencies on pdflatex

The makefile makes a system call to pdflatex, so software supporting that needs to be installed:

Additional dependencies?

If you find additional dependencies were needed on your system, please let us know: The compilation process was tested by Ghosh on Ubuntu 16.04 and by Reuman on a similar computing setup. It has not been tested on other platforms. We have endeavored to list all dependencies we can think of above, but we have only compiled on our own machines, so we cannot guarantee that additional dependencies will not also be needed on other machines, even after data are included (see above). This repository is intended to record a workflow, and is not designed or tested for distribution and wide use on multiple platforms. It is not guaranteed to work on the first try without any hand-holding on arbitrary computing setups.

Intermediate files:

Knitting the makefile automatically produces a lot of 'intermediate' files. Files ending in .tex are the converted documents from .Rmd including all the R code output and the rest (files ending .log, .aux, .lof, .lot, .toc and .out ) are intermediate files that pdflatex uses to keep track of various parts of the document. Some of these can be useful for diagnosing problems, if any.


We thank the many contributors to the large datasets we used; D. Stevens and P. Verrier for data extraction; and Joel E. Cohen, Lauren Hallet, and Jonathan Walter for helpful suggestions. We thank James Bell of the Rothamsted Insect Survey (RIS). The RIS, a UK Capability, is funded by the Biotechnology and Biological Sciences Research Council under the Core Capability Grant BBS/E/C/000J0200. SG, LWS and DCR were partly funded by US National Science Foundation grants 1714195 and 1442595 and the James S McDonnell Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or the McDonnell Foundation.


National Science Foundation, Award: 1714195

James S. McDonnell Foundation

National Science Foundation, Award: 1442595