Data from: A multi-trait analysis of the relationship between parasitism and female preference for orange in Trinidadian guppies (Poecilia reticulata)
Data files
Mar 26, 2026 version files 200.53 KB
-
Cleaned_for_Preference_Calcs.csv
14.89 KB
-
For_Orange_Calculations.csv
7.10 KB
-
For_Path_Analyses_with_Sources.xlsx
12.61 KB
-
For_Path_Analyses.csv
1.46 KB
-
R_Code.Rmd
60.95 KB
-
Raw_for_Preference_Calcs.csv
86.01 KB
-
README.md
17.51 KB
Abstract
Several studies suggest that parasite-imposed selection favours elaborate sexual ornaments, as posited by the Hamilton-Zuk hypothesis. However, few have examined the prediction that selection by parasites also promotes heightened female preferences. We explored this prediction by asking whether the strength of female mate preference for the area of male orange colouration in wild Trinidadian guppy populations was associated with Gyrodactylus parasite infection metrics. We further examined how environmental factors might affect sexual selection, parasitism, and their relationship in wild guppies. Our study, based on analyses of between 9 and 17 populations, offers preliminary evidence supporting the Hamilton-Zuk hypothesis and indicates interesting avenues for further research. We found that female preferences were stronger in populations exposed to higher Gyrodactylus intensities, but only if females preferred males with more orange colouration. Orange area was not associated with any parasite metric. This ornament varied with fish community composition, but the strength of female preference did not. Finally, Gyrodactylus prevalence increased with human disturbance, and intensity decreased in populations exposed to higher predation intensities and fish species richness. Our results suggest that parasitism may be one of several factors influencing sexual selection in guppies.
https://doi.org/10.5061/dryad.stqjq2cck
This dataset contains the data and code necessary to replicate the analyses in Clark et al. (2026), which test the relationship between Gyrodactylus parasitism and the strength of female mate preference for the relative area of male orange colouration in the Trinidadian guppy, Poecilia reticulata. The data cover 17 geographically isolated guppy populations from Trinidad’s Northern Range. The data were analyzed using structural equation modelling (path analysis). The analyses found that female preferences are positively associated with Gyrodactylus intensity across populations, but only if females prefer males with more orange colouration. The results further indicate complex relationships between relative orange area, the strength of female preference, parasitism, and environmental variables such as fish community composition, river drainage basin, and the level of human disturbance. The paper thus supports the conclusion that parasitism may be one of several factors influencing sexual selection in guppies.
Description of the data and file structure
The dataset consists of five Excel files containing the data used in the analyses, an R Markdown file containing the code necessary to replicate the analyses, and the Electronic Supplementary Materials for the paper.
Files and variables
File: For_Path_Analyses.csv
This file includes the data used for the multivariate structural equation modelling (path analysis). Each row represents a unique guppy population. Missing values are coded as "NA".
Variables
- drainage: the river drainage basin in which the guppy population is located.
- River: the river in which the guppy population is located.
- Name: the unique label for each population that identifies the river and predation level for the population (LP = low-predation, HP = high-predation). In cases where there were multiple populations from the same river with the same predation level, the populations were numbered.
- Predation: a binary “low” vs. “high” classification of the degree of predation risk for guppies, determined based on the presence of large fish predators, particularly Crenicichla frenata
- community: a scaled variable generated using correspondence analysis that combined the “Predation” variable (transformed into a binary variable where 0 = low, 1 = high) with the total number of fish species recorded at the site (i.e., species richness)
- Prevalence: the proportion of the guppy population infected by Gyrodactylus ectoparasites
- Intensity: the average number of individual Gyrodactylus parasites per infected guppy
- Preference: the strength of female mate preference for the area of orange colouration on males, measured as the partial regression coefficient of male attractiveness on male orange area. Male attractiveness is defined by the degree of female responsiveness and attention to males.
- Direction: a binary variable describing the sign of the “Preference” variable. “pos” = positive, “neg” = negative. In “pos” populations, males with more orange colouration were preferred, while in “neg” populations, they were disliked.
- OrangeArea: the relative area of orange colouration on males, calculated by dividing the area of orange colouration on the body by the total body area. Note that the body excludes all fins.
- Canopy: the estimated percentage of the water surface where the guppies were collected that is exposed to direct sunlight
- PopSize_Ranks: the guppy population size, transformed into a rank variable. The population size rank was estimated based on the guppy numbers in the collection pools and the expected dispersal distance.
- Density: the number of guppies per m2
- disturbance: the binary disturbance level, where 1 is “no-to-low” disturbance, and 2 is “medium-to-high” disturbance. Disturbance was estimated based on various indicators of human presence, like agriculture, recreational activities, and pollution.
File: For_Path_Analyses_with_Sources.xlsx
This sheet is identical to the “For_Path_Analyses.csv” file, except that it includes the data source beside each value. The “drainage” and “River” values are generally accepted and widely published for these well-studied populations, so no single source is listed. “Name”, “community,” “Direction,” and “disturbance” were estimated or calculated by the researchers. Additional notes are given for these six variables below the main table.
File: For_Orange_Calculations.csv
This file includes the data used to calculate the average area of orange colouration on males for seven populations in the dataset (Aripo HP, Aripo LP, El Cedro HP, El Cedro LP, Guanapo HP, Paria LP 1, and Turure HP). Each row represents one of the males tested in a previous study, Valvo et al. (2019).
Variables
- Population: the guppy population of origin of the focal male. This variable is identical to the “Name” variable in the “For_Path_Analyses.csv” file.
- Predation: a binary “low” vs. “high” classification of the degree of predation risk for guppies. This variable is the same as the “Predation” variable in the “For_Path_Analyses.csv” file.
- River: the river in which the population is located. This variable is the same as the “River” variable in the “For_Path_Analyses.csv” file.
- Male_pool: a code identifying the pool within the river from which the male was collected. Pools are separated by riffles of turbulent, fast-flowing water that guppies rarely cross, meaning that guppies from one pool are unfamiliar with conspecifics from other pools in the same population. Pools were labelled alphabetically by position along the river. In one case, a pool was labelled AA and the next was labelled A (for the Guanapo HP population).
- Male_ID: a unique code for the focal male
- “EXP_” stands for experimental, necessary for some males to distinguish them from males that were caught but not used in mate choice trials.
- Letters 4-5 (EXP males) or 1-2 (non-EXP males): the river (e.g., “AR” for “Aripo”)
- Letters 6-7 (EXP) or 3-4 (non-EXP): the predation level (“LP” = low, “HP” = high)
- Letter 8 (EXP males) or 5 (non-EXP): the “Male_pool” value.
- Letter 9 (EXP males) or 6 (non-EXP): “M” for “male,” to distinguish from females with otherwise identical codes
- The unique identifier: a unique number or code assigned to the male to distinguish it from the other males taken from its pool. The males from pool 17B of the Paria LP 1 population are instead identified as C1, C2, R, and U. These codes refer to the “Male_Type” variable in the “Raw_for_Preference_Calcs.csv” file.
- Body_Area_1, Body_Area_2, Body_Area_3: the three measurements of the male’s body area (excluding the tail and fins) in mm2. Measurements were repeated three times to increase the accuracy.
- Average_BodyArea: the numerical average of the three “Average_BodyArea” measurements.
- SD_BodyArea: the standard deviation of the three “Average_BodyArea” measurements. This value was kept below 0.3 to improve precision.
- Total_Orange: the total orange area on the male’s body (excluding the fins) in mm2. This is the sum of the “caud_o” and “body_o” variables in the “Raw_for_Preference_Calcs.csv” file.
- Relative_Orange: “Total_Orange” / “Average_BodyArea”*100
File: Raw_for_Preference_Calcs.csv
This file contains the raw data for the de novo calculations of the strength of female mate preference for orange, measured in the same seven populations as relative orange area (see the explanation for the “For_Orange_Calculations.csv” file). The data are identical to the raw data provided by Valvo et al. (2019), except that only the seven target populations are included. In Valvo et al. (2019), each trial involved placing four males into divided compartments on the sides of a large tank. The female was placed in the central compartment, and the amount of time she spent oriented towards each male was taken as an indicator of her level of attraction to that male. In the dataset, each row provides the information for a specific (focal) male in a particular trial.
Variables
- fem_ID: a unique code for the focal female. The meanings of the letters and numbers are identical to those for the "Male_ID" variable in the "For_Orange_Calculations.csv” file, except that the fifth letter is “F” for “female,” not “M” for “male.”
- year: the year in which the trial was conducted (2016 or 2017).
- date_obs: a code for the date on which the trial was conducted. Each code represents a day in the month of May. Days in 2016 begin with 575 and days in 2017 begin with 578.
- time: the time that the trial started. The format uses a 24-hour clock, where the first two digits represent the hour and the second two digits represent the minutes.
- date_col: a code for the date on which the female was collected from the field. The meaning of the codes is as in the date_obs column.
- river: identical to the “River” variable in the three files described above.
- pred: the binary predation metric, identical to the “Predation” variable in the three files described above.
- fem_pool: the pool within the river from which the female was collected. Identical to the “Male_pool” variable in the “For_Orange_Calculations.csv” file.
- male_pool: the pool within the river from which the focal male was collected. Identical to the “Male_pool” variable in the “For_Orange_Calculations.csv” file.
- comp: the compartment in which the focal male was housed during the trial.
- male_Type: a variable describing the degree of familiarity that the female has with the male’s colouration.
- c = “common”: the male is from the same pool as the female and has similar colouration to most other males in the pool.
- r = “rare”: the male is from the same pool as the female but has different colouration compared to most other males in the pool.
- u = “unfamiliar”: the male is from a different pool than the female and has different colouration from the males in the female’s pool of origin.
- photo_ID: an identifier for the photograph of the male that was used to calculate its colouration. “RIGHT” refers to the right side of the body, and the number indicates which image in a series was used. The ID begins with the “Male_ID” variable from the “For_Orange_Calculations.csv” file.
- tail_r2y: the area of red-to-yellow colouration on the tail of the male (mm2).
- tail_o: the area of orange colouration on the tail of the male (mm2).
- caud_r2y: the area of red-to-yellow colouration on the caudal peduncle (the area between the end of the anal fin and the beginning of the tail) of the male (mm2).
- caud_o: the area of orange colouration on the caudal peduncle of the male (mm2).
- body_r2y: the area of red-to-yellow colouration on the body of the male (mm2). The body excludes the caudal peduncle and the fins.
- body_o: the area of orange colouration on the body of the male (mm2).
- TT_trial: the total time of the trial (seconds).
- TT_orient: the total time the female spent facing the focal male while receptive (seconds).
- male_SL: the standard length of the male (cm). Standard length is measured from the tip of the snout to the base of the tail.
- recep: the total amount of time the female was receptive to males during the trial, defined as the amount of time spent within 6 cm of any of the four male compartments (seconds).
File: Cleaned_for_Preference_Calcs.csv
This file includes mostly the same variables as the “Raw_for_Preference_Calcs.csv” file. However, the individual trial values are averaged for each male to provide a single point per male to use in the analyses of female preference. Three variables were removed because they are not relevant to the averaged data, and two variables were added because they were necessary for the preference calculations.
Variables Removed
- time: each male were tested at multiple times on each day, so this variable is not relevant for the averaged data.
- fem_ID: values from multiple females were averaged, so this variable is no longer relevant.
- fem_pool: no longer relevant for the same reason as “fem_ID.”
- comp: across trials, individual males were housed in multiple different compartments, so this variable is not relevant for averaged data.
Variables Added
- orient_rel: the average amount of time that females spent oriented towards the focal male (“TT_orient”) divided by the total amount of time that the female was receptive (“recep”), averaged across all trials (proportion)
- body_Area: the body area of the male (mm2). This variable is identical to the “Average_BodyArea” variable in the “For_Orange_Calculations.csv” file.
Code/software
The code for Clark et al. (2026) is provided in the R Markdown file “R_Code.Rmd”. Analyses were run in R Version 4.5.1.
File: R_Code.Rmd
This R Markdown file provides all the code necessary to replicate the analyses in Clark et al. (2026). It includes all path analyses and permutation analyses used to examine the multivariate relationships in the dataset, as well as the supplementary parametric tests included in the paper. Also provided are the data cleaning procedures and linear regressions used to measure the strength of female preference for the seven populations from Valvo et al. (2019). Sections are clearly labelled based on what the analyses were designed to test. Explanatory notes are provided where necessary.
Access information
Published data was derived from the following sources:
- Brown, G. E., C. K. Elvidge, C. J. Macnaughton, I. Ramnarine, and J-G. J. Godin. 2010. Cross-population responses to conspecific chemical alarm cues in wild Trinidadian guppies (Poecilia reticulata): Evidence for local conservation of cue production. Canadian Journal of Zoology 88:139-147.
- Endler, J. A. 1986. A preliminary report on the distribution and abundance of fishes and crustaceans of the Northern Range Mounts, Trinidad. Port of Spain, Trinidad: Report to the Ministry of Agriculture and Fisheries.
- Endler, J. A., and A. E. Houde. 1995. Geographic variation in female preferences for male traits in Poecilia reticulata. Evolution 49:456–468.
- Fraser, B. A., I. W. Ramnarine, and B. D. Neff. 2010. Temporal variation at the MHC class IIB in wild populations of the guppy (Poecilia reticulata). Evolution 64:2086–2096.
- Furness, A. I., M. R. Walsh, and D. N. Reznick. 2012. Convergence of life-history phenotypes in a Trinidadian killifish (Rivulus hartii). Evolution 66:1240-1254.
- Gilliam, J. F., D. F. Fraser, and M. Alkins-Koo. 1993. Structure of the tropical fish community: A role for biotic interactions. Ecology 74:1856-1870.
- Gotanda, K. M., L. C. Delaire, J. A. M. Raeymaekers, F. Pérez-Jvostov, F. Dargent, P. Bentzen, M. E. Scott, et al. 2013. Adding parasites to the guppy-predation story: Insights from field surveys. Oecologia 172:155–166.
- Grether, G. F. 2000. Carotenoid limitation and mate preference evolution: A test of the indicator hypothesis in guppies (Poecilia reticulata). Evolution 54:1712-1724.
- Houde, A. E., and M. A. Hankes. 1997. Evolutionary mismatch of mating preferences and male colour patterns in guppies. Animal Behaviour 53:343–351.
- Karim, N., S. P. Gordon, A. K. Schwartz, and A. P. Hendry. 2007. This is not déjà vu all over again: male guppy colour in a new experimental introduction. Journal of Evolutionary Biology 20:1339-1350.
- Lyles, A. M. 1990. Genetic variation and susceptibility to parasites: Poecilia reticulata infected with Gyrodactylus turnbulli (Doctoral dissertation). Princeton University, Princeton.
- Magurran, A. E., and B. H. Seghers. 1991. Variation in schooling and aggression amongst guppy (Poecilia reticulata) populations in Trinidad. Behaviour 3/4:214–234.
- Martin, C. H., and S. Johnsen. 2007. A field test of the Hamilton-Zuk hypothesis in the Trinidadian guppy (Poecilia reticulata). Behavioral Ecology and Sociobiology 61:1897–1909.
- Phillip, D. A. T. 1998. Biodiversity of freshwater fishes of Trinidad and Tobago, West Indies. University of St. Andrews, St. Andrews. (Phillip 1998)
- Reznick, D., and J. A. Endler. 1982. The impact of prediction on life history evolution in Trinidadian guppies (Poecilia reticulata). Evolution 36:160-177.
- Stephenson, J. F., C. van Oosterhout, R. S. Mohammed, and J. Cable. 2015. Parasites of Trinidadian guppies: Evidence for sex- and age-specific trait-mediated indirect effects of predators. Ecology 96:489–498.
- Zandonà, E., C. M. Dalton, R. W. El-Sabaawi, J. L. Howard, M. C. Marshall, S. S. Kilham, D. N. Reznick, et al. 2017. Population variation in the trophic niche of the Trinidadian guppy from different predation regimes. Scientific Reports 7:5570.
The data in the path analysis dataset were compiled from published datasets, unpublished sources, and primary analyses. Published data sources were identified using keyword searches in online article databases. Data from the same population were matched using GPS coordinates and maps of the study sites. Most values are as published, but one female preference value was adjusted to match the scale for the other populations. Parasite values were also converted between the two focal metrics to ensure all sites had data for both variables. For some variables (canopy openness, ranked population size, guppy population density, and human disturbance level), estimates were generated based on the authors’ extensive first-hand knowledge of the sites and their conditions, as well as field notes and unpublished data. The dataset also includes the data used to calculate values de novo for the relative area of orange colouration on males and the strength of female preference for orange for seven populations. The methodology to collect those raw data is described in Valvo et al. (2019).
