Behavioral estimates of mating success corroborate genetic evidence for pre-copulatory selection
Cite this dataset
Bhave, Rachana S. et al. (2023). Behavioral estimates of mating success corroborate genetic evidence for pre-copulatory selection [Dataset]. Dryad. https://doi.org/10.5061/dryad.41ns1rnmn
Abstract
In promiscuous species, fitness estimates obtained from genetic parentage may often reflect both pre- and post-copulatory components of sexual selection. Directly observing copulations can help isolate the role of pre-copulatory selection, but such behavioral data are difficult to obtain in the wild and may also overlook post-copulatory factors that alter the relationship between mating success and reproductive success. To overcome these limitations, we combined genetic parentage analysis with behavioral estimates of size-specific mating in a wild population of brown anole lizards (Anolis sagrei). Males of this species are twice as large as females and multiple mating among females is common, suggesting the scope for both pre- and post-copulatory processes to shape sexual selection on male body size. Our genetic estimates of reproductive success revealed strong positive directional selection for male size, which was also strongly associated with the number of mates inferred from parentage. In contrast, a male’s size was not associated with the fecundity of his mates or his competitive fertilization success. By simultaneously tracking copulations in the wild via the transfer of colored powder to females by males from different size quartiles, we independently confirmed that large males were more likely than small males to mate. We conclude that body size is primarily under pre-copulatory sexual selection in brown anoles, and that post-copulatory processes do not substantially alter this pre-copulatory selection. Our study also illustrates the utility of combining both behavioral and genetic methods to estimate mating success to disentangle pre- and post-copulatory processes in promiscuous species.
README: Behavioral estimates of mating success corroborate genetic evidence for pre-copulatory sexual selection in the wild
[Access this dataset on Dryad](10.5061/dryad.41ns1rnmn)
The findings in this paper are drawn from a mark-recapture and genotyping study carried out on a focal island population of brown anole lizards along the intra-coastal waterway in Guano Tolomato Matanzas Natural Estuarine Research Reserve in Florida. Although animals in this population have been tracked between 2015 and 2019 part from a long-term ongoing study, this dataset primarily focuses on the adult male population present on the island in 2019, their reproductive fitness, and pre-and post-copulatory components of fitness. We asked if selection due to total reproductive success on a measure of body size (here, body mass) is primarily due to higher mating success (pre-copulatory selection) or is primarily due to competitive fertilization success (post-copulatory success). We also tested alternate hypotheses for stronger selection on larger male body size by assessing patterns of assortative mating and if selection on male body size was also driven by the average fecundity of females.
Based on the methods, the datasets provided illustrate individual identity of animals (e.g., ID, Sex, Age, Cohort), capture information (Year, Month of capture, New or recaptured), genetic fitness of individuals (total_no_of_offspring(Reproductive success), total_no_of_mates (Mating success), For males only: Average per mate fecundity, competitive fertilization success, unadjusted proportion paternity; For females only: Average female fecundity), data collected from powdering study to measure behavioral mating success (Males: size class and color of powder applied; Females only: color of powder detected, Size class classification, Mating status, Incidence of multiple mating, secondary contact).
There are two separate datasets provided. The first Males_2019.csv file details adult males who were captured from March to July 2019. From this dataset:
a) Selection gradients reported in Fig. 1 were measured from adult males captured in March only, where fitness measures were standardized for this subset of individuals.
b) Males captured in May and July were part of the fluorescent powdering study, the results and analyses from which are illustrated in Fig. 2.
c) The annual fitness data for adults has also been specified in the same dataset to illustrate Fig. 4 and 5 respectively.
d) Finally the dataset titled Females_2019.csv illustrates the identity, morphology, behavioral mating success, and genetic fitness of females captured in May and July.
Data from this file was used to illustrate Fig. 2 and Fig. 3.
Description of the data and file structure
Variables used in Dataset and what they refer to:
The following variables are present in the dataset. Although there are 3 separate datasets, they have all been collected from the same population described in the paper Bhave et al. (2023). How data under each variable was collected is explained in the respective Methods section for each analyses described above.
A) Males_2019.csv
This dataset was utilized for estimating linear and non-linear selection gradients. All individuals in this dataset were included to estimate selection on body mass due to reproductive success and mating success. However, individuals with no offspring were excluded from estimating selection due to average per-mate fecundity(off_year = 0). Individuals that had no offspring or had all offspring with a single female who also had no other mates were excluded(mna_PCS = 1) from estimates of competitive fertilization success(m_PCS). Relative fitness for selection analyses was estimated by dividing individual fitness with the average fitness of all animals that were successfully genotyped and were present in the dataset in March (off_year = 0 or >0). NAs in any column would indicate that the values could not be ascertained or determined for that given column.
ID: Continuous variable, Unique identifying ID number given to a captured individual. Each individual is distinguished based on the order in which toes are clipped.
Sex: (Sex at birth) Categorical variable, M = male, F = female
CapDate: Date of capturing animal on field at the start of the breeding season in March 2019. Written in yyyy-mm-dd format or mm-dd-yyyy
Month: Mentions month in which animals were captured. Note the months indicate when the sampling trip began (although some sampling trips may end in the following month). This column can help filter males for the specific subset used for the analyses.
Recap: Categorical N/R. N = New capture ~ Captured for the first time in the mark-recapture study, R = Recaptured ~ Animal has been captured and tracked between 2015 and 2018.
DaR: Status at release (Dead/Alive). Records if the animal died of unnatural causes due to mishaps at capture or was known to be alive at the time of release back on the site of capture.
Cohort: Assignment based on the year in which the individual was known to have been born (e.g., 2017 would imply the individual was born in 2017). NA values indicate Cohort could not be ascertained and subsequently Age could not be determined.
Age: Determined based on the annual difference between the Cohort in which the animal was born and the year in which it was captured (Age = 0 is a juvenile animal, 1 = animal entering the first breeding season (~ 1 year old), 2 = animal entering the second breeding season (~2 year old). NA values indicate unknown cohort and thus, unknown age.
SVL: Known as snout-vent-length. Measured in millimeters using a ruler. Estimates the animal's length from the tip of its snout to its venter on the date of capture. Used as a common measure of body size. If an animal was not measured at the time of capture, it is indicated by an NA.
Mass: Measured body weight of animals (in grams) on the date of capture. If an animal was not measured at the time of capture, it is indicated by an NA.
off_year: Estimated number of offspring sired in the year 2019 based on genetic parentage. A p-value corrected with FDR at <0.05 was considered as a successful parentage assignment. Used as a measure of annual reproductive fitness. NA values indicate that the individual was not included in parentage analysis and thus could not be assigned offspring. Zero values, on the other hand, indicate individuals that were included in parentage analysis but were not assigned offspring suggesting they did not successfully mate or sire offspring.
mat_year: Estimated unique number of females with which males sired offspring in 2019 based on genetic parentage assignments. Used as a measure of annual mating success. NA values indicate that the individual was not included in parentage analysis, could not be assigned offspring , and thus could not be assigned any mates. Zero values, on the other hand, indicate individuals that were included in parentage analysis but had no assigned offspring indicating they did not successfully mate. While in theory, it could indicate that they mated but females did not lay eggs sired by the given male, we considered these components inestimable.
t_FEC: Total number of offspring produced by the unique mates of the focal male. It is an estimate of the total fecundity of all partners of a male. This measure was not included in the selection analyses. NA values indicate that males either were not successfully genotyped or that males were genotyped and included in parentage analyses but were not assigned any offspring (off_year = NA or 0).
m_FEC: Average per mate fecundity. Calculated by dividing t_FEC by mat_year to estimate the average number of offspring produced across all females that a male sired offspring with. NA values indicate that males either were not successfully genotyped or that males were genotyped and included in parentage analyses but were not assigned any offspring (off_year = NA or 0).
mna_PCS: Unadjusted siring success. Calculated as the average proportion of offspring sired by a male amongst all the offspring produced by female mates. Calculated as the ratio between off_year and t_FEC. Selection on male body mass due to this measure is demonstrated in Fig. S3. NA values indicate that males either were not successfully genotyped or that they were not assigned included in parentage analyses but did not have any offspring assigned (off_year = NA or 0).
m_PCS: Competitive Fertilization success. Calculated by scaling the average proportion paternity of individuals across all the females who had more than 2 mating partners (following Devigili et. al. 2015). To calculate this metric, one would need the total number of mates that each female partner of a male had, which cannot be displayed in this dataset for reasons of brevity. We estimated selection on body mass due to this measure of competitive fertilization success. NA values indicate, males either were not successfully genotyped, males sired no offspring or sired all offspring with the same female and had no known competing males thus rendering competitive fertilization success inestimable.
Size_class: Allotted size quartile that males were distributed in based on their body mass in the given month.
Color: Specifies the color of fluorescent powder applied to males based on the size quartile they belonged to in the given month (orange, pink, yellow, green)
Notes: Specifies anything of note in the experiment. Blank values mean that there are no comments against those individuals. Specifically indicates a male that was accidentally included in the powdering experiment but was ascertained as an individual belonging to Cohort 2019 (sexually immature juvenile male).
B) Females_2019.csv
ID: Continuous variable, Unique identifying ID number given to a captured individual. Each individual is distinguished based on the order in which toes were clipped. Few cases have missing IDs since these animals were captured and measured directly on the field and were eventually excluded from the dataset before analyses. NA values indicate animal could not be linked to an ID, although they were toe-clipped previously. These NA values tended to occur in conjunction with captures in May which was the only time we captured, measured, identified females directly on the field, and released them back to their site of capture on the same day. We considered these as data points in the study primarily to assess behavioral estimates of mating success.
Sex: (Sex at birth) Categorical variable, M = male, F = female. Since this subset consists of only Females, M is not present.
CapDate: Date of capturing animal on field at the start of the breeding season in March 2019. Written in yyyy-mm-dd format or mm-dd-yyyy
Month: Mentions month in which animals were captured. Note the months indicate when the sampling trip began (although some sampling trips may end in the following month). This column can filter females for the specific subset used for the analyses (May or July)
Recap: Categorical N/R. N = New capture ~ Captured for the first time in the mark-recapture study, R = Recaptured ~ Animal has been captured and tracked between 2015 and 2018.
DaR: Status at release (Dead/Alive). Records if the animal died of unnatural causes due to mishaps at capture or was known to be alive at the time of release back on the site of capture.
Cohort: Assignment based on the year in which the individual was known to have been born (e.g., 2017 would imply the individual was born/captured as a juvenile in 2017).NA values indicate that cohort of individuals could not be identified if it was newly captured as an adult or a previously clipped animal could not be accurately identified
Age: Determined based on the annual difference between the Cohort in which the animal was born and the year in which it was captured (Age = 0 is a juvenile animal, 1 = animal entering the first breeding season (~ 1 year old), 2 = animal entering the second breeding season (~2 year old) and, so on). NA values indicate the Cohort could not be ascertained confidently.
SVL: Known as snout-vent-length. Measured in millimeters using a ruler. Estimates the animal's length from the tip of its snout to its venter on the date of capture. Used as a common measure of body size.
Mass: Measured body weight of animals (in grams) on the date of capture. NA values indicate that males either were not successfully genotyped or that males were genotyped and included in parentage analyses but were not assigned any offspring.
off_year: Estimated number of offspring produced in the year 2019 based on genetic parentage. A p-value corrected with FDR at <0.05 was considered as a successful parentage assignment. Used as a measure of annual reproductive fitness as well as a measure of total fecundity of the female. NA values indicate that the individual was not included in parentage analysis and thus could not be assigned offspring. Zero values, on the other hand, indicate individuals that were included in parentage analysis but were not assigned offspring suggesting they did not successfully mate or produce viable offspring.
mat_year: Estimated unique number of males with which females produced offspring in 2019 based on genetic parentage assignments. Used as a measure of annual mating success. NA values indicate that the individual was not included in parentage analysis and thus could not be assigned offspring. Zero values, on the other hand, indicate individuals that were included in parentage analysis but were not assigned offspring suggesting they did not successfully mate.
fec_year: Average per mate fecundity. Calculated by dividing off_year by mat_year to estimate the average number of offspring produced across all males that a female produced offspring. NA values indicate that the individual was not included in parentage analysis and thus could not be assigned offspring. We also assumed that individuals who did not have any offspring assigned, did so because they did not mate successfully even though it is likely this may be due to other factors like not producing viable eggs etc. As a result, we considered fec_year inestimable for these individuals too.
Color_Present: Categorical (Yes/No). Specifies if females had color present in and around their venter under UV light suggesting a potential transfer of fluorescent powder from males during mating. Blank values indicate that these individuals were not part of the powdering experiment.
Could_segregate: Categorical (Yes/No). Specifies if the observer could reliably identify the color on the venter. In cases, where we could not (Could _segregate = No), those copulations were omitted from further analysis.
Non_copulation contact: Categorical (Yes/No). Indicates if the color was on a part of the body that is usually not in direct contact during mating and is not considered as a copulation event.
Multiple_mating: Categorical (Yes/No). Indicates if two or more colors were detected on the venter, suggesting instances of mating with males belonging to more than one size quartile. Even if females could have mated with more than one male from the same size quartile, we did not consider that as an instance of multiple mating. If multiple mating = Yes, we chose to report the copulation as a new entry in the dataset, even if all other morphological and fitness data for the ID remained the same.
Color: Indicates the identified color of fluorescent powder present on the female. Categorical variable with one of the following entries - none, orange, pink, green or yellow.
Mating: Indicates if an individual copulated or not, as determined by transfer of fluorescent powder. Categorical (1/0 where 1 = Copulated, 0 = Did not copulate).
Size_class_mating: Indicates the size category of males with which the females were detected to have mated (In May, yellow = S1, green = S2, pink = S3, orange = S4; whereas in July, orange = S1, pink = S2, green = S3, yellow = S4). S0 indicated that a mating was not detected (Color = none).
Sharing/Access information
Links to other publicly accessible locations of the data:- NA
Data was derived from the following sources:- Data was collected as part of this study.
Code/Software
We used RStudio to write scripts to wrangle and analyze the data. Following is the complete list of packages we used in analyzing and plotting the data. The Rcode used to analyze the dataset to reproduce figures and analyses in the paper is available upon request. Requests may be sent to the Corresponding Author: Rachana Bhave at the following email address - rachanabhave@gmail.com
- readxl: Wickham H, Bryan J. readxl: Read Excel Files. 2021 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=readxl
- lubridate: Grolemund G, Wickham H. lubridate: Make Dealing with Dates a Little Easier. 2021 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=lubridate
- ggplot2: Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag; 2016 [cited 2023 Nov 1]. Available from: https://ggplot2.tidyverse.org
- ggthemes: Arnold JB. ggthemes: Extra Themes, Scales and Geoms for 'ggplot2'. 2019 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=ggthemes
- data.table: Dowle M, Srinivasan A. data.table: Extension of data.frame.2021 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=data.table
- tidyverse: Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, et al. Welcome to the tidyverse. Journal of Open Source Software. The Open Journal; 2019 [cited 2023 Nov 1];4(43):1686. Available from: https://doi.org/10.21105/joss.01686
- here: Müller K. here: A Simpler Way to Find Your Files. 2020 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=here
- dplyr: Wickham H, François R, Henry L, Müller K. dplyr: A Grammar of Data Manipulation. 2021 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=dplyr
- sjmisc: Lüdecke D. sjmisc - Data and Variable Transformation Functions. Journal of Open Source Software. The Open Journal; 2018 [cited 2023 Nov 1];3(26):754. Available from: https://doi.org/10.21105/joss.00754
- magrittr: Bache SM, Wickham H. magrittr: A Forward-Pipe Operator for R. 2014 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=magrittr
- ggsignif: Constantinou P, Patil I. ggsignif: Significance Brackets for 'ggplot2' [Internet]. 2020 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=ggsignif
- effects: Fox J, Weisberg S, Price B, Adler D, Bates D, Baud-Bovy G, et al. Package 'effects' - Effect Displays for Linear, Generalized Linear, and Other Models [Internet]. CRAN; 2020 Oct p. 1–121. Report No.: Version 4.2-0. Available from: https://cran.r-project.org/web/packages/effects/effects.pdf
- lmerTest: Kuznetsova A, Brockhoff PB, Christensen RHB. lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software. Foundation for Open Access Statistics; 2017;82(13). Available from: https://doi.org/10.18637/jss.v082.i13
- ggeffects: Lüdecke D. ggeffects - Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs. Journal of Open Source Software. The Open Journal; 2018 [cited 2023 Nov 1];3(26):772. Available from: https://doi.org/10.21105/joss.00772
- car: Fox J, Weisberg S. An R Companion to Applied Regression (Third Edition). Thousand Oaks CA: Sage; 2019 Nov p. xxi + 499 pp., ISBN: 978-1-5443-5626-3. Available from: https://socialsciences.mcmaster.ca/jfox/Books/Companion/
- grid: R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2021 [cited 2023 Nov 1]. Available from: https://www.R-project.org/
- gridExtra: Auguie B. gridExtra: Miscellaneous Functions for "Grid" Graphics. 2017 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=gridExtra
- writexl: Ooms J. writexl: Export Data Frames to Excel 'xlsx' Format. 2021 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=writexl
- MASS: Venables WN, Ripley BD. Modern Applied Statistics with S. Fourth Edition. New York: Springer; 2002.
- lme4: Bates D, Mächler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software. Foundation for Open Access Statistics; 2015;67(1). Available from: https://doi.org/10.18637/jss.v067.i01
- tiff: Urbanek S. tiff: Read and write TIFF images. 2020 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=tiff
- ggpubr: Kassambara A. ggpubr: 'ggplot2' Based Publication Ready Plots. 2020 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=ggpubr
- broom: Robinson D, Hayes A, Couch S. broom: Convert Statistical Objects into Tidy Tibbles. 2020 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=broom
- jtools: Long JA. jtools: Analysis and Presentation of Social Scientific Data. 2019 [cited 2023 Nov 1]. Available from: https://CRAN.R-project.org/package=jtools
Methods
We studied an island population of brown anole lizards (Anolis sagrei) in the Guano Tolomato Matanzas Natural Estuarine Research Reserve in northern Florida (29°37′53′′ N, 81°12′ 46′′ W) using procedures approved by the University of Virginia Animal Care and Use Committee (protocol 3896). To assay the reproductive success of males, we sampled all adults and juveniles of the population at four different times during the breeding season (March, May, July, and October) in 2019. This population has been the focus of a long-term mark-recapture study since 2015, such that most adults in the 2019 breeding season were first captured, marked, and genotyped as juveniles in 2017 or 2018. We measured the snout-vent length (SVL, nearest 1 mm) and body mass (nearest 0.01g) of all individuals prior to releasing them at their exact site of capture the following day. We extracted DNA from adults captured in the population and used SNPPIT 2.0 (Anderson 2012) to assign genetic parentage. At two points in the middle of the breeding season (May and July), we also tracked copulations in the wild by dusting the venters of adult males with fluorescent powder as we released them and tracked the transfer to females to evaluate if and the extent to which body size mediates higher reproductive success through mating success.
Funding
National Science Foundation, Award: DEB-1453089, CAREER Grant
Explorers Club, Mamont Scholar Field Grant - 2020