Fighting isn’t sexy in lekking Greater Sage-grouse: A relational event model approach for mating interactions
Data files
Apr 08, 2025 version files 2.15 GB
-
README.md
25.64 KB
-
simplified_sim_experiment_results_sd_7days_rec.zip
1.10 GB
-
simplified_sim_experiment_results_sd_7days.zip
1.05 GB
Abstract
The relationship between aggression and mate choice in mating systems is critical for understanding the evolution and diversification of sexual organisms, and yet remains the subject of vigorous debate. A key challenge is that traditional correlational approaches cannot distinguish underlying mechanisms of social interaction and can indicate misleading positive associations between aggression and mating events. We implement a novel Relational Event Model (REM) incorporating temporal dependencies of events in a social network to study natural reproductive behavior in a lek-breeding system where males gather to display and females visit to evaluate mates, often observing both male courtship displays and fights. We find that fighting is not attractive to females. Indeed, males are less likely to start and more likely to leave fights with females present, plausibly to avoid entanglement in protracted combat cycles arising from emergent social processes that reduce availability to mate. However, fighting serves other roles, e.g., to deter copulation interruptions and rebuff competitors. Our findings support the hypothesis that social systems regulating conflict and promoting females’ choice based on display are fundamental to stable lek evolution. Moreover, our analysis highlights the utility of the REM framework in testing mechanistic hypotheses in behavioral ecology and evolution.
This repository contains all the data and code necessary to reproduce the results presented in Snow et al., Fighting isn’t sexy in lekking Greater Sage-grouse: a relation event model approach for mating interactions
Setup
Follow these steps to set up the project:
Download project repository
- Download the project repository
Snow_et_al_data_code_ESM.zip
from Zenodo. - Unzip
Download data
The two datasets archived here on Dryad must be downloaded in order to reproduce the summary tables from the raw simulation data in Section 6.7 (Simulation Experiments; refer to that section for a detailed explanation of the materials contained).
- Download data ‘simplified_sim_experiment_results_sd_7days.zip’ and ‘simplified_sim_experiment_results_sd_7days_rec.zip’
- Unzip the downloaded .zip files and move them both into the ‘sim_experiments’ directory within the main project folder
Steps to reproduce sage grouse REM results
The following is a guide to the use of the materials, broken down by directory contained in the code package archived on Zenodo.
-
Data Processing
- Run the script: full_interaction_data_for_REM_import_code_udata_supp.R
- This reproduces the consolidated data file:
full_data_plus_cop_quest.tsv
- This reproduces the consolidated data file:
- Run the script: rem_udata_centroid_data.R
- This reproduces the folder: udata_centroid_data and its
subfolders- contains spatial data for the positions of the sage grouse:
- centroids of movement for each male on each day
- pairwise distance between centroids for all males on each day
- contains spatial data for the positions of the sage grouse:
- This reproduces the folder: udata_centroid_data and its
- Run the script: relevent_formatting_dis_reverse_udata_wincolumn_disturbance. to reproduce:
- relevent_format_dataset_udata_wincol_disturbance.tsv; the
interaction network data formatted for use in the REM analysis. - event_types_list_udata.csv; a list of all the unique types of events
that can happen in the scheme of the REM
- relevent_format_dataset_udata_wincol_disturbance.tsv; the
- Run the script: full_interaction_data_for_REM_import_code_udata_supp.R
-
REM pre-analysis
- Script relevent_analysis_intercepts_udata.R shows the
process by which we established the base intercept-only model
for the REM. Optimized models all use this intercept model as
its baseline - The two text files contain the final intercept-only model output
for the aggression and attractiveness models, respectively. The
file intercepts_only_attractiveness_model.txt is the basis
for the annotated Table S2. - Script rem.functions.R contains code for each of the
functions that can add a particular sufficient statistic to the
REM model. The final models are composed of the basic intercepts
and optimized subsets of these statistics
- Script relevent_analysis_intercepts_udata.R shows the
-
Cluster Optimization: These folders contain all the materials and scripts used for running our model selection algorithm on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures
- Aggression Model
- CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
event_types_list_udata.csv, male_pres_exact_quest.csv,
relevent_format_dataset_udata_wincol_disturbance.tsv,
rem.functions.R, and pairwise_dist are all data files
identical to their counterparts in the data_files and
data_processing folders - run_pipeline_stepwise.sh
and run_pipeline_stepwise_restart.sh are shell scripts that
orchestrate the parallel runs of the REM model on the cluster.
The former is for starting fresh, and the latter is for
restarting in the middle of a run - genremlist* scripts generate the sets of joblists that
get sent to the cluster. genremlist_next* is triggered by
the shell script after each round of model selection; it
assesses which combination of statistics so far has performed
the best and generates new sets to test.
genremlist_restart* creates the next joblist after a run
on the cluster is paused and needs to be restarted at the
point where it left off. - The script rem_cluster_model_selection_
stepwise_udata_terr.R encodes a single run of the REM model on the
cluster for a given set of sufficient statistics. - Cluster run outcomes:
- previous_best_initial_run shows the set of statistics
from the initial run of model selection, including some
non-significant statistics - previous_best_re-run_without_226_283 shows the result
from removing the non-significant statistics from the pool
and re-running the algorithm from the beginning - previous_best_remove_226_283_at_the_end shows the result
from removing the non-significant statistics from the final
results of the initial run, and then restarting the
algorithm from that point. It converges immediately and with
the best BIC; this represents the final statistic set
reflected in the final aggression model results, and this
file is the one referred to in the analysis script.
- previous_best_initial_run shows the set of statistics
- The script best_result_full_model_forward_
udata_terrsol_analysis.R goes through the final model selection
process and the best-fit statistic set result in detail and produces the
figures contained in the summary_figs folder, as well as
the output files rem_model_coef and rem_model_sd,
giving the values for the coefficients for the best-fitting
model and the standard deviations for the coefficients, respectively. - For reference, aggression_model_output_full.txt contains
the full model output for the best-fitting aggression model,
reported as Table S4.
- CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
- Attractiveness Model
- The data files, shell scripts, and “genremlist” scripts, are
equivalent to their counterparts found in the aggression model
folder - The script rem_cluster_model_selection_
stepwise_udata_terr_id.R encodes a single run of the REM model
on the cluster for a given set of sufficient statistics,
using the base intercept model that accounts for
individual-level attractiveness effects. - Cluster run outcomes:
- previous_best_remove_226_at_the_end shows the set of
statistics from the initial run of model selection in the
second-to-last row, and the result of removing the
non-significant statistic and then restarting the algorithm
from that point in the last row. The algorithm converges
immediately, and so the result is the same as what one gets
by simply removing the single non-significant predictor.
This is the final set of statistics reflected in the final
attractiveness model results, and this file is the one
referred to in the analysis script. - previous_best_re_run_without_226 shows the result from
removing the non-significant statistic from the pool and
re-running the algorithm from the beginning.
- previous_best_remove_226_at_the_end shows the set of
- Analogous to the script in the Aggression Model folder,
the script best_result_full_model_forward_udata
_terrsol_id_analysis.R goes through the final model selection process
and the best-fit statistic set result in detail and produces the
figures contained in the summary_figs folder, as well as
the output files rem_model_coef and rem_model_sd,
giving the values for the coefficients for the best-fitting
model and the standard deviations for the coefficients,
respectively.- The plot allplot_terr_id_model_grid_fem.pdf corresponds
to the final Figure 2 in the main text
- The plot allplot_terr_id_model_grid_fem.pdf corresponds
- For reference, attractiveness_model_output_full.txt
contains the full model output for the best-fitting
attractiveness model, reported as Table S4.
- The data files, shell scripts, and “genremlist” scripts, are
- Aggression Model
-
Adequacy Checking: These folders contain the scripts (and associated products) used for examining the adequacy of the optimized REM models for describing the observed data.
- Aggression Model
- Run the script relevent_analysis_base_udata_terr_adequacy.R to
reproduce the basic adequacy analysis as well as:- CSV files per_event_adequacy_data_full
and per_event_adequacy_data_summary; these contain the lists
of hazards the REM model calculates for each event in the
observed dataset and some summary statistics for the
calculated hazards, respectively. - Folder main plots; contains summary plots describing
Aggression Model adequacy, including event matching, event
rank histograms, recall, and deviance residuals.
- CSV files per_event_adequacy_data_full
- Run the script relevent_analysis_base_udata_terr_adequacy.R to
- Attractiveness Model
- Run the script relevent_analysis_base_udata_terr_id_adequacy.R
to reproduce the basic adequacy analysis as well as:- CSV files per_event_adequacy_data_full
and per_event_adequacy_data_summary; as above,
these contain the lists of hazards the REM model calculates
for each event in the observed dataset and some summary statistics
for the calculated hazards, respectively.- Folder main plots; contains summary plots describing
Attractiveness Model adequacy, including event matching,
event rank histograms, recall, and deviance residuals.- recall_rank_comb.pdf corresponds to Figure S7
- devmeanplot.pdf corresponds to Figure S8
- Folder main plots; contains summary plots describing
- CSV files per_event_adequacy_data_full
- Run the script availability_interruption_plots_terr_id_new.R to
reproduce:- matingplot_new.pdf which corresponds to Figure S4
- summary plot fight_count_fig.pdf
- Run the script fight_effect_explore.R to reproduce:
- fight_effect_fig.pdf which corresponds to Figure 4
in the main text
- fight_effect_fig.pdf which corresponds to Figure 4
- Run the script time_fighting.R to reproduce:
- Plot time_fighting_fig.pdf which corresponds to Figure S5
- Run the script relevent_analysis_base_udata_terr_id_adequacy.R
- Aggression Model
-
Event History Simulation: These folders contain all the materials and scripts used for producing simulated event history datasets based on the best-fit REM models on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures.
- Aggression Model
- CHG_2014_disturbance_events.csv,
CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
event_types_list_udata.csv, male_pres_exact_quest.csv,
relevent_format_dataset_udata_wincol_disturbance.tsv, and
pairwise_dist are all data files identical to their
counterparts in the data_files and data_processing folders - rem.functions.act.terr.sim.R contains code for each of the
functions that can add a particular sufficient statistic to
the REM model. They correspond to the functions in the
rem.functions.R file in the rem_pre_analysis folder, but in
this case they are coded in order to be robust to a growing,
simulated event history, rather than an existing observed
dataset. - previous_best_remove_226_283_at_the_end and rem_model_coef
are identical to the files representing the final statistic
set for the best-fitting Aggression Model results, and the
corresponding coefficients, respectively, found in
cluster_optimization/Aggression_model/. - run_pipeline_sim_act.sh is a shell script that organizes
the parallel simulations on the cluster - genremlist_sim.R script is called by the shell script and
generates the joblist that gets sent to the cluster - rem_event_hist_simulation_cluster_udata_terr_actualday_newcensus.R
encodes a single simulation run, in this case, based on the
best-fit Aggression REM. - The folder sim_results_actualday contains all the
simulated data produced, in this case based on the best-fit
Aggression Model: 100 simulations for each of the 18 days in
the empirical dataset.- !! Due to storage considerations, the folders within have
been compressed as .zip files; in order to run the following
analysis script, these must first be expanded.
- !! Due to storage considerations, the folders within have
- The script sim_output_analysis_cluster_new_3.R brings
together the simulation data and produces summary analyses and
figures, contained in the folder sim_actualday_plots:- Each day_level subfolder contains sets of plots
summarizing the simulations for each simulated day in the
dataset - Other plots summarize the characteristics of the simulated
“season” of data overall, and compares them to the observed
data, including overall counts of different types of events
and overall network characteristics of the event histories- File matesmall.pdf corresponds to Figure 3, panel A in
the main text
- File matesmall.pdf corresponds to Figure 3, panel A in
- Each day_level subfolder contains sets of plots
- CHG_2014_disturbance_events.csv,
- Attractiveness Model
- The data files, shell script, “genremlist” script, and
function list are equivalent to their counterparts found in
event_history_simulation/Aggression_model/ - previous_best_remove_226_at_the_end and rem_model_coef are
identical to the files representing the final statistic set
for the best-fitting Attractiveness Model results, and the
corresponding coefficients, respectively, found in
cluster_optimization/Attractiveness_model/. - rem_event_hist_simulation_cluster_udata_terr_id_actualday_newcensus.R
encodes a single simulation run, in this case, based on the
best-fit Attractiveness REM. - The folder sim_results_actualday contains all the
simulated data produced, in this case based on the best-fit
Attractiveness Model: 100 simulations for each of the 18 days
in the empirical dataset.- !!Due to storage considerations, the folders within have
been compressed as .zip files; in order to run the following
analysis script, these must first be expanded.
- !!Due to storage considerations, the folders within have
- As above, the script sim_output_analysis_cluster_new_3.R
brings together the simulation data and produces summary
analyses and figures, contained in the folder sim_actualday_plots:- As above, each day_level subfolder contains sets of
plots summarizing the simulations for each simulated day in
the dataset- nodemetsplot_2014-03-28.pdf corresponds to Figure S11
- Other plots summarize the characteristics of the simulated
season of data overall, and compares them to the observed
data, including overall counts of different types of events
and overall network characteristics of the event histories- File overallcount_plot.pdf corresponds to Figure S9
- File attendfight_combined.pdf corresponds to Figure S10
- File overallnet_plot.pdf corresponds to Figure S12
- File matesmall.pdf corresponds to Figure 3, panel B in
the main text
- As above, each day_level subfolder contains sets of
- The data files, shell script, “genremlist” script, and
- Aggression Model
6. Simulation Experiments: This folder contains all the materials and scripts used for producing simulated “experiments” on a simplified, imaginary lek using the best-fit Attractiveness REM models as a basis. Includes code for running simulations on a high-performance computing cluster, subsequent analysis, and the creation of summary figures. See main text for more detailed description of the simulation scenarios.
- CHG_2014_disturbance_events.csv,
CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
event_types_list_udata.csv, male_pres_exact_quest.csv,
relevent_format_dataset_udata_wincol_disturbance.tsv, and the
pairwise_dist directory are all data files identical to their
counterparts in the data_files and data_processing folders. - rem.functions.act.terr.sim.R contains code for each of the
functions that can add a particular sufficient statistic to the
REM model. As above in the Event History Simulations, they
correspond to the functions in the rem.functions.R file in the
rem_pre_analysis folder, but in this case they are coded in order
to be robust to a growing, simulated event history, rather than an
existing observed dataset. - previous_best_remove_226_at_the_end and rem_model_coef are
identical to the files representing the final statistic set for
the best-fitting Attractiveness Model results, and the
corresponding coefficients, respectively, found in
cluster_optimization/Attractiveness_model/. - run_pipeline_*.sh are shell scripts that organizes the
parallel simulations on the cluster for the control and treatment
simulations for both the Average Receive Hazard case and the High
Receive Hazard case. - genremlist_simex_*.R scripts is called by their respective
shell scripts and generate the joblist arrays that get submitted
to the cluster. - sim_ex_simplified_3_sd.R and sim_ex_simplified_3_sd_rec.R
each encodes a single simulation run for the Average Receive
Hazard case and the High Receive Hazard case, respectively. - The folders simplified_sim_experiment_results_sd_7days
and simplified_sim_experiment_results_sd_7days_rec contain all
the simulated data produced for the Average Susceptibility case
and the High Susceptibility case, respectively: for each case,
there are 1000 simulations for each of the three treatments and a
control.- Inside each folder, you will find a directory called 8days (representing experiments with seven consecutive simulated days and one day of burn-in at the beginning). Within this directory, the folders are structured as follows:
8days
├── attack
│ └── male_1
│ └── level_sd
│ ├── experiment_1_8_0_1_1_sd_001
│ ├── experiment_1_8_0_1_1_sd_002
│ ├── ...
│ └── experiment_1_8_0_1_1_sd_1000
│ ├── evlist_day_1.csv
│ ├── evlist_day_2.csv
│ ├── ...
│ ├── evlist_day_9.csv
│ ├── full_event_list.csv
│ ├── full_event_list_day_1.csv
│ ├── full_event_list_day_2.csv
│ ├── ...
│ ├── full_event_list_day_9.csv
│ ├── hazards
│ ├── pres_exact_day_1.csv
│ ├── pres_exact_day_2.csv
│ ├── ...
│ └── pres_exact_day_9.csv
├── attack_plus_disengage
│ └── male_1
│ └── level_sd
│ ├── experiment_1_8_0_1_1_sd_001
│ ├── experiment_1_8_0_1_1_sd_002
│ └── ...
├── disengage
│ └── male_1
│ └── level_sd
│ ├── experiment_1_8_0_1_1_sd_001
│ ├── experiment_1_8_0_1_1_sd_002
│ └── ...
├── control
│ ├── experiment_1_8_0_1_1_sd_001
│ ├── experiment_1_8_0_1_1_sd_002
│ └── ...
- For the three treatment folders, the subfolders indicate the focal male and the level of the treatment (male 1 and one standard deviation, respectively). Each experiment folder contains the output from one run of the simulation. Only a small subset of folders and files are shown here for clarity. Ellipses (…) indicate omitted, sequentially numbered entries with the same structure/contents.
- Each evlist*.csv file contains the sequential, simulated list of
events for the indicated simulated day, with events numerically coded
in REM format (i.e., two columns, one for event code and one for time
in s). In this case, day 1 is a dummy placeholder and the true
simulation day starts on day 2.- the full_event_list.csv file contains the entire simulated
sequence of events concatenated across days. It contains four columns,
‘Events’ coded with event type names, ‘Time’ in s, ‘Females’ denoting
female presence as 1 for no and 2 for yes, and ‘date’ indicating the
simulated day. The other full_event_list_day*.csv files contian the
same information and structure but split by simulation day. - the large hazards.csv file records all of the calculated hazards
for each possible event type at each moment for the whole simulation,
with a columns for each event type (see main text) and rows for
timepoints. - each pres_exact_day*.csv file contains a dataframe of arrival events
for each male on the simulated lek. This has the same format as the
full_event_list files but with an additional column for ‘Bird_ID’. For
the simulations archived here, all the pres_exact files are identical,
recording arrivals for all six males at time 0 and no departures.
The control folder lacks the additional organization by focal male and
treatment level and directly contains experiment folders. These all have
similar structure and contents as the those within the treatment directories.
- the full_event_list.csv file contains the entire simulated
- !! Due to storage considerations, these folders must be
downloaded separately from the online Supplemental Material
(Dryad DOI: 10.5061/dryad.w9ghx3g1f). The summary scripts will
not work without the data downloaded and the directories placed
inside the sim_experiments folder.
- The script sim_experiment_simplified_summary_functions.R defines the functions that are used to summarize the raw simulation data.
- The script sim_experiment_simplified_data_prep.R brings
together the all of the simulation data and summarizes it into
data tables, which are pre-saved in this repository and
contained in the folder summary_data:- simex_1000_7days_focal_boxplot_combined.csv contains the
disaggregated summary metrics calculated for the focal male in
each of the simulation runs. - simex_1000_7days_focal_output_combined.csv contains the mean
values for each metric in each treatment, the difference of each
treatment from the control, and the z-score representing each
contrast. - simex_1000_7days_focal_mean_SEs_for_plotting.csv contains
the mean values for each metric in each treatment and their
respective standard errors.WARNING: If you wish to reproduce figures from raw data, this script is very RAM intensive and takes a long time
(~1 hour) to run. Recommend at least 32G RAM free.
- simex_1000_7days_focal_boxplot_combined.csv contains the
- The script sim_experiment_simplified_output_plots.R produces plots based
on the data contained in the summary_data folder, and places them in the new
directory plots:- sim_meanplot_1000_7days_sd_focal_combined.pdf corresponds to Figure S13.
- sim_bubbleplot_1000_7days_sd_focal_combined.pdf is the direct
output of the script and is the basis for Figure 5. - sim_bubbleplot_1000_7days_sd_focal_combined_labels.pdf is the
final labeled version for Figure 5 in the main text.
7. Other supporting materials
- The folder Predictor_summary_supplement contains the
materials for summarizing the set of effects tested in the REM
model- Run script funlist_formatting.R to reproduce funlist_text,
which is the basis for Table S1, the
full annotated function list. - Table_S1_categorized.csv corresponds to the full Table
S1 - Run script predictor_summary_alluvial_plot_clean.R to
reproduce plots visually summarizing the model scheme:- predictor_summary_alluvial_all.pdf plot
corresponds to Figure S2 - predictor_summary_alluvial_significant.pdf plot
corresponds to Figure S3 - predictor_summary_alluvial_*_labels.pdf files have
adjusted labels as seen in the final plots in the text.
- predictor_summary_alluvial_all.pdf plot
- Run script funlist_formatting.R to reproduce funlist_text,
- Run script supp_corplot_fight_by_mating.R:
- Reproduces supp_corplot_fight_by_mating.pdf which
corresponds to Figure S6
- Reproduces supp_corplot_fight_by_mating.pdf which
- The file Appendix_S2_Protocol for Collecting Male Interaction
Data.pdf contains the standardized protocol for extracting the
behavioral data from videos recorded in the field. - The file 2014_Chugwater_Lek_Counts_clean.xlsx contains
supplemental data on lek conditions and attendance by males and
females collected in the field. - The folder srt files contain the raw .srt subtitle data from
video annotations. - The folder CHG script export materials contains the python
script and supporting information on the placement of the stake
array (see Methods) necessary for converting the .srt files
into usable data tables. - The folder spatial_mapping contains materials for
reproducing territory maps based on positional data- Run script territory_kernel_full_data_plus_pos.R to
reproduce territory_plots_pts.pdf which corresponds
to Figure S14.
- Run script territory_kernel_full_data_plus_pos.R to