Fighting isn’t sexy in lekking Greater Sage-grouse: A relational event model approach for mating interactions

Snow, Samuel S.1 2 ; Patricelli, Gail L.3; Butts, Carter T.4; Krakauer, Alan H.3; Perry, Anna C.3; Logsdon, Ryane3; Prum, Richard O.2

Published Apr 08, 2025 on Dryad. https://doi.org/10.5061/dryad.w9ghx3g1f

Data files

Apr 08, 2025 version files 2.15 GB

README.md

25.64 KB
simplified_sim_experiment_results_sd_7days_rec.zip

1.10 GB
simplified_sim_experiment_results_sd_7days.zip

1.05 GB

Abstract

The relationship between aggression and mate choice in mating systems is critical for understanding the evolution and diversification of sexual organisms, and yet remains the subject of vigorous debate. A key challenge is that traditional correlational approaches cannot distinguish underlying mechanisms of social interaction and can indicate misleading positive associations between aggression and mating events. We implement a novel Relational Event Model (REM) incorporating temporal dependencies of events in a social network to study natural reproductive behavior in a lek-breeding system where males gather to display and females visit to evaluate mates, often observing both male courtship displays and fights. We find that fighting is not attractive to females. Indeed, males are less likely to start and more likely to leave fights with females present, plausibly to avoid entanglement in protracted combat cycles arising from emergent social processes that reduce availability to mate. However, fighting serves other roles, e.g., to deter copulation interruptions and rebuff competitors. Our findings support the hypothesis that social systems regulating conflict and promoting females’ choice based on display are fundamental to stable lek evolution. Moreover, our analysis highlights the utility of the REM framework in testing mechanistic hypotheses in behavioral ecology and evolution.

This repository contains all the data and code necessary to reproduce the results presented in Snow et al., Fighting isn't sexy in lekking Greater Sage-grouse: a relation event model approach for mating interactions

Setup

Follow these steps to set up the project:

Download project repository

Download the project repository Snow_et_al_data_code_ESM.zip from Zenodo.
Unzip

Download data

The two datasets archived here on Dryad must be downloaded in order to reproduce the summary tables from the raw simulation data in Section 6.7 (Simulation Experiments; refer to that section for a detailed explanation of the materials contained).

Download data 'simplified_sim_experiment_results_sd_7days.zip' and 'simplified_sim_experiment_results_sd_7days_rec.zip'
Unzip the downloaded .zip files and move them both into the 'sim_experiments' directory within the main project folder

Steps to reproduce sage grouse REM results

The following is a guide to the use of the materials, broken down by directory contained in the code package archived on Zenodo.

Data Processing
1. Run the script: full_interaction_data_for_REM_import_code_udata_supp.R
  - This reproduces the consolidated data file:
    full_data_plus_cop_quest.tsv
2. Run the script: rem_udata_centroid_data.R
  - This reproduces the folder: udata_centroid_data and its
    subfolders
    - contains spatial data for the positions of the sage grouse:
      - centroids of movement for each male on each day
      - pairwise distance between centroids for all males on each day
3. Run the script: relevent_formatting_dis_reverse_udata_wincolumn_disturbance. to reproduce:
  - relevent_format_dataset_udata_wincol_disturbance.tsv; the
    interaction network data formatted for use in the REM analysis.
  - event_types_list_udata.csv; a list of all the unique types of events
    that can happen in the scheme of the REM
REM pre-analysis
1. Script relevent_analysis_intercepts_udata.R shows the
  process by which we established the base intercept-only model
  for the REM. Optimized models all use this intercept model as
  its baseline
2. The two text files contain the final intercept-only model output
  for the aggression and attractiveness models, respectively. The
  file intercepts_only_attractiveness_model.txt is the basis
  for the annotated Table S2.
3. Script rem.functions.R contains code for each of the
  functions that can add a particular sufficient statistic to the
  REM model. The final models are composed of the basic intercepts
  and optimized subsets of these statistics
Cluster Optimization: These folders contain all the materials and scripts used for running our model selection algorithm on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures
1. Aggression Model
  - CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
    event_types_list_udata.csv, male_pres_exact_quest.csv,
    relevent_format_dataset_udata_wincol_disturbance.tsv,
    rem.functions.R, and pairwise_dist are all data files
    identical to their counterparts in the data_files and
    data_processing folders
  - run_pipeline_stepwise.sh
    and run_pipeline_stepwise_restart.sh are shell scripts that
    orchestrate the parallel runs of the REM model on the cluster.
    The former is for starting fresh, and the latter is for
    restarting in the middle of a run
  - genremlist* scripts generate the sets of joblists that
    get sent to the cluster. genremlist_next* is triggered by
    the shell script after each round of model selection; it
    assesses which combination of statistics so far has performed
    the best and generates new sets to test.
    genremlist_restart* creates the next joblist after a run
    on the cluster is paused and needs to be restarted at the
    point where it left off.
  - The script rem_cluster_model_selection_
    stepwise_udata_terr.R encodes a single run of the REM model on the
    cluster for a given set of sufficient statistics.
  - Cluster run outcomes:
    - previous_best_initial_run shows the set of statistics
      from the initial run of model selection, including some
      non-significant statistics
    - previous_best_re-run_without_226_283 shows the result
      from removing the non-significant statistics from the pool
      and re-running the algorithm from the beginning
    - previous_best_remove_226_283_at_the_end shows the result
      from removing the non-significant statistics from the final
      results of the initial run, and then restarting the
      algorithm from that point. It converges immediately and with
      the best BIC; this represents the final statistic set
      reflected in the final aggression model results, and this
      file is the one referred to in the analysis script.
  - The script best_result_full_model_forward_
    udata_terrsol_analysis.R goes through the final model selection
    process and the best-fit statistic set result in detail and produces the
    figures contained in the summary_figs folder, as well as
    the output files rem_model_coef and rem_model_sd,
    giving the values for the coefficients for the best-fitting
    model and the standard deviations for the coefficients, respectively.
  - For reference, aggression_model_output_full.txt contains
    the full model output for the best-fitting aggression model,
    reported as Table S4.
2. Attractiveness Model
  - The data files, shell scripts, and "genremlist" scripts, are
    equivalent to their counterparts found in the aggression model
    folder
  - The script rem_cluster_model_selection_
    stepwise_udata_terr_id.R encodes a single run of the REM model
    on the cluster for a given set of sufficient statistics,
    using the base intercept model that accounts for
    individual-level attractiveness effects.
  - Cluster run outcomes:
    - previous_best_remove_226_at_the_end shows the set of
      statistics from the initial run of model selection in the
      second-to-last row, and the result of removing the
      non-significant statistic and then restarting the algorithm
      from that point in the last row. The algorithm converges
      immediately, and so the result is the same as what one gets
      by simply removing the single non-significant predictor.
      This is the final set of statistics reflected in the final
      attractiveness model results, and this file is the one
      referred to in the analysis script.
    - previous_best_re_run_without_226 shows the result from
      removing the non-significant statistic from the pool and
      re-running the algorithm from the beginning.
  - Analogous to the script in the Aggression Model folder,
    the script best_result_full_model_forward_udata
    _terrsol_id_analysis.R goes through the final model selection process
    and the best-fit statistic set result in detail and produces the
    figures contained in the summary_figs folder, as well as
    the output files rem_model_coef and rem_model_sd,
    giving the values for the coefficients for the best-fitting
    model and the standard deviations for the coefficients,
    respectively.
    - The plot allplot_terr_id_model_grid_fem.pdf corresponds
      to the final Figure 2 in the main text
  - For reference, attractiveness_model_output_full.txt
    contains the full model output for the best-fitting
    attractiveness model, reported as Table S4.
Adequacy Checking: These folders contain the scripts (and associated products) used for examining the adequacy of the optimized REM models for describing the observed data.
1. Aggression Model
  - Run the script relevent_analysis_base_udata_terr_adequacy.R to
    reproduce the basic adequacy analysis as well as:
    - CSV files per_event_adequacy_data_full
      and per_event_adequacy_data_summary; these contain the lists
      of hazards the REM model calculates for each event in the
      observed dataset and some summary statistics for the
      calculated hazards, respectively.
    - Folder main plots; contains summary plots describing
      Aggression Model adequacy, including event matching, event
      rank histograms, recall, and deviance residuals.
2. Attractiveness Model
  - Run the script relevent_analysis_base_udata_terr_id_adequacy.R
    to reproduce the basic adequacy analysis as well as:
    - CSV files per_event_adequacy_data_full
      and per_event_adequacy_data_summary; as above,
      these contain the lists of hazards the REM model calculates
      for each event in the observed dataset and some summary statistics
      for the calculated hazards, respectively.
      - Folder main plots; contains summary plots describing
        Attractiveness Model adequacy, including event matching,
        event rank histograms, recall, and deviance residuals.
        
        recall_rank_comb.pdf corresponds to Figure S7
        
        devmeanplot.pdf corresponds to Figure S8
  - Run the script availability_interruption_plots_terr_id_new.R to
    reproduce:
    - matingplot_new.pdf which corresponds to Figure S4
    - summary plot fight_count_fig.pdf
  - Run the script fight_effect_explore.R to reproduce:
    - fight_effect_fig.pdf which corresponds to Figure 4
      in the main text
  - Run the script time_fighting.R to reproduce:
    - Plot time_fighting_fig.pdf which corresponds to Figure S5
Event History Simulation: These folders contain all the materials and scripts used for producing simulated event history datasets based on the best-fit REM models on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures.
1. Aggression Model
  - CHG_2014_disturbance_events.csv,
    CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
    event_types_list_udata.csv, male_pres_exact_quest.csv,
    relevent_format_dataset_udata_wincol_disturbance.tsv, and
    pairwise_dist are all data files identical to their
    counterparts in the data_files and data_processing folders
  - rem.functions.act.terr.sim.R contains code for each of the
    functions that can add a particular sufficient statistic to
    the REM model. They correspond to the functions in the
    rem.functions.R file in the rem_pre_analysis folder, but in
    this case they are coded in order to be robust to a growing,
    simulated event history, rather than an existing observed
    dataset.
  - previous_best_remove_226_283_at_the_end and rem_model_coef
    are identical to the files representing the final statistic
    set for the best-fitting Aggression Model results, and the
    corresponding coefficients, respectively, found in
    cluster_optimization/Aggression_model/.
  - run_pipeline_sim_act.sh is a shell script that organizes
    the parallel simulations on the cluster
  - genremlist_sim.R script is called by the shell script and
    generates the joblist that gets sent to the cluster
  - rem_event_hist_simulation_cluster_udata_terr_actualday_newcensus.R
    encodes a single simulation run, in this case, based on the
    best-fit Aggression REM.
  - The folder sim_results_actualday contains all the
    simulated data produced, in this case based on the best-fit
    Aggression Model: 100 simulations for each of the 18 days in
    the empirical dataset.
    - !! Due to storage considerations, the folders within have
      been compressed as .zip files; in order to run the following
      analysis script, these must first be expanded.
  - The script sim_output_analysis_cluster_new_3.R brings
    together the simulation data and produces summary analyses and
    figures, contained in the folder sim_actualday_plots:
    - Each day_level subfolder contains sets of plots
      summarizing the simulations for each simulated day in the
      dataset
    - Other plots summarize the characteristics of the simulated
      "season" of data overall, and compares them to the observed
      data, including overall counts of different types of events
      and overall network characteristics of the event histories
      - File matesmall.pdf corresponds to Figure 3, panel A in
        the main text
2. Attractiveness Model
  - The data files, shell script, "genremlist" script, and
    function list are equivalent to their counterparts found in
    event_history_simulation/Aggression_model/
  - previous_best_remove_226_at_the_end and rem_model_coef are
    identical to the files representing the final statistic set
    for the best-fitting Attractiveness Model results, and the
    corresponding coefficients, respectively, found in
    cluster_optimization/Attractiveness_model/.
  - rem_event_hist_simulation_cluster_udata_terr_id_actualday_newcensus.R
    encodes a single simulation run, in this case, based on the
    best-fit Attractiveness REM.
  - The folder sim_results_actualday contains all the
    simulated data produced, in this case based on the best-fit
    Attractiveness Model: 100 simulations for each of the 18 days
    in the empirical dataset.
    - !!Due to storage considerations, the folders within have
      been compressed as .zip files; in order to run the following
      analysis script, these must first be expanded.
  - As above, the script sim_output_analysis_cluster_new_3.R
    brings together the simulation data and produces summary
    analyses and figures, contained in the folder sim_actualday_plots:
    - As above, each day_level subfolder contains sets of
      plots summarizing the simulations for each simulated day in
      the dataset
      - nodemetsplot_2014-03-28.pdf corresponds to Figure S11
    - Other plots summarize the characteristics of the simulated
      season of data overall, and compares them to the observed
      data, including overall counts of different types of events
      and overall network characteristics of the event histories
      - File overallcount_plot.pdf corresponds to Figure S9
      - File attendfight_combined.pdf corresponds to Figure S10
      - File overallnet_plot.pdf corresponds to Figure S12
      - File matesmall.pdf corresponds to Figure 3, panel B in
        the main text

6. Simulation Experiments: This folder contains all the materials and scripts used for producing simulated "experiments" on a simplified, imaginary lek using the best-fit Attractiveness REM models as a basis. Includes code for running simulations on a high-performance computing cluster, subsequent analysis, and the creation of summary figures. See main text for more detailed description of the simulation scenarios.

CHG_2014_disturbance_events.csv,
CHG_2014_male_presence_data_quest.csv, CHG_females_exact.tsv,
event_types_list_udata.csv, male_pres_exact_quest.csv,
relevent_format_dataset_udata_wincol_disturbance.tsv, and the
pairwise_dist directory are all data files identical to their
counterparts in the data_files and data_processing folders.
rem.functions.act.terr.sim.R contains code for each of the
functions that can add a particular sufficient statistic to the
REM model. As above in the Event History Simulations, they
correspond to the functions in the rem.functions.R file in the
rem_pre_analysis folder, but in this case they are coded in order
to be robust to a growing, simulated event history, rather than an
existing observed dataset.
previous_best_remove_226_at_the_end and rem_model_coef are
identical to the files representing the final statistic set for
the best-fitting Attractiveness Model results, and the
corresponding coefficients, respectively, found in
cluster_optimization/Attractiveness_model/.
run_pipeline_*.sh are shell scripts that organizes the
parallel simulations on the cluster for the control and treatment
simulations for both the Average Receive Hazard case and the High
Receive Hazard case.
genremlist_simex_*.R scripts is called by their respective
shell scripts and generate the joblist arrays that get submitted
to the cluster.
sim_ex_simplified_3_sd.R and sim_ex_simplified_3_sd_rec.R
each encodes a single simulation run for the Average Receive
Hazard case and the High Receive Hazard case, respectively.
The folders simplified_sim_experiment_results_sd_7days
and simplified_sim_experiment_results_sd_7days_rec contain all
the simulated data produced for the Average Susceptibility case
and the High Susceptibility case, respectively: for each case,
there are 1000 simulations for each of the three treatments and a
control.
- Inside each folder, you will find a directory called 8days (representing experiments with seven consecutive simulated days and one day of burn-in at the beginning). Within this directory, the folders are structured as follows:

        8days
    ├── attack
    │   └── male_1
    │       └── level_sd
    │           ├── experiment_1_8_0_1_1_sd_001
    │           ├── experiment_1_8_0_1_1_sd_002
    │           ├── ...
    │           └── experiment_1_8_0_1_1_sd_1000
    │               ├── evlist_day_1.csv
    │               ├── evlist_day_2.csv
    │               ├── ...
    │               ├── evlist_day_9.csv
    │               ├── full_event_list.csv
    │               ├── full_event_list_day_1.csv
    │               ├── full_event_list_day_2.csv
    │               ├── ...
    │               ├── full_event_list_day_9.csv
    │               ├── hazards
    │               ├── pres_exact_day_1.csv
    │               ├── pres_exact_day_2.csv
    │               ├── ...
    │               └── pres_exact_day_9.csv
    ├── attack_plus_disengage
    │   └── male_1
    │       └── level_sd
    │           ├── experiment_1_8_0_1_1_sd_001
    │           ├── experiment_1_8_0_1_1_sd_002
    │           └──  ...
    ├── disengage
    │   └── male_1
    │       └── level_sd
    │           ├── experiment_1_8_0_1_1_sd_001
    │           ├── experiment_1_8_0_1_1_sd_002
    │           └──  ...
    ├── control
    │   ├── experiment_1_8_0_1_1_sd_001
    │   ├── experiment_1_8_0_1_1_sd_002
    │   └──  ...

For the three treatment folders, the subfolders indicate the focal male and the level of the treatment (male 1 and one standard deviation, respectively). Each experiment folder contains the output from one run of the simulation. Only a small subset of folders and files are shown here for clarity. Ellipses (...) indicate omitted, sequentially numbered entries with the same structure/contents.
Each evlist*.csv file contains the sequential, simulated list of
events for the indicated simulated day, with events numerically coded
in REM format (i.e., two columns, one for event code and one for time
in s). In this case, day 1 is a dummy placeholder and the true
simulation day starts on day 2.
- the full_event_list.csv file contains the entire simulated
  sequence of events concatenated across days. It contains four columns,
  'Events' coded with event type names, 'Time' in s, 'Females' denoting
  female presence as 1 for no and 2 for yes, and 'date' indicating the
  simulated day. The other full_event_list_day*.csv files contian the
  same information and structure but split by simulation day.
- the large hazards.csv file records all of the calculated hazards
  for each possible event type at each moment for the whole simulation,
  with a columns for each event type (see main text) and rows for
  timepoints.
- each pres_exact_day*.csv file contains a dataframe of arrival events
  for each male on the simulated lek. This has the same format as the
  full_event_list files but with an additional column for 'Bird_ID'. For
  the simulations archived here, all the pres_exact files are identical,
  recording arrivals for all six males at time 0 and no departures.
  The control folder lacks the additional organization by focal male and
  treatment level and directly contains experiment folders. These all have
  similar structure and contents as the those within the treatment directories.
!! Due to storage considerations, these folders must be
downloaded separately from the online Supplemental Material
(Dryad DOI: 10.5061/dryad.w9ghx3g1f). The summary scripts will
not work without the data downloaded and the directories placed
inside the sim_experiments folder.

The script sim_experiment_simplified_summary_functions.R defines the functions that are used to summarize the raw simulation data.
The script sim_experiment_simplified_data_prep.R brings
together the all of the simulation data and summarizes it into
data tables, which are pre-saved in this repository and
contained in the folder summary_data:
- simex_1000_7days_focal_boxplot_combined.csv contains the
  disaggregated summary metrics calculated for the focal male in
  each of the simulation runs.
- simex_1000_7days_focal_output_combined.csv contains the mean
  values for each metric in each treatment, the difference of each
  treatment from the control, and the z-score representing each
  contrast.
- simex_1000_7days_focal_mean_SEs_for_plotting.csv contains
  the mean values for each metric in each treatment and their
  respective standard errors.
WARNING: If you wish to reproduce figures from raw data, this script is very RAM intensive and takes a long time
(~1 hour) to run. Recommend at least 32G RAM free.
The script sim_experiment_simplified_output_plots.R produces plots based
on the data contained in the summary_data folder, and places them in the new
directory plots:
- sim_meanplot_1000_7days_sd_focal_combined.pdf corresponds to Figure S13.
- sim_bubbleplot_1000_7days_sd_focal_combined.pdf is the direct
  output of the script and is the basis for Figure 5.
- sim_bubbleplot_1000_7days_sd_focal_combined_labels.pdf is the
  final labeled version for Figure 5 in the main text.

7. Other supporting materials

The folder Predictor_summary_supplement contains the
materials for summarizing the set of effects tested in the REM
model
- Run script funlist_formatting.R to reproduce funlist_text,
  which is the basis for Table S1, the
  full annotated function list.
- Table_S1_categorized.csv corresponds to the full Table
  S1
- Run script predictor_summary_alluvial_plot_clean.R to
  reproduce plots visually summarizing the model scheme:
  - predictor_summary_alluvial_all.pdf plot
    corresponds to Figure S2
  - predictor_summary_alluvial_significant.pdf plot
    corresponds to Figure S3
  - predictor_summary_alluvial_*_labels.pdf files have
    adjusted labels as seen in the final plots in the text.
Run script supp_corplot_fight_by_mating.R:
- Reproduces supp_corplot_fight_by_mating.pdf which
  corresponds to Figure S6
The file Appendix_S2_Protocol for Collecting Male Interaction
Data.pdf contains the standardized protocol for extracting the
behavioral data from videos recorded in the field.
The file 2014_Chugwater_Lek_Counts_clean.xlsx contains
supplemental data on lek conditions and attendance by males and
females collected in the field.
The folder srt files contain the raw .srt subtitle data from
video annotations.
The folder CHG script export materials contains the python
script and supporting information on the placement of the stake
array (see Methods) necessary for converting the .srt files
into usable data tables.
The folder spatial_mapping contains materials for
reproducing territory maps based on positional data
- Run script territory_kernel_full_data_plus_pos.R to
  reproduce territory_plots_pts.pdf which corresponds
  to Figure S14.

Fighting isn’t sexy in lekking Greater Sage-grouse: A relational event model approach for mating interactions

Data files

Abstract

Setup

Download project repository

Download data

Steps to reproduce sage grouse REM results

Data Processing

REM pre-analysis

Cluster Optimization: These folders contain all the materials and scripts used for running our model selection algorithm on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures

Adequacy Checking: These folders contain the scripts (and associated products) used for examining the adequacy of the optimized REM models for describing the observed data.

Event History Simulation: These folders contain all the materials and scripts used for producing simulated event history datasets based on the best-fit REM models on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures.

7. Other supporting materials

Fighting isn’t sexy in lekking Greater Sage-grouse: A relational event model approach for mating interactions

Data files

Abstract

README: Fighting isn’t sexy in lekking Greater Sage-grouse: A relational event model approach for mating interactions

Setup

Download project repository

Download data

Steps to reproduce sage grouse REM results

Data Processing

REM pre-analysis

Cluster Optimization: These folders contain all the materials and scripts used for running our model selection algorithm on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures

Adequacy Checking: These folders contain the scripts (and associated products) used for examining the adequacy of the optimized REM models for describing the observed data.

Event History Simulation: These folders contain all the materials and scripts used for producing simulated event history datasets based on the best-fit REM models on a high-performance computing cluster from a single directory, subsequent analysis, and the creation of summary figures.

7. Other supporting materials