Data from: Repeatability of adaptation in sunflowers reveals that genomic regions harbouring inversions also drive adaptation in species lacking an inversion
Data files
Nov 13, 2023 version files 1.24 GB

README.md

sunflower_repeated_adaptation_dryad.tbz
Abstract
Local adaptation commonly involves alleles of large effect, which experience fitness advantages when in positive linkage disequilibrium (LD). Because segregating inversions suppress recombination and facilitate the maintenance of LD between locally adapted loci, they are also commonly found to be associated with adaptive divergence. However, it is unclear what fraction of an adaptive response can be attributed to inversions and alleles of large effect, and whether the loci within an inversion could still drive adaptation in the absence of its recombinationsuppressing effect. Here, we use genomewide association studies to explore patterns of local adaptation in three species of sunflower: Helianthus annuus, H. argophyllus, and H. petiolaris, which each harbour a large number of speciesspecific inversions. We find evidence of significant genomewide repeatability in signatures of association to phenotypes and environments, which are particularly enriched within regions of the genome harbouring an inversion in one species. This shows that while inversions may facilitate local adaptation, at least some of the loci can still harbour mutations that make substantial contributions without the benefit of recombination suppression in species lacking a segregating inversion. While a large number of genomic regions show evidence of repeated adaptation, most of the strongest signatures of association still tend to be speciesspecific, indicating substantial genotypic redundancy for local adaptation in these species.
README: Repeatability of adaptation in sunflowers reveals that genomic regions harbouring inversions also drive adaptation in species lacking an inversion
This archive contains the intermediate data files and scripts required to create the main and supplementary figures for the paper titled "Repeatability of adaptation in sunflowers: genomic regions harbouring inversions also drive adaptation in species lacking an inversion"
The folder called "scripts" contains the scripts.
The folder called "data_intermediate_files" contains the data necessary to run the scripts.
The folder called "results" includes some of the outputs from different results, including lists of the top windows identified by PicMin.
Versions & Packages
All R scripts are run in R v4.1.2
Scripts use the following packages:
RColorBrewer
gplots
wesanderson
ggplot2
ggdist
data.table
dplyr
ggridges
poolr
pheatmap
GenomicRanges
magrittr
biomaRt
topGO
enrichplot
ggpubr
stringr
foreach
doParallel
Two scripts for processing VCFs in PERL are also included (vcf2vertical_sunflower_modified.pl & snp_coverage.pl)
NA values in datasets:
 for all scripts running analyses on the matrices of pvalues from the genomewide association tests (with environment and phenotype), NA values indicate a place where either there was no phenotype/environment measured in that species.
 for all scripts running analyses on the matrices of top candidate scores, NAs occur when either the above condition (1.) occurs or when there are not enough SNPs in a window to calculate a score.
SCRIPTS
The following scripts make the figures indicated in their names:
 figure3B_S15_plot_cscores_figure_6way_FIXED_NFFD.R
 figure3A_dbinom_clusters_NFFD15_bothdir.R
 figure4_S3_S13_S14_process_MAX_SIPEC_CRAs_FIGURE_PCA_pairwise.R
 figure5_plot_top_candidate_overlap_inversions_per_species.R
 figureS15CD_process_dbinom_clusters_chyper_find0_6way_FP_any_eref.R
 figureS2_plot_violins_sunflower.R
 figureS5_S6_process_phe_env_corr_topcand_introgression.R
 figureS12_introgression_SIPEC_CRAs.R
 figures4.R
 Figure_S7_A_B_twosided_barplot_convergent_clusters_overlapes_inversions.R
 Figure_S7_C_D_hetmaps_convergent_clusters_overlapes_inversions.R
 Figure_S9_10_GO_Enrichment.R
 These scripts are used for preprocessing, necessary to run the above files:
 process_dbinom_clusters_picmin_nopetfal_baypass_fixed.R: runs PicMin on the Baypass results, excluding petiolaris fallax and comparing the other 3 taxa
 process_dbinom_clusters_picmin_nopetfal_fixed.R: runs PicMin on the data, excluding petiolaris fallax and comparing the other 3 taxa
 process_dbinom_clusters_picmin_nopetpet_baypass_fixed.R: runs PicMin on the Baypass results, excluding petiolaris petiolaris and comparing the other 3 taxa
 process_dbinom_clusters_picmin_nopetpet_fixed.R: runs PicMin on the data, excluding petiolaris petiolaris and comparing the other 3 taxa
 process_picmin_results_by_genomic_region_across_vars.R: Runs clustering statistics on the PicMin results, collapsing nearby windows together for each region (across variables)
 process_picmin_results_per_variable.R: Runs clustering statistics on the PicMin results, collapsing nearby windows together for each variable
 process_top_candidate_overlap_inversions_per_species.R: Processes the inversion overlap, used in figure 5.
 pull_annotations_sunflower.R: This pulls annotations for genes near each of the significant PicMin results
Data
The following data files are included in the archive, with descriptions of their contents. Unfortunately due to the main authors not being available to refine the documentation, not all meanings of variables are known to SY, who attempted to document this.
FigureS4data.csv
 species: species
 recombination_bin: which recombination bin
 Values: local recombination rate
 logValues: log transform of column 3
GWAScorrected_nullw_result_recom_bin
 Taxa1: focal species in comparison
 Taxa2: alternate species in comparison
 Variable: environmental variable
 quantile: recombination quantile bin
 Window: window chromosome:position
 W: W statistic
 P: p value
 Mean_test: unknown
 N_samples: number of SNPs
 Mean_BG: unknown
 N_BG: number of SNPs in background window
 Count: unknown
 Pred_P: Zscore
 Emp_p: empirical pvalue (does it fall in the top 5% of the tail?)
 window_name: window ID
H.annuus_corr_env_pheno_p1_91_cleaned.txt
 phenotype: phenotype
 environment: environment
 H.annuus: correlation between phenotype and environment
H.argophyllus_corr_env_pheno_output_p1_31_cleaned.txt
 phenotype: phenotype
 environment: environment
 H.argophyllus: correlation between phenotype and environment
H.pet.fall_corr_env_pheno_output_p1_71_cleaned.txt
 phenotype: phenotype
 environment: environment
 H.pet.fall: correlation between phenotype and environment
H.pet.pet_corr_env_pheno_output_p1_71_cleaned.txt
 phenotype: phenotype
 environment: environment
 H.pet.pet: correlation between phenotype and environment
HAN412_Eugene_curated_v1_1.gff3
Standard gff3 format with columns:
 Chromosome
 Eugene
 Gene/mRNA/CDS/exon
 start position
 end position
 .
 orientation
 .
 annotation information
Ha412HO_inv.v3.inversions.regions.v1.txt
 group: grouping
 start: start position
 end: end position
 chr: chromosome
 spe: species
 species: species (alternate naming)
 dataset: which dataset identified the putative inversion
 mds: direction
Hannuus_Athaliana_shared_OG.rds
 Orthogroup: orthogroup
 OF_gene: gene ID
 gea_gene: position of gene
 Athal_genes: Arabidopsis thaliana annotations
SNP_count.csv
 window5: 5kb window chromosome/position
 window: unique identifier
 Annuus: number of SNPs in Annuus
 Argophyllus: number of SNPs in Argophyllus
 petpet: number of SNPs in H. pet pet
 petfal: number of SNPs in H. pet fal
 Annuus_Argophyllus: number of SNPs in both Annuus and Argophyllus
 Annuus_petpet: number of SNPs in both Annuus and Argophyllus
 Annuus_petfal: number of SNPs in both Annuus and Argophyllus
 Argophyllus_petpet: number of SNPs in both Argophyllus and petpet
 Argophyllus_petfal: number of SNPs in both Argophyllus and petfal
 petpet_petfal: number of SNPs in both petpet and petfal
all_res_cscore_FP_anygene.txt
 V1: code for comparison among species (1 = H.annuusH.argophyllus, 2 = H.annuusH.pet.fallax, 3 = H.annuusH.pet.petiolaris, 4 = H.argophyllusH.pet.fallax, 5 = H.argophyllusH.pet.petiolaris, 6 = H.pat.fallaxH.pet.petiolaris)
 V2: Environmental variable
 V3: code for comparison among species (1 = H.annuusH.argophyllus, 2 = H.annuusH.pet.fallax, 3 = H.annuusH.pet.petiolaris, 4 = H.argophyllusH.pet.fallax, 5 = H.argophyllusH.pet.petiolaris, 6 = H.pat.fallaxH.pet.petiolaris)
 V4:V20: the inferred number of genes that would give a cscore of 0 given observed overlap between the pair of species, assuming different false positive rates (each column with a false positive rate ranging from 0 to 0.8 in increments of 0.05).
all_windows_annot.txt
 V1: top candidate index for annuus
 V2: top candidate index for argophyllus
 V3: top candidate index for petpet
 V4: top candidate index for petfal
 V5: chromosome
 V6: start position
 V7: end position
 V8: Window ID
 V9: Environmental variable
 V10: picmin pvalue
 q: FDR adjusted picmin pvalue
 inversion_overlap (0 = no, 1 = yes)
 recomb_rate: recombination rate
 recomb_quantile: bin for recombination rate (15)
 merg1[,1] chromosome and position
 annot: annotation associated with the window
 nearby_annot: nearest annotation
 nearby_annot_dist: distance in bp to nearest annotation
 nearby_gff_dist: distance to closest gff annotation
 relpos: integer (meaningless, used to check for bugs)
convergent_cluster_inversion_overlap_LD0.9_1cM_ranges_verbose
 data: which type of data is this variable
 analysis: is it structure corrected (baypass = yes, spearman = no, gwas_uncorrected = no, gwas_corrected = yes)
 variable: environmental variable/phenotype
 direction: focal/alternate species to call WRAs
 comparison: focal/alternate species to call WRAs
 chromosome: chromosome
 cluster_start: start position
 cluster_end: end position
 cluster_size: size of cluster
 N_convergent: Number of WRAs
 inversion_species: which species was the inversion identified
 inversion_ID: ID for inversion
 inversion_start: start position of inversion
 inversion_end: end position of inversion
 inversion_size: size of inversion
 overlap_start: region of overlap start
 overlap_end: region of overlap end
 overlap_size: region of overlap size
 is_cluster_overlapping: yes/no
convergent_clustering_0.9_1_cM_summary
 data: input data type (climate, soil, phenotype)
 analysis: type of analysis (baypass, spearmans, gwas_corrected, gwas_uncorrected)
 variable: environmental variable
 direction: focal and alternate species
 chromosome: chromosome
 range: start:end position
 size: size of region
 N_convergent: number of WRAs in cluster
convergent_clustering_0.9_1cM_summary
NOTE: this file is used to collapse adjacent windows into clusters for Cscores analysis. Not used in other aspects of CRAs.
 V1: species
 V2: analysis (spearman, gwascorrected)
 V3: environmental variable
 V4: chromosome
 V5: start:end position
 V6: window size
 V7: number of windows
dbinom_score_spearman_Eref
 Annuus: top candidate index for Annuus
 Argophyllus: top candidate index for Argophyllus
 petfal: top candidate index for H. pet fal
 petpet: top candidate index for H. pet pet
 chrom: chromosome
 start_window: window start
 end_window: window stop
 window_id: window ID
 variable: environmental variable
dbinom_score_spearman_NFFD
 Annuus: top candidate index for Annuus
 Argophyllus: top candidate index for Argophyllus
 petfal: top candidate index for H. pet fal
 petpet: top candidate index for H. pet pet
 chrom: chromosome
 start_window: window start
 end_window: window stop
 window_id: window ID
 variable: environmental variable
dbinom_scores
NOTE: this directory contains many files all with the same format of columns, with their filenames indicating the environmental variable and the kind of analysis used (baypass = structure corrected, spearman = noncorrected). The top candidate index values have been 1 * log10 transformed. NA's correspond to missing data due to lack of sequence information.
 Annuus: top candidate index for Annuus
 Argophyllus: top candidate index for Argophyllus
 petfal: top candidate index for H. pet fal
 petpet: top candidate index for H. pet pet
 chrom: chromosome
 start_window: window start
 end_window: window stop
 window_id: window ID
 variable: environmental variable
fixmin_tophits_alphatop1000_all_tested_windows.txt
NOTE: lists all of the windows that were tested in picmin
 V4: chromosome
 V5: start position
 V6: end position
 V7: window ID
fixmin_tophits_alphatop1000_recomb_FDR10_3way_pervar.txt
NOTE: This lists the clusters of adjacent windows with significant picmin hits
 cluster_chrom: chromosome of cluster of significant picmin hits
 cluster_start: cluster start position
 cluster_end: cluster end position
 cluster_var: environmental variable
 cluster_inv: is it overlapping an inversion
 cluster_numwin: how many windows are in the cluster
 clus1: mean empirical p for first species
 clus2: mean empirical p for second species
 clus3: mean empirical p for third species
 clus_recomb: average recombination rate for cluster
ha412_chrom_lengths.txt
 chrom: chromosome
 length: length in base pairs
introgression_standing_var_5window.txt
 window5: 5kb window chromosome/position
 window: unique identifier
 Annuus: number of SNPs in Annuus
 Argophyllus: number of SNPs in Argophyllus
 petpet: number of SNPs in H. pet pet
 petfal: number of SNPs in H. pet fal
 Annuus_Argophyllus: number of SNPs in both Annuus and Argophyllus
 Annuus_petpet: number of SNPs in both Annuus and Argophyllus
 Annuus_petfal: number of SNPs in both Annuus and Argophyllus
 Argophyllus_petpet: number of SNPs in both Argophyllus and petpet
 Argophyllus_petfal: number of SNPs in both Argophyllus and petfal
 petpet_petfal: number of SNPs in both petpet and petfal
 chrom: chromsome
 start: position
 end: end
 c_Annuus_Argophyllus: cscore for the overlap between these species in SNP counts
 c_Annuus_petfal: cscore for the overlap between these species in SNP counts
 c_Annuus_petpet: cscore for the overlap between these species in SNP counts
 c_Argophyllus_petfal: cscore for the overlap between these species in SNP counts
 c_Argophyllus_petpet: cscore for the overlap between these species in SNP counts
 c_petfal_petpe: cscore for the overlap between these speciest in SNP counts
overlap_heatmaps_nocluster_6way_0.995_include_inversion_FIXED0.txt
NOTE: includes inversion regions, no clustering of WRAs into CRAs
 variable: environmental variable
 pairwise_code: code for comparison among species (1 = H.annuusH.argophyllus, 2 = H.annuusH.pet.fallax, 3 = H.annuusH.pet.petiolaris, 4 = H.argophyllusH.pet.fallax, 5 = H.argophyllusH.pet.petiolaris, 6 = H.pat.fallaxH.pet.petiolaris)
 number_rows: baseline estimate of overlap (add subsequent values to this)
 cut2: minimum cutoff for top candidate index to consider a window as "adapted" (min = 2)
 cut3: minimum cutoff for top candidate index to consider a window as "adapted" (min = 3)
 cut4: minimum cutoff for top candidate index to consider a window as "adapted" (min = 4)
 cut5: minimum cutoff for top candidate index to consider a window as "adapted" (min = 5)
 cut6: minimum cutoff for top candidate index to consider a window as "adapted" (min = 6)
 cut7: minimum cutoff for top candidate index to consider a window as "adapted" (min = 7)
 cut8: minimum cutoff for top candidate index to consider a window as "adapted" (min = 8)
 cut9: minimum cutoff for top candidate index to consider a window as "adapted" (min = 9)
 cut10: minimum cutoff for top candidate index to consider a window as "adapted" (min = 10)
overlap_heatmaps_nocluster_othercorrect_6way_0.995_include_inversion_FIXED0.txt
NOTE: includes inversion regions, run on WRAs without clustering, and includes population structure correction results on environment
 variable: environmental variable
 pairwise_code: code for comparison among species (1 = H.annuusH.argophyllus, 2 = H.annuusH.pet.fallax, 3 = H.annuusH.pet.petiolaris, 4 = H.argophyllusH.pet.fallax, 5 = H.argophyllusH.pet.petiolaris, 6 = H.pat.fallaxH.pet.petiolaris)
 number_rows: baseline estimate of overlap (add subsequent values to this)
 cut2: minimum cutoff for top candidate index to consider a window as "adapted" (min = 2)
 cut3: minimum cutoff for top candidate index to consider a window as "adapted" (min = 3)
 cut4: minimum cutoff for top candidate index to consider a window as "adapted" (min = 4)
 cut5: minimum cutoff for top candidate index to consider a window as "adapted" (min = 5)
 cut6: minimum cutoff for top candidate index to consider a window as "adapted" (min = 6)
 cut7: minimum cutoff for top candidate index to consider a window as "adapted" (min = 7)
 cut8: minimum cutoff for top candidate index to consider a window as "adapted" (min = 8)
 cut9: minimum cutoff for top candidate index to consider a window as "adapted" (min = 9)
 cut10: minimum cutoff for top candidate index to consider a window as "adapted" (min = 10)
overlap_heatmaps_results_6way_0.995_include_inversion_FIXED0.txt
NOTE: includes inversion regions, collapses adjacent WRAs into clusters
 variable: environmental variable
 pairwise_code: code for comparison among species (1 = H.annuusH.argophyllus, 2 = H.annuusH.pet.fallax, 3 = H.annuusH.pet.petiolaris, 4 = H.argophyllusH.pet.fallax, 5 = H.argophyllusH.pet.petiolaris, 6 = H.pat.fallaxH.pet.petiolaris)
 number_rows: baseline estimate of overlap (add subsequent values to this)
 cut2: minimum cutoff for top candidate index to consider a window as "adapted" (min = 2)
 cut3: minimum cutoff for top candidate index to consider a window as "adapted" (min = 3)
 cut4: minimum cutoff for top candidate index to consider a window as "adapted" (min = 4)
 cut5: minimum cutoff for top candidate index to consider a window as "adapted" (min = 5)
 cut6: minimum cutoff for top candidate index to consider a window as "adapted" (min = 6)
 cut7: minimum cutoff for top candidate index to consider a window as "adapted" (min = 7)
 cut8: minimum cutoff for top candidate index to consider a window as "adapted" (min = 8)
 cut9: minimum cutoff for top candidate index to consider a window as "adapted" (min = 9)
 cut10: minimum cutoff for top candidate index to consider a window as "adapted" (min = 10)
annuus_phenotypes.csv
argophyllus_phenotypes.csv
petioaris_fallax_phenotypes.csv
petiolaris_petiolaris_phenotypes.csv
NOTE: For all four of these files, the phenotypes are as shown below:
 Plant ID: plant unique ID
 Genotype ID: unique genotype ID
 TLN: Total leave number
 LIR: Leaf initiation rate
 Days to budding: days to budding at anthesis
 DTF: Days to Flower
 Stem diamater at flowering: Primary stem diameter at anthesis
 Plant height at flowering: Total length of the main stem
 Internode length: Average length of an internode
 Stem diameter final before 1st node: Diameter of the stem base of fully developed plants
 Stem diameter final after 5th node: Diameter of the stem of fully developed plants
 Distance of first branching from ground: Distance between ground and first node
 Primary branches: Number of branches on the main stem
 SLA: Specific leaf area (mm^2)
 Leaf total N: Amount of nitrogen in leaf tissue
 Leaf total C: Amount of carbon in leaf tissue
 Leaf C N ratio: Ratio between carbon and nitrogen content in leaf tissue
 Disk diameter: Diameter of the inflorescence disk
 Ligule length: Length of individual ligules
 Ligule width: Maximum width of individual ligules
 Flower head diameter: Diameter of the inflorescence
 Ligule LW ratio: Ratio between ligule length and maximum width
 Flower FHDD ratio: Ratio between Inflorescence and disk diameters
 Ligules number: Average number of ligule per Inflorescence
 Stem colour: Presence and intensity of purple colour on main stem (04)
 Petiole main veins colour: Presence and intensity of purple colour on main leaf veins (04)
 Darker axillae: Presence of purple colour on leaf axillae (presence/absence)
 Leaf perimeter: Perimeter of an individual leaf
 Leaf area: Area of an individual leaf
 Leaf width midheight: Width measured at ½ of the leaf’s length
 Leaf maximum width: Maximum width of the leaf
 Leaf height midwidth: Length measured at ½ of the leaf’s width
 Leaf maximum height: Maximum length of the leaf
 Leaf curved height: Length measured along a curved line through the leaf (equidistant from both leaf borders)
 Leaf curvedHeight maxWidth: Ratio between curved length and maximum leaf width
 Trichomes length: Length of nonglandular trichomes on abaxial side of leaves
 Trichomes density leaf edge flat area: Density of trichomes outside secondary veins near the leaf edge
 Trichomes density leaf edge secondary veins: Density of trichomes on secondary veins near the leaf edge
 Trichomes density leaf center flat area: Density of trichomes outside secondary veins near the leaf main vein
 Trichomes density leaf center secondary veins: Density of trichomes on secondary veins near the leaf main vein
 Trichomes density edge average: Density of trichomes near the leaf edge
 Trichomes density center average: Density of trichomes near the leaf main vein
 Trichomes density nonvein average: Density of trichomes outside secondary veins
 Trichomes density vein average: Density of trichomes on secondary veins
 Leaf shape index external I: The ratio of the Maximum Height to Maximum Width
 Leaf shape index external II: The ratio of Height Midwidth to Width Midheight
 Leaf curved shape index: The ratio of Curved Height to the width of the leaf at midcurvedheight, as measured perpendicular to the curved height line
 Leaf ellipsoid: The ratio of the error resulting from a bestfit ellipse to the area of the leaf. Error is the average magnitude of residuals (Res) along the leaf’s perimeter, divided by the length of the major (longer) axis of the ellipse. Smaller values indicate that the leaf is more ellipsoid
 Leaf circular: The ratio of the error resulting from a bestfit circle to the area of the leaf. Error is the average magnitude of residuals (Res) along the leaf’s perimeter, divided by the radius of the circle. Smaller values indicate that the leaf is more circular
 Leaf rectangular: The ratio of the area of the rectangle bounding the leaf to the area of the rectangle bounded by the leaf
 Leaf obovoid: Obovoid is calculated from the maximum width (W), the height at which the maximum width occurs (y), the average width above that height (w1), and the average width below that height (w2), and a scaling function scale_ob as: Obovoid = 1/2 * scale_ob(y) * (1 – w1/W + w2/W) If Obovoid > 0, subtract 0.4. Otherwise, Obovoid is 0
 Leaf width widest pos: The ratio of the height at which the maximum width occurs to the Maximum Height
 Leaf eccentricity: The ratio of the height of the internal ellipse to the Maximum Height
 Leaf proximal eccentricity: The ratio of the height of the internal ellipse to the distance between the bottom of the ellipse and the top of the leaf
 Leaf Distal eccentricity: The ratio of the height of the internal ellipse to the distance between the top of the ellipse and the bottom of the leaf
 Leaf shape index internal: The ratio of the internal ellipse’s height to its width
 Leaf eccentricity area index: The ratio of the area of the leaf outside the ellipse to the total area of the leaf
 Total RGB: Sum of RGB values for leaf colour
 RGB proportion green: Proportion of green in average leaf colour
 RGB proportion red: Proportion of red in average leaf colour
 RGB proportion blue: Proportion of blue in average leaf colour
 Phyllaries diameter: Diameter of phyllaries whorl
 Phyllaries length: Length of individual phyllaries
 Phyllaries width: Maximum width of individual phyllaries
 Phyllaries LW ratio: Ratio between phyllaries length and maximum width
 Flowerhead to phyllaries diameter ratio: Ratio between flower head and phyllaries whorl diameters
 Disk to phyllaries diameter ratio: Ratio between flower disk and phyllaries whorl diameters
 Seed perimeter: Perimeter of an individual seed
 Seed area: Area of an individual seed
 Seed width mid height: Width measured at ½ of the seed’s length
 Seed maximum width: Maximum width of the seed
 Seed height mid width: Length measured at ½ of the seed’s width
 Seed maximum height: Maximum length of the seed
 Seed curved height: Length measured along a curved line through the seed (equidistant from both seed borders)
 Seed HW ratio: Ratio between curved length and maximum seed width
 Seed shape index external I: The ratio of the Maximum Height to Maximum Width
 Seed shape index external II: The ratio of Height Midwidth to Width Midheight
 Seed curved shape index: The ratio of Curved Height to the width of the seed at midcurvedheight, as measured perpendicular to the curved height line
 Seed ellipsoid: The ratio of the error resulting from a bestfit ellipse to the area of the seed. Error is the average magnitude of residuals (Res) along the seed’s perimeter, divided by the length of the major (longer) axis of the ellipse. Smaller values indicate that the seed is more ellipsoid
 Seed circular: The ratio of the error resulting from a bestfit circle to the area of the seed. Error is the average magnitude of residuals (Res) along the seed’s perimeter, divided by the radius of the circle. Smaller values indicate that the seed is more circular
 Seed rectangular: The ratio of the area of the rectangle bounding the seed to the area of the rectangle bounded by the seed
 Seed ovoid: Ovoid is calculated from the maximum width (W), the height at which the maximum width occurs (y), the average width above that height (w1), and the average width below that height (w2), and a scaling function scale_ov as: Ovoid = 1/2 * scale_ov(y) * (1 – w2/W + w1/W). If Ovoid > 0, subtract 0.4. Otherwise, Ovoid is 0
 Seed width widest pos: The ratio of the height at which the maximum width occurs to the Maximum Height
 Seed eccentricity: The ratio of the height of the internal ellipse to the Maximum Height
 Seed proximal eccentricity: The ratio of the height of the internal ellipse to the distance between the bottom of the ellipse and the top of the seed
 Seed distal eccentricity: The ratio of the height of the internal ellipse to the distance between the top of the ellipse and the bottom of the seed
 Seed shape index internal: The ratio of the internal ellipse’s height to its width
 Seed eccentricity area index: The ratio of the area of the seed outside the ellipse to the total area of the seed
recombination_bins.txt
 window: chromosome + position
 recomb_rate: estimated rate of recombination
 recomb_quantile: recombination bin (out of 5) with percentile range
results_WRAs_overlap_inversions_500reps_per_species.txt
 V1: species 1
 V2: species 2
 V3: environmental variable
 V4: how many windows that are nonWRAs are overlapping an inversion
 V5: how many windows that are nonWRAs are not overlapping an inversion
 V6: how many windows that are WRAs are overlapping an inversion
 V7: how many windows that are WRAs are not overlapping an inversion
 V8: what proportion of permuted nulls had a greater chisquare statistic than observed?
 V9: proportion of nonWRAs that overlap an inversion out of all nonWRAs
 V10: proportion of WRAs that overlap an inversion out of all WRAs
spearman_nullw_result_recom_bin
 Taxa1: focal species in comparison
 Taxa2: alternate species in comparison
 Variable: environmental variable
 quantile: recombination quantile bin
 Window: window chromosome:position
 W: W statistic
 P: p value
 Mean_test: unknown
 N_samples: number of SNPs
 Mean_BG: unknown
 N_BG: number of SNPs in background window
 Count: unknown
 Pred_P: Zscore
 Emp_p: empirical pvalue (does it fall in the top 5% of the tail?)
 window_name: window ID
sunflower_environments_clipped.csv
 Population ID: unique ID for each population
 Individuals: unique ID for each individual
 Taxon: species/subspecies
 Latitude: Latutidue
 Longitude: Longitude
 Elevation: Elevation
 MAT: mean annual temperature (°C)
 MWMT: mean warmest month temperature (°C)
 MCMT: mean coldest month temperature (°C)
 TD: Continentality, temperature difference between TD MWMT and MCMT (°C)
 MAP: Mean annual precipitation (mm)
 MSP: May to September precipitation (mm)
 AHM: Annual heatmoisture index (MAT+10)/(MAP/1000))
 SHM: Summer heatmoisture index ((MWMT)/(MSP/1000))
 DD_0: Degreedays below 0°C, chilling degreedays (°C)
 DD5: Degreedays above 5°C, growing degreedays
 DD_18: Degreedays below 18°C
 DD18: Degreedays above 18°C
 NFFD: Number of frostfree days
 bFFP: The day of the year on which FFP begins
 eFFP: The day of the year on which FFP ends
 FFP: Frostfree period (days)
 EMT: Extreme minimum temperature over 30 years (°C)
 EXT: Extreme maximum temperature over 30 years (°C)
 Eref: Hargreaves reference evaporation (mm)
 CMD: Hargreaves climatic moisture deficit (mm)
 RH: Relative humidity
 OM: Organic matter percentage
 P1: phosphorous, weak Bray (ppm)
 P2: phosphorous, strong Bray (ppm)
 BICARB: sodium bicarbonate (ppm)
 K: potassium (ppm)
 MG: magnesium (ppm)
 CA: calcium (ppm)
 NA: sodium (ppm)
 PH: soil pH
 CEC: cation exchange capacity (meq/100g)
 PERCENT_K: percent base saturation K (%)
 PERCENT_MG: percent base saturation Mg (%)
 PERCENT_CA: percent base saturation Ca (%)
 PERCENT_NA: percent base saturation Na (%)
 SOL_SALTS: soluble salts (mmhos/cm)
sunflower_picmin_allvars_top1000_recomb_fix2_long.txt
4species picmin analysis
 V1: top candidate index for annuus
 V2: top candidate index for argophyllus
 V3: top candidate index for petpet
 V4: top candidate index for petfal
 V5: chromosome
 V6: start position
 V7: end position
 V8: Window ID
 V9: Environmental variable
 V10: picmin pvalue
 V11: FDR adjusted picmin pvalue
sunflower_picmin_allvars_top1000_recomb_nopetfal_baypass_fix2_long.txt
3species picmin analysis excluding petfal and using baypass (structurecorrected) instead of raw spearman's correlation.
 V1: top candidate index for annuus
 V2: top candidate index for argophyllus
 V3: top candidate index for petpet
 V4: top candidate index for petfal
 V5: chromosome
 V6: start position
 V7: end position
 V8: Window ID
 V9: Environmental variable
 V10: picmin pvalue
 V11: FDR adjusted picmin pvalue
sunflower_picmin_allvars_top1000_recomb_nopetfal_fix2_long.txt
3species picmin analysis excluding petfal and using raw spearman's correlation.
 V1: top candidate index for annuus
 V2: top candidate index for argophyllus
 V3: top candidate index for petpet
 V4: top candidate index for petfal
 V5: chromosome
 V6: start position
 V7: end position
 V8: Window ID
 V9: Environmental variable
 V10: picmin pvalue
 V11: FDR adjusted picmin pvalue
sunflower_picmin_allvars_top1000_recomb_nopetpet_baypass_fix2_long.txt
3species picmin analysis excluding petpet and using baypass (structurecorrected) instead of raw spearman's correlation.
 V1: top candidate index for annuus
 V2: top candidate index for argophyllus
 V3: top candidate index for petpet
 V4: top candidate index for petfal
 V5: chromosome
 V6: start position
 V7: end position
 V8: Window ID
 V9: Environmental variable
 V10: picmin pvalue
 V11: FDR adjusted picmin pvalue
sunflower_picmin_allvars_top1000_recomb_nopetpet_fix2_long.txt
3species picmin analysis excluding petpet and using raw spearman's correlation.
 V1: top candidate index for annuus
 V2: top candidate index for argophyllus
 V3: top candidate index for petpet
 V4: top candidate index for petfal
 V5: chromosome
 V6: start position
 V7: end position
 V8: Window ID
 V9: Environmental variable
 V10: picmin pvalue
 V11: FDR adjusted picmin pvalue
top_candidate_spearman_H.argophyllus
 V1: environmental variable
 V2: window
 V3: number of SNPs
 V4: number of outliers
 V5: proportion
 V6: quantile expectation for binomial with 1e4
 V7: quantile expectation for binomial with 1e8
 V8: species
convergent_cluster_inversion_overlap_LD0.9_1cM_ranges_verbose.table
 data: type of data (climate, soil and phenotype)
 analysis: type of analysis (Spearman, baypass, GWAS)
 variable : variable type
 direction : direction of the pair
 comparison : type of comparison
 chromosome : chromosome
 cluster_start : start position of the convergent cluster
 cluster_end : end position of the convergent cluster
 cluster_size : size of the convergent cluster (bp)
 N_convergent : number of convergent clusters
 inversion_species : inversion detected in which species
 inversion_ID : inversion ID
 inversion_start : start positon of inversion
 inversion_end: end positon of inversion
 inversion_size : size of inversion (bp)
 overlap_start : start of overlap between inversion and convergent cluster
 overlap_end : end of overlap between inversion and convergent cluster
 overlap_size : size of overlap between inversion and convergent cluster
 is_cluster_overlapping : whether convergent cluster overlaps with any inversion
convergent_clusters_overlaps_with_inversions_pervariable.Spearman.table
 N_cluster : number of clusters overlapping
 porportion_number_cluster_overlap_inversion : porportion number of cluster overlap inversion
 porportion_number_cluster_overlap_inversion_P : Pvalue of porportion number of cluster overlap inversion
 porportion_length_cluster_overlap_inversion : porportion length of cluster overlap inversion
 porportion_length_cluster_overlap_inversion_P : Pvalue porportion length of cluster overlap inversion
 var : variable
 comparison : pair type
 type : type of data (climate, soil and phenotype)
 analysis: type of analysis (Spearman, baypass, GWAS)
convergent_inversion_overlap_merged_P_LD0.9_1cM.table
 data : type of data (climate, soil and phenotype)
 analysis : type of analysis (Spearman, baypass, GWAS)
 variable: variable type
 direction : direction of the pair
 N_cluster : number of convergent cluster
 total_cluster_length : length of the convergent cluster
 total_overlap_length : length of the convergent cluster overlapping with inversion
 N_cluster_overlap : number of the convergent cluster overlapping with inversion
 porportion_number_cluster_overlap_inversion : porportion number clusters overlaping with inversion
 porportion_number_cluster_overlap_inversion_NULL_mean : porportion number of clusters overlaping with inversion mean of null distribution
 porportion_number_cluster_overlap_inversion_P : porportion number of clusters overlaping with inversion Pvalue
 porportion_length_cluster_overlap_inversion : porportion length of clusters overlaping with inversion
 porportion_length_cluster_overlap_inversion_NULL_mean : porportion length of clusters overlaping with inversion mean of null distribution
 porportion_length_cluster_overlap_inversion_P: porportion length of clusters overlaping with inversion Pvalue
out_res_GOEnrichment_TopGo_arabidopsis_homologs_convergent_windowos_ElimFisher_CC.csv
 GO.ID : GO ID
 Term : GO term
 Annotated : annotated
 Significant: Pvalue
 Expected: expected Pvalue
 elimFisher: elimFisher value
 p.adj: Adjusted Pvalue (BH)
 comparison: comparison type
 analysis: analysis type
out_res_Inversion_Convergent_CLUSTERS_overlapping_Summed_Size_number_per_chromosome_Climate_Spearman_union_percomparison.table
 cluster_start : start position of the convergent cluster
 cluster_end : end position ofo the cluster
 total_cluster_length : length of the cluster (bp)
 total_N.cluster : tota,l number of convergent cluster
 total_overlap_size : size of overlap (bp)
 total_N.overlapped : ize of overlap
 direction : direction of the analysis
 chromosome: chromosome
out_res_Inversion_Convergent_CLUSTERS_overlapping_Summed_Size_number_per_chromosome_Phenotype_Spearman_union_percomparison.table
 cluster_start : start position of the convergent cluster
 cluster_end : end position ofo the cluster
 total_cluster_length : length of the cluster (bp)
 total_N.cluster : tota,l number of convergent cluster
 total_overlap_size : size of overlap (bp)
 total_N.overlapped : ize of overlap
 direction : direction of the analysis
 chromosome: chromosome
CONV_WIN_GENE_overlap_Ath_HanXRQr12_Ha412HOv2_NA.table
 data: data (env: enviornments, gwa: phenotypes)
 type: type of data (enviornment, soil, phenotype)
 analysis: analysis (spearman, GWAS corrected, GWAS uncorrected)
 Taxa1: first pair of taxa
 Taxa2: second pair of taxa
 comparison: type of comparison
 variable: variable type
 window_name: name of 5k window
 window5 : window ID
 Chr: chromosmome
 window_start = start position of the window
 window_end = end position of the window
 Ha412HOv2.0_annot2.1_start = start position on Ha412HOv2.0_annot2.1 assembly
 Ha412HOv2.0_annot2.1_end = end position on Ha412HOv2.0_annot2.1 assembly
 Ha412HOv2.0_annot2.1 =Ha412HOv2.0_annot2.1 assembly
 arabidopsis = arabidopsis assembly
 HanXRQr1.0 = ortholog HanXRQr1.0
 HanXRQv2= ortholog HanXRQr1.0