Data from: Macroevolution of floral scent chemistry across radiations of male euglossine bee-pollinated plants

Liu, Jasen 1 ; Milet-Pinheiro, Paulo2; Gerlach, Günter3; Ayasse, Manfred4; Nunes, Carlos5; Alvos-dos-Santos, Isabel5; Ramirez, Santiago1

Published Oct 24, 2023 on Dryad. https://doi.org/10.25338/B85938

Data files

Oct 24, 2023 version files 943.87 KB

README.md

12.21 KB
Scripts_and_data.zip

931.66 KB

Abstract

Floral volatiles play key roles as signaling agents that mediate interactions between plants and animals. Despite their importance, few studies have investigated broad patterns of volatile variation across groups of plants that share pollinators, particularly in a phylogenetic context. The “perfume flowers”, Neotropical plant species exhibiting exclusive pollination by male euglossine bees in search of chemical rewards, present an intriguing system to investigate these patterns due to the unique function of their chemical phenotypes as both signaling agents and rewards. We leverage recently-developed phylogenies and knowledge of biosynthesis along with decades of chemical ecology research to characterize axes of variation in the chemistry of perfume flowers, as well as understand their evolution at finer taxonomic scales. We detect pervasive chemical convergence, with many species across families exhibiting similar volatile phenotypes. Scent profiles of most species are dominated by compounds of either the phenylpropanoid or terpenoid biosynthesis pathways, while terpenoid compounds drive more subtle axes of variation. We find recapitulation of these patterns within two independent radiations of perfume flower orchids, in which we further detect evidence for rapid evolution of divergent floral chemistries, consistent with the putative importance of scent in the process of adaptation and speciation.

Perfume flower dataset

Scent data from male euglossine bee-pollinated plants, in addition to pollinator information The zipped folder contains 5 folders. Apart from the "Raw_data" folder, each folder contains all the data and source code required to run the primary R script.

Analyses were run in R version 4.2.1 with the following packages:
dplyr v. 1.1.2
tidyr v. 1.3.0
ggplot2 v. 3.4.2
compositions v. 2.0.6
ggpubr v. 0.6.0
pez v. 1.2.4
dendextend v. 1.17.1
ecodist v 2.0.9
ape v. 5.7.1
geomorph v. 4.0.5
phytools v. 1.5.1
geiger v. 2.0.11
phylogram v. 2.1.0
picante v. 1.8.2
chemodiv v. 0.2.0
webchem v. 1.2.0
ChemmineR v. 3.48.0
fmcsR v. 1.38.0
fBasics v. 4022.94
ade4 v. 1.7.22

Cells with NA refer to cases where formulae used to calculate values are unable to do so due to some inherent property of the data (e.g. when only 1 compound is present, diversity metrics are not possible to calculate).

These values are removed from analyses when diversity metrics are used as predictors.

Description of scripts and datasets within each folder

Raw_data: raw data used for the analyses prior to filtering

Plant_pollinator.csv - csv of plant species with pollinator info and scent data; each row corresponds to a plant species - pollinator combination (so if a given plant has multiple pollinators, it occupies as many rows as its total number of pollinators).
- "plantspecies" corresponds to plant species in a plant-pollinator combination
- "plantfamily" corresponds to the family the plant is in
- "plantgenus" corresponds to the genus the plant is in
- "beespecies" corresponds to a bee pollinator of the plant
- "beegenus" corresponds to the bee genus ("Eg" = Euglossa, "Ag" = Aglae, "Ex" = Exaerete, "El" = Eulaema, "Ef" = Eufriesea")
- all other columns correspond to compounds present in the plant's perfume, with values ranging from 0 to 100 corresponding to the percentage of the compound in the total blend
plant_VOC_averages_all.csv - csv of all plant species with scent data; each row corresponds to a species while columns correspond to compounds; values range from 0 to 100, corresponding to the percentage of the compounds in the blend

Chemodiv_analyses: Data and script to generate the chemodiv distancematrices used for downstream analyses

Files in this folder include:

chemodivscript.R - Primary R script to generate chemical distance matrices used in downstream analyses in other subfolders.
cleaneddataset_with_features_FINAL.csv - csv of plant species with their chemical traits after filtering out spp with less than 70% of their total scent profile resolved and standardizing such that values correspond to proportions, in addition to some other features for exploratory plotting
- "Family" corresponds to the family the plant is in
- "SuborFam" correspond to the subfamily the plant is in, if it is an orchid, or the family if not
- "richness" corresponds to the total number of compounds
- "aroprop" corresponds to the proportion of the perfume comprised of aromatic compounds
- "monoprop" corresponds to the proportion of the perfume comprised of monoterpenoid compounds
- "terpprop" corresponds to the proportion of the perfume comprised of all terpenoid compounds
- "sesprop" corresponds to the proportion of the perfume comprised of sesquiterpenoid compounds
- "faprop" corresponds to the proportion of the perfume comprised of fatty acid derivative compounds
- "carprop" corresponds to the proportion of the perfume comprised of carotenoid compounds
- "broadclass" corresponds to the broad chemical class of the plant ("T" = terpenoid-dominated, "AT" = mix of aromatic and terpenoid, "A" = aromatic-dominated, "F" = fatty acid derivative-dominated)
- "most_abundant" corresponds to the most abundant compound in the blend
- "ses_jac" corresponds to standard effect size calculated using jaccard distances
- "ses_bray" corresponds to standard effect size calculated using Bray-curtis distances
R Compound Properties - csv of compounds present in the dataset with their InChiKey and SMILES code for chemodiv

Analyses_with_full_dataset: data and scripts to perform analysesacross all species

Files in this folder include:

Broad_patterns_Final.R - Primary R script used to generate biosynthetic distances using the method of Junker 2018, Chemoecology, generate ordinations, and perform correlative analyses of axes of variation with specific chemical traits.
SI1_sourceFunctions_BioSynDist_copy.R - Source code from Junker 2018, Chemoecology, to generate biosynthetic distances among species
Coded.csv - csv containing taxonomic information
- "Family" corresponds to the family the plant is in
- "SuborFam" correspond to the subfamily the plant is in, if it is an orchid, or the family if not
compoundxproperty_sorted_filtered - csv where rows correspond to compounds present in dataset and columns correspond to biosynthetic pathways and functional group information. Entries are coded 1 or 0 based on presence / absence
master_VOC_averages_no_poll_cleaned.csv - csv where rows correspond to species present in dataset while columns correspond to different compounds present after data filtering, and values correspond to percentage (0 to 100)
diversity_metrics_chemodiv.csv - csv of chemical diversity metrics generated from the chemodivscript
- finger_funchill corresponds to functional Hill diversity of compounds calculated using the "fingerprint" scheme
- finger_hill corresponds to Hill diversity of compounds calculated using the "fingerprint" scheme
- fmcs_funchill corresponds to functional Hill diversity of compounds calculated using the "fMCS" scheme
- fmcs_hill corresponds to Hill diversity of compounds calculated using the "fMCS" scheme
fingerprintdisdata_full.csv - csv of distance matrix from "fingerprints" scheme in chemodiv
fmcsdisdata_full.csv - csv of distance matrix from "fMCS" scheme in chemodiv

Pollinator_analyses: data and scripts for performing analysesinvolving pollinators

Files in this folder include:

pollinator_analyses_FINAL.R - Primary R script to perform correlative tests between chemical traits and pollinators using chemical distance matrices generated earlier.
dataset_with_everything.csv - csv containing filtered chemical traits, PCO scores, and chemodiv metrices generated from "Analyses_with_full_dataset"
master_VOC_regions_20220224_edited - csv where rows correspond to plant species and columns correspond to pollinator species.
- "plantspecies" corresponds to plant species in a plant-pollinator combination
- "plantfamily" corresponds to the family the plant is in
- "plantgenus" corresponds to the genus the plant is in
- "beespecies" corresponds to a bee pollinator of the plant
- "beegenus" corresponds to the bee genus ("Eg" = Euglossa, "Ag" = Aglae, "Ex" = Exaerete, "El" = Eulaema, "Ef" = Eufriesea")
- all other columns correspond to compounds present in the plant's perfume, with values ranging from 0 to 100 corresponding to the percentage of the compound in the total blend
complete_AE_Plant.tre - most general phylogeny of plants within the dataset
euglossine_tree.tree - euglossine bee phylogeny (Ramirez et al. 2011).
fingerprintdisdata_full.csv - csv of distance matrix from "fingerprints" scheme in chemodiv
fmcsdisdata_full.csv - csv of distance matrix from "fMCS" scheme in chemodiv
myscheme_full.csv - csv of distance matrix from "simple" scheme using Junker 2018 method produced in "Analyses_with_full_datset"

Phylogenetic_analyses: data and scripts for performing phylogeneticcomparative methods

Files in this folder include:

Phylogenetic_Comparative_FINAL.R - Primary R script for calculating phylogenetic signal and disparity through time in two separate clades.
Adams_function.r - R source code to calculate phylogenetic signal
complete_AE_Plant.tre - most general phylogeny of plants within the dataset
euglossine_tree.tree - euglossine bee phylogeny (Ramirez et al. 2011).
cattreeforanalyses.tre - complete_AE_Plant.tre pruned to just the Catasetinae
stantreeforanalyses.tre - complete_AE_Plant.tre pruned to just the Stanhopeinae
bigtreechemodataset_new.csv - csv containing scent data for all species in the dataset with phylogenetic information. Rows correspond to species while the first 167 columns correspond to compounds, with values representing relative proportion of that compound in the species' blend. Other columns:
- "Family" corresponds to the family the plant is in
- "SuborFam" correspond to the subfamily the plant is in, if it is an orchid, or the family if not
- "richness" corresponds to the total number of compounds
- "aroprop" corresponds to the proportion of the perfume comprised of aromatic compounds
- "monoprop" corresponds to the proportion of the perfume comprised of monoterpenoid compounds
- "terpprop" corresponds to the proportion of the perfume comprised of all terpenoid compounds
- "sesprop" corresponds to the proportion of the perfume comprised of sesquiterpenoid compounds
- "faprop" corresponds to the proportion of the perfume comprised of fatty acid derivative compounds
- "carprop" corresponds to the proportion of the perfume comprised of carotenoid compounds
- "linearmono" corresponds to the proportion of the perfume comprised of linear monoterpenoid compounds
- "ringedmono" corresponds to the proportion of the perfume comprised of ringed monoterpenoid compounds
- "cineolecasette" corresponds to the proportion of the perfume comprised of cineole casette compounds
- "carvones" corresponds to the proportion of the perfume comprised of carvones compounds
- "broadclass" corresponds to the broad chemical class of the plant ("T" = terpenoid-dominated, "AT" = mix of aromatic and terpenoid, "A" = aromatic-dominated, "F" = fatty acid derivative-dominated)
- "most_abundant" corresponds to the most abundant compound in the blend
- "ses_jac" corresponds to standard effect size calculated using jaccard distances
- "ses_bray" corresponds to standard effect size calculated using Bray-curtis distances
- finger_funchill corresponds to functional Hill diversity of compounds calculated using the "fingerprint" scheme
- finger_hill corresponds to Hill diversity of compounds calculated using the "fingerprint" scheme
- fmcs_funchill corresponds to functional Hill diversity of compounds calculated using the "fMCS" scheme
- fmcs_hill corresponds to Hill diversity of compounds calculated using the "fMCS" scheme
- pcoa1_full corresponds to values of PCo 1 for the species calculated using the "simple" scheme
- pcoa2_full corresponds to values of PCo 2 for the species calculated using the "simple" scheme
- pcoa3_full corresponds to values of PCo 3 for the species calculated using the "simple" scheme
- pcoa4_full corresponds to values of PCo 4 for the species calculated using the "simple" scheme
- pcoa1_finger corresponds to values of PCo 1 for the species calculated using the "fingerprint" scheme
- pcoa2_finger corresponds to values of PCo 2 for the species calculated using the "fingerprint" scheme
- pcoa3_finger corresponds to values of PCo 3 for the species calculated using the "fingerprint" scheme
- pcoa4_finger corresponds to values of PCo 4 for the species calculated using the "fingerprint" scheme
- pcoa1_fmcs corresponds to values of PCo 1 for the species calculated using the "fMCS" scheme
- pcoa2_fmcs corresponds to values of PCo 2 for the species calculated using the "fMCS" scheme
- pcoa3_fmcs corresponds to values of PCo 3 for the species calculated using the "fMCS" scheme
- pcoa4_fmcs corresponds to values of PCo 4 for the species calculated using the "fMCS" scheme
Catchemodataset_new.csv - csv containing filtered scent data for the Catasetinae. See bigtreechemodataset description for column meanings
Stanchemodataset_new.csv - csv containing filtered scent data for the Stanhopeinae. See bigtreechemodataset description for column meanings

Data from: Macroevolution of floral scent chemistry across radiations of male euglossine bee-pollinated plants

Data files

Abstract

README

Perfume flower dataset

Description of scripts and datasets within each folder

Methods

Works referencing this dataset