Data from: Urbanization correlates with genetic and plastic variation of Impatiens capensis flower morphology
Data files
May 27, 2026 version files 47.99 GB
Abstract
The spectacular diversity of flowers is largely driven by pollinator-mediated selection that favors attractive flowers and effective pollen transfer. Urbanization has the potential to affect floral trait evolution by altering pollinator communities through environmental changes. Additionally, abiotic changes in urban habitats can induce phenotypic plasticity, further shaping evolutionary trajectories. We investigated how urbanization affects the genetic and plastic components of flower morphology of Impatiens capensis (Balsaminaceae) across four Canadian cities. We found that urbanization influenced the pollinator community composition and the body size of bumblebees, the species’ main pollinator, although the magnitude of the size effect varied among cities. Using a combination of field surveys and a common garden experiment, our results suggest that urbanization affects sepal size—a tubular floral organ in which pollinators enter to access the nectar—through both genetic and plastic responses. While plasticity sometimes masked the genetic determination of sepal size in the field, we observed a positive correlation between the genetic component of the sepal size and bumblebee body size. These results suggest that urban habitats may drive evolutionary changes in floral traits by modifying pollinator communities.
Dataset DOI: 10.5061/dryad.931zcrk17
Description of the data and file structure
We photographed flowers of Impatiens capensis for 15 populations in 2021 and 16 populations in 2022, collected pollinators, and recorded pollinator visitation rates. We also collected seeds in 2021 from 30–100 plants per population to conduct a common‑garden experiment in 2022. In this common garden, we photographed flowers of Impatiens capensis for 15 populations.
We also characterized the populations using bioclimatic variables and several urbanization intensity metrics, including the Normalized Difference Vegetation Index (NDVI), Global Man‑made Impervious Surfaces (GMIS), and land‑use/land‑cover categories.
Files and variables
1. Pictures folder
Urbanization_correlates_with_flower_morphology_Pictures.zip contains all pictures used in the morphometric analyses for the 2021, 2022, and common‑garden sampling events. Pictures are sorted by:
- Experiment (2021, 2022, or common garden)
- Region (Montreal, Ottawa, Quebec, or Toronto)
- Population
- Front or profile of the flower
2. Data folder
The Data folder (Urbanization_correlates_with_flower_morphology_Data.zip) contains two folders.
a. CSV
For all CSV, decimal notation is "." and file delimiter is semicolon.
Biosim_12y_monthly_model.csv
Each row report, for a population, geographic information (Latitude and Longitude in decimal degrees, Élévation (which is altitude, in meters above sea level), date (year and month for which climatic data were interpolated), and climatic data.
For each population, the following monthly climatic data are interpolated: LowestTmin is the lowest temperature (in degree celsius), MeanTmin is the average minimum temperature (in degree celsius), MeanTair is the average temperature (in degree celsius), MeanTmax is the average maximum temperature (in degree celsius), HighestTmax is the highest temperature (in degree celsius), Total Prcp is total precipitations (in millimeters), MeanTdew is the average dew temperature (in degree celsius), MeanRelH is the mean relative humidity (in percentage), TotalRadiation is solar radiation (in watt/m²), FrostDay is the number of days when temperature was below 0 degree celsius, FrostFreeDay is the number of days when temperature was above 0 degree celsius, WetDay is the number of days with precipitations, DryDay is the number of days without precipitations.
Front_Flower_Info.csv
Contains population information for each flower photographed.
For each corolla picture, Individual is the individual number, Photo is the picture name, Population is the population name, Habitat is the type of habitat (urban or natural), Sex is sex of the flower (either Male (M) or Female (F)), Region is the region of study (either Toronto, Ottawa, Montreal or Quebec), NDVI is the Normalized Difference Vegetation Index calculated in a radius of 500 meters around the center of the population, Type is whether the picture has been duplicated for assessing technical error, with Original being the original picture and Copy a duplicated picture, Environment is whether the individual was collected in the field or in the common garden experiment, Population_code is a three‑letter population acronym, Year is the year of sampling.
Profile_Flower_Info.csv
Contains population information for each flower photographed.
For each posterior sepal picture, Individual is the individual number, Photo is the picture name, Population is the population name, Habitat is the type of habitat (urban or natural), Sex is sex of the flower (either Male (M) or Female (F)), Region is the region of study (either Toronto, Ottawa, Montreal or Quebec), NDVI is the Normalized Difference Vegetation Index calculated in a radius of 500 meters around the center of the population, Type is whether the picture has been duplicated for assessing technical error, with Original being the original picture and Copy a duplicated picture, Environment is whether the individual was collected in the field or in the common garden experiment, Population_code is a three‑letter population acronym, Year is the year of sampling.
Pollinator_abundance_2021.csv
Matrix of pollinator abundances, with species as columns, for each population (rows are three-letter acronyms) sampled in 2021.
Pollinator_abundance_2022.csv
Matrix of pollinator abundances, with species as columns, for each population (rows are three-letter acronyms) sampled in 2022.
Pollinator_abundance_2021+2022.csv
Matrix of pollinator abundances, with species as columns, for each population (rows are three-letter acronyms) from the 2021 and 2022 sampling efforts.
Pollinator_observation_2021_2022.csv
A specific README is provided in the CSV folder for this file.
Pollinator_size_2021_2022.csv
Contains all inter‑tegular distances (ITD, in millimeters) measurements for collected pollinators and includes their taxonomic information.
Tag is the tag associated with the pollinator specimen in the Robert Ouellet entomological collection, Year is the Year of sampling, Month is the month of sampling, Day is the day of sampling, Province is the province of sampling, City is the city of sampling, Region is the region of sampling (either Toronto, Ottawa, Montreal or Quebec), Population_code is the three-letter acronym for the population, Habitat is the habitat type, either Urban or Natural, Latitude and Longitude are the the geographic information in decimal degree of each population, Family is the family of the specimen, Genus is the genus of the specimen, Species is the species of the specimen, ITD is the inter-tegular distance (in millimeters) of the specimen and Population is the population where the specimens was collected.
Pollinator_Visitation_Rates_Pop.csv
Contains pollinator visitation rates for all populations sampled in 2021 and 2022.
Population is the name of the population, Population_code is a three-letter acronym for the population, Year is the year for which visitation rates was calculated, Habitat is habitat type, either Urban or Natural, Region is the region studied, either Toronto, Ottawa, Montreal or Quebec, and Visitation_rates is the mean visitation rate in each population, expressed as the number of pollinator per flower per hour.
Sampling_2021_dates.csv
Contains the sampling dates for each population in 2021.
Region and Visit correspond to the region studied (either Toronto, Ottawa, Montreal or Quebec) and whether it was the first or second annual visit in the population (1 or 2), Date was the date of the visit, Population AM is the population sampled in the morning, Population PM is the population sampled in the afternoon, and Details are other information about the sampling.
Sampling_2022_dates.csv
Contains the sampling dates for each population in 2022.
Region and Visit correspond to the region studied (either Toronto, Ottawa, Montreal or Quebec) and whether it was the first or second annual visit in the population (1 or 2), Date was the date of the visit, Population AM is the population sampled in the morning, Population PM is the population sampled in the afternoon, and Details are other information about the sampling.
Sites_characteristics_500m.csv
Indicates, for each population (identified with their three-letter acronym in the rows): the percentage of land‑use/land‑cover categories (either Trees, Crops, Built Areas, Bare Ground or Rangeland), the mean NDVI (Normalized Difference Vegetation Index) value and the mean GMIS (Global Man-Made Impervious Surfaces) value. Percentages and NDVI / GMIS values are calculated within a 500‑meter radius.
Sites_characteristics_1000m.csv
Indicates, for each population (identified with their three-letter acronym in the rows): the percentage of land‑use/land‑cover categories (either Trees, Crops, Built Areas, Bare Ground or Rangeland), the mean NDVI (Normalized Difference Vegetation Index) value and the mean GMIS (Global Man-Made Impervious Surfaces) value. Percentages and NDVI / GMIS values are calculated within a 1000‑meter radius.
Sites_characteristics_2000m.csv
Indicates, for each population (identified with their three-letter acronym in the rows): the percentage of land‑use/land‑cover categories (either Trees, Crops, Built Areas, Bare Ground or Rangeland), the mean NDVI (Normalized Difference Vegetation Index) value and the mean GMIS (Global Man-Made Impervious Surfaces) value. Percentages and NDVI / GMIS values are calculated within a 2000‑meter radius.
Sites_coordinates.csv
Contains decimal‑degree coordinates for all populations.
Population is the population studied, Region here contains both region (either Toronto, Ottawa, Montreal or Quebec) and habitat (either Urban or Natural) studied, Latitude is the latitude of the population in decimal degrees, Longitude is the longitude of the population in decimal degrees, Altitude is the altitude of the population in meters above sea level.
Size_Bombus.csv
Contains the sample size (n), mean ITD (ITD column), standard deviation (sd), standard error (se) and confidence interval (ci) for bumblebees pooled by population (Population column). N represents the sample size for each habitat (urban or natural) within each region. Habitat is the habitat type (either Urban or Natural) and Region is the region where data were collected (either Toronto, Ottawa, Montreal or Quebec).
b. TPS_files
Contains all .tps files used for the morphometric analyses.
Each .tps file includes: a scale value and the Cartesian coordinates of landmarks and semilandmarks for each picture.
The filename structure is as follows:
- Begins with CG (Common Garden) or F (Field)
- Followed by the year (2021 or 2022)
- Then the three-letter region code (MTL, OTT, QUE, TOR)
- Then the three-letter population code
- Followed by Profile (posterior sepal) or Front (corolla)
- Rand indicates pictures were randomized before placing landmarks
- Final indicates the final files used for the publication
Code/software
To run the code, you will have to download the Data folder and use it as a working directory.
The “Code” folder (Urbanization_correlates_with_flower_morphology_Code.zip) contains the following five codes:
Geometric_Morphometrics_Front_Impatiens.R
- Code to run in geometric morphometric analyses of the corolla of Impatiens capensis (R v4.3.3 or later).
Geometric_Morphometrics_Profile_Impatiens.R
- Code to run geometric morphometric analyses of the posterior sepal of Impatiens capensis (R v4.3.3 or later).
Google_Earth_Engine_Code.txt
- Code used in the Google Earth Engine to compute the NDVI map and calculate mean NDVI values for each population.
Pollinator_analysis_Impatiens.R
- Code to run analyses on pollinator communities and pollinator size of Impatiens capensis (R v4.3.3 or later).
Site_Analysis_Impatiens.R
- Code is used to classify populations as urban or natural based on bioclimatic and urbanization intensity variables (R v4.3.3 or later).
