Skip to main content

Data used in: Phenological sensitivities to climate are similar in two Clarkia congeners: Indirect evidence for facilitation, convergence, niche conservatism, or genetic constraints

Cite this dataset

Mazer, Susan et al. (2022). Data used in: Phenological sensitivities to climate are similar in two Clarkia congeners: Indirect evidence for facilitation, convergence, niche conservatism, or genetic constraints [Dataset]. Dryad.


To date, most herbarium-based studies of phenological sensitivity to climate and of climate-driven phenological shifts fall into two categories: detailed species-specific studies vs. multi-species investigations designed to explain inter-specific variation in sensitivity to climate and/or the magnitude and direction of their long-term phenological shifts. Few herbarium-based studies, however, have compared the phenological responses of closely related taxa to detect: (1) phenological divergence, which may result from selection for the avoidance of heterospecific pollen transfer or competition for pollinators, or (2) phenological similarity, which may result from phylogenetic niche conservatism, parallel or convergent adaptive evolution, or genetic constraints that prevent divergence. Here, we compare two widespread Clarkia species in California with respect to: the climates that they occupy; mean flowering date, controlling for local climate; the degree and direction of climate change to which they have been exposed over the last ~115 years; the sensitivity of flowering date to inter-annual and to long-term mean maximum spring temperature and annual precipitation across their ranges; and their phenological change over time. Specimens of C. cylindrica were sampled from sites that were chronically cooler and drier than those of C. unguiculata, although their climate envelopes broadly overlapped. Clarkia cylindrica flowers ~ 3.5 days earlier than C. unguiculata when controlling for the effects of local climatic conditions and for quantitative variation in the phenological status of specimens. However, the congeners did not differ in their sensitivities to the climatic variables examined here; cumulative annual precipitation delayed flowering and higher spring temperatures advanced flowering. In spite of significant spring warming over the sampling period, neither species exhibited a long-term phenological shift. Precipitation and spring temperature interacted to influence flowering date: the advancing effect on flowering date of high spring temperatures was greater in dry than in mesic regions, and the delaying effect of high precipitation was greater in warm than in cool regions. The similarities between these species in their phenological sensitivity and behavior are consistent with the interpretation that facilitation by pollinators and/or shared environmental conditions generate similar patterns of selection, or that limited genetic variation in flowering time prevents evolutionary divergence between these species.


This study comprised five steps. First, we sampled reproductive specimens of each species from throughout its range to record each specimen’s date and site of collection, as well as a quantitative estimate of its phenological status, which ranged from bearing only flower buds to bearing only ripe fruits. Second, we compared the climatic conditions occupied by each species to assess their habitat preferences and tested for species differences in mean flowering date independent of local, long-term climatic conditions. Third, we used linear models to detect the degree of climate change that each species experienced over the past 112 – 119 years across their sampled range (1900 – 2012 for C. cylindrica and 1892-2011 for C. unguiculata). Fourth, we constructed and tested phenoclimatic models to detect the effects of local mean maximum spring temperature (Spring Tmax) and annual precipitation (AP) on each species’ flowering date, which was estimated as the day of the year (DOY) on which a reproductive specimen was collected; these linear models also controlled statistically for each specimen’s phenological status (estimated using a quantitative index of its reproductive progression; Love et al. 2019).

The phenoclimatic models tested here provide measures of the sensitivity of flowering phenology to climatic conditions estimated at two temporal scales: decadal and year-of-collection.  Specifically, each site of specimen collection was characterized by its mean climatic conditions from 1921-2010 (i.e., long-term climate, or climate “normals”), and by the deviation between climatic conditions at the site in the year of collection and the site’s mean long-term climate conditions (i.e., climate anomalies due to inter-annual variation in temperature and annual precipitation [AP]). When both kinds of parameters are included as explanatory variables in linear models designed to predict the DOY of flowering, the sensitivity of DOY to climate normals reflects a combination of local adaptation and plastic responses to spatial variation in chronic climatic conditions, while phenological responsiveness to interannual variation is due primarily or wholly to plastic responses to short-term local conditions.  Other investigators have used this approach, including Mazer et al.’s (2020) study of seed size variation in Clarkia; Pearson et al.’s (2022) investigation of Eschscholzia californica (California Poppy; Papaveraceae), and Parker’s (2022) study of five species of Arctostaphylos (Ericaceae) and Ceanothus (Rhamnaceae).

In all models, we controlled for variation in the phenological status of specimens, which can confound the relationship between a specimen’s collection date and the actual date of flowering onset. These models were then used to compare the direction and magnitude of the two species’ phenological sensitivities to local temperature and precipitation normal and anomalies, and to test for interactions between climate variables that may have influenced flowering time. Finally, we tested for phenological shifts in estimated flowering date within each species during the ~115-year sampling period to determine whether long-term temporal trends in flowering phenology were consistent with each species’ sensitivity to inter-annual variation in temperature and precipitation, and with the degree of climate change that each species experienced.

Study species

Clarkia is well-studied genus of ~40 species of self-compatible, annual, herbaceous wildflowers native to the western U.S. (Lewis and Lewis, 1955). Wherever they occur, populations of Clarkia are among the last spring wildflowers to bloom (typically flowering in May or June), and they produce dense and showy floral displays. The two taxa selected for the current study – Clarkia cylindrica ssp. clavicarpa W. Davis (subgenus Peripetasma) and C. unguiculata Lindley (subgenus Phaeostoma) – inhabit open or disturbed habitats and roadsides in the foothills, grasslands, and oak/pine woodlands of the Coastal Ranges, Transverse Ranges, and Sierra Nevada in California. These taxa are adapted to a Mediterranean climate, although the sites sampled for the current study vary widely with respect to long-term conditions. Among sites, long-term MAP estimated from 1921-2010 ranges from 141 – 1377 mm, and mean spring Tmax for the same period ranges from 9.9-24.9 °C. Clarkia cylindrica has been described as “normally outcrossing” (Davis, 1970), and C. unguiculata is predominantly outcrossing, although populations differ in their outcrossing rates (Vasek, 1958; Hove et al., 2016; Ivey et al., 2016). Both species are restricted to California, are typically found at elevations below 1500m, and are diploid (n=9).

The two species have been found to co-occur regularly in the southern Sierra Nevada (Moeller, 2004), but they are also found with other congeners. Lewis and Lewis (1955) reported that, at the time of the publication of their 1955 monograph on Clarkia, C. unguiculata and C. cylindrica could be found in mixed colonies with, respectively, any of 17 or 10 other Clarkia species (including each other). So, the potential for these species to compete for pollinators or to facilitate the pollination of congeners is not restricted to their interactions with one another. In addition, Lewis and Lewis (1955) found these two species to be genetically incompatible; no hybrids were formed following interspecific pollination. This incompatibility means that, while divergent flowering times might help sympatric populations of C. unguiculata and C. cylindrica to avoid interspecific pollination and stigma clogging, phenological divergence is not needed to achieve reproductive isolation.

Both species are pollinated by a small number of specialist, solitary “Clarkia bees” (Lewis and Lewis, 1955; MacSwain et al., 1973) that are attracted to the conspicuously pigmented, flecked, and spotted cup-shaped flowers of C. cylindrica and the rotate, clawed-petalled flowers of C. unguiculata, both of which provide pollen and nectar rewards (Fig. 1). Clarkia cylindrica is commonly visited for its pollen by Andrena lewisorum Thorp, which has also been found to visit C. unguiculata (MacSwain et al., 1973). Other bee species that visit C. cylindrica and C. unguiculata (but not necessarily at the same locations, at similar frequencies per plant species, or with equivalent pollination efficacy) include: Andrena omninigra Viereck, Hesperapis regularis (Cresson), Megachile gravita Mitchell, M. pascoensis Mitchell, Diadasia angusticeps Timberlake, Melissodes clarkiae LaBerge, Synhalonia venusta carinata (Timberlake), Ceratina sequoiae Michener, Lasioglossum pullilabre (Vachal) and Apis mellifera Linnaeus (MacSwain et al., 1973).

Herbarium data

The two Clarkia species examined here are well represented in the holdings of the Consortium of California Herbaria. For the current study, we borrowed specimens of each species from the Jepson Herbarium (JEPS) and the University Herbarium (UC) at the University of California, Berkeley; Rancho Santa Ana Botanic Garden (RSA); the University of California, Riverside (UCR), and the Santa Barbara Botanic Garden (SBBG). Each specimen’s label was examined to extract its collection date (recorded as the day of year on which the specimen was collected [1-365 or 366 on leap years], year of collection, elevation (m), latitude and longitude. Specimens that lacked GPS coordinates were georeferenced using the label’s description of the collection site and the online utility, GEOLOCATE ( Where elevation was not recorded on the label, we used the specimen’s latitude and longitude (with GEOLOCATE) to estimate its elevation. Only specimens that bore one or more individual plants that had begun to reproduce (i.e., bearing flower buds, open or spent flowers, expanding ovaries, or fruits) at the time of collection were included in this study. Individual plants that had not begun to reproduce were not scored.

Because herbarium specimens may have been collected at any time during an individual’s reproductive cycle, a specimens’ collection date (DOY) is not a precise proxy for date of onset of flowering. Moreover, DOY is generally positively correlated with an individual plant’s phenological status such that, under similar environmental conditions, individuals collected at early stages of reproduction (e.g., when bearing only closed flower buds) are represented by earlier DOYs than those collected at later stages (e.g., when bearing only ripe fruits). Because DOY is confounded with reproductive stage, predictive models that control for the phenological status of sampled plants when examining the relationship between DOY and climatic conditions explain a higher proportion of the variance in DOY than those that do not control for this source of variance (Love et al., 2019).

To provide a quantitative estimate of each specimen’s phenological status, we used a phenological index (PI) that ranges from 1 (for plants comprised entirely of buds) to 4 (for plants comprised of all fruits) (Love et al., 2019). For each individual plant specimen (including multiple plants when a given herbarium sheet contained more than one individual), we counted the numbers of buds (> 5mm in length), open flowers, wilted flowers or expanding ovaries, and fully developed fruits (full-sized and/or beginning to dehisce). Each organ type was assigned a value that reflected its developmental stage (buds=1; open flowers = 2; spent flowers or expanding ovaries = 3; fully developed fruits = 4) and used in the following equation:

where Px represents the proportion of reproductive organs in a given stage and i represents the value assigned to that class (e. g., buds have a value of 1). The PI is therefore the weighted average of the proportions of buds, open flowers, spent flowers or developing ovaries, and ripe fruits. For herbarium sheets on which multiple, complete individuals were pressed, we assigned the sheet the mean of the PI values of its component individuals. Partial individuals were not scored for their PI. For C. unguiculata, a total of 608 plants on 231 sheets were scored; for C. cylindrica, a total of 1042 plants on 306 sheets were scored.

Duplicate specimens were considered to be those that were: collected < 500 m away from the nearest specimen in both latitude and longitude; collected on the same day of the same year; and represented by the same mean annual temperature normal, mean spring maximum temperature normal, and annual precipitation normal (as extracted using the climate database, ClimateNA.) Two sites separated by latitudinal and longitudinal distances of 500 m would be a linear distance of ~707 m apart, and were usually associated with distinct climate variables due to a difference in slope, aspect, or elevation, which ClimateNA uses to estimate climatic parameters. Following the elimination of duplicate specimens, 226 sheets of C. unguiculata collected from 1902-2011 and 284 sheets of C. cylindrica collected from 1900-2012 were analyzed here.

Within each species, some specimens were retained even if they were collected < 707m from another collection site. For 85 specimens of Clarkia cylindrica, the distance to the nearest collection site was < 707m, but only 24 of these specimens (distributed in 10 groups of 2 or 3 specimens) were collected on the same day in the same year as a nearby specimen. Based on the climate data retrieved from ClimateNA, the 2-3 specimens in each of these groups were represented by different combinations of mean annual temperature, annual precipitation, and mean Spring Tmax (likely due to differences between them in slope, aspect, and/or elevation, for which ClimateNA interpolated distinct climate parameters), and so they were retained for analysis. Twenty-two of the 226 specimens of Clarkia unguiculata were collected < 707m from the next closest site. Of these specimens, two were collected on the same day in the same year as their nearest neighbor, but 40 meters apart in elevation.

The range of PI values recorded for the sheets of C. cylindrica was 1.0-3.77; the range for C. unguiculata was 1.0 – 3.95. In all analyses described below, the PI was log-transformed to improve normality.


Climate data

We evaluated climate variables commonly found to influence flowering date in other taxa (Anderson et al., 2012; Cleland et al., 2012; Park and Mazer, 2018; Berg et al., 2018): annual precipitation (AP; this includes cumulative rain and snow, with the latter converted to water-equivalents); mean maximum spring temperature (Spring Tmax, the mean maximum daily temperature from March-April), mean minimum spring temperature (Spring Tmin), and the number of frost-free days in winter and in spring (NFFD). For each collection site and climatic parameter, two values were extracted from ClimateNA (Wang et al., 2016), an application that assembles downscaled monthly climatic parameters (e.g., the mean of daily values) for a wide range of parameters recorded from 1901 onwards. First, we obtained the long-term mean value of each parameter from 1921-2010. Second, we extracted the climate conditions for the year of collection (YOC). For each collection site x year combination, we then calculated the deviation between the annual conditions in the YOC and long-term climate mean (hereafter referred to as the “normal”). The value of this deviation (referred to here as the “anomaly” for the focal variable) indicates whether, in the year of specimen collection, a given site was warmer- (or cooler) or drier- (or wetter) than its long-term average.

Exploratory linear models were constructed and tested to determine whether spring Tmax or spring Tmin better explained variation in DOY. Each of two models was tested separately on C. cylindrica and C. unguiculata, and then on the pooled data including both species. The first model included the following variables as predictors: Log(PI), AP normal, Spring Tmax normal, AP anomaly, and spring Tmax anomaly. The second included the same predictors but using the Spring Tmin normals and anomalies instead of Tmax. The models in which the predictor variables included Spring Tmax performed better (had higher R2 values) than those that included Spring Tmin (Table A1). Because Spring Tmax and Spring Tmin are collinear among collection sites (Spring Tmax normal vs. Spring Tmin normal: r = 0.55, P < 0.0001, n=509; Spring Tmax anomaly vs. spring Tmin anomaly: r = 0.75, P < 0.0001, n=509), we did not include both in the same model.

In addition to examining the effects of Annual Precipitation, Spring Tmax, and Spring Tmin on flowering date, we used data from ClimateNA to calculate the anomalies for Winter Tmax, Winter Tmin, precipitation as snow (PAS), and the number of frost-free days (NFFD) in winter and spring in order to examine long-term temporal trends in climate over the ~115 years of specimen collection represented by the data analyzed here.



Climatic conditions and geographic locations occupied by each species. To compare species with respect to the combinations of climatic and geographic conditions they occupy throughout their sampled ranges, we examined the bivariate space occupied by each species’ collection sites with respect to their Spring Tmax and AP normals, as well as their elevation, latitude and longitude. Linear models were then constructed to test for significant differences between species’ means with respect to each of the four focal climate variables (AP normal, Spring Tmax normal, AP anomalies, and Spring Tmax anomalies) while controlling for variation in the other three variables. In each of these models, the focal climate variable was the response variable and the other three climate variables and Species were included as fixed main effects. One-way ANOVAs were conducted to compare species’ means with respect to elevation.

Climate change during sampling period: Temporal change in climate anomalies. Analyzing each species separately over its ~115-year sampling period, we conducted simple regressions to test for temporal trends in each of our focal climate variables: AP, PAS, winter Tmin, spring Tmin, winter Tmax, spring Tmax, winter NFFD, and spring NFFD. In each regression, the anomaly for a given climate variable was included as the response variable and Year as the independent variable. In these analyses, positive slopes of the regression of the temperature- (or precipitation-) based anomalies on year mean that, as time progresses, the sampled sites are becoming warmer (or wetter) than average.

Factors influencing flowering date: species identity, spring Tmax and AP normals and anomalies, and phenological status. We constructed and tested a suite of linear models to detect significant differences between species in mean DOY and to measure the independent effects on DOY of the normals for AP and spring Tmax, of the AP and spring Tmax anomalies, and of the phenological status (estimated as log[PI]) of individuals. Because interactions between climate variables can influence the interpretation of their individual effects on DOY, we also sought evidence for significant interactions between each pair of the climate variables that were included as main effects. This analysis was conducted in several steps; in all models, DOY was the response variable.

The first model tested only for the following main fixed effects: Species, Log(PI), AP and spring Tmax normals, and AP and spring Tmax anomalies. We then tested each of 15 two-way interactions by adding one of the following interactions to the first model to test its contribution to variance in DOY: AP normal|Tmax normal, AP normal|AP anomaly, MAP normal|Tmax anomaly, Tmax normal|AP anomaly, Tmax normal|Tmax anomaly, AP anomaly|Tmax anomaly, LogPI|Species, AP normal|Species, spring Tmax normal|Species, AP anomaly|Species, Spring Tmax anomaly|Species, AP normal|Log(PI), Spring Tmax normal|Log(PI), AP anomaly|Log(PI), and Spring Tmax anomaly|Log(PI). We chose to test the two-way interactions individually rather than to include all of them in a single model in order to facilitate the biological interpretation of the coefficients of each interaction and main effect. Additionally, potential collinearity between predictors (such as the cross-products that comprise interactions) can lead to variance inflation, making detection of significant effects difficult (i.e., increased Type II errors).

Among all of these models, the only significant two-way interaction was the AP normal|Tmax normal interaction (see Results). This interaction term was also statistically significant (P = 0.0178) when tested in a model that included the six main effects above plus all of  two-, three-, and four-way interaction terms between and among the four climate variables (AP and Spring Tmax normal and anomalies; results not shown). We then tested for differences between species with respect to this interaction by constructing and testing a linear model that included the following predictors: Species, Log(PI), AP and Spring Tmax normals, AP and Spring Tmax anomalies, and the AP normal|Tmax normal and the Species|MAP normal|Tmax normal interaction terms.

Long-term phenological shifts. To detect long-term phenological shifts in flowering dates in each species across the sampling period, we constructed and tested linear models in which DOY was the response variable and log(PI), year, latitude, longitude, and elevation were treated as fixed independent variables. In these main-effects models, there was no significant effect of year on DOY. We then tested for two-way interactions between year and each geographic attribute: Year|Latitude, Year|Longitude, and Year|Elevation. Each two-way interaction was tested separately by adding it to the main-effects model. In the absence of any significant two-way interaction, a significant (or non-significant) effect of Year on DOY could be interpreted as an effect of Year that is independent of the effects on DOY of latitude, longitude, or elevation. In turn, a significant interaction would indicate that the rate of change in DOY over time (i.e., the effect of Year) varies among specimens collected in different latitudes, longitudes, or elevations.

All linear models were conducted using the lm function (stats 4.0.2 of the R Stats Package) in R Studio version 1.2.5042; figures were created using visreg v2.7.0 (Breheny and Burchett, 2017), and ggplot2 v3.3.2 (Wickham 2016). In all of these linear models, significance testing was conducted using Type III sums of squares; the effects on DOY of each independent variable or interaction term was tested when placed last into the model. In multiple linear models that included species as an independent variable, the lsmeans and eff_size functions (in the emmeans package) were used to test for significant differences between species’ least squares means and for effect sizes.

Usage notes

Please refer to the README.txt file. 


National Science Foundation, Award: DBI-1802181

National Park Service, Award: Cooperative Agreement H8C07080001

University of California, Santa Barbara, Award: Chancellor's Fellowship to Tadeo Ramirez-Parada