Retention of nonfunctional traits over evolutionary time is puzzling, because the cost of trait production should drive loss. Indeed, several studies have found nonfunctional traits are rapidly eliminated by selection. However, theory suggests that complex genetic interactions and a lack of genetic variance can constrain evolution, including trait loss. In the mustard family Brassicaceae the conserved floral condition includes four long and two short stamens, but we show that short stamens in the highly self-pollinating mustard Arabidopsis thaliana do not significantly increase selfed seed set, suggesting that the trait has lost most or all of its function after the transition to selfing. We find that short stamen loss is common in native populations. Loss is incomplete and decreases with increasing latitude, a cline unexplained by correlations with flowering time or ovule count (which also vary with latitude). Using recombinant inbred lines derived from a cross between plants at the latitudinal extremes of the native range, we found three QTLs affecting short stamen number, with epistasis among them constraining stamen loss. Constraints on stamen loss from both epistasis and low genetic variance may be augmented by high selfing rates, suggesting that these kinds of constraints may be common in inbred species.
Native range accessions common garden data
This is a csv file containing all the data from our analysis of geographic variation in stamen and ovule number. Each row represents a single flower. Multiple flowers were sampled per plant, multiple plants per accession, and multiple accessions per geographic location. Geographic location is in column 1 (Abbrev.), populated with abbreviations for the locations (see Table S1 for more information). Accessions (genotypes, maintained by self-pollination from a single field collection) are indicated by unique numbers in column 2 (Stock#). Individual plant IDs are in column 3; these numbers are unique within Stock#s, but replicated across (e.g., both stock # 36 and #22582 will have plants 1, 2, 3...). Column 4 (Coll. Date) is the date the flower was collected, which was used to estimate plant flowering times. Column 5 (fl rank) gives the rank of the flower. If it was on the main flowering stalk of the plant, flowers were numbered from the first to open (at the bottom), with floral rank increasing up the flowering stalk. Flowers on side branches were not assigned ranks, and instead labeled "axil." Columns 6 and 7 give the number of long and short stamens in the flower, respectively. Column 8 gives the number of ovules in the flower; those data are only available for a subset of flowers. The remaining columns give the latitude and longitude with hours, minutes, and seconds separated into different columns (e.g. "lat h" is hours, "lat m" is minutes, "lat s" is seconds).
Native range accessions common garden.csv
RIL stamen number by flower, for variance partitioning analysis
This is a csv file with the RIL short stamen number data used for partitioning variance. Each row is a single flower. Column 1, "Line" is the recombinant inbred line (RIL). "Plant.nested.in.line," Column 2, gives the number identifying each individual plant (each plant within a RIL has a unique number, but those numbers are replicated across RILs). Column 3, "shorts," is the number of short stamens in each flower.
RILs_forVariance.csv
Short stamen number for parents of RILs, for variance partitioning
This is a csv file with the RIL parent short stamen number data used for partitioning variance. Each row is a single flower. Column 1, "ShortStamen," is the number of short stamens in each flower. Column 2, "Pop", indicates which parent the flower was sampled from (Italian or Swedish). Column 3, "Line," is the parental subline (see manuscript). Column 4, "Rep," gives the number identifying each individual plant (each plant within a parental subline has a unique number, but those numbers are replicated across sublines).
RILparents_forVariance.csv
Variation in seed set with natural variation in short stamen number
This is a csv file with the data used to assess function of short stamens using natural variation in short stamen number. Each row corresponds to a single flower/fruit. Column 1, "Pop" identifies the geographic location the accession is associated with. Column 2, "Stock#" gives the ID of the accession (when available, these are ABRC stock numbers). Multiple accessions were sampled per location. Multiple plants per accession were also sampled; individual plant IDs are in column 3, "PlantUnique." Columns 4 and 5 are the number of long and short stamens, respectively. The number of seeds produced by the fruit are in Column 6.
StamenLossFunction_NaturalVariation.csv
Short stamen function, stamen removal test data
This is a csv file containing the data on short stamen function based on treatments removing stamens in different configurations. Each row is a single flower/fruit. Column 1 specifies which geographic location the accession was originally from (see Table S1). The second column ("plant") gives a unique ID for the specific plant the flower was sampled from. Each plant received each treatment at least once. Column 3, "treatment," shows which treatment the flower received ("remall" = all stamens removed, "remnone" = no stamens removed, but stamens probed with forceps as a control; "remlong" = only long stamens removed; "remshort" = only short stamens removed; "unmanip" = unmanipulated, not touched by forceps. Column 4 ("seed set") gives the resulting number of seeds produced by that flower/fruit, and column 5 ("year") shows which year the treatment was applied (2007 was in a growth chamber, 2011 plants were put outside after treatments were applied until flowers wilted.)
final stamen removal data for analysis.csv
RIL line means for QTL and epistasis analyses
This csv file contains the RIL data used for QTL analysis, including the separate analysis of epistasis. Each row represents a single plant; multiple flowers were sampled per plant, so the stamen numbers are averages. The first three columns are total stamen numbers (long and short together); Column 1 "QN_mean_stamen_number" is quantile-normalized, column 2 "no6s_stamen_number" includes only plants that had some stamen loss, column 3 "mean_stamen_number" is the raw mean. Column 4, "id," is the RIL number. Columns 4-7 give the genotype of the QTL (those on chromosomes 1, 3, and 5 respectively) for each line; "a" indicates lines homozygous for the Italian (stamen loss) allele, while "b" indicates lines homozygous for the Swedish (short stamen production) allele. Column 8 gives the mean short stamen number (subtracting the four long stamens; mean_stamen_number - 4).
RIL line means_QTL and epistasis analyses.csv