Datasets for microsatellite genotype for natural populations and seedlings from six mother trees, and seedling survival and growth records
Data files
Jan 31, 2025 version files 146.84 KB
-
README.md
3.93 KB
-
ShoreaCurtisiiPopulationGenotypeSTRUCTURE.csv
12.46 KB
-
ShoreaCurtisiiSeedlingGenotypeCERVUS.csv
56.36 KB
-
ShoreaCurtisiiSeedlingSurvivalGrowthRecord.csv
74.09 KB
Abstract
To assess genetic factors that affect the fitness of seedlings of Rubroshorea curtisii, a dominant canopy tree species in hill dipterocarp forests, the inter- and intra-population genetic structure of individuals with reproductive stage and, survival rate and seedling growth performance in relation to the bi-parental genetic relationship were studied. A Bayesian based clustering analysis revealed that three genetically distinct clusters were observed in almost all populations throughout the distributional range of the species in Malay Peninsula and provided the optimum explanation for the genetic structure of 182 mature individuals in two permanent plots in a hill dipterocarp forest. The two clusters showed larger genetic differentiation from the ancestral admixture population, but the other one was not differentiated. A total of 460 seedlings derived from six mother trees in the plot were raised in a nursery, and their pollen donors were identified using genetic marker based paternity assignment. Seed weight, bi-parental genetic relatedness, and bi-parental genetic heterogeneity based on the clustering analysis were used to analyze their effects on seedling fitness. The bi-parental larger genetic heterogeneity was associated with a significantly higher probability of seedling survivorship, and likewise, higher performance of vertical growth of the seedlings; but the seed weight and genetic relatedness did not significantly affect those. This evidence suggests that fitter seedlings derived from mating between parents with different genetic clusters contribute to maintaining genetic diversity through negative frequency-dependent selection and may have an important role in adaptation in the tropical forest plant community.
README: Datasets for microsatellite genotype for natural populations and seedlings from six mother trees, and seedling survival and growth records
https://doi.org/10.5061/dryad.nzs7h451m
Description of the data and file structure
Population Genotype Data Set: A csv file named "ShoreaCurtisiiPopulationGenotypeSTRUCTURE.csv" contains 10 microsatellite markers' genotypes from 163 Shorea curtisii DNAs collected from 9 natural populations in the Malay Peninsula. This csv file is optimized for the software "STRUCTURE", the first line shows names of microsatellite markers from Ujino et al. (1998), Lee, Tani, Ng, and Tsumura (2004), and Lee et al. (2006), then followed by genotype data of each individual. The first row shows individual names which represent the abbreviation of the population name on the left side of the under-bar and individual tag numbers on the right side of the under-bar. The abbreviation of the population name corresponds to Tani et al. (2022). The second and third rows are PopData and PopFlag of STRUCTURE, respectively. Microsatellite genotypes of each individual are recorded from the forth row with PCR fragment length form. Missing data are scored as "-9". Due to setting DRYAD file uploading system, the first line indicates contents of rows. This should be replaced to microsatellite loci names, such as shc04, shc07, shc09, sle074, sle384, sle293, sle562, sle566, slu044, slu175.
Seedling Genotype Data Set: A csv file named as "ShoreaCurtisiiSeedlingGenotypeCERVUS.csv" contains 10 microsatellite markers' genotypes from 145 mature trees (more than 20 cm in diameter of breast height) and 481 seedlings derived from the six mother trees in a plot of undisturved forest at Semangkok Forest Reserve, Malay Peninsula.?The first line shows microsatellite marker names followed by the mature individuals genotype (from 2nd to 146th line) and followed by the seedling genotype (from 147th to 627th line). The first row shows names of the mature individuals and seedling followed by microsatellite allele with PCR fragment length style from 2nd row. Missing data are scored as "0".
Seedling Survival Growth Record: A csv file named as gShoreaCurtisiiSurvivalGrowthRecord.csv h is input file for survival analysis performed by the R statistical software using the package esurvival f (Andersen & Gill, 1982; R Core Team, 2014) to fit maternal (seed weight represented as w) and biparental effects (r, qdis and fdis) to the survival of seedlings. This input file is also used for growth analysis using the generalized linear mixed model (GLMM) with normal distribution error structure on in stan 2.5 (Carpenter et al., 2017) using the package eRstan f in R 3.1.1 (Gelman, Lee, & Guo, 2015; R Core Team, 2014; Stan Development Team, 2020). The first line shows title of data, such as Seedling ID, Indium (consecutive number of seedlings), MT ID (mother tree ID), Mtnum (consecutive number of mother trees), CanF ID (candidate pollen donor ID), CanFnum (consecutive number of candidate pollen donors), w (seed weight of each seedling), r (relatedness between mother tree and pollen donor of each seedling), mk1~mk3 (Qk values of mother tree of each seedling), fk1fk3 (Qk values of pollen donor of each seedling), qdis (qdis statics of each seedling), fdis (fdis statics of each seedling), days (survival days of each seedling), cens (censorship information for each seedling, where 1 indicates death, and 0 indicates censored at the end of observation). In the rows followed by cens, height of each seedling in the observation date is recorded until death of each seedling or the end of the observation date. The survival, growth and explanatory variables for each seedling are represented from second line.
Survival analysis: survival_coxph.r is R script for the survival analysis.
Growth analysis: growth_glmmstan.r is R script for the growth analysis.
Methods
Leaf or inner bark tissue was collected from 134 R. curtisii individuals, representing nine natural populations throughout the natural distribution across Malay Peninsula (Table 1, Fig 1). The tissues were collected from R. curtisii individuals that were at least 20 m apart, to avoid collecting samples from genetically related individuals, regardless of age or size of trees. Samples were stored at -20 ˚C, prior to DNA extraction. Semangkok Forest Reserve, a designated hill dipterocarp forest conservation area, is in and governed by the Selangor state; it is 60 km north of Kuala Lumpur, on the Malay Peninsula and is a designated hill dipterocarp forest conservation area. In 1993, Niiyama et al. (1999) established a 6-ha permanent plot (200 m × 300 m) in an undisturbed forest on a narrow ridge and steep slope, ranging from 340 to 450 m above sea level (3°37'07''N, 101°44'15''E). Another ca. 4-ha (100 m × 400 m) permanent plot was established within a selectively logged area of forest in 1994 and was extended to about 5.4-ha (ca. 140 m × 400 m) in 2007 (3°37'23''N, 101°44'15''E); this is only 200 m away from the 6-ha permanent plot (Yagihashi et al., 2010). Leaf or inner bark samples were collected from 144 and 38 R. curtisii individuals (with dbh more than 20 cm) from the undisturbed plot and the logged plot, respectively, of which 17 and three trees were growing in areas of the study plots adjacent to the undisturbed and the logged plots, respectively. A total of 17 three trees growing in areas of the study plots adjacent to the undisturbed plot have previously been identified as the candidate pollen donors (Tani et al., 2012; Tani et al., 2015). Samples were also stored at -20 ˚C, before DNA extraction.
A sporadic synchronized flowering event was observed in Semangkok Forest Reserve in October 2011, and fruit set occurred around February 2012. Seeds were collected from six selected mother trees in the undisturbed plot. After the removal of wings from the seeds, seed weight was measured, and then the seeds were placed on seedbeds consisting of river sand on February 20th, 2012 to investigate germination. The result of the germination test is presented in Table 2. The germinated seedlings were potted in a mixture of river sand and soil without fertilizer on March 26th, 2012, and maintained in the nursery of the Forest Research Institute Malaysia (FRIM, 3°14'01''N, 101°38'00''E) under 50% shade, with a water sprinkler system for irrigation. Height and survival of seedlings were monitored weekly before and monthly after transplantation.
Genomic DNA was extracted, using the method described by Murray & Thompson (1980). The material analyzed came from inner bark tissue samples of the adult trees from the research plots, material from either the inner bark or leaf tissues for the population samples and from the leaves of seedlings. The extracted DNA from the adult trees and population samples was further purified using a High Pure PCR Template Preparation Kit (Roche). After RNA digestion, the DNA was diluted to a concentration of about 2 ng/mL. All samples were genotyped based on ten microsatellite markers previously developed by Ujino et al. (1998), Lee et al. (2004) and Lee et al. (2006). Polymerase chain reaction (PCR) amplifications were carried out in total reaction volumes of 10 μL using a GeneAmp 9700 (Applied Biosystems). The PCR mixture contained 0.2 μM of each primer, 1x QIAGEN Multiplex PCR Master Mix (Qiagen), and 0.5-3 ng of template DNA. The temperature profile used was: 15 min at 95 ºC, then 30-35 cycles of 30 sec at 94 ºC, 90 sec at 50-57 ºC and 90 sec at 72 ºC, with a 10 min final extension step at 72 ºC. Amplified PCR fragments were electrophoretically separated using a 3100 genetic analyzer (Applied Biosystems) with a calibrated internal size standard (GeneScan ROX 400HD). The genotype of each individual was determined from the resulting electropherograms using GeneMarker (SoftGenetics). The microsatellite genotypic data from population samples, adult trees from the plots, and seedlings from the nursery are referred as population genotype, plot genotype and seedling genotype, respectively. The plot genotypes were obtained in previous studies (Tani et al., 2012; Tani et al., 2015), and were already deposited in Dryad (http://dx.doi.org/10.5061/dryad.7k434).