Microsatellite data, chloroplast and nuclear rRNA sequences of Avicennia marina from Vietnam, Malaysia, and The Philippines
Data files
May 12, 2025 version files 132.06 KB
-
Microsatellite_data_15_loci_10_pops_Vietnam.xlsx
91.83 KB
-
README.md
1.53 KB
-
Suppl_table_S3_cpDNA_mutations.xlsx
25.68 KB
-
Suppl_Table_S4_rRNA_cistron.xlsx
13.01 KB
Abstract
Mangrove forests maintain connectivity and stay genetically linked through ocean-dispersed propagules. Most propagules are dispersed over relatively short distances, but those reaching open waters can be transported over relatively long distances. Avicennia species exhibit a pronounced genetic structure across varying distances following a stepping stone migration model, with connectivity patterns linked to strength and direction of ocean-surface currents Here, we use present-day spatial genetic structure of A. marina populations as an imprint of connectivity. This should allow us to estimate their migration history in relation to coastal configuration and Holocene sea-level rise on the Sunda Shelf. We examined the genetic diversity, structure, as well as the demographic, and evolutionary history of establishment for ten A. marina populations across coastal stretches of Vietnam, using nuclear microsatellite markers in 558 individual trees. Additionally, genome skimming of 24 samples allowed detailed analysis of complete chloroplast genome and nuclear ribosomal cistron sequences.
Dataset DOI: 10.5061/dryad.3r2280gss
Description of the data and file structure
The microsatellite data comprise 15 loci for individual genotypes in 10 Avicennia marina populations of Vietnam.
The chloroplast and nuclear rRNA cistron sequences were obtained from NGS data of Avicennia marina individuals of Vietnam, Malaysia, and The Philippines
Files and variables
File: Microsatellite_data_15_loci_10_pops_Vietnam.xlsx
Description: nuclear microsatellites of Avicennia marina
Variables
- 15 loci
- 10 populations of Vietnam
File: Suppl_table_S3_cpDNA_mutations.xlsx
Description: Chloroplast DNA with 128 mutational changes, namely 35 transitions (ti), 43 transversions (tv), 9 insertion-deletions (indel) and 41 mononucleotide repeats (µsat) in Avicennia marina individuals of ten populations in Vietnam, three in West Malaysian Peninsula and seven in The Philippines.
File: Suppl_Table_S4_rRNA_cistron.xlsx
Description: Incomplete homogenization at 21 nucleotide positions of the nuclear ribosomal RNA cistron (SSrRNA, ITS1, 5.8srRNA, ITS2, LSrRNA ) in Avicennia marina individuals of ten populations in Vietnam, three in West Malaysian Peninsula and seven in The Philippines. Colors indicate the positions with incomplete homogenizations for each region.
Code/software
Any tabular data software
Genomic DNA was extracted from approximately 20 mg of each dried leaf tissue using the E.Z.N.A. SP plant DNA Mini kit (Omega bio-tek, Norcross, GA, USA). The concentration of individual samples ranged from 10 – 200 ng/µl. The multiplexed PCR reactions consisted of 15 microsatellite markers: Avma1, Avma02, Avma03, Avma05, Avma6, Avma8, Avma10, Avma14, Avma17 (Geng et al., 2007); Am3, Am81, (Maguire et al., 2000a); Aa22, Aa23, Aa67 (Teixeira et al., 2003); and AMK6 (Triest et al., 2020). Primers were fluorescence-labelled with four different dye-labels (6FAM/VIC/NED/PET), and a mixture of 0.2 µM of each primer. About 6.25 µl master mix (Qiagen Multiplex PCR kit), 1.25 µl primer mix, 2.5µl H2O, and 2.5µl of genomic DNA were used for multiplex PCR reactions. PCR was performed in a thermal cycler (Bio-Rad MyCycler) under the conditions of initial denaturation at 95 °C for 15 min followed by 35 cycles of 30 sec denaturation at 95 °C, 90 sec annealing at 57 °C, 80 sec elongation at 72 °C and a final extension of 30 min at 60 °C. All PCR products were separated on an ABI3730XL sequencer (Macrogen, Seoul, Korea) and allele sizes were determined with GeneMarker v.2.60 (SoftGenetics LLC, State College, USA).
A “comparative analysis” of Southern Vietnam populations with A. marina from The Philippines (Leyte) and Peninsula Malaysia (Southwest and West) was done for an overall available set of nine microsatellite loci (out of abovementioned 15 loci): Avma1, Avma02, Avma6, Avma8, Avma10, Avma14, Avma17, Am3 and Am81 (and used only for positioning in a combined PCoA and for evolutionary model testing with ABC-RF, see further under microsatellite analysis).
For a complete chloroplast genome analysis and rRNA cistron, through genome skimming, we considered sixteen individuals covering all Vietnam sites (one in each transect from N1 to C8 and double in S9 and S10). For comparative reasons, we newly included and analyzed seven individuals from the western coast of Leyte Island in The Philippines (one from each population AM5, AM6, AM8, AM9, AM14, AM15, AM16 as mentioned in Triest et al., 2021a) and three individuals from the western coast of the Peninsular Malaysia (one from each population W1, W2, W3 as mentioned in Triest et al., 2021b). Genomic DNA extracts of 26 samples were made at the Plant Biology and Nature Management (APNA) lab of the Vrije Universiteit Brussel (VUB) and processed for next generation sequencing analysis using the E.Z.N.A. SP plant DNA Mini Kit (Omega biotek, Norcross, GA, USA). Quantity and purity (260/280 and 260/230 ratios) of the DNA were determined using a Nanodrop one Spectrophotometer (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Extractions were repeated for samples with a 260/280 ratio of less than 1.8 and/or a concentration lower than 5 ng/ul. If necessary, multiple DNA extractions of one sample were pooled and concentrated by ethanol precipitation. An Illumina paired-end library was constructed using the TruSeq nano DNA Kit. After passing quality inspection (DNA concentration between 5 to 15ng/ul), the constructed library (TruSeq Nano DNA Kit) was sequenced by 300 bp x 2 paired-end sequencing in an Illumina MiSeq platform (Macrogen, Seoul, South Korea).
Raw data was filtered out to remove the joint sequence and low-quality reads to obtain high-quality clean data. The Illumina pair-end next generation sequencing (NGS) product was used for genome skimming of de novo chloroplast assembles. The de novo chloroplast assemblies were done at first for a good quality fresh A. marina sample from Kenya (Gazi Bay) using NOVOPlasty assembly at Kmer = 33 (Dierckxsens et al., 2017). All assemblies were executed by taking a single read from the dataset that originates from the targeted plastid as seed (rbcL) and taking 30% as subsample from the FASTA file with default parameters. Thereafter, the Illumina pair-end next generation sequencing (NGS) products are used for chloroplast assembles that were referenced to the existing annotated A. marina chloroplast genome (GenBank accession number MT108381 from Fujian, China by Li et al., 2020) and appeared similar in genome structure and perfectly aligned with our reference sample from Kenya though did not align well with GenBank accession number MT012822 (A. marina from Oman by Khan et al. in 2020, unpublished). All 26 individual samples (16 from Vietnam, 7 from The Philippines, and 3 from Malaysia) were assembled using ‘assemble to reference’ function in Geneious software. Illumina 2 x 300 bp paired-end were processed in Geneious Prime® 2024.0.5 (Biomatters Ltd.) to obtain complete chloroplast genome sequences. The assemblages averaged 13,316 – 119,539 reads with a mean depth of reads ranging from 27 – 239 coverage. This approach facilitated the analysis without need to describe de novo chloroplast genome features and gene annotations. The 26 consensus complete chloroplast sequences were aligned with mafft v7.388 (Katoh and Standley, 2013). We further adjusted manually at sites with mononucleotide repeats and insertions/deletions (indels).
Genome skimming allowed to assembler the nuclear ribosomal cistron (18S, ITS1, 5.8S, ITS2, and 26S) from a 671 bp template sequence of Avicennia marina (Genbank AF365978, Schwarzbach and McDade, 2002) containing an internal transcribed spacer 1 (partial sequence), 5.8S ribosomal RNA gene (complete sequence) and internal transcribed spacer 2 (partial sequence) and subsequently used four times as a seed in Geneious Prime® 2024.0.5 to progressively enlarge the flanking regions. A 5,772 bp nuclear ribosomal cistron was obtained for one sample (from C7 population) and used as a reference to map every sample, averaging 3,932 – 23,717 reads with a mean depth of reads ranging from 123 – 713 coverage. The generated consensus sequences of 5,772 bp length for a total of 26 samples (16 from Vietnam, 7 from the Philippines, and 3 from Malaysia) were aligned for comparative description of mutated positions using mafft v7.388 (Katoh and Standley, 2013).
