Data from: Uncovering new lineages in the Sunda Pangolin (Manis javanica) with museum mitogenomics
Data files
Sep 29, 2025 version files 3.20 MB
-
BEAST_input_alignments_BEAST.zip
64.78 KB
-
BEAST_input_relaxed.xml
1.23 MB
-
BEAST_MCC_Manis_final
103.04 KB
-
cytb_genetic_distance_plot.zip
6.99 KB
-
cytb_input_tree_final.fasta
339.82 KB
-
cytb.contree
11.40 KB
-
IQTREE_input_alignment.fasta
1.44 MB
-
IQTREE_mitogenome_sp_partition_merge.contree
3.68 KB
-
IQTREE_mtgenome_input_partitions.nex
1.72 KB
-
README.md
2.64 KB
Abstract
Accurately identifying evolutionarily significant units (ESUs) is crucial for conservation planning, especially for species like pangolins threatened by overhunting and habitat loss. ESUs help categorize different pangolin populations, aiding in understanding their genetic diversity and distribution, which is vital for targeted conservation efforts. This research generated mitochondrial genomes from historical museum specimens of Sunda pangolins (Manis javanica) from underrepresented locations, uncovering a new evolutionary lineage from the Mentawai Islands. The novel sequences provide resources for forensic labs tracing the origin of confiscated scales and limit the potential distribution of the "mysterious pangolin." The Mentawai Archipelago represents a divergent ESU with a small distribution, important for conservation planning. Additionally, this research confirmed the presence of two major M. javanica lineages in Java and extended the known distribution to Bali and East Kalimantan. These findings support the "Out of Borneo" hypothesis and suggest a recent colonization of pangolins across Indochina and west Sundaland. This study also highlights the need for further investigation into the taxonomic status of these lineages and their management as subspecies.
Dataset DOI: 10.5061/dryad.7d7wm3876
Description of the data and file structure
This research generated mitochondrial genomes from historical museum specimens of Sunda pangolins (Manis javanica) from underrepresented locations, uncovering a new evolutionary lineage from the Mentawai Islands that diverged from mainland and west Sundaland populations around 760 thousand years ago. The novel sequences provide resources for forensic labs tracing the origin of confiscated scales and shed light into the potential distribution of the "mysterious pangolin".
Files and variables
File: IQTREE_input_alignment.fasta
Description: Mitochondrial genome alignment used in IQTREE phylogenetic analysis.
File: IQTREE_mtgenome_input_partitions.nex
Description: Mitochondrial genome alignment partition used in IQTREE phylogenetic analysis.
File: IQTREE_mitogenome_sp_partition_merge.contree
Description: Maximum likelihood phylogenetic tree inferred from mitochondrial genomes (15,254 bp) with IQTREE (Figure 1).
File: BEAST_input_relaxed.xml
Description: BEAST analysis input file.
File: BEAST_MCC_Manis_final
Description: Maximum Clade Credibility (MCC) timetree inferred from mitochondrial genomes (11,510 bp) using BEAST (Figure S3).
File: BEAST_input_alignments_BEAST.zip
Description: Mitochondrial genome alignments input (and excluded) in BEAST. This ZIP file contains two directories representing those partition alignments included and excluded from the BEAST analysis due to saturation.
File: cytb_genetic_distance_plot.zip
Description: Input and R code to generate Cytochrome b genetic distance plot (Figure S2).
File: cytb_input_tree_final.fasta
Description: Cytochrome b alignment used in IQTREE phylogenetic analysis.
File: cytb.contree
Description: Maximum likelihood phylogenetic tree inferred from Cytochrome b (1,040 bp) with IQTREE (Figure S1).
Code/software
These analyses have been run on IQTREE, BEAST, and R. All code–which was used to compute the genetic distance plot–is provided in "cytb_genetic_distances_plot.R", which is found in "cytb_genetic_distance_plot.zip".
Access information
Other publicly accessible locations of the data:
- Newly generated mitogenome sequences, along with previously published ones, including their GenBank accession numbers and associated metadata, are listed in Table S1.
Data was derived from the following sources:
- GenBank
