Exome capture design for the strawberry poison frog, Oophaga pumilio, in Bocas del Toro
Data files
Aug 19, 2025 version files 132.56 MB
-
annotation_complete
5.56 MB
-
IRN1000004503_Opumilio_21Feb2020_capture_targets.bed
11.22 MB
-
IRN1000004503_Opumilio_21Feb2020_coverage_summary.txt
530 B
-
IRN1000004503_Opumilio_21Feb2020_coverage.txt
12.53 MB
-
IRN1000004503_Opumilio_21Feb2020_predicted_no_coverage_regions.bed
1.70 MB
-
IRN1000004503_Opumilio_21Feb2020_primary_targets.bed
7.27 MB
-
Opumilio.target.noendline.fasta
94.27 MB
-
README.md
1.96 KB
Abstract
The aposematic strawberry poison frog, Oophaga pumilio, is an iconic model system for studying the evolution and maintenance of color variation. Through most of its range, this frog is red with blue limbs. However, frogs from the Bocas del Toro Province, Panama, show striking variance in color and pattern, both sympatrically and allopatrically. This observation contradicts standard models of the evolution of aposematism and has led to substantial speculation about its evolutionary and molecular causes. Since the enigma of O. pumiliophenotypic variation is partly unresolved because of its large, ∼ 6.7 Gb genome, we here sequence exomes from 347 individuals from ten populations and map a number of genetic factors responsible for the color and pattern variation. The kitgene is the primary candidate underlying the blue-red polymorphism in Dolphin Bay, where an increase in melanosomes is correlated with blue coloration. Additionally, thettc39bgene, a known enhancer of yellow-to-red carotenoid conversion in birds, is the primary factor behind the yellow-red polymorphism in the Bastimentos West area. The causal genetic regions show evidence of selective sweeps acting locally to spread the rare phenotype. Our analyses suggest an evolutionary model in which selection is driving the formation of new morphs in a dynamic system resulting from a trade-off between predation avoidance, intraspecific competition, and mate choice.
Dataset DOI: 10.5061/dryad.np5hqc055
Description of the data and file structure
Exome capture was designed from skin transcriptome of different developmental stages of Oophaga pumilio. The target was sent to NimbleGenes to design the probes.
Files and variables
File: IRN1000004503_Opumilio_21Feb2020_capture_targets.bed
Description: bedfile of all the regions that we designed to capture including padding around primary regions
File: IRN1000004503_Opumilio_21Feb2020_predicted_no_coverage_regions.bed
Description: bedfile of all the regions that NimbleGene predicted to have no coverage
File: IRN1000004503_Opumilio_21Feb2020_coverage_summary.txt
Description: Summary of the expected coverage using the exome capture
File: IRN1000004503_Opumilio_21Feb2020_primary_targets.bed
Description: This file contains coordinates showing the probe targets with no padding
File: IRN1000004503_Opumilio_21Feb2020_coverage.txt
Description: Estimated coverage in target regions
File: Opumilio.target.noendline.fasta
Description: Fasta file submitted as the primary target (what we want to extract from the whole genome)
File: annotation_complete
Description: Annotation of the transcript
Access information
Other publicly accessible locations of the data:
- PRJNA760522: Sequencing generated using the probes and Illumina Sequencing
Data was derived from the following sources:
- Transcriptome from this article: Stuckert, A.M.M., Freeborn, L., Howell, K.A., Yang, Y., Nielsen, R., Richards-Zawacki, C., and MacManes, M.D. (2023). Transcriptomic analyses during development reveal mechanisms of integument structuring and color production. Evol. Ecol. https://doi.org/10.1007/s10682-023-10256-2.
The initial transcriptome was 108,640,165 bp represented by 152,862 transcripts. The transcripts were mapped to the Oophaga pumilio and the Ranitomeya imitator genomes using STARlong 2.7.0d. using the following parameters: --outFilterMismatchNmax 5000, --outFilterMismatchNoverReadLmax .1, --seedSearchStartLmax 20, --seedPerReadNmax 50000, --sjdbOverhang 100 --outFilterScoreMin 0, –outFilterScoreMinOverLread 0, --outFilterMatchNminOverLread .5 --seedPerWindowNmax 1000. We removed the transcripts that did not map to these two assemblies. This resulted in 141,482 transcripts and 104,671,021 bp. Then we used transcript_filter.pl v0.2.0 to remove isoforms. We did two rounds of filtering using the following parameters: 1) minimum 80% coverage and 98% identical 2) 90% coverage, 95% identical. The isoform removal step resulted in 97 Mb. We removed most mitochondrial genes using the annotation and kept cytB. We used 6-Process Annotation (https://github.com/CGRL-QB3-UCBerkeley/MarkerDevelopmentPopGen) to filter out sequences smaller than 150 bp and with a GC content smaller than 35% or larger than 75%. This script also masked repeats setting the parameter to “Xenopus genus” and “vertebrates”. Finally, we checked if a list of candidate color genes was present. We added the missing genes from the Oophaga pumilio or the Ranitomeya imitator genomes. The final design had 90 Mb and 115,420 regions and was sent to NimbleGene for approval. NimbleGene made 116,121 probes that targeted 80 Mb, estimating a final coverage of 86 Mb across 110,329 transcripts.
