Data from: DNA methylation site loss for plasticity-led novel trait genetic fixation

Katsumura, Takafumi 1 2 3 ; Sato, Suguru1; Yamashita, Kana1; Oda, Shoji4; Gakuhari, Takashi5; Tanaka, Shodai1; Fujitani, Kazuko1; Nishimaki, Toshiyuki1; Imai, Tadashi6; Yoshiura, Yasutoshi6 7; Takeshima, Hirohiko8; Hashiguchi, Yasuyuki9; Sekita, Yoichi1; Mitani, Hiroshi4; Ogawa, Motoyuki1; Takeuchi, Hideaki2 10; Oota, Hiroki1 4

Published Apr 22, 2026 on Dryad. https://doi.org/10.5061/dryad.cz8w9gjb3

Data files

Apr 22, 2026 version files 99.97 MB

DeltaUpPlxnb3Medaka.csv

2.60 KB
Gut_IHC_images.zip

13.10 MB
PopulationLocations_forAnnualTemperature.csv

1.23 KB
README.md

17.14 KB
SNPsRADseq.vcf

86.79 MB
WildCapturedMedaka.csv

20.84 KB
WildDerivedLabMedaka.csv

16.54 KB
WildDerivedLabMedaka2021.csv

12.15 KB

Abstract

Phenotypic plasticity allows organisms to adapt traits in response to environmental changes, yet the molecular basis by which such plastic traits become genetically fixed remains unclear. Here, we investigated gut length plasticity in medaka fish (Oryzias latipes) through genome-wide methylation profiling, CRISPR/Cas9-mediated deletion, and population genomic analyses. We found that seasonal methylation of CpG sites upstream of the Plxnb3 is correlated with gut length plasticity, and deletion of this region abolishes plasticity. Additionally, standing variation in Ppp3r1 is associated with genetically fixed longer gut length in populations lacking plasticity. These results suggest that loss of epigenetic regulation via CpG site reduction triggers the genetic fixation of novel traits. Our findings provide molecular evidence linking epigenetic plasticity and genetic assimilation, advancing understanding of plasticity-led evolution in natural populations.

Access this dataset on Dryad

This dataset supports the analyses reported in Katsumura et al. (2026), PNAS 123(13), e2534817123 (https://doi.org/10.1073/pnas.2534817123). It comprises morphometric measurements (standard length and gut length) of Japanese medaka (Oryzias latipes) from three origins — wild-caught individuals, laboratory-reared fish derived from wild populations, and CRISPR/Cas9-generated plxnb3 upstream-deleted (∆upPlxnb3) mutants — along with immunohistochemistry images of medaka gut tissue, genome-wide SNP genotypes obtained by RAD-seq, and geographic coordinates of the sampling regions. These resources underpin Fig. 1, Fig. 3, Fig. 4, Fig. 5, and Fig. S6 of the manuscript.

Description of the data and file structure

Morphometric and geographic data are provided as CSV files (comma-delimited, UTF-8 with BOM, a single header row). SNP data are provided in a standard VCF file. Immunohistochemistry images are organised in per-individual folders bundled into a single ZIP archive.

Population genetic grouping. Medaka populations are assigned to seven genetic subgroups following the structure reported in Katsumura et al. (2019, G3 9: 217–228): six Japanese subgroups — NJPN1 (derived Northern Japanese group: Kamikita, Yokote, Niigata, Kaga, Maizuru, Miyazu); NJPN2 (ancestral Northern Japanese group: Amino, Kumihama, Toyooka, Kinosaki, Hamasaka); SJPN1 (PO.NEH: Ichinoseki, Toyota, Mishima, Iwata, Sakura, Shingu); SJPN2 (Sanyo/Shikoku/Kinki: Ayabe, Tanabe, Okayama, Takamatsu, Kudamatsu, Misho); SJPN3 (San-in: Kasumi, Tsuma, Tottori, Hagi); SJPN4 (Northern and Southern Kyushu: Kusu, Arita, Hisayama, Fukue, Kazusa, Izumi, Hiwaki, Nago) — and KOR (KOR/CHN: Yongcheon, Maegok, Bugang, Sacheon, Shanghai) for populations from the Korean peninsula and eastern China.

Missing data. No missing values are present in the CSV files.

Files and variables

File: `WildCapturedMedaka.csv`

Standard length and gut length measurements of wild-caught medaka (and of their descendants reared at an outdoor facility for the transplantation experiment), used in Fig. 1 and related analyses. Each row corresponds to a single individual (n = 309).

ID — Unique identifier for each specimen. The prefix encodes the sampling river (e.g., KMB = Kabe, EJ = Ejiri, SHD = Shihodo).
Sex — Sex of the individual. Values: female, male.
River — Name of the river or stream from which the lineage was captured in the wild (Kabe, Ejiri, Shihodo; all in Kagawa Prefecture, Japan).
Place — Location at which the individual itself was sampled and measured. Two values occur:
- Kagawa — individuals sampled directly from the wild in Kagawa Prefecture.
- Kashiwa — individuals from the transplantation / outdoor breeding experiment (Fig. S5). In August 2015, wild medaka from the Kabe river were transported to the outdoor breeding facility on the Kashiwa campus of The University of Tokyo; the offspring bred there were sampled in September 2016 and February 2017.
Sampling date — Date of sampling, formatted as YYYY/M/D (e.g., 2015/2/22). Range: 2014/1/25 – 2017/2/20.
Year — Year of sampling, as an integer (2014–2017).
Season — Season of sampling. Values: Summer, Winter.
Standard length (mm) — Standard body length, in millimetres, measured from the most anterior part of the head to the end of the vertebral column on photographs using ImageJ.
Gut length (mm) — Length of the isolated gut (from esophagus to anus), in millimetres, fixed in 4 % paraformaldehyde and measured on photographs using ImageJ.

File: `WildDerivedLabMedaka.csv`

Standard length and gut length measurements of laboratory-reared medaka derived from 40 wild populations, sampled in 2015 at the outdoor breeding facility of The University of Tokyo, and used in Fig. 1 and related analyses. Each row corresponds to a single individual (n = 341). The stocks are descendants of populations collected from each wild habitat in 1983 (Shima et al., 1985) and have been maintained by taking offspring every 1–2 years in independent tanks under common rearing conditions.

ID — Unique identifier for each fish (prefix encodes the population of origin; e.g., YC = Yongcheon).
Sex — Sex of the individual. Values: female, male.
Population — Population of origin (locality name).
Subgroup — Genetic subgroup to which the population belongs (NJPN1, NJPN2, SJPN1, SJPN2, SJPN3, SJPN4, KOR; see "Population genetic grouping" above).
Sampling date — Date of sampling at the breeding facility, formatted as YYYY.M.D (e.g., 2015.8.25). All entries fall within 2015.
Standard length (mm) — Standard body length, in millimetres.
Gut length (mm) — Length of the isolated gut, in millimetres.

File: `WildDerivedLabMedaka2021.csv`

Standard length and gut length measurements from the common-garden experiment in 2021, used in Fig. S6. Medaka were sampled on 3 February 2021 (winter) and 30 August 2021 (summer) from available wild-derived laboratory stocks. Each row corresponds to a single individual (n = 167).

ID — Unique identifier for each fish (e.g., MY-w01, where MY = Miyazu and w = winter sampling).
Sex — Sex of the individual. Values: female, male.
Population — Population of origin (locality name).
Subgroup — Genetic subgroup (NJPN1, NJPN2, SJPN1, SJPN2).
Sampling date — Date of sampling at the breeding facility, formatted as YYYY/M/D (all within 2021).
Year — Year of sampling (all 2021).
Season — Season of sampling. Values: Summer, Winter.
Standard length (mm) — Standard body length, in millimetres.
Gut length (mm) — Length of the isolated gut, in millimetres.

File: `DeltaUpPlxnb3Medaka.csv`

Standard length and gut length measurements of CRISPR/Cas9-generated medaka used in Fig. 3. The mutant allele (∆upPlxnb3) carries a 334-bp targeted deletion of the seasonally methylated region upstream of plxnb3; individuals are F3 progeny from heterozygous F2 × F2 mating, reared at 28 °C under a 14 h light / 10 h dark cycle. Each row corresponds to a single individual (n = 59).

ID — Unique identifier (dP01–dP60; one of the 60 genotyped F3 individuals was excluded because the gut could not be isolated).
Sex — Sex of the individual. Values: female, male.
Day after hatch — Age at measurement, in days post-hatching (range: 94–122; mean ≈ 103 days).
Genotype — Genotype at the plxnb3 upstream-deletion locus:
- +/+ — Homozygous wild-type (reference allele on both chromosomes) (n = 25).
- +/- — Heterozygous (one wild-type allele and one ∆upPlxnb3 allele) (n = 22).
- -/- — Homozygous ∆upPlxnb3 (deletion on both chromosomes) (n = 12).
Standard length (mm) — Standard body length, in millimetres.
Gut length (mm) — Length of the isolated gut, in millimetres.

File: `PopulationLocations_forAnnualTemperature.csv`

Geographic coordinates and genetic subgroup assignments for the 35 sampling regions. Each row corresponds to one population/locality. These coordinates were used in Fig. 5C to relate gut-length variation to the climatic environment of each region. Annual mean air temperatures used in the manuscript were obtained from the Global Solar Atlas database (https://globalsolaratlas.info; CC BY 4.0 license); they are not included in this file to maintain compatibility with the Dryad CC0 license waiver.

Population — Locality name (35 populations in total).
Subgroup — Genetic subgroup to which the population is assigned (NJPN1, NJPN2, SJPN1, SJPN2, SJPN3, SJPN4; see "Population genetic grouping" above).
Latitude — Latitude of the sampling site, in decimal degrees (WGS84). Range: 26.59°–40.77° N.
Longitude — Longitude of the sampling site, in decimal degrees (WGS84). Range: 127.98°–141.26° E.

File: `Gut_IHC_images.zip`

Immunohistochemistry (IHC) image data of medaka gut tissue used in Fig. 3D. Wild-type (+/+) and homozygous ∆upPlxnb3 (-/-) medaka were compared (n = 5 individuals per genotype). Five-micrometre serial tissue sections were prepared and incubated with a mouse monoclonal anti-phosphorylated neurofilament H antibody (BioLegend, #smi-31, 1:1000) whose cross-reactivity for medaka has been confirmed (Uemura et al., 2015, PLoS Genet. 11, e1005065). An average of six images of intestinal villi per individual were captured at 100× objective magnification on a light microscope (Olympus BX63) equipped with a digital camera (Olympus DP74).

Archive structure. When extracted, the archive contains ten per-individual folders and an accompanying README.txt:

drR_1 – drR_5 — Wild-type (+/+) individuals (n = 5). The prefix drR refers to the d-rR/Tokyo medaka line used as the genetic background for the CRISPR/Cas9 experiment (unedited siblings).
dUpPlxnb3_1 – dUpPlxnb3_5 — Homozygous ∆upPlxnb3 (-/-) mutant individuals (n = 5).

Image files. Each folder contains 2–8 JPEG (.jpg) images of intestinal villi captured on the dates 2019-07-26 and 2020-03-12. File names follow the pattern YYYYMMDD_NNNN.jpg, where YYYYMMDD is the imaging date and NNNN is the camera-assigned sequential image number. A few file names include + or - between numbers (e.g., 20200312_2858+2859.jpg, 20200312_2889-2893.jpg); these denote composite / stitched images assembled from adjacent captures on the same specimen.

File: `SNPsRADseq.vcf`

Single-nucleotide polymorphism (SNP) genotypes obtained by RAD-seq and used in Fig. 4 (association scan) and Fig. 1C / Fig. 5 (phylogeny and molecular-evolution analyses). The file conforms to the Variant Call Format v4.2 specification (https://samtools.github.io/hts-specs/). Genotypes are imputed and phased with Beagle v4.1 (beagle.21Jan17.6cc.jar); this is reflected in the phased-genotype separator (|) in the GT field and the imputation-quality metrics in the INFO field (see below). The distributed file contains 63,265 SNPs across 24 autosomes (chromosomes 1–24).

Reference genome. Reads were aligned to the PacBio-assembled medaka reference Medaka-Hd-rR-pacbio_version2.2.4.fasta (http://utgenome.org/medaka_v2/#!Assembly.md).

Standard VCF fields.

CHROM — Reference chromosome (autosomes 1–24 of the Hd-rR PacBio assembly v2.2.4).
POS — 1-based position on the reference chromosome.
ID — Variant identifier formatted as {LocusID}_{SNP_position_within_locus} as output by the Stacks populations module (e.g., 12700_17).
REF — Reference allele.
ALT — Alternative allele(s).
QUAL — Phred-scaled variant quality score. Encoded as . (missing) because values are not recomputed after imputation.
FILTER — Filter status. All retained sites are marked PASS.
INFO — Variant-level annotations:
- AF — Estimated ALT allele frequency (Float).
- AR2 — Allelic R-squared: estimated squared correlation between the most probable REF dose and the true REF dose (Beagle imputation-accuracy metric; Float).
- DR2 — Dosage R-squared: estimated squared correlation between the estimated REF dose [P(RA) + 2·P(RR)] and the true REF dose (Beagle imputation-accuracy metric; Float).
- IMP — Flag indicating that the marker is imputed.
FORMAT — Per-sample genotype specification:
- GT — Phased genotype (e.g., 0|0, 0|1, 1|1).
- DS — Estimated ALT dose [P(RA) + P(AA)] (Float, 0–2).
- GP — Estimated genotype probabilities for the three possible genotypes (Float triple).

Sample columns. One column per individual. Sample IDs are formatted as {RegionPrefix}_{PopulationCode}{IndividualNumber} (e.g., EKOR_YC2 = individual YC2 from the Yongcheon population within the KOR genetic group). The PopulationCode portion corresponds to the ID prefix used in WildDerivedLabMedaka.csv (e.g., YC = Yongcheon). Population-to-subgroup (NJPN1/2, SJPN1–4, KOR) assignments can be looked up via PopulationLocations_forAnnualTemperature.csv or WildDerivedLabMedaka.csv.

Note on regional groupings used during preprocessing. The distributed file was produced by merging seven per-region VCFs with bcftools merge v1.2: 1_KOR, 2_Kyushu, 3_Sanin, 4_Sanyo-Shikoku, 5_Chubu-Kanto-KitaNihon, 6_Hybrid, and 7_NJPN. These regional labels are coarser geographic pre-groupings used only for the imputation workflow and are distinct from the finer-grained genetic subgroups (NJPN1/2, SJPN1–4, KOR) reported in the manuscript and the CSV files.

SNP calling and processing pipeline (exact parameters as reported in the manuscript):

Single-end reads (51 bp, HiSeq 2500) quality-trimmed with Cutadapt v1.12 (-m 50 -e 0.2).
Demultiplexed with Stacks process_radtags v1.44 (-c -r -t 44 -q -s 0 --barcode_dist_1 2).
Aligned to the reference with BWA aln/samse v0.7.15-r1140 (-n 0.06 -k 3); multimapped reads filtered with SAMtools v1.9 (-q 1 -F 4 -F 256 -F 2048).
SNPs called with the Stacks pipeline: pstacks -m 2 --model_type snp --alpha 0.05; cstacks -g -n 1; sstacks -g; rxstacks --lnl_filter --lnl_lim -10 --conf_filter --conf_lim 0.75 --prune_haplo --model_type bounded --bound_low 0 --bound_high 0.1; cstacks -g -n 1; sstacks -g; populations -r 0.5 -p 7 -m 6 -f p_value -a 0.0 --p_value_cutoff 0.1 --lnl_lim -10.
Genotype imputation and haplotype phasing performed per regional subset with Beagle v4.1 (beagle.21Jan17.6cc.jar; GT format).
The seven imputed/phased regional VCFs merged with bcftools merge v1.2 (htslib v1.2.1) / VCFtools into the distributed file (63,265 SNPs).

For downstream association analyses (Fig. 4A), a minor-allele-frequency filter (MAF > 0.05) was applied within the analyses, yielding 22,235 SNPs; this filter is not pre-applied to the distributed file, so all 63,265 SNPs are available.

Sharing / Access information

Links to other publicly accessible locations of the data associated with this study:

Manuscript: Katsumura et al. (2026) PNAS 123(13), e2534817123. https://doi.org/10.1073/pnas.2534817123
Raw sequencing data (DDBJ Sequence Read Archive):
- DRA010581 — MBD-seq reads.
- DRA010605 — RAD-seq reads (input to SNPsRADseq.vcf).
- DRA015161, DRA015162 — tBS / WGBS reads.
- DRA017652 — additional sequencing data.
Mitochondrial DNA D-loop sequences (DDBJ): accession LC719344–LC719462.
Wild-derived laboratory stocks (living biological material): available from NBRP Medaka (https://shigen.nig.ac.jp/medaka/).

Data were generated by the authors. Annual mean air temperature values used in the manuscript (Fig. 5C) were obtained from the Global Solar Atlas (https://globalsolaratlas.info; CC BY 4.0 license) and are not redistributed in this repository.

Code / Software

Analyses reported in the manuscript were conducted with the following software. Versions are as reported in the manuscript; only the tools directly relevant to reproducing analyses of the files in this repository are listed.

R v3.5.2 (initial analyses) / v4.1.1 (later analyses). Key packages: brms (Bayesian GLMM / GLM via MCMC), lmerTest (tBS analysis), NSM3 (Steel–Dwass test), SNPRelate (PCA), ggplot2 (visualisation).
ImageJ (Schneider et al. 2012, Nat. Methods 9: 671–675) — morphometric measurements on photographs and quantification of IHC staining.
Cutadapt v1.12 — read trimming.
Stacks v1.44 (process_radtags, pstacks, cstacks, sstacks, rxstacks, populations) — RAD-seq locus assembly and SNP calling.
BWA v0.7.15-r1140 — read alignment.
SAMtools v1.9 — BAM filtering.
Beagle v4.1 (beagle.21Jan17.6cc.jar) — genotype imputation and haplotype phasing.
bcftools v1.2 (htslib v1.2.1) — VCF merging.
VCFtools — VCF handling.
PLINK v1.90b3.46 — FST calculation and association testing with permutation.

Analysis scripts are not deposited in this repository.

Usage notes

Microsoft Excel, R, Python (pandas), or any text editor can be used to open the CSV files.
bcftools, VCFtools, or R (VariantAnnotation) are recommended for handling SNPsRADseq.vcf.
Image files inside Gut_IHC_images.zip can be viewed with Fiji/ImageJ or any standard image viewer.

Sampling and measurement of gut length

To detect the seasonal plasticity of gut length, we collected 228 medaka individuals from three rivers in Kagawa prefecture from January 2014 to August 2015. The details are in supplementary table S1. To explore the genetically fixed gut length in medaka populations, we sampled 400 individuals from 40 wild-derived laboratory stocks that originated from wild populations and consisted of seven genetic backgrounds (SJPN1-4, NJPN1, NJPN2, and KOR) (1). Each stock is a descendant of populations collected from each wild habitat in 1983 (2) and has been maintained by taking offspring every 1-2 years, close to the generation time in the field (3) in independent tanks in common environments (i.e., the same food and feeding times) at our outdoor breeding facility at The University of Tokyo (1, 4) (see fig. S9 in Katsumura et al. [2014] (4)). The sampled stocks were: SJPN1 (PO.NEH): Ichinoseki, Toyota, Mishima, Iwata, Sakura, Shingu; SJPN2 (Sanyo/Shikoku/Kinki): Ayabe, Tanabe, Okayama, Takamatsu, Kudamatsu, Misho; SJPN3 (San-in): Kasumi, Tsuma, Tottori, Hagi; SJPN4 (Northern and Southern Kyushu): Kusu, Arita, Hisayama, Fukue, Kazusa, Izumi, Hiwaki, Nago; NJPN1 (derived NJPN): Kamikita, Yokote, Niigata, Kaga, Maizuru, Miyazu; NJPN2 (ancestral NJPN): Amino, Kumihama, Toyooka, Kinosaki, Hamasaka; KOR (KOR/CHN): Yongcheon, Maegok, Bugang, Sacheon, Shanghai. The subgroup names in parentheses are defined in our prior study (1). We isolated the gut (from the esophagus to the anus), fixed each one in 4% paraformaldehyde for 1 hour on ice, and took photos of them laid out on glass slides to measure their lengths. Unfortunately, 32 medaka individuals died before this analysis, and 27 individuals were removed because of presenting with an abnormal form due to aging (e.g., spondylosis). Finally, we obtained the standard and gut lengths from photos of 341 individuals using Image J software (5).

Generating genome-wide SNP data

For 341 gut-length-measured medaka, genomic DNAs were extracted from muscle using NucleoSpin Tissue (Macherey-Nagel) according to the manufacturer's protocol. After the quality check using a Nanophotometer (IMPLEN) and 0.5% agarose gel electrophoresis, six individuals were removed because of low DNA concentrations. For 335 individuals, we arranged the original RAD-seq protocol (6), adding an indexing PCR step to adjust the sample size, and generated 14 RAD-seq libraries (24 individuals per library, 23 individuals in the last library) by the following method. The designed P1-adaptor included 24 in-line barcodes, which had two nucleotide differences in each, and the P1-adaptors and Sbf I-HF (New England Biolabs)-digested DNAs for 90 min at 37˚C were ligated by T4 ligase (New England Biolabs). After DNA pooling, DNAs were sonicated using an S220 Focused-ultrasonicator (Covaris) to target 300 bp and purified using the GeneRead Size Selection Kit (Qiagen) according to the manufacturer's protocols. The End-repair, A-tailing, and P2 adaptor ligation steps were performed using the NEBNext Ultra DNA Library Prep Kit for Illumina (New England Biolabs). After size selection using AMPure XP beads (Beckman Coulter), indexing PCR was performed using Q5 (New England Biolabs) under the following conditions: initial denaturing step at 90˚C for 30 sec, 14 cycles of denaturation at 98˚C for 10 sec, annealing at 68˚C for 30 sec, extension at 72˚C for 20 sec, and a final extension step at 72˚C for 5 min. The PCR products were purified with AMPure XP beads and then were validated using the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific) for DNA concentration, 4200 TapeStation (Agilent Technologies) for fragment length, and Miseq (Illumina) for library quality. Finally, RAD-seq data were generated using three lanes of HiSeq 2500 (Illumina) with 51 bp single-end reads settings conducted by Macrogen Japan. The data have been submitted to the DRA database under accession number: DRA010605.

Our single-end reads were filtered by Cutadapt ver1.12 (7) using the following options: “-m 50 -e 0.2,” and were demultiplexed by process_radtags (v1.44) implemented in Stacks (8) using the following options: “-c -r -t 44 -q -s 0 --barcode_dist_1 2.” After quality filtering and check, one individual (from the Kazusa population) was removed because of low-quality reads. The draft genome of the medaka sequenced by the PacBio sequencer (Medaka-Hd-rR-pacbio_version2.2.4.fasta; http://utgenome.org/medaka_v2/#!Assembly.md) was used to align the reads using BWA backtrack 0.7.15-r1140 (9) using the “-n 0.06 -k 3” option. After the mapping process, the multi-mapped reads were removed using SAMtools v1.9 (10) and the “-q 1 -F 4 -F 256 -F 2048” option. SNP call was performed by Stacks pipeline: pstacks -m 2 --model_type snp --alpha 0.05; cstacks -g -n 1; sstacks -g; rxstacks --lnl_filter --lnl_lim -10 --conf_filter --conf_lim 0.75 --prune_haplo --model_type bounded --bound_low 0 --bound_high 0.1; cstacks -g -n 1; sstacks -g; populations -r 0.5 -p 7 -m 6 -f p_value -a 0.0 --p_value_cutoff 0.1 --lnl_lim -10. Assigned medaka populations into seven genetic groups based on our prior study (1), genotype imputation was performed by Beagle (v4.1) using the GT format (11). Finally, the data set was generated using VCFtools (12) to merge the seven genotype-imputed data, and included 63,265 SNPs.

Analysis of Plxnb3 upstream-deleted medaka

To examine a phenotype of Plxnb3 upstream-deleted (∆upPlxnb3) medaka, we conducted three tests: segregation of genotypes, a comparison of gut and body length, and an immunohistological comparison. For these tests, we performed heterozygous mating using the F2 generation and obtained their fertile eggs. We bred the medaka larvae in a tank (W52×D21×H23cm) and raised them until 2 weeks after hatching, divided the juveniles into groups of four to six individuals, and transferred them to 2 L tanks to raise them for an average of 103 days after hatching. Each genotype was determined using the PCR-based method described in our paper, and the segregation ratio was evaluated using the chi-squared test. To examine the relationship between gut and body length, 59 F3 individuals were used for which genotype segregation was tested, except for one individual for which not comparable gut could be isolated.

For immunohistological analysis, 5 µm serial tissue sections from wild types and mutants (n = 5 each) were prepared using our previously described method (13). The sections were incubated with a mouse monoclonal anti-phosphorylated neurofilament H (BioLegend, #smi-31, 1:1000), of which cross-reactivity for medaka had been confirmed in a previous study (14). Quantitative analysis of an immunohistological-staining signal was performed using ImageJ software (5). We obtained an average of six images of intestinal villus per individual at a 100× objective using light microscopy (BX63, Olympus) equipped with a digital camera (DP74, Olympus).

Outdoor breeding experiment in artificial environment

To test whether the food source affects the gut length, we transported the wild medaka from the Kabe river into the outdoor breeding facility on the Kashiwa campus, The University of Tokyo, in August 2015. Then, in September 2016 and February 2017, we sampled and compared the gut lengths of bred and wild medakas in the outdoor breeding facility and the Kabe river, respectively. Their gut lengths were measured according to the method described above.

Common garden experiment

To examine whether NJPN1 medaka show a shorter gut in winter than in summer, we sampled and measured the medaka gut lengths on 3 February and 30 August 2021 using available wild-derived laboratory stocks. The sampled stocks were: NJPN1: Yokote (n = 10, 10), Miyazu (n = 10, 10), Kaga (n = 7, 10); NJPN2: Kumihama (n = 9, 10), Amino (n = 10, 10), Toyooka (n = 11, 8); SJPNs as control: Ichonoseki (n = 8, 10), Okayama (n = 10, 5), Mishima (n = 10, 9).

Annual temperature

To test whether the dispersion of the gut length was large, according to the dispersion of the air temperature of each region, we downloaded the annual temperature using the Global Solar Atlas database (https://globalsolaratlas.info).

References

T. Katsumura, S. Oda, H. Mitani, H. Oota, Medaka Population Genome Structure and Demographic History Described via Genotyping-by-Sequencing. G3 . 9, 217–228 (2019).
A. Shima, A. Shimada, M. Sakaizumi, N. Egami, First listing of wild stocks of the Medaka Oryzias latipes currently kept by Zoological Institute, Faculty of Science, University of Tokyo. J Fac Sci Univ Tokyo Sec IV. 16, 27–35 (1985).
R. T. Leaf, Y. Jiao, B. R. Murphy, J. I. Kramer, K. M. Sorensen, V. G. Wooten, Life-History Characteristics of Japanese Medaka Oryzias latipes. Copeia. 2011, 559–565 (2011).
T. Katsumura, S. Oda, S. Nakagome, T. Hanihara, H. Kataoka, H. Mitani, S. Kawamura, H. Oota, Natural allelic variations of xenobiotic-metabolizing enzymes affect sexual dimorphism in Oryzias latipes. Proc. Biol. Sci. 281 (2014), doi:10.1098/rspb.2014.2259.
C. A. Schneider, W. S. Rasband, K. W. Eliceiri, NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 9, 671–675 (2012).
N. A. Baird, P. D. Etter, T. S. Atwood, M. C. Currey, A. L. Shiver, Z. A. Lewis, E. U. Selker, W. A. Cresko, E. A. Johnson, Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One. 3, e3376 (2008).
M. Martin, Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal. 17, 10–12 (2011).
J. Catchen, P. A. Hohenlohe, S. Bassham, A. Amores, W. A. Cresko, Stacks: an analysis tool set for population genomics. Mol. Ecol. 22, 3124–3140 (2013).
H. Li, R. Durbin, Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25, 1754–1760 (2009).
H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, 1000 Genome Project Data Processing Subgroup, The Sequence Alignment/Map format and SAMtools. Bioinformatics. 25, 2078–2079 (2009).
S. R. Browning, B. L. Browning, Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
P. Danecek, A. Auton, G. Abecasis, C. A. Albers, E. Banks, M. A. DePristo, R. E. Handsaker, G. Lunter, G. T. Marth, S. T. Sherry, G. McVean, R. Durbin, 1000 Genomes Project Analysis Group, The variant call format and VCFtools. Bioinformatics. 27, 2156–2158 (2011).
K. Nagata, C. Hashimoto, T. Watanabe-Asaka, K. Itoh, T. Yasuda, K. Ohta, H. Oonishi, K. Igarashi, M. Suzuki, T. Funayama, Y. Kobayashi, T. Nishimaki, T. Katsumura, H. Oota, M. Ogawa, A. Oga, K. Ikemoto, H. Itoh, N. Kutsuna, S. Oda, H. Mitani, In vivo 3D analysis of systemic effects after local heavy-ion beam irradiation in an animal model. Sci. Rep. 6, 28691 (2016).
N. Uemura, M. Koike, S. Ansai, M. Kinoshita, T. Ishikawa-Fujiwara, H. Matsui, K. Naruse, N. Sakamoto, Y. Uchiyama, T. Todo, S. Takeda, H. Yamakado, R. Takahashi, Viable neuronopathic Gaucher disease model in Medaka (Oryzias latipes) displays axonal accumulation of alpha-synuclein. PLoS Genet. 11, e1005065 (2015).

Data from: DNA methylation site loss for plasticity-led novel trait genetic fixation

Data files

Abstract

README: Data from: DNA methylation site loss for plasticity-led novel trait genetic fixation

Description of the data and file structure

Files and variables

File: WildCapturedMedaka.csv

File: WildDerivedLabMedaka.csv

File: WildDerivedLabMedaka2021.csv

File: DeltaUpPlxnb3Medaka.csv

File: PopulationLocations_forAnnualTemperature.csv

File: Gut_IHC_images.zip

File: SNPsRADseq.vcf