Quantitative trait loci mapping in cichlid fishes: Aulonocara koningsi x Metriaclima mbenjii and Labidochromis caeruleus x Labeotropheus trewavasae
Data files
Aug 24, 2022 version files 2.22 MB
-
LcLt_Head_lateral_landmarks.TPS
-
LcLt_Head_ventral_landmarks.TPS
-
LcLt_HeadPhenotype_Master.csv
-
LcLt_HeadQTL_genotypes.csv
-
README.txt
Dec 28, 2022 version files 6.42 MB
-
LcLt_BodyPhenotype_Master.csv
-
LcLt_BodyQTL_genotypes.csv
-
LcLt_Head_lateral_landmarks.TPS
-
LcLt_Head_ventral_landmarks.TPS
-
LcLt_HeadPhenotype_Master.csv
-
LcLt_HeadQTL_genotypes.csv
-
LcLt_MmAk_body_landmarks.TPS
-
MmAk_BodyPhenotype_Master.csv
-
MmAk_BodyQTL_genotypes.csv
-
README.txt
Jan 09, 2023 version files 6.42 MB
-
LcLt_BodyPhenotype_Master.csv
-
LcLt_BodyQTL_genotypes.csv
-
LcLt_Head_lateral_landmarks.TPS
-
LcLt_Head_ventral_landmarks.TPS
-
LcLt_HeadPhenotype_Master.csv
-
LcLt_HeadQTL_genotypes.csv
-
LcLt_MmAk_body_landmarks.TPS
-
MmAk_BodyPhenotype_Master.csv
-
MmAk_BodyQTL_genotypes.csv
-
README.txt
Jul 31, 2023 version files 8.50 MB
-
LcLt_BodyPhenotype_Master.csv
-
LcLt_BodyQTL_genotypes.csv
-
LcLt_Head_lateral_landmarks.TPS
-
LcLt_Head_ventral_landmarks.TPS
-
LcLt_HeadPhenotype_Master.csv
-
LcLt_HeadQTL_genotypes.csv
-
LcLt_MmAk_body_landmarks.TPS
-
MmAk_BodyPhenotype_Master.csv
-
MmAk_BodyQTL_genotypes.csv
-
MmAk_Pigment_genotypes.csv
-
MmAk_PigmentPhenotype_Master.csv
-
README.txt
Apr 08, 2024 version files 8.51 MB
-
LcLt_BodyPhenotype_Master.csv
-
LcLt_BodyQTL_genotypes.csv
-
LcLt_Head_lateral_landmarks.TPS
-
LcLt_Head_ventral_landmarks.TPS
-
LcLt_HeadPhenotype_Master.csv
-
LcLt_HeadQTL_genotypes.csv
-
LcLt_MmAk_body_landmarks.TPS
-
MmAk_BodyPhenotype_Master.csv
-
MmAk_BodyQTL_genotypes.csv
-
MmAk_Pigment_genotypes.csv
-
MmAk_PigmentPhenotype_Master.csv
-
README.md
Abstract
Since the time of Darwin, biologists have sought to understand the evolution and origins of phenotypic variation. To understand the genetic and molecular sources of morphological differences, we capitalize on the cichlid fish system. Cichlids of the East African Rift Lakes have undergone an extensive adaptive radiation, including variation in body shape, head shape, and pigmentation. These morphological differences are often intimately linked to the ecology and behavior of these animals. Here, we investigate the genetic basis of these phenotypes using quantitative trait loci (QTL) mapping using four genera of Lake Malawi cichlids and two F2 hybrid populations. The first hybrid cross is between Aulonocara koningsi, which lives in the open sandy region and feeds insects from the open sand, and Metriaclima mbenjii, an omnivore rock-dwelling fish. The second cross is between Labidochromis caeruleus, a suction-feeding insectivore that swims continuously searching for prey, and Labeotropheus trewavasae, which feeds by biting or scraping attached algae from the rocks in its benthic habitat. Such work can provide insights into the molecular basis of phenotypic adaptation, the genetic architecture of morphology, and the evolution of cichlid fishes.
README: Quantitative trait loci mapping in cichlid fishes: Aulonocara koningsi x Metriaclima mbenjii and Labidochromis caeruleus x Labeotropheus trewavasae
https://doi.org/10.5061/dryad.4mw6m90cz
This data set describes data from two F2 hybrid populations of cichlid fishes, further described below.
Description of the data and file structure
All of the files included in this repository were collected to compare morphologies and perform quantitative trait loci mapping for four genera of Lake Malawi cichlid fishes and two F2 hybrid populations. The first hybrid cross is between Aulonocara koningsi, which lives in the open sandy region and feeds insects from the open sand, and Metriaclima mbenjii, an omnivore rock-dwelling fish. The second cross is between Labidochromis caeruleus, a suction-feeding insectivore that swims continuously searching for prey, and Labeotropheus trewavasae, which feeds by biting or scraping attached algae from the rocks in its benthic habitat. File names with "MmAk" indicates the Metriaclima x Aulonocara cross, while "LcLt" refers to Labidochromis x Labeotropheus.
.TPS files contain landmark data used for geometric morphometric shape analysis prior to size correction. These landmark data were collected from 2D images using TPSdig, converted to XY coordinates using TPSutil, and geometric morphometrics including size correction was conducted using the geomorph package in R. The LcLT_Head_lateral_landmarks.TPS file has landmarks at the (1) anterior insertion of the dorsal fin, (2) center of the eye, (3) tip of the snout, (4) insertion of the pelvic fin, (5) dorsal reach of the operculum, (6) ventral end of the operculum, and (7) ventral end of the pre-operculum as well as semilandmarks from the dorsal fin to the snout. The LcLT_Head_ventral_landmarks.TPS file has landmarks at the (1) left joint of the mandible and quadrate, (2) anterior most tip of the mandible, (3) right joint of the mandible and quadrate, (4) right lateral edge of the opercle, (5) right posterior edge of the branchial rays, and (6) midline insertion of the pelvic muscles as well as semilandmarks on the mandible. The LcLt+MmAk_body_landmarks.TPS file has landmarks at the (1) tip of the snout, (2) anterior edge of the eye, (3) posterior edge of the eye, (4) anterior insertion of the dorsal fin, (5) dorsal insertion of the pectoral fin, (6) ventral insertion of the pectoral fin, (7) insertion of the pelvic fin, (8) anterior insertion of the anal fin, (9) posterior insertion of the anal fin, (10) ventral edge of the caudalpeduncle, (11) dorsal edge of the caudalpeduncle, and (12) posterior insertion of the dorsal fin.
The Phenotype_Master.csv files include various phenotypic measure from each animal, described below. All linear measures were collected from images using ImageJ as number of pixels, then converted into cm using measures of a scale in each picture using Excel.
LcLt_HeadPhenotype_Master.csv includes
(1) the ID number for each fish;
(2) whether the fish is a Labidochromis parental, Labeotropheus parental, or F2 hybrid;
(3) the sex of the animal (0=female, 1=male) determined based on gonad dissection and omitted if there was any ambiguity;
(4) the number of pixels in a 2.855cm scale in each lateral picture;
(5) the standard length (SL) of the animal (snout to caudal peduncle) in pixels and separately in cm following conversion with the scale;
(6) the number of pixels in a 0.5cm scale in the ventral picture;
and (7) a series of linear measures, described in the next sentence, in separate columns as measured by the number of pixels (px), in cm following conversion with the scale, or after processing to a residual after regression to standard length using R, using a data set with both parental species and F2 hybrids (no units).
LcLt_HeadPhenotype linear measures are:
(1) head length, from tip of the snout to the opercle;
(2) head proportion, head length/standard length;
(3) the anterior insertion of the dorsal fin to the pelvic fin insertion;
(4) the pelvic fin insertion to the snout;
(5) the preorbital length, from the tip of the snout to the anterior edge of the eye;
(6) the eye diameter, as well as the radius and eye area that was derived from that measure;
(7) the mouth angle, measured in degrees of the angle formed by the line measuring head length and the preorbital length as described above;
(8) the mandible width, measured from the right and left joints with the quadrate;
(9) the mandible length from the midline to the intersection with the line measuring mandible width;
(10) the length from the edge of the opercle to the joint of the mandible and quadrate;
(11) the edge of the opercle to the midline of the fish;
(12) the angle measured in degrees formed by lines from the right joint of the mandible and quadrate to the midline and the left joint of the mandible and quadrate to the midline of the mandible. Also included in this file are PC (principal component scores) from geometric morphometric analysis of the TPS files described above, following size correction including the first 5 PC scores from lateral and 3 PC scores from ventral.
The LcLt_BodyPhenotype_Master.csv and MmAk_BodyPhenotype_Master.csv files contain the same columns but for each of the two separate crosses in separate files. Columns are
(1) the ID number for each fish;
(2) whether the fish is a Labidochromis parental, Labeotropheus parental, Metriclima parental, Aulonocara parental, F2 hybrid from the Labiochromis x Labeotropheus cross, or F2 hybrid from the Metriaclima x Aulonocara cross;
(3) the sex of the animal (0=female, 1=male) determined based on gonad dissection and omitted if there was any ambiguity;
and (4) a series of linear measures, described in the next sentence, measured in cm following conversion with the scale, or after processing to a residual after regression to standard length using R, using a data set with both parental species and F2 hybrids (no units).
Linear measures are:
(1) standard length (snout to posterior end of caudal peduncle),
(2) head length (snout to opercle),
(3) body depth (anterior insertion of dorsal fin to insertion of pelvic fin),
(4) caudal peduncle depth,
(5) distance between caudal peduncle and anal fin insertion,
(6) length of the anal fin base,
(7) distance between anal fin and pelvic fin,
and (8) pectoral fin depth. Head proportion was calculated by dividing head length by standard length. Also included in this file are the first 3 PC (principal component scores) from geometric morphometric analysis of the TPS files described above, which included size correction.
The MmAk_PigmentPhenotype_Master.csv file includes the columns
(1) the ID number for each fish;
(2) whether the fish is a Metriclima parental, Aulonocara parental, or F2 hybrid from the Metriaclima x Aulonocara cross;
(3) the sex of the animal (0=female, 1=male) determined based on gonad dissection and omitted if there was any ambiguity;
(4) the mass of the animal at time of sacrifice in grams;
(5) the standard length (SL) of the animal (snout to caudal peduncle) in cm; and
(6) a series of pigment measures extracted from the pictures as described below. All measures are reported in the unit described below, as well as a separate column where that value was corrected for size of the animal (standard length) by processing to a residual after regression to standard length using R, using a data set with both parental species and F2 hybrids (no units).
Measures are all from an isolated region of the flank that was 10 pixels high with the ventral side aligned with the guide at the top of the caudal peduncle, the opercle at the anterior end, and the dorsal fin on the posterior end. This isolated 10 pixel high bar was processed was uploaded to FIJI software, converted to a 32-bit grayscale, and then the Plot Profile command in FIJI was used to convert the image to a numerical gray value from 0 (pure black) to 255 (pure white), averaging the values of the 10 pixels in each column. From this plot profile output, we report
(1) darkest intensity (0-255 gray value units),
(2) lightest intensity (0-255 gray value units),
and the (3) range of intensity, from the lightest intensity minus the darkest intensity (0-255 gray value units).
We calculated (4) the covariance of intensity measure with anterior-posterior pixel position using the covar function in R (no units). We used a custom perl script to further analyze the plot profile output and classify pixels as being present in a bar (darker than average pixel color) or interbar (lighter than average pixel color). The perl script is available at https://github.com/kpowder/Biology2022.
From these classifications, we calculated (5) the average gray value of bar classified pixels;
(6) the average gray value of interbar classified pixels;
(7) the differential intensity of bar versus interbars, calculated as the average gray value of interbars minus the average gray value of bars;
(8) the total count of bars;
(9) the average width of bars (in pixels);
(10) the average width of interbars (in pixels);
and (11) the percent of barring, measured as the total length of regions in pixels classified as bars divided by the total length of the isolated region in pixels.
QTL_genotypes.csv files (LcLt_HeadQTL_genotypes.csv, LcLt_BodyQTL_genotypes.csv, MmAk_BodyQTL_genotypes.csv, and MmAk_Pigment_genotypes.csv) are all formatted in the same way. Each file contain measures duplicated from the respective Phenotype_Master.csv file and RADseq genotypes. These data are for hybrids only, in a format that is ready to perform quantitative trait loci mapping using the MQM program in R/qtl. Genotype data came from DNA extracted from caudal fin tissue, that was sequenced as RADseq libraries, and processed using the R program Stacks. The A allele is designated as coming from the Metriaclima granddam and the B allele from the Aulonocara grandsire. In the second cross, the A allele is designated as coming from the Labidochromis granddam and the B allele from the Labeotropheus grandsire. QTL mapping of residual phenotypes using this genotype data was conducted using MQM in R/qtl. Per the required formatting of R/qtl, the first row is a header; the second row and third rows contain the linkage group and centimorgan (cM) position, respectively, for any genetic markers, and the remaining rows include phenotype data and the genetic call for that individual at all of the genetic markers (as AA, AB, BB or omitted: -). Each marker is named by the physical position in the Metriaclima zebra UMD2a reference genome, as contig_nucleotide position.
Sharing/Access information
Additional descriptions of the data are provided in the associated manuscripts.
Code/Software
.CSV files can be opened in Microsoft Excel, OpenOffice, or a data analysis environment such as R (https://www.r-project.org/. .TPS files can be opened and analyzed using software available at https://sbmorphometrics.org/, specifically TPSutil, TpsDig, and TPSrelw for file management (e.g. sample removal), landmark digitization, and geometric morphometric analysis, respectively. .TPS files can be converted in TPSutil to be analyzed in R. Genotype files are formatted for upload and analysis in R/qtl. File names with "MmAk" indicates the Metriaclima x Aulonocara cross, while "LcLt" refers to Labidochromis x Labeotropheus.
See associated scripts for processing of RADseq sequencing data and pigment data at https://github.com/kpowder/Biology2022
Methods
For the first hybrid cross, a single Metriaclima female crossed to two Aulonocara males; the inclusion of the second grandsire was inadvertent and resulted from an unexpected fertilization event in these species with external fertilization. This single F1 family was subsequently incrossed to produce a hybrid F2 population of 491 fishes. For the second cross, a single Labidochromis caeruleus female was crossed with a single Labeotropheus trewavasae male to create one F1 family, which was subsequently incrossed to produce a hybrid F2 population of 447 fishes. Hybrid fish and 10 each of the pure parental species were euthanized and 2D imaged at five months, with each image including a color standard and scale. Sex of each animal was determined based on gonad dissection and omitted if there was any ambiguity. Linear measures were collected from these images using the program ImageJ as pixels, converted to cm using measures of a scale in each picture using Excel, and processed to residual after regression to standard length (snout to caudal peduncle) in R. Landmark data was collected from 2D images using TPSdig, converted to XY coordinates using TPSutil, and geometric morphometrics including size correction was conducted using the geomorph package in R. Pigment data was collected as described in the associated manuscript.
Genotype data came from DNA extracted from caudal fin tissue, that was sequenced as RADseq libraries, and processed using the R program Stacks. The A allele is designated as coming from the Metriaclima granddam and the B allele from the Aulonocara grandsire. In the second cross, the A allele is designated as coming from the Labidochromis granddam and the B allele from the Labeotropheus grandsire. QTL mapping of residual phenotypes using this genotype data was conducted using MQM in R/qtl.
File names with "MmAk" indicates the Metriaclima x Aulonocara cross, while "LcLt" refers to Labidochromis x Labeotropheus.
Usage notes
.CSV files can be opened in Microsoft Excel, OpenOffice, or a data analysis environment such as R (https://www.r-project.org/). .TPS files can be opened and analyzed using software available at https://sbmorphometrics.org/, specifically TPSutil, TpsDig, and TPSrelw for file management (e.g. sample removal), landmark digitization, and geometric morphometric analysis, respectively. .TPS files can be converted in TPSutil to be analyzed in R. Genotype files are formatted for upload and analysis in R/qtl. File names with "MmAk" indicates the Metriaclima x Aulonocara cross, while "LcLt" refers to Labidochromis x Labeotropheus.