The hundreds of cichlid fish species in Lake Malawi constitute the most extensive recent vertebrate adaptive radiation. Here we characterize its genomic diversity by sequencing 134 individuals covering 73 species across all major lineages. The average sequence divergence between species pairs is only 0.1–0.25%. These divergence values overlap diversity within species, with 82% of heterozygosity shared between species. Phylogenetic analyses suggest that diversification initially proceeded by serial branching from a generalist Astatotilapia-like ancestor. However, no single species tree adequately represents all species relationships, with evidence for substantial gene flow at multiple times. Common signatures of selection on visual and oxygen transport genes shared by distantly related deep-water species point to both adaptive introgression and independent selection. These findings enhance our understanding of genomic processes underlying rapid species diversification, and provide a platform for future genetic analysis of the Malawi radiation.
Amino Acid Alignments For Genes in Fig6a
These are the haplotypes used to build the haplotype trees showing shared depth adaptation of the Diplotaxodon and 'deep benthic' groups. The alignments are in the 'fasta' format. Haplotype phase is based on the BEAGLE output (not shapeit). Both haplotypes are included per species. The fasta headers indicate group assignment (e.g. mbuna), and then the first three letters of the genus name and the first three letters of the species name.
AminoAcidAlignmentsForFig6.tar.gz
All whole-genome variant calls (after BEAGLE genotype refinement)
The final product of our variant calling pipeline; obtained as described under: "Variant calling, filtering, and genotype refinement" in Supplementary Methods. This file also includes the A. calliptera samples from Indian Ocean Catchment and outgroup genotypes (N. brichardi), based on a whole genome alignment between the N. brichardi reference and the Lake Malawi M. zebra reference (again described in Supplementary methods).
Malinsky_et_al_2018_LakeMalawiCichlids.vcf.gz
Phylogenetic trees
All trees linked from Fig 2c, also all SNAPP MCMC samples, and all local Maximum Likelihood trees, both without and with the Indian Ocean (IO) catchment A. calliptera. For details see Methods.
phylogenies.tar.gz
D statistics
This table gives values of Patterson’s D(h1, h2, h3, h4) for all combinations of samples of the cichlid species given in Supplementary Table S1 of Malinsky et al. 2018, where the outgroup (h4) is fixed as N. brichardi from Lake Tanganyika. Note that the different geographic variants of A. calliptera were treated separately.
D_statistics_allSpeciesTrioCombinations.tsv.gz
f statistics
This table gives values of the f4 admixture ratio f(h1, h2, h3, h4) ([see SOM18 in ref. 31, and fG in ref. 48 in Malinsky et al. 2018] for all combinations of species for which D(h1,h2,h3,h4)>0. The outgroup (h4) is fixed as N. brichardi from Lake Tanganyika. Note that the different geographic variants of A. calliptera were treated separately.
f-stats.tsv.gz