Skip to main content
Dryad

Evolutionary insights into Felidae iris color through ancestral state reconstruction

Cite this dataset

Tabin, Julius (2023). Evolutionary insights into Felidae iris color through ancestral state reconstruction [Dataset]. Dryad. https://doi.org/10.5061/dryad.s4mw6m9b0

Abstract

There have been almost no studies with an evolutionary perspective on eye (iris) color, outside of humans and domesticated animals. Extant members of the family Felidae have a great interspecific and intraspecific diversity of eye colors, in stark contrast to their closest relatives, all of which have only brown eyes. This makes the felids a great model to investigate the evolution of eye color in natural populations. Through machine learning cluster image analysis of publicly available photographs of all felid species, as well as a number of subspecies, five felid eye colors were identified: brown, hazel/green, yellow/beige, gray, and blue. Using phylogenetic comparative methods, the presence or absence of these colors was reconstructed on a phylogeny. Additionally, through a new color analysis method, the specific shades of the ancestors’ eyes were quantitatively reconstructed. The ancestral felid population was predicted to have brown-eyed individuals, as well as a novel evolution of gray-eyed individuals, the latter being a key innovation that allowed the rapid diversification of eye color seen in modern felids, including numerous gains and losses of different eye colors. It was also found that the loss of brown eyes and the gain of yellow/beige eyes is associated with an increase in the likelihood of evolving round pupils, which in turn influence the shades present in the eyes. Along with these important insights, the methods presented in this work are widely applicable and will facilitate future research into phylogenetic reconstruction of color beyond irises.

README: Reference Information

Provenance for this README

  • File name: README_FelidDataset.md
  • Authors: Julius A. Tabin
  • Other contributors: Katherine A. Chiasson
  • Date created: 2023-02-15
  • Date most recently modified: 2023-10-01

Dataset Version and Release History

  • Current Version:
    • Number: 2.0.0
    • Date: 2023-10-01
    • Persistent identifier: DOI: 10.5061/dryad.s4mw6m9b0
    • Summary of changes: Updated to reflect an improved analysis pipeline
  • Embargo Provenance: n/a
    • Scope of embargo: n/a
    • Embargo period: n/a

Dataset Attribution and Usage

  • Dataset Title: Data for the article "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction"
  • Persistent Identifier: <https://doi.org/10.5061/dryad.s4mw6m9b0>
  • Dataset Contributors:
    • Creators: Julius A. Tabin and Katherine A. Chiasson
  • Date of Issue: 2023-02-15
  • License: Use of these data from Zenodo or GitHub is covered by the following license:
    • Title: GNU General Public License v3.0
    • Specification: <https://www.gnu.org/licenses/gpl-3.0.en.html>
  • License: Use of these data from Dryad is covered by the following license:
    • Title: CC0 1.0 Universal (CC0 1.0)
    • Specification: <https://creativecommons.org/publicdomain/zero/1.0/>
  • Data Reuse
    • The authors respectfully request to be contacted by researchers interested in the reuse of these data so that the possibility of collaboration can be discussed.
  • Suggested Citations:

    • Dataset citation:

      &gt; Tabin J.A. and K.A. Chiasson. 2023. Data for the article "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction", Dryad, Dataset, <https://doi.org/10.5061/dryad.s4mw6m9b0>

    • Corresponding publication:

      &gt; Tabin J.A. and K.A. Chiasson. 2023. Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction. iScience. Submitted.

Contact Information

  • Name: Julius A. Tabin
  • Affiliations: Department of Organismic and Evolutionary Biology, Harvard University
  • ORCID ID: <https://orcid.org/0000-0002-3591-6620>
  • Email: <jtabin1@gmail.com>
  • Alternate Email: <jtabin@g.harvard.edu>
  • Address: e-mail preferred
  • Contributor ORCID IDs:
    • Julius A. Tabin: <https://orcid.org/0000-0002-3591-6620>
    • Katherine A. Chiasson: <https://orcid.org/0000-0002-9729-1718>

Additional Dataset Metadata

Acknowledgements

  • Funding sources: This work was supported in part by a graduate stipend from the Department of Organismic and Evolutionary Biology at Harvard University.
  • Formatting for this README file is based on the README file of LaPergola, J.B., C. Riehl, J.E. Martínez-Gómez, B. Roldán-Clarà, and R.L. Curry. 2022. Data for the article "Extra-pair paternity correlates with genetic diversity, but not breeding density, in a Neotropical passerine, the Black Catbird", Dryad, Dataset, <https://doi.org/10.5061/dryad.2bvq83btg>

Methodological Information

  • Methods of data collection/generation: see manuscript and supplemental methods for details

Data and File Overview

Summary Metrics

  • File count: 29
  • Range of individual file sizes: 4.0 KB - 3.4 MB
  • File formats: .csv, .Rmd, .ipynb, .pdf, .xlsx, .NEX

Table of Contents

  • Tabin_Chiasson_Supplemental_Methods_and_Results.pdf
  • Tabin_Chiasson_Supplemental_Figures.pdf
  • Tabin_Chiasson_Supplemental_Table_1.xlsx
  • Tabin_Chiasson_Supplemental_Table_2.xlsx
  • Data Collection Script.ipynb
  • Color Presence Reconstruction.Rmd
  • Quantitative Color Reconstruction.Rmd
  • Output Specific Colors.ipynb
  • Find Correlations.Rmd
  • Carnivore_phylo_Nyakatura2012.NEX
  • enviro_data.csv
  • enviro_data_onlytree.csv
  • poly_data_main_subset.csv
  • poly_data_subset.csv
  • general_data_reordered_withsub.csv
  • Tip_col_data.csv
  • Node_col_data.csv
  • general_data_brown_only.csv
  • general_data_hazgre_only.csv
  • general_data_yelbei_only.csv
  • general_data_grey_only.csv
  • general_data_blue_only.csv
  • general_data_reordered.csv
  • col_data.csv
  • col_data_nosub
  • col_data_onlytree
  • dom_col_data.csv
  • dom_col_data_nosub.csv
  • dom_col_data_onlytree.csv

Setup

  • Unpacking instructions: n/a
  • Recommended software/tools: Python version 3.8.8; RStudio 2021.05.24; R version 4.2.1
  • Raw data files used for this analysis can be found at <https://github.com/jtabin/Felid-Eyes>

Notes

  • All cells left empty in any data file are because there is no data present there for that taxon. The programs for analysing the data have been designed with these gaps in mind and the gaps are intentional; data is not missing.
  • For some files below, there are columns grouped corresponding to each of the five eye colors identified in the study: brown, hazgre (hazel/green), yelbei (yellow/beige), gray, and blue. In the follow data description, an x will stand in for any color name.

File Details

Details for: Tabin_Chiasson_Supplemental_Methods_and_Results.docx

  • Description: a Word document containing the supplemental methods and results for "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction"
  • Format(s): .docx

Details for: Tabin_Chiasson_Supplemental_Figures.pdf

  • Description: a .pdf file containing the supplemental figures and figure captions for "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction"
  • Format(s): .pdf

Details for: Tabin_Chiasson_Supplemental_Table_1.xlsx

  • Description: an Excel sheet containing Supplemental Table 1 for "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction"
  • Format(s): .xlsx
  • Variables:
    • Rename: Taxon name
    • Common Name: The common name of each taxon
    • Number of Images: The number of images used for data collection

Details for: Tabin_Chiasson_Supplemental_Table_2.xlsx

  • Description: an Excel sheet containing Supplemental Table 2 for "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction". The table has been split up for clarity, each part corresponding to AIC values for one part of the paper's analysis: the main phylogeny, the main phylogeny with only the most common colors condidered, and the full phylogeny, including all the subspecies.
  • Format(s): .xlsx
  • Variables:
    • Model: the phylogenetic model of trait evolution
    • K: the number of parameters
    • logLik: the log likelihood of that model
    • AICc: the calculated AICc value
    • deltaAICc: the difference in AIC score between the best model and the compared model
    • Weight: the proportion of total predictive power of the model
    • Evidence ratio: the ratio of the model's weight compared to the best model

Details for: Data Collection Script.ipynb

  • Description: a Jupyter Notebook file containing code to take a folder of iris images for a species as its input and outputs which colors are present and their various shades. Some of this must be manually determined according to the methods outlined in "Evolutionary Insights Into Felidae Iris Color Through Ancestral State Reconstruction".
  • Format(s): .ipynb

Details for: Color Presence Reconstruction.Rmd

  • Description: an R Markdown file containing code for reconstructing which color eyes were present in the populations of each ancestor. This takes the output of "Data Collection Script.ipynb" as its input and outputs the reconstructions at each phylogenetic node, along with figures.
  • Format(s): .Rmd

Details for: Quantitative Color Reconstruction.Rmd

  • Description: an R Markdown file containing code for performing the reconstruction for the more specific colors (i.e. not just whether or not a color is present, but what its shades were quantitatively). This also takes the output of "Data Collection Script.ipynb" as its input and outputs the reconstructions at each phylogenetic node, along with figures.
  • Format(s): .Rmd

Details for: Output Specific Colors.ipynb

  • Description: a Jupyter Notebook file containing code for transforming the RGB .csv output of "Data Collection Script.ipynb" and "Specific Color Reconstruction.Rmd" into colorful images for figure creation. It also takes "shade_bg.png" as an background input, provided in the <https://github.com/jtabin/Felid-Eyes> repository.
  • Format(s): .ipynb

Details for: Find Correlations.Rmd

  • Description: an R Markdown file containing code for taking the output of "Data Collection Script.ipynb" and performing phylogenetic and tetrachoric correlations, resulting in raw data, as well as figures.
  • Format(s): .Rmd

Details for: Carnivore_phylo_Nyakatura2012.NEX

  • Description: a NEXUS file containing the species-level supertree for Carnivora used in this paper. It was taken from Nyakatura, K, Bininda-Emonds ORP. 2012. Updating the evolutionary history of Carnivora (Mammalia): a new species-level supertree complete with divergence time estimates. BMC Biology 10: 1-31.
  • Format(s): .NEX

Details for: Tip_col_data.csv

  • Description: a comma-delimited file containing the eye color information for the tips of the phylogenetic tree loaded in "Color Presence Reconstruction.Rmd" and "Specific Color Reconstruction.Rmd". This is the input to those files.
  • Format(s): .csv
  • Variables:
    • Rename: Taxon name
    • x_col_num: How many distinct shades of a certain color appear for that species/node (determined by "Data Collection Script.ipynb")
    • x_order: The order of shades in the eyes (i.e. which shades are more abundant). This is 1-4 letters (d, m, and l), corresponding to dark, medium, and light shades. Thus, if a cell contained mdl, then the order of shade abundance in the eye of that row's taxon would be medium > dark > light.
    • x_pri and x_sec: The primary and secondary shades in the eye individually. For the mdl example, x_pri would contain m and x_sec would contain d.
    • x_light_R, x_light_G, and x_light_B: The red, green, and blue RGB values, respectively, for the light shade, provided it exists.
    • x_med_R, x_med_G, and x_med_B: The red, green, and blue RGB values, respectively, for the medium shade, provided it exists.
    • x_dark_R, x_dark_G, and x_dark_B: The red, green, and blue RGB values, respectively, for the dark shade, provided it exists.
    • x_excl_R, x_excl_G, and x_excl_B: The red, green, and blue RGB values, respectively, for a rare fourth color, if it exists for some species, that was excluded in the comparative analyses.

Details for: Node_col_data.csv

  • Description: a comma-delimited file containing the output of the phylogenetic reconstructions done by the programs.
  • Format(s): .csv
  • Variables:
    • Node: The node ID (ordered as the R package ape orders the nodes, with 1 being the common ancestor of the whole tree)
    • Other columns are identical to those for Tip_col_data.csv above.

Details for: general_data_brown_only.csv, general_data_hazgre_only.csv, general_data_yelbei_only.csv, general_data_grey_only.csv, general_data_blue_only.csv, general_data_reordered.csv, and general_data_reordered_withsub.csv

  • Description: comma-delimited files containing subsets of the "Tip_col_data.csv" file, which are better for comparisons using "Specific Color Reconstruction.Rmd".
  • Format(s): .csv
  • Variables:
    • x_ter and x_qua: The tertiary and quaternary shades in the eye individually. The _R, _G, _B suffixes indicate the red, green, and blue RGB values, respectively.
    • Other columns are identical to those for Tip_col_data.csv above.

Details for: col_data.csv, dom_col_data.csv, col_data_nosub.csv, dom_col_data_nosub.csv, col_data_onlytree.csv, and dom_col_data_onlytree.csv

  • Description: comma-delimited files containing just the presence or absence of each overall eye color for each felid taxon considered in the study. Any file beginning with "dom_" just contains the most common eye colors, determined using our experimental methods. Any file ending with "_nosub" only has species (with subspecies removed) and any file ending with "_onlytree" only has species that appear on the Nyakatura and Bininda-Emonds (2012) Carnivora tree.
  • Format(s): .csv
  • Variables:
    • Rename: Taxon name
    • x_pres: Whether the taxon for each row contains that color eyes in its population (1 for yes, 0 for no).

Details for: poly_data_subset.csv and poly_data_main_subset.csv

  • Description: comma-delimited files containing the presence or absence of each overall eye color for each felid taxon considered in the study, formatted as a polymorphic trait. "poly_data_main_subset.csv" is the same, but only containing the most common eye colors, determined using our experimental methods.
  • Format(s): .csv
  • Variables:
    • Rename: Taxon name
    • color: Which eye colors are present for that taxon, separated by + signs.

Details for: enviro_data.csv and enviro_data_onlytree.csv

  • Description: a comma-delimited file containing the environmental/morphological data collected and made into parameters using our methods and supplemental methods. "enviro_data_onlytree.csv" only has species that appear on the Nyakatura and Bininda-Emonds (2012) Carnivora tree.
  • Format(s): .csv
  • Variables:
    • Pupil_type: The pupil information for each species looked at in the study (i.e. whether they have round, vertical, or subcircular pupils).
    • Pupil_type_bin: The pupil information as numbers: 0 = vertical, 1 = subcircular, and 2 = round.
    • Pupil_type_revised: The pupil information for each species looked at in the study with subcircular pupils considered vertical (i.e. whether they have round or vertical pupils).
    • Pupil_type_revised_bin: The pupil information, with subcircular pupils considered vertical, as numbers: 0 = vertical, 1 = round.
    • Activity_type: The animal's observed activity habits (diurnal, nocturnal, and/or crepuscular) from the University of Michigan Animal Diversity Web.
    • Nocturnal, Crepuscular, Diurnal: Each corresponds to one activity mode with a 1 if that activity mode is present and a 0 if it is absent.
    • Nocturnal_prop: A metric for how nocturnal the animal is with a 3 for fully nocturnal, 2 if there is one other activity mode, 1 if there are two others, and 0 if the animal isn't nocturnal.
    • Region: Data on the zoogeographical region that each species is mainly found in (ethiopian, oriental, palearctic, nearctic, or neotropical). The regions and names are from the paper Johnson WE., Eizirik E, Pecon-Slattery J, Murphy WJ, Antunes A, Teeling E, O'Brien SJ. 2006. The late Miocene radiation of modern Felidae: a genetic assessment. Science 311(5757):73-77.
    • Ethiopian, Oriental, Palearctic, Nearctic, Neotropical: Each corresponds to one zoogeographical region with a 1 if the animal is present in the area and a 0 if it is absent.
    • Habitat: The animal's main habitat(s), determined by the University of Michigan Animal Diversity Web. Non-mutually exclusive possible options are desert, forest, savanna, mountains, rainforest, swamp, marsh, tundra, and taiga.
    • Desert, Savanna, Forest, Rainforest, Forest_Rainforest, Mountains: Each corresponds to one habitat with a 1 if the animal is present in the habitat and a 0 if it is absent. Forest_Rainforest is either/or Forest and Rainforest.
    • Habitat_num: The number of different habitats occupied by the animals.
    • Low_elevation_m, High_elevation_m: The lowest and highest elevation each taxon has been observed in (in meters). IMPORTANT: THIS DATA IS INCOMPLETE!
    • Length_low_cm, Length_high_cm, Length_avg_cm: Low, high, and average body length in cm. IMPORTANT: THIS DATA IS INCOMPLETE!
    • Skull_Length_mm: The skull length in mm. IMPORTANT: THIS DATA IS INCOMPLETE!
    • Mating: The mating system (promiscuous, polygynous, and/or monogamous).
    • Coat_pattern: Data on the coat pattern of each taxon: flecks, uniform, stripes, sblotch (small blotches), rosettes, and/or blotches. This is based on Werdelin L, Olsson L. 1997. How the leopard got its spots: a phylogenetic view of the evolution of felid coat patterns. Biol. J. Linn. Soc. 62(3):383-400.
    • Flecks, Uniform, Stripes, SBlotch, Rosettes, Blotches: Each corresponds to one coat pattern with a 1 if the animal has that pattern and a 0 if it doesn't.
    • Black_body_morethaneye: This has a 1 if there is black fur on the animal's body of greater area than the animal's eye and a 0 if it doesn't have that.
    • Black_tail_morethaneye: This has a 1 if there is black fur on the animal's tail of greater area than the animal's eye and a 0 if it doesn't have that.
    • Nose_color: Whether the animal has a black or pink nose.
    • Nose_black, Nose_pink: Each corresponds to one nose color with a 1 if the animal has that color and a 0 if it doesn't.
    • Hybridization: A list of the species that each animal has been seen to hybridize with in the modern day.
    • Ancient Hybridization: A list of the species that each animal is hypothesized to have hybridized with historically.

END OF README

Methods

Data set: In order to sample all felid species, we took advantage of public databases. Images of individuals from 40 extant felid species (all but Felis catus, excluded due to the artificial selection on eye color in domesticated cats by humans), as well as 12 identifiable subspecies and four outgroups (banded linsang, Prionodon linsang; spotted hyena, Crocuta crocuta; common genet, Genetta genetta; and fennec fox, Vulpes zerda), were found using Google Images and iNaturalist using both the scientific name and the common name for each species as search terms. This approach, taking advantage of the enormous resource of publicly available images, allows access to a much larger data set than in the published scientific literature or than would be possible to obtain de novo for this study. Public image-based methods for character state classification have been used previously, such as in a phylogenetic analysis of felid coat patterns (Werdelin and Olsson 1997). However, this approach does require implementing strong criteria for selecting images.

Criteria used to choose images included selecting images where the animal was facing towards the camera, at least one eye was unobstructed, the animal was a non-senescent adult, and the eye was not in direct light, causing glare, or completely in shadow, causing unwanted darkening. The taxonomic identity of the animal in each selected image was verified through images present in the literature, as well as the “research grade” section of iNaturalist. When possible, we collected five images per taxon, although some rarer taxa had fewer than five acceptable images available. In addition, some species with a large number of eye colors needed more than five images to capture their variation, determined by quantitative methods discussed below. Each of the 56 taxa and the number of images used are given in Supplementary Table 1. 

Once the images were selected, they were manually edited using MacOS Preview. This editing process involved choosing the “better” of the two eyes for each image (i.e. the one that is most visible and with the least glare and shadow). Then, the section of the iris for that eye without obstruction, such as glare, shadow, or fur, was cropped out. An example of this is given in Figure S1. This process resulted in a data set of 279 cropped, standardized, irises. These images, along with the original photos, can be found on GitHub.

Eye color identification: To impartially identify the eye color(s) present in each felid population, the data set images were loaded by species into Python (version 3.8.8) using the Python Imaging Library (PIL) (Van Rossum and Drake 2009; Clark 2015). For each image, the red, green, and blue (RGB) values for each of its pixels were extracted. Then, they were averaged and the associated hex color code for the average R, G, and B values was printed. The color associated with this code was identified using curated and open-source color identification programs (Aerne 2022; Cooper 2022). This data allowed the color of each eye in the data set to be correctly identified, removing a great deal of the bias inherent in a researcher subjectively deciding the color of each iris. 

Eye colors were assigned on this basis to one of five fundamental color groups: brown, hazel/green, yellow/beige, gray, and blue. To ensure no data was missed due to low sample size, the first 500 Google Images, as well as all the “research grade” images on iNaturalist, were viewed for each species. Any missed colors were added to the data set. This method nonetheless has a small, but non-zero, chance to miss rare eye colors that are present in species. However, overall, it provides a robust and repeatable way to identify the general iris colors present in animals. 

In addition, if, for a given species, one or two eye colors were greatly predominant in the available data online (>80% for one or ~40% for both, respectively), they were defined as being the most common eye color(s). With this assessment, the phylogenetic analysis below could be carried out both with all recorded eye colors and using only the most common eye colors, thereby assuring that rare eye colors did not skew the results. All the eye colors present for each species are displayed in Figure 1.

Shade measurements within each color group: For each species, the images were sorted into their color groups. For each group, RGB values for each pixel in each image were again extracted, resulting in a three-dimensional data set. This was reduced to two dimensions using Uniform Manifold Approximation and Projection (UMAP), a method selected for its preservation of local structure, important for potential fine shade differences (McInnes et al. 2018). The UMAP projection for each image was then analyzed using k-means clustering through the package scikit-learn (version 1.2.0) (Pedregosa et al. 2011). The number of clusters (k), indicating the number of distinct shades of color in the iris of each animal, was determined using elbow plots. 

After this was done for all images in the group, the k values were averaged and each image was clustered using the average k value, rounded to the nearest integer. This was done to standardize within groups, avoid confounders based on lower-quality images, and allow for comparative analysis. After this, the average RGB values for each cluster for each image were calculated. Then, the clusters were matched up based on similarity. To do this, one image from the group had its clusters labeled in order (if there were three clusters, they would be 0, 1 and 2). Then, another image from the group would have the distances in 3D space between each of its clusters compared to each of the labeled clusters. The optimal arrangement of clusters was found by calculating the sum of squared errors for every possible combination of clusters and taking the minimum. Then, the clusters were merged. This method was repeated for every image in the group. Doing this for every color of every species resulted in an output with the number of shades within the iris for each color in each species, as well as an average of each different shade across the data. Throughout this process, images were not resized, in order to allow higher quality images, which have more pixels, to contribute a greater amount to the average. This was done to ensure any blurring from lower-quality images did not obscure the true shade variety in each eye.

The final, combined clusters were ranked by how prevalent they were within the eyes, calculated by the number of pixels in each group. The groups for each shade were categorized as “dark”, “medium”, or “light” according to procedure provided in the Supplementary Methods. The importance of this pipeline is to create a data set that can be compared in a standardized way. The information about which shades are most represented was also collected and saved. This data can be found in the Supplementary Material.

Phylogeny: The phylogeny used for this work was subset from the Carnivora supertree from Nyakatura and Bininda-Emonds (2012). This ultrametric phylogeny takes into account 188 literature and gene trees and includes members of all eight Felidae lineages. More recent phylogenies are largely congruent, differing mainly in the placement of the Bay Cat Lineage and the Pallas’s cat (Otocolobus manul), partly due to differences in Y chromosome evolutionary evidence compared to other lines of evidence (Li et al. 2016). Alternate placements were tested and were found to not produce a significant difference, making these discrepancies irrelevant to this study. 

This Carnivora supertree tree is missing 9 of the extant felid groups for which data was collected. Thus, a second tree (termed the “full” tree) was created with the missing species being added manually according to their placements on a Felidae-specific tree from Johnson et al. (2006) and/or the more recent tree from Li et al. (2016). The subspecies added were defined according to the most recent identification based on Kitchener et al. (2017) and Liu et al. (2018). Subspecies were added as a polytomy next to the previously defined species on the tree. Since divergence data was unavailable for some of the species and subspecies, the additions were made with branch lengths equal to the nearest resolved neighboring branch, a severe overestimation of the divergence between groups. 

It is important to note that this method of manually adding taxa to a tree is flawed without proper sequence data and certainly should not be relied upon for ancestral state predictions or to make broad claims, as there is no guarantee that any addition reflects true divergence. However, this tree was created purely to try and provide some insight into local areas of the tree at the species level (e.g. what was the eye color of the ancestral tiger?). Even still, these predictions must be understood as far more uncertain than analyses with the original supertree with more limited taxa. The main tree with all the eye colors present for each species is shown in Figure 1. The full tree and the main tree created only considering the most common eye colors are presented in Figure S2.

General color reconstruction: To begin the process of ancestral state reconstruction, the phylogenetic trees were read into R (version 4.2.1) using the package ape (version 5.6-2) (R Core Team 2022; Paradis and Schliep 2019). A table of taxa, and the colors represented for each, was loaded in and scored with 0/1 for absence/presence. The same table with just the most common eye colors was also loaded in. 

The command rayDISC() from corHMM (version 2.8) was used for each of the five eye colors independently across the tree (Beaulieu et al. 2022). Although the presence/absence of each eye color was analyzed on their own, the colors are likely not fully independent. Therefore, they were also analyzed together as a polymorphic trait using stochastic mapping through fitpolyMk() from the R package phytools (version 1.2-0) (Revell 2012). Since there were far too many states (25-1), including high parameter complexity, for adequate interpretation as a polymorphic character and the two analyses generally aligned (data not shown), the independent model was used for the rest of the analysis. A color was said to be present at any given node (Figure 2) if the marginal maximum likelihood ancestral state reconstruction for that color was 60% or greater. The optimal model of trait evolution was determined using an Akaike information criterion (AIC) analysis done on the results of the fitDiscrete() command from geiger comparing equal/symmetric and asymmetric rates (Pennell et al. 2014). This process was done for the data of all the observed eye colors, as well as for the data for the most common eye colors and for the full phylogeny, with the AIC output and weights given in Supplementary Table 2.

Quantitative color reconstruction: After data was collected on the eye colors present for every node on the tree, more specific reconstructions were possible. For each node, a new tree was created for each eye color present at that node. Each of these subset trees included every descendant of that node that shared each eye color with it, except for those where the color was lost and then re-arose independently. For example, an ancestral node that was determined to have hazel/green eyes and brown eyes present would have one tree with all its continuous, green-eyed descendants and another tree with all its continuous, brown-eyed descendants. A diagram of this method is given in Figure S3. This method was done to most accurately reconstruct along plausible evolutionary pathways. If one wants to predict the eye shade of a specific color for a specific node, one should omit taxa that either have lost that eye color (since their present condition cannot communicate any relevant information about the shade of that color for their ancestor), as well as taxa that have lost that eye color and then regained it (since it is unknown whether their present condition is at all related to the shade of that color for their ancestor).

After the trees were created, the specific colors were reconstructed using maximum likelihood methods with the function fastAnc() from the R package phytools (version 1.2-0) (Revell 2012). This was done independently for the red, green, and blue values for each of the data sets collected for the light, medium, and dark shades. Since RGB values can only be from 0-255, it was heartening that the 95% confidence intervals for the quantitative reconstructions were almost always well within the realistic range, lending considerable support to the reconstructions. Large confidence intervals are a known limitation of continuous trait likelihood reconstructions, so one should not understand the reconstructions to always communicate the exact eye shades of the felid ancestors, but they are useful in comparison to one another to illuminate larger trends.

Beyond reconstructing the colors themselves, corHMM’s rayDISC() was again used to reconstruct the number of shades within each eye color for each node, using the shade representation data as a discrete, multistate trait. This was also done for the primary and secondary shades within each eye. Put together, these methods allow for a high-resolution understanding of the iris color of ancestral felids. For each ancestral felid population, we are able to know: which color eyes were present (out of brown, hazel/green, yellow/beige, gray, and blue), how many different shades they had in their eyes for each color, which shades were more or less common, and approximately what those shades would have been. All of this is present in the Supplementary Material.

Correlation analysis: Apart from reconstructing ancestral states, different correlations were performed in order to investigate the possible evolutionary interactions related to eye color variation. Data on pupil shape was obtained from Banks et al. (2015) and data on activity by time of day and primary habitat(s) was obtained from the University of Michigan Animal Diversity Web (Banks et al. 2015; Myers et al. 2022). Data on ​​zoogeographical regions were based on Johnson et al. (2006) and data on coat patterns were based on Werdelin et al. (1997). Nose color data (pink or black) and whether or not any black was present in the coat or tail were determined manually from observation of images. These were each converted into a set of binary traits, according to the procedure given in the Supplementary Methods.

This data, along with the presence/absence data for each eye color, was analyzed with a maximum likelihood approach using BayesTraits (version 3.0.5), made accessible in R through the package btw (version 2.0) (Pagel et al. 2004; Griffin 2018). This was done by building two models, one where the evolution of two binary traits is independent and one where their evolution is dependent on one another (i.e. where the rate of change in one trait is influenced by the state of the other trait). Then, the models were evaluated using a calculated log Bayes Factor, with a log Bayes Factor over 2 indicating positive evidence for the dependent model. Given the stochasticity of these models, the model comparisons were done 100 times and the calculated log Bayes Factors were averaged, ensuring robust and reproducible results. This process was done by comparing the presence of each eye color to all others, as well as the environmental/physical data to the presence of each eye color, the average shade of the RGB values in each eye color, and the average shade of the RGB values in all eye colors overall. This latter average was computed for all taxa by dropping NA values in the averages. To transform the average values into discrete traits, each value was categorized using Jenks natural breaks optimization, performed through the getJenksBreaks() command in the package BAMMtools (version 2.1.10) (Rabosky et al. 2014). Finally, tetrachoric correlation coefficients were calculated using the tetrachoric() command in the package psych (version 2.2.9), to indicate the direction of each association (Revelle 2022). For the shade correlations, a positive association indicates that the trait is associated with lighter shades.

Funding

Harvard University