Skip to main content

Evolutionary insights into Felidae iris color through ancestral state reconstruction


Tabin, Julius (2023), Evolutionary insights into Felidae iris color through ancestral state reconstruction, Dryad, Dataset,


There have been almost no studies with an evolutionary perspective on eye (iris) color, outside of humans and domesticated animals. Extant members of the family Felidae have a great interspecific and intraspecific diversity of eye colors, in stark contrast to their closest relatives, all of which have only brown eyes. This makes the felids a great model to investigate the evolution of eye color in natural populations. Through machine learning cluster image analysis of publicly available photographs of all felid species, as well as a number of subspecies, five felid eye colors were identified: brown, hazel/green, yellow/beige, gray, and blue. Using phylogenetic comparative methods, the presence or absence of these colors was reconstructed on a phylogeny. Additionally, through a new color analysis method, the specific shades of the ancestors’ eyes were reconstructed to a high degree of precision. The ancestral felid population was found to have brown-eyed individuals, as well as a novel evolution of gray-eyed individuals, the latter being a key innovation that allowed the rapid diversification of eye color seen in modern felids, including numerous gains and losses of different eye colors. It was also found that the loss of brown eyes and the gain of yellow/beige eyes is associated with an increase in the likelihood of evolving round pupils, which in turn influence the shades present in the eyes. Along with these important insights, the unique methods presented in this work are widely applicable and will facilitate future research into phylogenetic reconstruction of color beyond irises.


Data set: In order to sample all felid species, we took advantage of public databases. Images of individuals from 40 extant felid species (all but Felis catus, excluded due to the artificial selection on eye color in domesticated cats by humans), as well as 13 identifiable subspecies and the banded linsang as an outgroup (Prionodon linsang), were found using Google Images and iNaturalist using both the scientific name and the common name for each species as search terms. This approach, taking advantage of the enormous resource of publicly available images, allows access to a much larger data set than in the published scientific literature or that would be possible to obtain de novo for this study. Public image-based methods for character state classification have been used previously, such as in a phylogenetic analysis of felid coat patterns (Werdelin and Olsson 1997). However, this approach does require implementing strong criteria for selecting images.

Criteria used to choose images included selecting images where the animal was facing toward the camera, at least one eye was unobstructed, the animal was a non-senescent adult, and the eye was not in direct light, causing glare, or completely in shadow, causing unwanted darkening. The taxonomic identity of the animal in each selected image was verified through images present in the literature, as well as the “research grade” section of iNaturalist. When possible, we collected five images per taxon, although some rarer taxa had fewer than five acceptable images available. In addition, some species with a large number of eye colors needed more than five images to capture their variation, determined by quantitative methods discussed below.

Once the images were selected, they were manually edited using MacOS Preview. This editing process involved choosing the “better” of the two eyes for each felid image (i.e. the one that is most visible and with the least glare and shadow). Then, the section of the iris for that eye without obstruction, such as glare, shadow, or fur, was cropped out. This process resulted in a data set of 269 cropped, standardized, felid irises.

Eye color identification: To impartially identify the eye color(s) present in each felid population, the data set images were loaded by species into Python (version 3.8.8) using the Python Imaging Library (PIL) (Van Rossum and Drake 2009; Clark 2015). For each image, the red, green, and blue (RGB) values for each of its pixels were extracted. Then, they were averaged and the associated hex color code for the average R, G, and B values was printed. The color associated with this code was identified using curated and open-source color identification programs (Aerne 2022; Cooper 2022). This data allowed the color of each eye in the data set to be correctly identified, removing a great deal of the bias inherent in a researcher subjectively deciding the color of each iris. 

Eye colors were assigned on this basis to one of five fundamental color groups: brown, hazel/green, yellow/beige, gray, and blue. To ensure no data was missed due to low sample size, the first 500 Google Images, as well as all the “research grade” images on iNaturalist, were viewed for each species. Any missed colors were added to the data set. This method nonetheless has a small, but non-zero, chance to miss rare eye colors that are present in species. However, overall, it provides a robust and repeatable way to identify the general iris colors present in animals. 

In addition, if, for a given species, one or two eye colors were greatly predominant in the available data online (>80% for one or ~40% for both, respectively), they were defined as being the most common eye color(s). With this assessment, the phylogenetic analysis below could be carried out both with all recorded eye colors and using only the most common eye colors, thereby assuring that rare eye colors did not skew the results.

Shade measurements within each color group: For each species, the images were sorted into their groups by assigned color. For each group, RGB values for each pixel in each image were again extracted, resulting in a three-dimensional data set. This was reduced to two dimensions using Uniform Manifold Approximation and Projection (UMAP) (McInnes et al. 2018). The graph for each image was then analyzed using k-means clustering through the package scikit-learn (version 1.2.0) (Pedregosa et al. 2011). The number of clusters (k), indicating the number of distinct shades of color in the iris of each animal, was determined using elbow plots. 

After this was done for all images in the group, the k values were averaged and each image was clustered using the average k value, rounded to the nearest integer. This was done to standardize within groups, avoid confounders based on lower-quality images, and allow for comparative analysis. After this, the average RGB values for each cluster for each image were calculated. Then, the clusters were matched up based on similarity. To do this, one image from the group had its clusters labeled in order (if there were three clusters, they would be 0, 1, and 2). Then, another image from the group would have the distances in 3D space between each of its clusters compared to each of the labeled clusters. The optimal arrangement of clusters was found by calculating the sum of squared errors for every possible combination of clusters and taking the minimum. Then, the clusters were merged. This method was repeated for every image in the group. Doing this for every color of every species resulted in an output with the number of shades within the iris for each color in each species, as well as an average of each different shade across the data. Throughout this process, images were not resized, in order to allow higher quality images, which have more pixels, to contribute a greater amount to the average.

The final, combined clusters were ranked by how prevalent they were within the eyes, calculated by the number of pixels in each group. The groups for each shade were categorized as “dark”, “medium”, or “light” according to the procedure provided in the Supplementary Methods. The importance of this pipeline is to create a data set that can be compared in a standardized way. The information about which shades are most represented was also collected and saved.

Phylogeny: The phylogeny used for this work was taken from Johnson et al. (2006), with the banded linsang (Prionodon linsang) as an outgroup (Johnson et al. 2006). This is a molecular phylogeny of all eight Felidae lineages with fossil calibrations and usage of Y chromosome data. More recent phylogenies are largely congruent, differing mainly in the placement of the Bay Cat Lineage, due to differences in Y chromosome evolutionary evidence compared to other lines of evidence (Li et al. 2016). Both Bay Cat Lineage placements were tested and were found to not produce a significant difference, making the discrepancy irrelevant to this study. Importantly, the Johnson et al. tree does not include every extant felid species, nor subspecies, looked at in this study. To remedy this, the missing species and subspecies were added manually according to their placements on the more recent tree from Li et al. (2006) or, in the case of subspecies, as a polytomy next to the previously defined species on the tree. Since divergence data was unavailable for some of the species and subspecies, two trees were created. One tree had the species in their proper relationships, but with branch lengths of nearly zero (0.0000001), a severe underestimation of the divergence between groups. This was termed the “short” tree. The other tree also had proper relationships, but with branch lengths equal to the nearest resolved neighboring branch, a severe overestimation of the divergence between groups. This was termed the “long” tree. Completing analyses with both trees and comparing the results bypasses the need for exact branch lengths for the uncertain taxa, provided the results match. A limitation of this method is that it results in uncertainty if the results do not align, which turned out to not be a problem in this case.

General color reconstruction: To begin the process of ancestral state reconstruction, the short and long phylogenetic trees were read into R (version 4.2.1) using the package ape (version 5.6-2) (R Core Team 2022; Paradis and Schliep 2019). A table of taxa, and the colors represented for each, was loaded in and scored with 0/1 for absence/presence. The same table with just the most common eye colors was also loaded in. 

The command rayDISC() from corHMM (version 2.8) was used for each of the five eye colors independently across the tree (Beaulieu et al. 2022). The presence/absence of each eye color was taken on its own because there would be far too many states to run it as a multistate trait. The accepted state at each node was determined using maximum likelihood methods. All models of trait evolution (equal rates, symmetric rates, and asymmetric rates) were tried, but there were no differences in results. An Akaike information criterion (AIC) analysis done on the results of the fitDiscrete() command from geiger revealed that an asymmetric rates model was best supported, regardless of the lack of difference its use makes in the results (Pennell et al. 2014). This process was done for the data of all the observed eye colors, as well as for the data for the most common eye colors. 

Exact color reconstruction: After data was collected on the eye colors present for every node on the tree, more specific reconstructions were possible. For each node, a new tree was created for each eye color present at that node. Each new tree included every descendant of that node that shared each eye color with it, except for those where the color was lost and then re-arose independently. For example, an ancestral node that was determined to have hazel/green eyes and brown eyes present would have one tree with all its continuous, green-eyed descendants and another tree with all its continuous, brown-eyed descendants.

After the trees were created, the specific colors were reconstructed using maximum likelihood methods with the function fastAnc() from the R package phytools (version 1.2-0) (Revell 2012). This was done independently for the red, green, and blue values for each of the data sets collected for the light, medium, and dark shades. The 95% confidence intervals for the exact reconstructions were extremely wide when using the “short” tree, averaging over -1000 to 1000. Large confidence intervals are a known limitation of continuous trait likelihood reconstructions, but this is particularly egregious given that RGB values can only be from 0-255. However, when the long tree was used, the end results did not change significantly, but the confidence intervals decreased massively, almost always being well within the 0-255 realistic range. This lends considerable support to the reconstructions. 

Beyond reconstructing the colors themselves, corHMM’s rayDISC() was again used to reconstruct the number of shades within each eye color for each node, using the shade representation data as a discrete, multistate trait. This was also done for the primary and secondary shades within each eye. Put together, these methods allow for a high-resolution understanding of the iris color of ancestral felids. For each ancestral felid population, we are able to know: which color eyes were present (out of brown, hazel/green, yellow/beige, gray, and blue), how many different shades they had in their eyes for each color, which shades were more or less common, and approximately what those shades would have been.

Correlation analysis: Apart from reconstructing ancestral states, different correlations were performed in order to investigate the possible evolutionary interactions related to eye color variation. Data on pupil shape was obtained from Banks et al. (2015) and data on activity by time of day and primary habitat(s) was obtained from the University of Michigan Animal Diversity Web (Banks et al. 2015; Myers et al. 2022). Data on ​​zoogeographical regions were based on Johnson et al. (2006) and data on coat patterns were based on Werdelin et al. (1997). Nose color data (pink or black) and whether or not any black was present in the coat or tail were determined manually from observation of images. These were each converted into a set of binary traits.

This data, along with the presence/absence data for each eye color, was analyzed with a maximum likelihood approach using BayesTraits (version 3.0.5), made accessible in R through the package btw (version 2.0) (Pagel et al. 2004; Griffin 2018). This was done by building two models, one where the evolution of two binary traits is independent and one where their evolution is dependent on one another (i.e. where the rate of change in one trait is influenced by the state of the other trait). Then, the models were evaluated using a calculated log Bayes Factor, with a log Bayes Factor over 2 indicating positive evidence for the dependent model. This process was done by comparing the presence of each eye color to all others, as well as the environmental/physical data to the presence of each eye color, the average shade of the RGB values in each eye color, and the average shade of the RGB values in all eye colors overall. This latter average was computed for all taxa by dropping NA values in the averages. To transform the average values into discrete traits, each value was categorized using Jenks natural breaks optimization, performed through the getJenksBreaks() command in the package BAMMtools (version 2.1.10) (Rabosky et al. 2014). Finally, tetrachoric correlation coefficients were calculated using the tetrachoric() command in the package psych (version 2.2.9), to indicate the direction of each association (Revelle 2022). For the shade correlations, a positive association indicates that the trait is associated with lighter shades.


Harvard University