Data and code from: Diffeomorphic independent contrasts for ancestral reconstruction of shapes
Data files
Apr 08, 2026 version files 1.37 GB
-
fig3_kernel_test.zip
22.54 MB
-
fig4_5_root_comparison.zip
1.34 GB
-
fig8_emperical_kernel.zip
441.69 KB
-
help_functions.zip
12.50 KB
-
README.md
10.76 KB
Abstract
Ancestral reconstruction is a fundamental challenge in evolutionary biology, requiring methods that can capture complex morphological changes while accounting for phylogenetic relationships. Current approaches are based on linear assumptions that often oversimplify the spatial relationships between anatomical features and fail to account for landmark correlations within shapes. Here, we introduce a novel method that combines the ability of Large Deformation Diffeomorphic Metric Mapping (LDDMM) to model smooth, invertible transformations between shapes while preserving the relationships between landmarks with Felsenstein's Independent Contrasts (IC) to iteratively reconstruct ancestral shapes along the branches of a phylogenetic tree. We call this method Diffeomorphic Independent Contrasts for Ancestral Reconstruction of Shapes (DICAROS). We validate DICAROS against two existing methods: (1) Linear predictors using Ordinary Least Squares and (2) Ancestral character estimation using maximum likelihood under Brownian Motion, and apply DICAROS to a dataset of swallowtail butterfly species (Family Papilonidae, Order Lepidoptera) to reconstruct the ancestral shape and visualize evolutionary trajectories in a phylomorphospace from the contrasts. We conclude that DICAROS outperforms the existing methods in terms of accuracy and provides a more accurate reconstruction of the ancestral shape for non-symmetric phylogenetic trees. With DICAROS, we show a transition between untailed and tailed papilinodae species while also illustrating how images of modern species would look under the DICAROS ancestral reconstruction.
https://doi.org/10.5061/dryad.41ns1rnr7
Description of the data and file structure
This repository contains the code and simulation data for the manuscript "DICAROS: Diffeomorphic Independent Contrasts for Ancestral Reconstruction of Shapes"( https://doi.org/10.1093/sysbio/syag019 of Severinsen et al. 2026.
DICAROS is a method for reconstructing the ancestral shapes using landmarks. Furthermore, it allows applying the shape reconstruction to matching images. It is based on the Large Deformation Diffeomorphic Metric Mapping (LDDMM) to model smooth, invertible transformations between shapes while preserving the relationships between landmarks with Felsenstein's Independent Contrasts (IC) to iteratively reconstruct ancestral shapes along the branches of a phylogenetic tree.
Files and variables
Description:
Folders starting with "fig" refer to how to reproduce a specific figure within the manuscript.
help_functions.zip refers to the underlying software and functions shared across all figures.
For the empirical example we refer to, https://github.com/MichaelSev/DICAROS
For updates on Hyperiax, we refer to: https://github.com/ComputationalEvolutionaryMorphometry/hyperiax
For the most recent updates and compatibility of DICAROS, it can be found at;
https://github.com/ComputationalEvolutionaryMorphometry/hyperiax/blob/main/examples/DICAROS.ipynb
When applicable, directory names follow this pattern
(01) _ (02) _ leaf _ (03) _ kalpha _ (04) _ ksigma _ (05) _ nlandmarks _ (06) _ depth _ 5 _ sigma _ (07) where:
- (01) denotes the root shape, either birdbeak, butterfly, circle, or sphere
- (02) denotes the tree structure, either symmetric (sym) or asymmetric (asym)
- (03) denotes the tree size, referring to the number of leaves
- (04) denotes the kalpha value used for shape simulation
- (05) denotes the ksigma value used for shape simulation
- (06) denotes the number of landmarks in each shape
- (07) denotes the kernel value used for reconstruction with DICAROS
When applicable, file names follow this pattern
pred_root_iteration[N].txt, where N denotes the iteration number
File: fig3_kernel_test.zip
Description: To reproduce Figure 3, which tests for the robustness, for different values of the kernel size of DICAROS.
The directory contains 26,076 files and 290 folders.
The directory tree structure and main files are visible below, with explanations in parentheses.
fig3_kernel_test
├── 01_reconstructed_roots
│ ├── Birdbeak
│ │ └── Birdbeak_(02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05) \_ nlandmarks_79_depth_5_sigma \_ (06)
│ │ ├── pred_root_iteration[N].txt (reconstructed root)
│ │ ├── treeform.txt (tree structure)
│ │ └── true_root.txt (original root shape)
│ ├── Butterfly
│ │ └── Butterfly_(02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05) \_ nlandmarks_118_depth_5_sigma \_ (06)
│ │ ├── pred_root_iteration[N].txt (reconstructed root)
│ │ ├── treeform.txt (tree structure)
│ │ └── true_root.txt (original root shape)
│ ├── Circle
│ │ └── Circle_(02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05) \_ nlandmarks_30_depth_5_sigma \_ (06)
│ │ ├── pred_root_iteration[N].txt (reconstructed root)
│ │ ├── treeform.txt (tree structure)
│ │ └── true_root.txt (original root shape)
│ ├── Sphere
Sphere_(02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05) \_ nlandmarks_50_depth_5_sigma \_ (06)
│ │ ├── pred_root_iteration[N].txt (reconstructed root)
│ │ ├── treeform.txt (tree structure)
│ │ └── true_root.txt (original root shape)
├── start_shapes (root shape for shape simulation across the phylogeny - all shapes are normalized)
│ ├── Birdbeak
│ │ └── Birdbeak.csv (start shape for simulation)
│ ├── Butterfly
│ │ └── Butterfly.csv (start shape for simulation)
│ ├── Circle
│ │ └── Circle.csv (start shape for simulation)
│ └── Sphere
│ └── Start shape for simulation
├── 01_sim_shape_sigma.sh (bash file to execute Python scripts for varying kernel)
├── 02_plot_results.ipynb (plot the results from directories 01_reconstructed_roots)
├── simulate_shape_changing_sigma.py (called by 01_sim_shape_sigma.sh )
└── simulate_shape_changing_sigma_3d.py
File: fig4_5_root_comparison.zip
Description: To reproduce figures 3 and 4, in the manuscript where we compare DICAROS to other methods from the R packages Geomorph, Phytools, and mvMORPH.
fig4_5_root_comparison
├── 01_simulated_shapes_and_reconstruction
│ ├── Birdbeak
│ │ └── Birdbeak_(02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05) \_ nlandmarks_79_depth_5
│ │ ├── pred_root_iteration[N].txt (reconstructed root by DICAROS)
│ │ ├── landmark_iteration[N].txt (leaf shapes used for reconstruction)
│ │ ├── treeform.txt (tree structure)
│ │ └── true_root.txt (original root shape)
│ ├── Butterfly (identical format as birdbeak)
│ ├── Circle (identical format as birdbeak)
│ └── Sphere (identical format as birdbeak)
├── 02_mean_distances (results for distance estimation between DICAROS, phytools, mvMORPH, and geomorph)
│ ├── Birdbeak
│ │ ├── Birdbeak \_ (02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05) \_ nlandmarks_79_depth_5.csv (the mean distance pr method )
│ │ └── Birdbeak \_ (02) \_ leaf \_ (03) \_ kalpha \_ (04) \_ ksigma \_ (05)_nlandmarks_79_depth_5_sum.csv (the summed distance pr method )
│ ├── Butterfly (identical format as birdbeak)
│ ├── Circle (identical format as birdbeak)
│ └── Sphere (identical format as birdbeak)
├── start_shapes (root shape for shape simulation across the phylogeny - all shapes are normalized)
│ ├── Birdbeak
│ │ └── Birdbeak.csv (start shape for simulation)
│ ├── Butterfly
│ │ └── Butterfly.csv (start shape for simulation)
│ ├── Circle
│ │ └── Circle.csv (start shape for simulation)
│ └── Sphere
│ └── Start shape for simulation
├── 01_simulate_shapes.sh (bash file to execute Python scripts for varying parameters)
├── 02_R_comparison.sh (reconstruct root using Geomorph, Phytools, and mvMORPH and save output to 02_mean_distances )
├── 03_plot_results.ipynb (plot the results from directories 02_mean_distances)
├── simulate_shape_changing_sigma.py (called by 01_sim_shape_sigma.sh )
└── simulate_shape_changing_sigma_3d.py(called by 01_sim_shape_sigma.sh )
File: fig8_emperical_kernel.zip
Description: To reproduce Figure 8, to highlight the reconstruction for different kernels.
fig8_emperical_kernel
├── butterfly_kernel_width_example.ipynb
├── Parides_photinus.txt (original shape)
├── root_yss.txt (reconstructed roots y-coordinates)
└── root_xss.txt (reconstructed roots x-coordinates)
File: help_functions.zip
Description: Help functions, shared between all coding elements within this directory
help_functions
├── align_shapes.py (functions to do procedures alignment in Python)
├── DICAROS.py (functions to perform DICAROS in 2d)
├── DICAROS_3d.py (functions to perform DICAROS in 3d)
├── DICAROS_phylomorphospace.py (functions related to phylomorphospace using DICAROS in Figure 7)
├── image_manipulation.py (functions related to the trajectory example shown in Figure 9)
├── kunita_flow.py (functions related to performing shape simulation)
├── load_func.R (functions related to loading in shapes in R)
├── SDE (functions related to performing shape simulation, see jaxgeometry for details )
└── writing_functions.py (functions related to writing simulation results to files)
Code/Software
Dependencies
- Jax Geometry: A library for differential geometry computations
- Repository: jaxgeometry
- Version: 0.9.4
- Hyperiax: A library for differential geometry computations
- Repository: hyperiax
- Version: 1.0.1
- A specific for this project can be found at https://github.com/MichaelSev/DICAROS
Installation
- Create and activate a conda environment:
bash
conda create -n DICAROS python=3.13
conda activate DICAROS
- Install required packages:
bash
pip install jax[cuda12]==0.4.34 jaxlib==0.4.34
pip install jaxdifferentialgeometry==0.9.4
pip install numpy==1.26.4 pandas==2.2.3 matplotlib==3.8.4 scikit-learn==1.4.2
pip install HeapDict==1.0.1
- For the R simulations, we used the following versions
bash
install.packages("geomorph", version = "4.0.6", repos = "http://cran.us.r-project.org") install.packages("phytools", version = "2.3-0", repos = "http://cran.us.r-project.org") install.packages("mvMORPH", version = "1.2.1", repos = "http://cran.us.r-project.org") install.packages("readxl", version = "1.4.5", repos = "http://cran.us.r-project.org")
install.packages("foreach", version = "1.5.2", repos = "http://cran.us.r-project.org") install.packages("doParallel", version = "1.0.17", repos = "http://cran.us.r-project.org")
Sharing/Access information
Other publicly accessible locations of the data:
Data was derived from the following sources:
- The original butterfly images were derived from Gbif.org.
- The origin of all butterflies is detailed in the metadata file within "Papilnodae_dataset.", which can be found in the above Github repository
