This file explains the software programs associated with our paper "Practical performance of tree comparison metrics" (Kuhner and Yamato). These programs are in the forms used to write the paper, and are not particularly user-friendly or robust to unexpected input. We will develop a user-friendly version of the tree comparison measures for release in the future. *** PYTHON PROGRAMS These Python programs, written by Mary Kuhner and Jon Yamato, compute distance measures on trees, tabulate and graph the results. They are archived as kuhner_pyprogs.tgz (tar gzip file). They use several external Python packages, specifically numpy, ete2, munkres, and matplotlib. These will need to be installed before the programs will run. All programs with "_graph" in the name will attempt to open windows and display graphics in them, and will fail if the environment does not permit this. PROGRAMS IMPLEMENTING DISTANCE MEASURES dist_fns.py -- contains routines for most of the distance measures used in this study as well as utility routines for tree handling mast.py -- routines for the Maximum Agreement SubTree measure align.py -- routines for the Align measure nodedist.py -- routines for the Node measure dist_harness.py -- test harness for all of the distance-measure routines PROGRAMS APPLYING DISTANCE MEASURES TO DATA inf_analyze.py -- apply distance measures to data generated for the "bullseye" experiment in our paper inf_graph.py -- graph the results of inf_analyze.py n_analyze.py -- apply distance measures to data generated for the "n-away" experiment in our paper n_graph.py -- graph the results of n_analyze.py SAMPLE APPLICATION paper_samples.py -- a tiny program which applies the 9 distance measures to a pair of hard-coded trees. This may be useful as a guide to incorporating the routines in your own code. The example used corresponds to Figure 1 in our paper. *** C PROGRAMS These C programs, written by Joe Felsenstein and archived here with his permission, were used to simulate genetic data for the "bullseye" experiment. The versions archived here have been modified by Mary Kuhner and Jon Yamato to read their input from a parameter file (and to read ancestral recombination graphs, a capability not used here). They are archived as kuhner_cprogs.tgz (tar gzip file). For the "n-away" experiment we used the tree simulation program ms and DNA simulation program seq-gen, available from Richard Hudson. We include all needed parameter files for a toy case which simulates trees and DNA for 5 tips and 2000 bp. The commands needed to run this case, including compilation commands (using the gcc compiler) and all file manipulation required, are given in commands.txt. PROGRAMS AND PARAMETER FILES rantree.c -- simulate phylogenetic trees from a branching-process or coalescent model. Produces Newick format trees. ranparm -- annotated parameter file for rantree rectreedna.c -- simulate DNA data under the Kimura 2-parameter model on a given Newick-format tree. dnaparm -- annotated parameter file for rectreedna seedfile -- random number seed file used by rantree and rectreedna; output is deterministic based on this seed. tree_header.txt -- line to prepend to the output of rantree in order to prepare it to be input into rectreedna; should consist of the range of sites to simulate, i.e. 1 2000 to simulate 2000 bp of DNA. commands.txt -- sample commands to run these programs For more information about these programs or assistance with their use, please contact Mary Kuhner: mkkuhner@uw.edu