# Data from: Solution of the Generalized Noah's Ark Problem

## Citation

Billionnet, Alain (2012), Data from: Solution of the Generalized Noah's Ark Problem, Dryad, Dataset, https://doi.org/10.5061/dryad.7q804

## Abstract

The phylogenetic diversity (PD) of a set of species is a measure of the evolutionary distance among the species in the collection, based on a phylogenetic tree. Such a tree is composed of a root, of internal nodes and of leaves that correspond to the set of taxa under study. With each edge of the tree is associated a non-negative branch length (evolutionary distance). If a particular survival probability is associated with each taxon, the PD measure becomes the expected PD measure. In the Noah’s Ark Problem (NAP) introduced by Weitzman (1998), these survival probabilities can be increased at some cost. The problem is to determine how best to allocate a limited amount of resources to maximize the expected PD of the considered species. It is easy to formulate the NAP as a (difficult) nonlinear 0-1 programming problem. The aim of this article is to show that a general version of the NAP (GNAP) can be solved simply and efficiently with any set of edge weights and any set of survival probabilities by using standard mixed-integer linear programming software. The crucial point to move from a nonlinear program in binary variables to a mixed-integer linear program, is to approximate the logarithmic function by the lower envelope of a set of tangents to the curve. Solving the obtained mixed-integer linear program provides not only a near-optimal solution but also an upper bound on the value of the optimal solution. We also applied this approach to a generalization of the Nature Reserve Problem (GNRP) that consists of selecting a set of regions to be conserved so that the expected phylogenetic diversity of the set of species present in these regions is maximized. In this case, the survival probabilities of different taxa are not independent of each other. Computational results are presented to illustrate potentialities of the approach. Near-optimal solutions with hypothetical phylogenetic trees comprising about 4,000 taxa are obtained in a few seconds or minutes of computing time for the GNAP, and in about 30 minutes for the GNRP. In all the cases the average guarantee varies from 0% to 1.20%.