Skip to main content

Data from: Sampling from commercial vessel routes can capture marine biodiversity distributions effectively

Cite this dataset

Boyse, Elizabeth; Beger, Maria; Valsecchi, Elena; Goodman, Simon (2023). Data from: Sampling from commercial vessel routes can capture marine biodiversity distributions effectively [Dataset]. Dryad.


Collecting fine-scale occurrence data for marine species across large spatial scales is logistically challenging but is important to determine species distributions and for conservation planning. Inaccurate descriptions of species ranges could result in designating protected areas with inappropriate locations or boundaries. Optimising sampling strategies, therefore, is a priority for scaling up survey approaches using tools such as environmental DNA (eDNA) to capture species distributions. In a marine context, commercial vessels, such as ferries, could provide sampling platforms allowing access to under-sampled areas and repeatable sampling over time to track community changes. However, sample collection from commercial vessels could be biased and may not represent biological and environmental variability. Here, we evaluate whether sampling along Mediterranean ferry routes can yield unbiased biodiversity survey outcomes, based on perfect knowledge from a stacked species distribution model (SSDM) of marine megafauna from online data repositories. Simulations to allocate sampling point locations were carried out representing different sampling strategies (random vs regular), frames (ferry routes vs unconstrained) and number of sampling points. SSDMs were remade from different sampling simulations and compared to the ‘perfect knowledge’ SSDM to quantify the bias associated with different sampling strategies. Ferry routes detected more species and were able to recover known patterns in species richness at smaller sample sizes better than unconstrained sampling points. However, to minimise potential bias, ferry routes should be chosen to cover the variability in species composition and its environmental predictors in the SSDMs. The workflow presented here can be used to design effective sampling strategies using commercial vessel routes globally, including for eDNA analyses. This approach has potential to provide a cost-effective method to access remote oceanic areas on a regular basis and can recover meaningful data on spatiotemporal biodiversity patterns.


This dataset includes binary species distribution models for 43 species of marine predators (9 mammals, 13 elasmobranchs, 20 fishes, and one turtle) from the Mediterranean Sea, and a binary stacked species distribution model showing the species richness of all marine predators. Models are available at 0.083° x 0.083° resolution in a WGS84 projection. Species distribution models were made with occurrence data collated from publically available data sources GBIF, OBIS, EurOBIS, and ACCOBAMS, and the Medlem database which is available upon request from its authors, as well as environmental predictors from Bio-Oracle and Marspec. Quality checking of occurrence records prior to modelling has been carried out including removal of records with GPS coordinates with fewer than three decimal places and duplicates between records based on the species, coordinates, year and month. Records were manually filtered further to identify records with the same species, year and month but different coordinates as a result of potential rounding between the different datasets. 

Usage notes

.tiff files can be opened with GIS software or in R using the raster package.


Leeds Doctoral Scholarship