Data from: Global tracking of marine megafauna space use reveals how to achieve conservation targets

Sequeira, Ana M. M.1 ; Rodriguez, Jorge P.2

Published Apr 24, 2026 on Dryad. https://doi.org/10.5061/dryad.x95x69ptv

Data files

Apr 24, 2026 version files 53.96 MB

1_Tracked_Individuals.zip

958.85 KB
2_Detected_behaviours.zip

2.43 MB
3_Areas_for_MPA_extension.zip

189.61 KB
4_Anthropogenic_threats.txt

84 B
5_Supplementary_Tables.zip

60.79 KB
6_Spatial_Extent_of_Space-Use_by_Species.zip

50.30 MB
README.md

10.93 KB

Abstract

The recent Kunming-Montreal Global Biodiversity Framework (GBF) sets ambitious goals, but no clear pathway for how the zero loss of important biodiversity areas and halting human-induced extinction of threatened species will be achieved. We assembled a multi-taxa tracking dataset (11 million geopositions from 15,845 tracked individuals across 121 species) to provide a global assessment of space use of highly mobile marine megafauna, showing that 63% of the area they cover is used 80% of the time as important migratory corridors or residence areas. The GBF 30% threshold (Target 3) will be insufficient for marine megafauna’s effective conservation, leaving important areas exposed to major anthropogenic threats. Coupling area protection with mitigation strategies (e.g., fishing regulation, wildlife-traffic separation) will be essential to reaching international goals and conserving biodiversity.

https://doi.org/10.5061/dryad.x95x69ptv

Description of the data and file structure

The dataset used for the MegaMove paper by Sequeira, Rodriguez & 376 co-authors was collected to capture movement of marine megafauna across the global ocean and was analysed at a global scale at 1 degree resolution.

All data and source code used for MegaMove paper by Sequeira, Rodriguez & 376 co-authors are provided in Dryad (six folders with datasets needed to reproduce results and figures - noting one of these folders has been replaced with a text file indicating the folder needs to be downloaded from Zenodo instead due to licensing) and Zenodo (five folders with code and data used for all analyses PLUS the folder that has been replaced with a text file in Dryad and should now be downloaded from Zenodo).
Empty grid-cells indicate 'no data', which can represent locations where no tracking data were collected, where residency or migratory behaviours were not identified, where there are no environmental data (e.g., due to cloud cover), or where there are no anthropogenic threats data.

Files and variables

File: 1_Tracked_Individuals.zip

Description: Folder containing files with number of tracked individuals in each grid cell used to create all figures containing this information. The folder also includes a sub-folder called "tests", with the additional datasets used for testing effects of biases.

Latitude: latitude in degrees
Longitude: longitude in degrees
nind: number of individuals in each grid cell
{sp}_nind: number of individuals of species {sp} in each grid cell
{tx}_nind: number of individuals of taxon {tx} in each grid cell
Nspec_eff: effective number of species

File: 2_Detected_behaviours.zip

Description: Folder containing data of area used and time spent in each behaviour (i.e., migration or residency) with sub-folders containing shapefiles of corridor, residences and IMMegAs as detected in the paper. For file Time_Behaviours.csv, the units are percentage of time (in respective behaviours).

File: 3_Areas_for_MPA_extension.zip

Description: csv files with the grid cells selected for protection as based in the data and on the models

fmpa: fraction of the grid cell area covered by marine protected areas
toprot: grid-cells to be included in the 30% of protected area (if value equals 1).
Ecol: represents the areas of ecological interest, i.e. IMMegAS (if value equals 1).

File: 4_Anthropogenic_threats.txt

Description: This text file provides the instruction that the a zipped folder with the anthropogenic threats data ("4_Anthropogenic threats.zip") needs to be downloaded from Zenodo. That zipped folder contains csv files of the major anthropogenic threats considered as averages and scored from high (10) to low (1).

File: 5_Supplementary_Tables.zip

Description: Six supplementary Tables as indicated in the manuscript and including:

Supplementary Table S1 Summary of the satellite tracking dataset
Supplementary Table S2 Tag deployment information and metadata
Supplementary Table S3 Morphometric data and maximum travel speeds
Supplementary Table S4 List of threats and existing assessment data
Supplementary Table S5 Conservation management measures
Supplementary Table S6 References included in supplementary tables

File: 6_Spatial_Extent_of_Space-Use_by_Species.zip

Description: Maps showing the spatial extent of space-use by each of the 111 species

Code/software

The codes uploaded to Zenodo are written in R, C++ and Python (Jupyter Notebooks) languages. They are organized in folders: behavior_detection, models and optimization, and also include a folder with auxiliary code. The contents in each folder are described below:

Behavior detection

This folder includes code to detect residence areas and corridors from displacements tracking data. Each of the code files provided should be run sequentially (from 01 to 08). Their inputs and outputs are in the ../source_data/ folder.

The functions of the code files are the following:

- 01_averagedisplbyspecies_multiscale.ipynb: to assign to each individual the average displacement of its species.

- 02_normalizedisplacementsmultiscale.cc: to normalize the displacements with species average.

- 03_coherencebytaxa.ipynb: to assign measure average coherence by taxa and according to that select the time lag T.

- 04_extractdispl.ipynb: to extract displacements associated to the selected T (in 03_coherencebytaxa) to read lighter files.

- 05_searchcorridors.ipynb: to extract corridors from displacements data.

- 06_averagedisplbyspecies_SD.ipynb: to assign to each individual the average displacement of its species and its standard deviation.

- 07_residence_selectedT_multiscale.ipynb: to extract residency areas as locations with displacements lower than average in 1 SD or less.

- 08_combineoutput.ipynb: to organize information at cell level: presence, residence and corridor by taxa.

Models

This folder includes the code to run the models used in the manuscript to predict areas used for residence or corridors. The code is organised in three separate files written in R language (with extension .R), each corresponding to:

- MegaMove_GLM Models_Corridor.R: code to predict Corridors

- MegaMove_GLM Models_Residence.R: code to predict Residences

- MegaMove_GLM_Models_Prediction Manuscript Maps.R: code to reproduce maps with prediction results (stored in the folder Final Maps)

Input data used to run the code is in ~/source_data and includes two files for each code:

- Prevalent_RandList: a list of 7 random sets of prevalent presence and absence grid-cells indicating the existing of the behaviour (migration reflecting Corridors, or residency reflecting Residences) with corresponding environmental variables needed to run the models for each taxon

- PresenceList: a list of 7 sets of all presence grid-cells derived from the tracking dataset with corresponding environmental variables used to predict out the models for each taxon

Outputs from the R code are stored in the folder models/Output, which should contain two subfolders, each to store the results for the Corridors model and the Residences model per taxa. Each of these subfolders should also contain a subfolder to store the Predictions results, as follows:

- CORRIDORvNONE_PerTaxa / PREDICTIONS

- RESIDENCEvNONE_PerTaxa / PREDICTIONS

The summary results will be outputted to the Outputs folder, while d will be stored inside the respective subfolders.

Optimization

This folder includes the code file optimization_algorithm.ipynb that selects cells to protect, according to the detected behaviors in the IMMegAs, to reach a protection of 30% of area covered by MegaMove dataset.

Source Data

The 'source data' folder includes all input data for each of the code files.

Auxiliary

This folder includes code to process data for some of the testing done and to calculate displacements. The following three files are included:

- displacementsmultiscale.cc: to measure displacements at different time lags T.

- filtertagloc.cc: to filter positions around the tagging location.

- randomization.ipynb: to create randomized tracking datasets

How to run the code

C++

You need to have installed a C++ compiler, for example g++.

To install g++ (in Ubuntu), the following commands need to be executed:

sudo apt-get update
sudo apt-get install g++

Then code can be compiled by replacing by the specific code to execute as follows:

g++ <codename>.cc -o torun
./torun

Python (notebooks)

Code can be run using Jupyter notebooks. We recommend to create a new environment (entitled megamove) with the needed packages to run them:

conda create --name megamove

To install the needed packages:

conda activate megamove
conda install -c anaconda ipykernel
python -m ipykernel install --user --name=megamove
conda install jupyter
conda install pandas=1.2.4
conda install matplotlib=3.4.2
conda install geopandas=0.9.0

To open jupyter notebooks:

jupyter notebook

Then, you can click on each notebook and select the megamove kernel to run it.

R

The simplest way to run the code is to:

1. Download R from https://www.r-project.org/ and install it. (The latest version used to run the models at the time of submission was R 4.4.1)

2. Copy the folder "models" to your computer.

3. Open the code files (this can be done with Notepad), and edit the start of each code file under ### Set up working directory, so you set your working directory as the path to where you copied the models folder, i.e., replace "\~/models/" with the full path to the folder, in:

### Set up working directory:

 wd <- ("\\~/models/")

1. Open R

2. Sequentially copy the code from each code file into R to reproduce all the results.

Access information

Datasets used in the paper were derived from multiple sources, including:

E.U. Copernicus Marine Service Information (CMEMS) Marine Data Store (MDS) Global Ocean Physics Reanalysis (DOI: 10.48670/moi-00021; accessed Dec 2020)
E.U. Copernicus Marine Service Information (CMEMS) Global Ocean Biology Hindcast replaced in July 2022 by the Global Ocean Biogeochemistry Hindcast (DOI: 10.48670/moi-00019; accessed Dec 2020)
NASA Ocean Biology Processing Group Level-3 SeaWifs (1998-2003) and Modis-Aqua (2003-2018) Ocean Color Data (NASA Ocean Biology Distributed Active Archive Center, https://oceandata.sci.gsfc.nasa.gov/opendap/SeaWiFS/L3SMI/ and https://oceandata.sci.gsfc.nasa.gov/opendap/MODISA/L3SMI/contents.html; accessed Dec 2020)
European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim Reanalysis product (https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-interim; accessed Dec 2020)
Global Fishing Watch, https://www.globalfishingwatch.org
Marine AIS Data, https://www.exactearth.com.
E. van Sebille* et al.*, A global inventory of small floating plastic debris. Environ Res Lett 10, (2015).
J. Hansen, R. Ruedy, M. Sato, K. Lo, Global surface temperature change. Reviews in Geophysics 48, 1-29 (2010).
NASA Goddard Institute for Space Studies, GISS Surface Temperature Analysis (GISTEMP). https://data.giss.nasa.gov/gist, (2021).