African elephant reference microsatellite data (v5.32) and sampling locations for geolocation and sample matching
Data files
Apr 16, 2026 version files 338.20 KB
-
README.md
2.92 KB
-
REFELE_5.32_filtered_forest.tsv
100.93 KB
-
REFELE_5.32_filtered_savannah.tsv
230.94 KB
-
zones_44_forest.tsv
1.39 KB
-
zones_44_savannah.tsv
2.02 KB
Abstract
This package contains the Center for Environmental Forensic Science (CEFS) reference elephant microsatellite data, version 5.32, and zone files giving sampling location information, version 44. It does not contain ivory seizure data due to law enforcement sensitivity. The data comprise genotypes for 16 microsatellites for 683 African forest elephants (Loxodonta cyclotis) and 1571 African savannah elephants (Loxodonta africana) from known sampling locations.
These data can be used for geolocation and sample matching, but microsatellite reads must be calibrated by sharing of known samples between labs. They can also be used to survey genetic structure of African elephant populations. Sampling was directed at wide coverage of all areas where elephants were found 2006-2023 and includes areas where elephants are no longer present. Duplicate samples from the same sampling zone have been removed as probably representing the same elephant, but one case of duplicate samples across two different zones has been retained.
Dataset DOI: 10.5061/dryad.2z34tmq12
Description of the data and file structure
This package contains the CEFS reference elephant microsatellite data, version 5.32, used in preparing the linked manuscript. It does not contain ivory seizure data due to law enforcement sensitivity. The data comprise genotypes for 16 microsatellites for 683 African forest elephants (Loxodonta cyclotis) and 1571 African savannah elephants (Loxodonta africana) from known sampling locations. Most samples were dung although a few were ivory, tissue, hair or bone. Coordinates given are for the center of the park or protected area, or midpoint of a group of samples if no protected area was involved; they are not coordinates of individual samples.
Files and variables
File: README.md
Description: Information on file formats and data interpretation.
File: REFELE_5.32_filtered_forest.tsv
Description: Forest elephant reference microsatellite genotypes
Variables
- SID: sample ID (two lines per sample)
- zone: area from which sample was collected (crossreference to zones files)
- All remaining variables are microsatellite genotypes (as fragment lengths, not repeat counts) with missing data coded as -999.
File: REFELE_5.32_filtered_savannah.tsv
Description: Savannah elephant reference microsatellite genotypes
Variables
- SID: sample ID
- zone: area from which sample was collected (crossreference to zones files)
- All remaining variables are microsatellite genotypes (as fragment lengths, not repeat counts) with missing data coded as -999.
File: zones_44_forest.tsv
Description: locations of forest elephant sampling zones
Variables
- Location: abbreviated name of nearby park, reserve, or other landmark
- zoneID: ID number of input zone
- subregion; broad area in which zone is located (0 = western forest, 1 = central/eastern forest, 2 = northern savannah, 3 = northeastern savannah, 4 = southeastern savannah, 5 = southern savannah)
- latitude: decimal latitude of input zone (not of individual sample)
- longitude: decimal longitude of input zone
File: zones_44_savannah.tsv
Description: locations of savannah elephant sampling zones
Variables
- Location: abbreviated name of nearby park, reserve, or other landmark
- zoneID: ID number of input zone
- subregion; broad area in which zone is located (0 = western forest, 1 = central/eastern forest, 2 = northern savannah, 3 = northeastern savannah, 4 = southeastern savannah, 5 = southern savannah)
- latitude: decimal latitude of input zone (not of individual sample)
- longitude: decimal longitude of input zone
Code/software
All data are in .tsv format and can be viewed with standard tools.
Samples were prepared and genotyped as described in Mailand and Wasser 2007.
Species of each sample was assigned using EBHybrids (Mondol et al. 2015) and samples with inferred hybrid probability >= 50% were removed. Samples with fewer than 10 fully genotyped loci were removed; loci where one allele was coded as missing were recoded as fully missing.
Associated zone files give latitude and longitude of the approximate sampling zone (park, protected area, or other habitat) from which samples were derived, but do not represent the coordinates of specific samples.
