Skip to main content

Daily totals of odontocete acoustic presence at western North Atlantic acoustic monitoring sites

Cite this dataset

Cohen, Rebecca E. et al. (2022). Daily totals of odontocete acoustic presence at western North Atlantic acoustic monitoring sites [Dataset]. Dryad.


A combination of machine learning and expert analyst review was used to detect odontocete echolocation clicks, identify dominant click types, and classify clicks in 32 years of acoustic data collected at 11 autonomous monitoring sites in the western North Atlantic between 2016 and 2019. Previously-described click types for eight known odontocete species or genera were identified in this data set: Blainville’s beaked whales (Mesoplodon densirostris), Cuvier’s beaked whales (Ziphius cavirostris), Gervais’ beaked whales (Mesoplodon europaeus), Sowerby’s beaked whales (Mesoplodon bidens), and True’s beaked whales (Mesoplodon mirus), Kogia spp., Risso’s dolphin (Grampus griseus), and sperm whales (Physeter macrocephalus). Six novel delphinid echolocation click types were identified and named according to their median peak frequencies. Consideration of the spatiotemporal distribution of these unidentified click types, and comparison to historical sighting data, enabled assignment of the probable species identity to three of the six types, and group identity to a fourth type. UD36, UD26, and UD28 were attributed to Risso’s dolphin (G. griseus), short-finned pilot whale (G. macrorhynchus), and short-beaked common dolphin (D. delphis), respectively, based on similar regional distributions and seasonal presence patterns. UD19 was attributed to one or more species in the subfamily Globicephalinae based on spectral content and signal timing. UD47 and UD38 represent distinct types for which no clear spatiotemporal match was apparent. This approach leveraged the power of big acoustic and big visual data to add to the catalog of known species-specific acoustic signals and yield new inferences about odontocete spatiotemporal distribution patterns. The tools and call types described here can be used for efficient analysis of other existing and future passive acoustic data sets from this region.


Acoustic data were collected using High-frequency Acoustic Recording Packages (Wiggins & Hildebrand 2007) deployed at deep water sites along the continental shelf break and slope in the western North Atlantic, between ~30ºN to ~40°N. An energy detector was used to identify odontocete echolocation clicks in the raw timeseries data, and then an unsupervised clustering algorithm was used to cluster clicks in 5-minute bins, generating summary spectra, inter-click-interval distributions, and mean waveform envelopes for resultant clusters. This clustering step was taken to identify recurring patterns in a naturally variable signal type, and also to reduce the size of the data set to improve computational efficiency. The summary spectra were classified to one of 20 classes, identified via a second clustering step, using a neural network-based classifier. Daily acoustic presence of each class, given in hours, was computed per deployment.

Usage notes

These .mat (Matlab) files correspond to consecutive acoustic device deployments at 11 acoustic monitoring sites, given by the file name. Each file contain data from a single deployment, with 4 variables:

spNameList - 21x1 cell array; contains class names for known species, novel putative delphinid click types (UD), and common noise sources included in this analysis.

labeledBins - 1x21 cell array; cell {1,i} contains an Nx2 double matrix of the the start (first column) and end (second column) datetimes of all 5-minute bins containing clicks labeled as class spNameList{i,1} in this deployment. N varies by class, depending on how prevalent that class was during this deployment. Datetimes are in the Matlab serial date number format.

dailyTots - Nx22 double matrix; first column contains the dates of days for which there was recording effort during this deployment; columns 2-22 correspond to the classes in spNameList. The values in index (j,i) give the total hours of acoustic presence of class spNameList{i-1,1} on day dailyTots(j,1). Minimum temporal resolution is presence in 5-minute bins, so the minimum acoustic presence recorded is 0.0833 hours. Datetimes are in the Matlab serial date number format.

binFeatures - 3x21 cell array; cells {1:3,i} each contain an Nx1 double, where N corresponds to the number of 5-minute bins containing clicks which were labeled as spNameList{1,i} in this deployment. Cell {1,i} gives the classifier confidence associated with the labels for each 5-minute bin; cell {2,i} give the mean peak-to-peak received level of all clicks labeled as spNameList{1,i} in each 5-minute bin; cell{3,i} gives the number of clicks labeled as spNameList{1,i} in each 5-minute bin.