Behavioural context of call production in humpback whale calves: Identification of potential begging calls in a Mysticete species
Data files
Nov 11, 2024 version files 40.46 KB
-
data_codes.zip
33.66 KB
-
README.md
6.80 KB
Abstract
Baleen whale calves vocalize, but the behavioural context and role of their social calls in mother-calf interactions are yet to be documented further. We investigated the context of call production in humpback whale (Megaptera novaeangliae) calves using camera-equipped animal-borne multi-sensor tags. Behavioural states, including suckling sessions, were identified using accelerometer, depth, and video data. Call types were categorized through clustering techniques. We found that call types and rates predict the occurrence of a given state. Milling, resting, and traveling were associated with a median call rate of 0 calls min-1, while surface play, tagging responses, and suckling were associated with higher call rates, averaging up to a median of 0.5 calls min-1 for suckling. Suckling sessions were mainly associated with two sets of low-frequency calls corresponding to previously described burping, barking, and snorting sounds. Surface play sessions featured mid-frequency calls with whoop-like sounds and other call types. These results address the significance of vocal signalling in mother-calf communication and the calf’s development, including the first identification of potential begging calls. Overall, this study offers new insights into baleen whale behaviour, underscores the importance of social calls in mother-calf interactions, and enhances our understanding of communication systems in aquatic mammalian mother-young pairs.
Baleen whale calves vocalize, but the behavioral context and role of their social calls in mother-calf interactions require further documentation. Using camera-equipped multi-sensor tags, we identified behavioral states in humpback whale calves and analyzed the relationship between call types and behavioral states, including suckling sessions.
1. Dataset Description
The dataset (data.csv) contains acoustic characteristics of social calls and behavioral state data for studying the context of vocalizations in humpback whale calves. Each row in the dataset corresponds to a single recorded vocalization event, with the corresponding behavioral state segment (1 segment = 1 continuous behavioral state). There may be multiple calls from the same behavioral state segment. If a behavioral state segment does not contain any calls, its data are still included as one row labeled as “O.Silence.”
Columns in the Dataset:
- depID: Unique identifier for each deployment/calf.
- AV_call_type: Provisionally assigned name based on aural and visual examination following the description in Saloma et al (2022). “O.Silence” is used for segments without any calls (silent).
- st: Start time of the call relative to the original Audio/Video file, measured in seconds.
- en: End time of the call relative to the original Audio/Video file, measured in seconds.
- Dur: Duration of the call, expressed in seconds.
- F0: Fundamental frequency of the call, measured in Hz.
- Fmax: Maximum frequency of the call, measured in Hz.
- Q25: First energy quartile frequency, measured in Hz.
- Q50: Median energy quartile frequency, measured in Hz.
- Q75: Third energy quartile frequency, measured in Hz.
- Bdw: Frequency bandwidth within which the total energy falls within 12 dB of Fmax, measured in Hz.
- stateNum: Numerical identifier for the corresponding behavioral state segment.
- state_st: Start time of the behavioral state segment relative to the deployment data start time, measured in seconds.
- state_en: End time of the behavioral state segment relative to the deployment data start time, measured in seconds.
- state_vidNam: Name of the original Audio/Video file.
- state_Start_s_: Start time of the state relative to the original Audio/Video file, measured in seconds.
- state_Stop_s_: End time of the state relative to the original Audio/Video file, measured in seconds.
- raw_depID: Original unique identifier for the deployment/calf prior to any processing. This identifier is based on the tagging date, along with other information such as the tag unit used (identified by color) and the category of the tagged individual.
- behavior: Describes the behavioral state. Mi: Milling, Pl: Surface play session, Po: Response to tagging, Re: Resting, Su: Suckling session, T: Traveling.
For the rows labeled as “O.Silence” (see details above), the following columns (relating to call characteristics)—st, en, Dur, F0, Fmax, Q25, Q50, Q75, and Bdw—are marked as **missing data “NA” **(Not Assigned), as no acoustic parameters are associated with segments lacking calls.
2. Code Overview
Language: R
Version: v.1
2.1 Initial Data Preparation
The code reads in the dataset and filters out silence (O.Silence
).
2.2 Cluster Analysis
- Principal Component Analysis (PCA): PCA is performed on the acoustic parameters (after handling missing data with multiple imputations) to reduce dimensionality and explore the structure of the call data.
- Hierarchical Clustering: Clustering is applied using the results of PCA, determining 13 distinct clusters. These clusters represent different “Types” of vocalizations.
- Cluster Assignment: The original dataset is updated with cluster labels (
clust
) for each vocalization, and a contingency table is generated to compare the clusters with the provisionally assigned categories.
2.3 Data Preparation for Behavioral Analysis
- Combining Silence and Calls: Silence and vocalization data are combined back to create a full dataset (
acou_classified
), where each call is assigned a cluster number (silence = 0). - Vocal Activity Calculation: A binary indicator (
vox
) is added to each row to distinguish between calls (1) and silence (0). The duration of each behavioral state segment is calculated, and the number of calls per state segment is aggregated. - Call Type Occurrence: The presence or absence of each call type in the segment is recorded for further analysis.
2.4 Visualizations
- PCA Visualization: The PCA results are plotted to visualize the contributions of variables and the clustering of the calls.
- Call Frequency Plot: A bar plot shows the frequency of different call types for each individual (
depID
), excluding silence segments. - Call Rate by Behavior: A box plot illustrates the distribution of call rates across different behavioral states.
2.5 Random Forest Classification
- Random Forest Model: A random forest model is built to predict behavioral states based on the call rates and presence/absence of each call type.
- Confusion Matrix: The model’s accuracy is evaluated using a confusion matrix, which shows the correct classifications and the expected classifications by chance.
- Variable Importance Plot: The importance of different call types in predicting behavior is visualized in a heatmap.
3. Notes
- Load Dataset: Make sure
data.csv
is in the working directory before running the code. - Run Cluster Analysis: The clustering part of the code identifies distinct call types, so ensure you have run it.
- Behavioral Context Analysis: The behavioral state analysis requires the combined dataset with silence and calls, and several aggregation operations are performed at this stage to obtain a table with each row corresponding to a unique behavioral state segment.
4. Required Libraries
Ensure that the following R libraries are installed:
dplyr
: For data manipulation.NbClust
: For determining the optimal number of clusters.ggplot2
: For plotting.missMDA
: For multiple imputation of missing data.FactoMineR
andfactoextra
: For PCA and clustering.rfPermute
: For building random forest models.
5. File Outputs
The code generates multiple plots and tables, which can be saved or visualized directly:
- PCA plots of acoustic parameters.
- Cluster visualization plots.
- Call frequency and rate plots.
- etc.
6. Contact
For questions or clarifications about the dataset or code, please contact the author.