Acoustic communication is widespread in the animal kingdom and often fundamental to early life survival. Extant archosaurs, birds and crocodilians, display complex vocalizations in early life that function in parental care and embryo communication. Turtles are a sister group to archosaurs, and vocalize, but lack parental care, providing an opportunity to decouple parental care from other factors driving the evolution of neonatal communication. Turtle hatchlings produce vocalizations in the subterranean nest before emergence, but their function remains unclear. We hypothesized that acoustic cues may facilitate emergence from the nest by cuing hatchlings to begin digging out of the subterranean nest cavity. We test whether hatchling vocalizations are positively associated with digging activity in hatchling snapping turtles (Chelydra serpentina). In two successive years, we monitored subterranean hatchling behaviour in situ with acoustic recorders in semi-natural turtle nests. Structural Equation Modelling revealed that vocalization was causally associated with hatchling movement. Our findings support a hypothesis that early-life acoustic communication coordinates a social activity in the absence of parental care, a function with relatively few described parallels in the animal kingdom.

Authors: Claudia Lacroix, Christina M. Davy, Njal Rollinson

Last Updated: 2025-04-17

Abstract: Acoustic communication is vital for early life survival in animals. Extant archosaurs, birds and crocodilians, display vocalisations that function in parental care and neonatal communication. Turtles also vocalize in early life and are a sister group to archosaurs. However, turtles lack parental care, providing an opportunity to decouple factors related to parental care and social learning. We tested the hypothesis that hatchling turtle vocalizations coordinate digging in the nest. We monitored snapping turtle hatchlings over two summers with acoustic recorders and wildlife cameras. Machine learning-based acoustic signal processing and structural equation modeling showed a strong positive association between hatchling vocalization and nest movement. We also found evidence of emergence synchrony within nests. This supports the hypothesis that hatchling vocalizations aid in coordinating subterranean movements during nest emergence, a social behavior among juveniles that occurs in the absence of parental care, and one that has few parallels in the animal kingdom.

Description of the data, file structure, and associated code.

R codes are published on Zenodo. See Related Work link for Software.

Part 1 - Nest Emergence

Description: the R script NestEmergence.R uses data from camera trap recordings in 2021 and 2022. The script uses 3 formats of post-processed data to summarise different nest emergence behaviours.

Dataset 1: the dataframe hatch.emergence.data.csv describes the total time it took for hatchlings to climb out of the nest. This was calculated, for each nest, by subtracting the first emergence event (obtained from the camera traps) from the first detection of movement in the nest (using audio recordings, proxy for hatching).

Column Name	Type	Unit	Description
year	ordinal	date	Year of sampling
clutch_ID	nominal	NA	unique identifier for each nest.
Group	nominal	NA	unique identifier for each nest pair.
Channel	nominal	NA	audio channel (left, right)
emergence	ordinal	date & time	date and time of the first emergence event inferred from monitoring the camera trap data.
hatching	ordinal	date & time	date and time of the first detected movement in the nest (proxy for hatching event) inferred from manually scanning the audio recordings.
duration	ordinal	days	time between hatching and the first emergence event.
ID	nominal	NA	unique identifier for each nest and year.
NestID	nominal	NA	unique identifier for each nest.
first_date	ordinal	date & time	date and hour of the first emergence event inferred from monitoring the camera trap data

Dataset 2: The dataframe emerge.events.data.csv describes each instance there was an emergence event observed in the camera traps.

Column Name	Type	Unit	Description
Group	nominal	NA	unique identifier for each nest pair.
clutch_ID	nominal	NA	unique identifier for each nest.
DateHour	ordinal	date & time	Date and time of the emergence event summarised per hour.
n	discrete	hatchlings	number of hatchlings that emerged within the hour.
year	ordinal	date	Year of sampling
first_emerge	ordinal	date & time	date and hour of the first emergence event for the nest inferred from monitoring the camera trap data.
time_since_emergence	continuous	seconds	time since the first emergence event of that nest.
ID	nominal	NA	unique identifier for each nest and year.
NestID	nominal	NA	unique identifier for each nest.
first_date	ordinal	date	date of the first emergence event for the nest inferred from monitoring the camera trap data.

Dataset 3: The dataframe emerge.interval.data.csv describes the time between emergence events per nest.

Column Name	Type	Unit	Description
Group	nominal	NA	unique identifier for each nest pair.
Side	nominal	NA	position of the nest cage the observation was recorded based on the camera trap video (left or right)
year	ordinal	date	Year of sampling
clutch_ID	nominal	NA	unique identifier for each nest.
Num	ordinal	integer	sequence of emergence events per nest, such that 1 is the first emergence event and 1+n is the subsequent emergence event for that nest.
DateTime	ordinal	date & time	Date and time of the observation in hours.
diff.minutes	continuous	minutes	time between the previous emergence event
ID	nominal	NA	unique identifier for each nest and year.
NestID	nominal	NA	unique identifier for each nest.
first_date	ordinal	date	date of the first emergence event for the nest inferred from monitoring the camera trap data.

Part 2 - Structural Equation Modeling (manually labelled, 2021 dataset)

Description: the R script manual_SEM.R is an R script used to perform structural equation models of the 2021 data which contains manually labelled acoustic detections.

Dataset 1: the dataframe manual_model_data.Apr13.2023.csv describes the number of vocalisations and movement observed per hour between the first sign of hatching and the first detected date of emergence. Movement and Vocalisations were quantified using cluster analysis via Kaleidoscope Pro and manually labelled.

Column Name	Type	Unit	Description
Date.Hour	ordinal	date & time	Date and time of the observation in hours.
AirTemp	continuous	degrees Celsius	Average air temperature per hour recorded from a nearby temperature logger.
nestID	nominal	NA	unique identifier for each nest.
NestTemp	continuous	degrees Celsius	Average nest temperature per hour recorded from a burried temperature logger. Nest temperature is unique for each nest pair.
group	nominal	NA	unique identifier for each nest pair.
BehavPeriod	nominal	NA	'Pre-Hatching', 'Hatching', 'Post-Hatching' and 'Emergence' behavioural periods during recording. Behavioural periods were delimited by using the first acoustic sign of movement, and the first emergence event recorded in the camera traps.
Vocalisation	discrete	unit per hour	Total number of vocalisations detected per hour. vocalisations were quantified using a cluster-based algorithm via Kaleidoscope Pro and manually labelled detections.
Movement	discrete	unit per hour	Total number of bouts of movement detected per hour. Movement was quantified using a cluster-based algorithm via Kaleidoscope Pro and manually labelled detections.
Hour	ordinal	hour	hour of the day
ClutchSize	discrete	number of eggs	Number of live eggs/hatchlings in the nest.
Date	ordinal	date	Date (yyyy-mm-dd)
ordinal	ordinal	date	Ordinal day of the year
Year	ordinal	date	Year sampling (2021)
Month	ordinal	date	Month of the year as numeric
Day	ordinal	date	Day of the month
precip_mean	continuous	millimeters	mean total hourly precipitation per day
precip_tot	continuous	millimeters	total precipitation per day

Part 3 - Accuracy calculations of the automatically labelled model

Description: the R script ModelAccuracy.R uses the validation dataset to summarise accuracy metrics of the cluster model's performance.

Dataset 1: the dataframe file.metadata.Apr14.2022.csv describes all the metadata for the audio files recorded in 2021.

Column Name	Type	Unit	Description
File Name	nominal	NA	original WAV file name
Group	nominal	NA	unique identifier for each nest pair.
Channel	discrete	NA	audio channel (left= 0, right = 1)
Subset	nominal	NA	Indicator whether data from the file was included in the analysis or not. (Y=included, N=excluded)
BehavPeriod	nominal	NA	'Pre-Hatching', 'Hatching', 'Post-Hatching' and 'Emergence' behavioural periods during recording. Behavioural periods were delimited by using the first acoustic sign of movement, and the first emergence event recorded in the camera traps. "NO DATA" means that the nest did not emerge and therefore the behavioural preiods were not defined.
DateTime.Modified	ordinal	date & time	Date and time that the file was created or last modified.
Size(GB)	continuous	gigabytes	size of the audio file
NestID	nominal	NA	unique identifier for each nest.
DateTime.file	ordinal	date & time	Date and time that the audio file started recording.

Dataset 2: the dataframe validation_subset_data_labelled.May29.2023.csv contains the results from the validation dataset. It contains both manually labelled detections and automatically labelled detections, which are then used to produce accuracy measurement and the probability matrix of the cluster model's performance. This dataset is also used in part 4 (see below) to generate 1000 simulated datasets.

Column Name	Type	Unit	Description
INDIR	nominal	NA	output from Kaleidoscope Pro. directory that the audio file is contained in.
FOLDER	nominal	NA	output from Kaleidoscope Pro. folder the audio file is contained in.
IN FILE	nominal	NA	output from Kaleidoscope Pro. Original WAV file name.
CHANNEL	discrete	NA	output from Kaleidoscope Pro. Audio channel (left= 0, right = 1).
OFFSET	continuous	seconds	output from Kaleidoscope Pro. Time since the start of the audio file the detection originates from.
DURATION	continuous	seconds	output from Kaleidoscope Pro. The total duration of the detection.
Fmin	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the minimum frequency of the detection.
Fmean	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the mean frequency of the detection.
Fmax	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the maximum frequency of the detection.
DATE	ordinal	date	Date the detection was recorded.
TIME	ordinal	time	Time of day the detection was recorded.
HOUR	ordinal	hour	Hour of day the detection was recorded.
TOP1MATCH*	nominal	NA	output from Kaleidoscope Pro. The label that the cluster model assigned to the detection.
TOP1DIST	continuous	NA	output from Kaleidoscope Pro. Distance from the cluster center.
TOP2MATCH	nominal	NA	output from Kaleidoscope Pro. The second most likely label that the cluster model assigned to the detection.
TOP2DIST	continuous	NA	output from Kaleidoscope Pro. Distance from the cluster center.
TOP3MATCH	nominal	NA	output from Kaleidoscope Pro. The third most likely label that the cluster model assigned to the detection.
VOCALIZATIONS	discrete	NA	output from Kaleidoscope Pro. Value of 1 assigned to each row to enable calculations.
MANUAL ID	nominal	NA	The label that we manually labelled and assigned to the detection.
INPATHMD5	nominal	NA	output from Kaleidoscope Pro.
Group	nominal	NA	unique identifier for each nest pair.
Subset	nominal	NA	Indicator whether data from the file was included in the analysis or not. (Y=included, N=excluded)
BehavPeriod	nominal	NA	'Pre-Hatching', 'Hatching', 'Post-Hatching' and 'Emergence' behavioural periods during recording. Behavioural periods were delimited by using the first acoustic sign of movement, and the first emergence event recorded in the camera traps.
Size(GB)	continuous	gigabytes	size of the audio file
NestID	nominal	NA	unique identifier for each nest.
DateHour	ordinal	date & time	Date and time of the observation in hours.

Dataset 3: the dataframe behaviour.temp.summary.May24.2022.csv describes the number of bouts of movement recorded per hour. Bouts of movement were quantified by running a separate cluster analysis across a wide acoustic window and manually labelling the detections. In this script, the dataset is used to validate the total duration of noise as a proxy for bouts of movement.

Column Name	Type	Unit	Description
Date.Hour	ordinal	date & time	Date and time of the observation in hours.
AirTemp	continuous	degrees celsius	Air temperature measure from a HOBO data logger hung on a nearby bush.
nestID	nominal	NA	unique identifier for each nest.
NestTemp	continuous	degrees celsius	Hourly nest temperature measured by a HOBO or ibutton data logger.
group	nominal	NA	unique identifier for each nest pair.
BehavPeriod	nominal	NA	'Pre-Hatching', 'Hatching', 'Post-Hatching' and 'Emergence' behavioural periods during recording. Behavioural periods were delimited by using the first acoustic sign of movement, and the first emergence event recorded in the camera traps. "NO DATA" means that the nest did not emerge and therefore the behavioural preiods were not defined.
Vocalisation	discrete	number of vocalisations	Number of vocalisations detected per hour. Vocalisations were calculated from a manually labelled detections obtained from a cluster model in Kaleidoscope Pro.
Movement	discrete	number of bouts of movement	Number of bouts of movement detected per hour. Bouts of movement were calculated from a manually labelled detections obtained from a cluster model in Kaleidoscope Pro.
Hour	ordinal	time	hour of the day.
ClutchSize	discrete	number of eggs	number of viable eggs in the nest.

Part 4 - Structural Equation Modeling using a Probability Matrix (automatically labelled, 2022 dataset)

Description: the R script auto_SEM.R is an R script used to perform structural equation models of the 2022 data which contains automatically labelled acoustic detections.

Dataset 1: the dataframe validation_subset_data_labelled.May29.2023.csv contains the results from the validation dataset. It contains both manually labelled detections and automatically labelled detections, which are then used to produce accuracy measurement and the probability matrix of the cluster model's performance.

Column Name	Type	Unit	Description
INDIR	nominal	NA	output from Kaleidoscope Pro. directory that the audio file is contained in.
FOLDER	nominal	NA	output from Kaleidoscope Pro. folder the audio file is contained in.
IN FILE	nominal	NA	output from Kaleidoscope Pro. Original WAV file name.
CHANNEL	discrete	NA	output from Kaleidoscope Pro. Audio channel (left= 0, right = 1).
OFFSET	continuous	seconds	output from Kaleidoscope Pro. Time since the start of the audio file the detection originates from.
DURATION	continuous	seconds	output from Kaleidoscope Pro. The total duration of the detection.
Fmin	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the minimum frequency of the detection.
Fmean	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the mean frequency of the detection.
Fmax	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the maximum frequency of the detection.
DATE	ordinal	date	Date the detection was recorded.
TIME	ordinal	time	Time of day the detection was recorded.
HOUR	ordinal	hour	Hour of day the detection was recorded.
TOP1MATCH*	nominal	NA	output from Kaleidoscope Pro. The label that the cluster model assigned to the detection.
TOP1DIST	continuous	NA	output from Kaleidoscope Pro. Distance from the cluster center.
VOCALIZATIONS	discrete	NA	output from Kaleidoscope Pro. Value of 1 assigned to each row to enable calculations.
MANUAL ID	nominal	NA	The label that we manually labelled and assigned to the detection.
Group	nominal	NA	unique identifier for each nest pair.
Subset	nominal	NA	Indicator whether data from the file was included in the analysis or not. (Y=included, N=excluded)
BehavPeriod	nominal	NA	'Pre-Hatching', 'Hatching', 'Post-Hatching' and 'Emergence' behavioural periods during recording. Behavioural periods were delimited by using the first acoustic sign of movement, and the first emergence event recorded in the camera traps.
Size(GB)	continuous	gigabytes	size of the audio file
NestID	nominal	NA	unique identifier for each nest.
DateHour	ordinal	date & time	Date and time of the observation in hours.

Dataset 2: the dataframe auto_model_data.May29.2023.csv describes the dataset that was automatically labelled by the built cluster model.

Column Name	Type	Unit	Description
INDIR	nominal	NA	output from Kaleidoscope Pro. directory that the audio file is contained in.
FOLDER	nominal	NA	output from Kaleidoscope Pro. folder the audio file is contained in.
year	ordinal	date	Year of sampling.
group	nominal	NA	unique identifier for each nest pair.
File	nominal	NA	Original WAV file name.
channel	discrete	NA	audio channel (left= 0, right = 1)
OFFSET	continuous	seconds	output from Kaleidoscope Pro. Time since the start of the audio file the detection originates from.
DURATION	continuous	seconds	output from Kaleidoscope Pro. The total duration of the detection.
Fmin	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the minimum frequency of the detection.
Fmean	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the mean frequency of the detection.
Fmax	continuous	Hertz (Hz)	output from Kaleidoscope Pro. the maximum frequency of the detection.
TOP1MATCH*	nominal	NA	output from Kaleidoscope Pro. The label that the cluster model assigned to the detection.
TOP1DIST	continuous	NA	output from Kaleidoscope Pro. Distance from the cluster center.
VOCALIZATIONS	discrete	NA	output from Kaleidoscope Pro. Value of 1 assigned to each row to enable calculations.
MANUAL ID	nominal	NA	The label that we manually labelled and assigned to the detection.
FilePath2	nominal	NA	full path of the directory that the audio file is contained in.
FilePath	nominal	NA	alternative path of the directory that the audio file is contained in.
size	continuous	gigabytes	size of the audio file
clutch_id	nominal	NA	unique identifier for each nest.
DateTime.file	ordinal	date & time	date and time of the day of the audio the detection was recorded in.
date	ordinal	date	Date the detection was recorded.
month	ordinal	month	Month of the year of the detection.
day	ordinal	day	Day of the month of the detection.
time	ordinal	time	Time of day the detection was recorded.
hour	ordinal	hour	Hour of day the detection was recorded.
date_hour	ordinal	date & time	date and hour of the day of the detection

Dataset 3: the dataframe auto_model_data_missingfiles.May29.2023.csv describes the meta data of all acoustic files, irrespective of whether there are acoustic detections present.

Column Name	Type	Unit	Description
year	ordinal	date	Year of sampling.
group	nominal	NA	unique identifier for each nest pair.
channel	discrete	NA	audio channel (left= 0, right = 1)
clutch_id	nominal	NA	unique identifier for each nest.
DateTime.file	ordinal	date & time	date and time of the day of the audio the detection was recorded in.

Dataset 4: the dataframe NestTemperature.May30.2023.csv describes the hourly average nest temperature per nest, per year.

Column Name	Type	Unit	Description
nestID	nominal	NA	unique identifier for each nest.
Date.Hour	ordinal	date & time	date and hour of the day of nest temperature.
Date	ordinal	date	Date temperature was recorded and summarised.
Time	ordinal	time	Time of day temperature was recorded.
NestTemp_C	continuous	degrees Celsius	Average nest temperature per hour recorded from a burried temperature logger. Nest temperature is unique for each nest.

Dataset 5: the dataframe DailyPrecipitation.May9.2023.csv describes the total daily precipitation per day, acquired from a nearby weather station.

Column Name	Type	Unit	Description
Year	ordinal	date	Year precipitation was recorded.
Month	ordinal	month	Month of the year precipitation was recorded.
Day	ordinal	day	Day of the month precipitation was recorded.
precip_mean	continuous	millimeters	mean total hourly precipitation per day.
precip_tot	continuous	millimeters	total precipitation per day.
Date	ordinal	date	Date precipitation was recorded and summarised.

Dataset 6: the dataframe TurtleInNests.May17.2023.csv describes the number of viable eggs contained in each nest, each year.

Column Name	Type	Unit	Description
Year	ordinal	date	Year of sampling.
NestID	nominal	NA	unique identifier for each nest.
ClutchSize	discrete	number of eggs	number of viable eggs in the nest.

Part 5 - Supplemental Figure S1

Description: the R script ValidatingTemperature.R is an R script used to validate the iButton and HOBO temperature loggers between adjacent nests and over time.

Data and code supporting acoustic and environmental factors driving digging behavior in the early life of a freshwater turtle

Data files

Abstract

Description of the data, file structure, and associated code.

Part 1 - Nest Emergence

Part 2 - Structural Equation Modeling (manually labelled, 2021 dataset)

Part 3 - Accuracy calculations of the automatically labelled model

Part 4 - Structural Equation Modeling using a Probability Matrix (automatically labelled, 2022 dataset)

Part 5 - Supplemental Figure S1

Data and code supporting acoustic and environmental factors driving digging behavior in the early life of a freshwater turtle

Data files

Abstract

README: Acoustic communication and temperature drive coordinated digging behavior during an early life stage transition of a freshwater turtle

Description of the data, file structure, and associated code.

Part 1 - Nest Emergence

Part 2 - Structural Equation Modeling (manually labelled, 2021 dataset)

Part 3 - Accuracy calculations of the automatically labelled model

Part 4 - Structural Equation Modeling using a Probability Matrix (automatically labelled, 2022 dataset)

Part 5 - Supplemental Figure S1

Works referencing this dataset