Vibroscape analysis reveals acoustic niche overlap and plastic alteration of vibratory courtship signals in ground-dwelling wolf spiders
Data files
Feb 23, 2024 version files 4.15 GB
-
audio_all.zip
-
MS_bout_noise.csv
-
MS_bout_schiz_interaction.csv
-
pianka_idx.csv
-
pitfall.csv
-
README.md
-
schiz_property_noise.csv
-
temperature.csv
Abstract
Soundscape ecology has enabled researchers to investigate natural interactions among biotic and abiotic sounds as well as their influence on local animals. To expand the scope of soundscape ecology to encompass substrate-borne vibrations (i.e. vibroscapes), we developed methods for recording and analyzing sounds produced by ground-dwelling arthropods to characterize the vibroscape of a deciduous forest floor using inexpensive contact microphone arrays followed by automated sound filtering and detection in large audio datasets. Through the collected data, we tested the hypothesis that closely related species of Schizocosa wolf spider partition their acoustic niche. In contrast to previous studies on acoustic niche partitioning, two closely related species - S. stridulans and S. uetzi - showed high acoustic niche overlap across space, time, and/or signal structure. Finally, we examined whether substrate-borne noise, including anthropogenic noise (e.g., airplanes) and heterospecific signals, promotes behavioral plasticity in signaling behavior to reduce the risk of signal interference. We found that all three focal Schizocosa species increased the dominant frequency of their vibratory courtship signals in noisier signaling environments. Also, S. stridulans males displayed increased vibratory signal complexity with an increased abundance of S. uetzi, their sister species with which they are highly overlapped in the acoustic niche.
README: Vibroscape analysis reveals acoustic niche overlap and plastic alteration of vibratory courtship signals in ground-dwelling wolf spiders
Data description
audio_all.zip
Audio files of sounds that were detected from the raw audio datasets
MS_bout + noise.csv
Detected sounds bouts with the information of abiotic/biotic noise at the time of sound production
Columns
- glob_cat: global categories of sound bouts
- date: recording date
- plot: recording plot
- rec_ID: recorder id (all = air-borne sounds)
- file_begin: original recording file that the bout starts
- file_end: orignial recording file that the bout ends
- bout_loc_begin: time point of bout beginning within an audio file
- bout_loc_end: time point of bout ending within an audio file
- bout_glob_begin: time point of bout beginning in a whole recording duration
- bout_glob_end: time point of bout ending in an whole recording duration
mean_temp: average of temperature (recorded every 15 min) during sound production
** temperature for plot 'B', date '180518', from bout_glob_begin '29219.13042' to bout_glob_begin '62121.04488' did not measured due to the error in temperature logger
med_freq_cent: median of frequency centroid
med_dom_freq: median of dominant frequency
peak_rate: peak rate of sounds
n_idle: number of idles (only for S. stridulans)
n_noise_30m: number of abiotic/biotic noise before and after 15 minutes of the sound bout
t_noise_30m: number of types of noise (glob_cat) before and after 15 minutes of the bout
ent_noise_30m: Shannon entropy of the types of abiotic/biotic noise that occurred before and after 15 minutes of the sound bout
schiz_property_noise.csv
Detected signal bouts of S. duplex, S.stridulans, and S. uetzi with the information of general noise & conspecific/heterospecific signals during the interaction time window (before & after 15 minutes from each signal bout)
Columns
- glob_cat: global categories of sound bouts
- date: recording date
- plot: recording plot
- rec_ID: recorder id (all = air-borne sounds)
- bout: signal bout id within a day from a single recording unit
- file_begin: original recording file that the bout starts
- file_end: orignial recording file that the bout ends
- bout_loc_begin: time point of bout beginning within an audio file
- bout_loc_end: time point of bout ending within an audio file
- bout_glob_begin: time point of bout beginning in a whole recording duration
- bout_glob_end: time point of bout ending in an whole recording duration
- mean_temp: average of temperature (recorded every 15 min) during sound production
- med_dom_freq: median of the dominant frequency of the signal bout
- max_dom_freq: maximum of the dominant frequency within a signal bout
- min_dom_freq: minimum of the dominant frequency within a signal bout
- med_zcr: median of the zero-crossing rate of the signal bout
- max_zcr: maximum of the zero-crossing rate within a signal bout
- min_zcr: minimum of the zero-crossing rate within a signal bout
- med_spec_cent: median of the spectral centroid of the signal bout
- max_spec_cent: maximum of the spectral centroid within a signal bout
- min_spec_cent: minimum of the spectral centroid within a signal bout
- med_spec_bw: median of the spectral bandwidth of the signal bout
- max_spec_bw: maximum of the spectral bandwidth within a signal bout
- min_spec_bw: minimum of the spectral bandwidth within a signal bout
- t_noise_30m: number of types of noise (glob_cat) before and after 15 minutes of the bout
- n_noise_30m: number of detected sounds before and after 15 minutes of the bout
- ent_noise_30m: Shannon entropy of the number of types of noise (glob_cat) before and after 15 minutes of the bout
- n_cs_30min: number of conspecific sounds before and after 15 minutes of the bout
- n_hs_30min: number of heterospecific sounds before and after 15 minutes of the bout
- n_idle: number of idles (only for S. stridulans)
- dr_idle: the length of idles (only for S. stridulans)
pianka_idx.csv
Pianka's niche overlap index between types of sounds
Columns
- cat_1: first category of sound
- cat_2: second category of sound
- plot_pi: the overlap in occurrence plots between cat_1 and cat_2
- date_pi: the overlap in occurrence date between cat_1 and cat_2
- time_pi: the overlap in occurrence time between cat_1 and cat_2
- freq_pi: the overlap in dominant frequencies between cat_1 and cat_2
- all_pi: plot_pi x date_pi x time_pi x freq_pi
- cat_1_type: airborne (ac) or vibratory (vi)
- cat_2_type: airborne (ac) or vibratory (vi)
pitfall.csv
number of collected Schizocosa wolf spiders from pitfall traps
Columns
- species: species name
- date: collection date
- time: time period of collection (0: 0000-0800, 8: 0800-1600, 16: 1600-0000)
- plot: collection site (recordin plot id)
- male: number of males collected
temperature.csv
temperature data during recoding period
Columns
- plot: recording plot
- date: recording date
- time: temperature recording time (e.g. 1226 = 12:26)
- temp: temperature in celcius
- rec_time: corresponding recording time
Code description
code1_filtering.py
Python codes for audio filtering
code2_finding_bouts.py
Python codes for finding bouts
code3_spectrogram.py
Python codes to create spectrograms for manual sound classification
vibroscape_analysis.R
R codes for statistical analyses and figures
test_methods.ipynb
Jupyter Notebook for testing the effects of audio processing methods on acoustic properties (Supplementary material S1)
Supplementary_S2.pptx
ppt file for explaining background algorithms for codes
Supplementary_S3.zip
audio files and spectrograms of unclassified vibratory sounds (unknown1, unknown2, unknown3)
Methods
Field recording:
For field recordings, we chose five study plots (10 m x 10 m) at the field station of the University of Mississippi at Abbeville, Mississippi, USA (34˚43’ N 89˚39’ W). In each study plot, we deployed a TemLog20 temperature logger (Tamtop, Milpitas, California, USA), 25 recording units consisting of a contact microphone (35 mm diameter, Goedrum Co., Chanhua, Taiwan) and a Toobom R01 8GB acoustic recorder (Toobom, China), and four pitfall traps (Carolina biological supply company, Bunington, North Carolina, USA). The temperature loggers recorded the temperature at each recording plot every 15 minutes during the experimental periods. In total, we deployed 125 recording units, 5 temperature loggers, and 20 pitfall traps across our five study plots. We conducted a 24-hour recording every three days from May 15th to July 15th, 2018 resulting in a total of 1950 24-hour recordings across 13 days. The substrate-borne vibrations during 24 hours in study plots were continuously recorded from 0800 except 10 minutes to replace audio recorders at 1600 due to the limited battery capacity.
After a ~24-hour recording, we extracted uncompressed WAV files at a 48-kHz sampling rate from recorders to an 8 TB external hard drive (Seagate Technology LLC., Cupertino, California, USA). On the same day, we collected specimens from pitfall traps at three different times (0800, 1600, and 0000) to observe the temporal variation in the activity of species in study plots. We sorted collected specimens by the time of collection, collection date, and study plot and we preserved them in 95% ethanol for later species identification by PM. We used the collected specimens to corroborate our species identity of sound recordings across locations.
Data processing:
To automate signal detection for classification, we wrote Python programs to filter background noise, detect pulses, and group pulses into biologically meaningful signal bouts. Before the process, we divided a 24-hour recording WAV file into 10-minute chunks using FFmpeg for processing speed.
Noise filtering:
Due to the spatial/temporal variation in background noise, we conducted adaptive noise filtering using a unique frequency spectrum of the background noise of each 10-minute WAV chunk. To acquire the frequency spectrum, the program calculated the amplitude threshold by the sigma clipping method. The amplitude threshold is calculated by sigma clipping as m + ασ with median m and a standard deviation σ of the amplitudes (mV) of all the sampling points of the WAV chunk. The constant α was determined among values between 1 to 10 at intervals of 0.3 by the elbow method on the number of sampling points above the amplitude threshold. Once the amplitude threshold was determined, the program extracted the frequency spectrum of the longest segment below the amplitude threshold by Fast Fourier Transformation and filtered the WAV chunk by the frequency spectrum of ‘background noise’.
Sound detection and classification:
After noise filtering, we updated the amplitude threshold of each file by the sigma clipping methods through the same methods with noise filtering to find the optimal alpha. Using the updated amplitude threshold, we detected pulses above the amplitude threshold. The amplitude and time of detected pulses within a WAV chunk were recorded for pulse grouping and sound classification. For pulse grouping, the program calculated the time interval between adjacent detected pulses and applied the Gaussian Mixture Model (GMM) to classify the time intervals into three categories of within-bout, between-bout within a single signaling activity, and between-signaling activities. Then, we grouped pulses into bouts by the results of the GMM for sound classification.
An expert in spider sound analysis (NC) classified detected sounds by visual inspection of spectrograms. To classify non-spider sounds, we used BirdNET (Kahl et al., 2021) and the library of Singing insects of North America (SINA; Walker & Moore, 2003). For the BirdNET, we accepted the species that showed the highest probability values from the online bird sound identification system. If the intervals between consecutive conspecific (or same class) sounds were recorded at the same vibratory sensor within one minute, we grouped the sounds as a signal bout. Also, if conspecific sounds from the same recording plot were detected by multiple sensors simultaneously, we classified the sounds as airborne sounds that were transmitted to the ground and counted the bouts as a single signal bout. When we cannot specify a reliable species or source producing sounds, we labeled the sound types as ‘unknown’.
Usage notes
Python packages: Pydub (Robert & Webbie, 2018), kneed (Satopää et al., 2011), Noisereduce (Sainburg et al., 2020), Scipy (Virtanen et al., 2020) and Scikit-learn package (Pedregosa et al., 2011).
R for statistical analysis