Estimating spatio-temporal reproductive dynamics of fish populations with passive acoustic monitoring: A state-space model approach
Data files
Dec 10, 2025 version files 842.43 MB
-
audio_data.zip
440.67 MB
-
bayesian_p_LMP_xyrhoConst_log_5_6_2.ipynb
67.43 KB
-
chorus_laplace_scale2.R
32.82 KB
-
chorustime_auto4.R
24.76 KB
-
coastline_location.out
1.49 KB
-
coastline_plot.R
1.92 KB
-
environment.yml
13.83 KB
-
file_names2.csv
30.62 KB
-
kashima_csvs3_TL2.zip
89.26 MB
-
MCMC_results.zip
305.74 MB
-
mic_location_xycoordinate_info.out
1.19 KB
-
mic_location_xycoordinate_info.R
1.36 KB
-
mic_location_xycoordinate.csv
127 B
-
mic_location.csv
184 B
-
model_check_LMPxyrhoConst_log10_5_6_2.ipynb
3.65 MB
-
numerical_integration.zip
133.49 KB
-
ParticleFilter_LMPxyrhoConst_date_MP1_log10_5_6_2.ipynb
90.12 KB
-
ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim1_1.ipynb
145.96 KB
-
ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim2_1.ipynb
151.96 KB
-
ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim3_1.ipynb
157.71 KB
-
ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim4_1.ipynb
144.58 KB
-
Particlefilter_matrix_making3-15.ipynb
86.36 KB
-
Particlefilter_matrix_making3-2.ipynb
26.75 KB
-
Particlefilter_matrix_making3.ipynb
110.15 KB
-
plot_for_paper_appendix3.R
28.90 KB
-
plot_for_paper3.R
29.98 KB
-
pop_location_log_5_6_2.R
7.86 KB
-
prior_predictive_check_5_6_2.ipynb
1.21 MB
-
q25_autoChorusDetect4.zip
34.11 KB
-
R_sessionInfo.txt
995 B
-
ReadMe_for_codes.txt
5.25 KB
-
ReadMe_for_data.txt
5.60 KB
-
README.md
23.84 KB
-
requirements.txt
1 KB
-
sim_result2.R
31.33 KB
-
sound_attenuation_matrix_test.ipynb
437.68 KB
-
sound_propagate2.ipynb
46.05 KB
-
tide_data_processing.R
11.05 KB
-
TL_detect3.m
6.77 KB
-
water_temp_dayaverage_ibaraki.R
1.44 KB
Abstract
Passive acoustic monitoring (PAM) has been used to estimate the presence and spatial distribution of target organisms using biological sounds received by microphones. Due to its cost-effectiveness and non-invasiveness, PAM is becoming a promising approach for studying the spatiotemporal dynamics of large groups in response to environmental changes. However, conventional PAM can face significant constraints in dealing with the collective vocalisations of organisms. Here, we extend the traditional sound propagation equation for targeting a collectively vocalising group and propose a state-space model that estimates the acoustic group’s location and size with a limited number of microphones. These developments overcome traditional limitations, such as financial, operational, and methodological costs and problems associated with the use of numerous microphones. Simulation studies confirmed that the model produced systematically unbiased estimates of acoustic group size and location across varying group parameters. Furthermore, the estimations of the acoustic group locations were robust to a violation of the model assumption that the spatial extent of the acoustic group is temporally constant. As an empirical demonstration, applying the proposed approach to white croaker (Pennahia argentata) tracked the daily movement of the acoustic group centre in relation to tidal conditions. Moreover, the approach not only revealed temporal variations in acoustic group size with a 10-minute resolution but also revealed temporal shifts in the peak timing of choruses. These findings demonstrate the potential for novel biological insights. The proposed approach enables long-term and comprehensive assessment of biological status and supports effective resource management through spatiotemporally fine-scale predictions of future distribution and abundance. Moreover, this approach can be applied to collectively vocalising groups across a wide range of taxonomic groups, including birds, insects and amphibians, even when using a limited number of microphones.
Description of the data and file structure
In this analysis, we used audio data obtained from 5 hydrophones. These data were converted to files of sound intensity between 500 and 1000Hz. These sound volume data were cropped during the duration of the fish chorus, and created chorus sound volume data. Using these data about fish chorus sound volume, we estimated the fish group centre and size simultaneously via Particle Filter MCMC. In this calculation, numerical integration was conducted.
Model performance was checked with varying parameter settings and under several situations that are assumed or not assumed in the model. Then, this model was applied to the chorus of white croaker as a case study. From the results of MCMC, we estimate the relationship between fish chorus features and environmental factors as a secondary analysis.
In each code, the detailed explanations of each calculations were described beside the code.
You can reproduce the results by following the workflow described below.
The detailed descriptions of each code and data are provided after the workflow.
Workflow to reproduce the results
To fully reproduce the results reported in the manuscript, users can follow these steps:
- Run
TL_detect3.mto compute sound intensity (input:audio_data.zipandfile_names2.csv, output:kashima_csvs3_TL2.zip). - Run
chorustime_auto4.Rto create chorus sound volume data (input:kashima_csvs3_TL2.zip, output:q25_autoChorusDetect4.zip). - Run
mic_location_xycoordinate_info.Rto convert hydrophone coordinates from the longitude-latitude scale to xy-coordinates. In addition, it shows the geographical distance between the hydrophones and their relative positioning (input:mic_location_xycoordinate_info.out, output:mic_location_xycoordinate.csv). - Run
Particlefilter_matrix_making3.ipynb(and related files (named ..._matrix_making_3-15.ipynb and ..._matrix_making_3-2.ipynb) to compute numerical integration results (output:numerical integration.zip). - Run
Particlefilter_LMPxyrhoConst_MP1_log10_5_6_2_sim*_1.ipynbfor the simulation studies (* means the number of 1-4, indicating simulation number) (input:numerical_integration.zip, output:MCMC_results.zip). - Run
Particlefilter_LMPxyrhoConst_date_MP1_log10_5_6_2.ipynbfor the case study (input:numerical_integration.zipandq25_autoChorusDetect4.zip, output:MCMC_results.zip). - Run
water_temp_dayaverage_ibaraki.Randtide_data_processing.Rfor preparing the water temperature and tide information to conduct secondary analysis later (input:day_average_wtemp.csvandkashima_tide_August_2015.txt, output:kashima_tide.csvandday_average_wtemp_analizable.csv). - Run
pop_location_log_5_6_2.Rto convert the location of the group centre from the longitude-latitude scale to the distance from the coastline (input:MCMC_results.zip, output:pop_loc_CoastDist.csvand figure of the group centre). - Run
sim_results2.Rto calculate and show the results of the simulation studies (input:MCMC_results.zip, output: figures of the simulation results). - Run
plot_for_paper3.Rto conduct statistical tests and generate main figures in the manuscript (input:MCMC_results.zip, output: figures and tables of the case study results). - Run
plot_for_paper_appendix3.Rto create the plots in the appendix (input:MCMC_results.zip, output: figures and tables shown in the appendix). - (Optional) Run
bayesian_p_LMP_xyrhoConst_log_5_6_2.ipynb,prior_predictive_check_5_6_2.ipynb, andmodel_check_LMPxyrhoConst_log10_5_6_2.ipynbfor model validation analyses shown in the appendix (input:MCMC_results.zip, output: figures and tables of the results of Bayesian p value, prior predictive check, and the difference between model prediction and observed data).
Files and variables
File: ReadMe_for_data.txt
Description: ReadMe_for_data.txt contains abstract information on each data generated or used in this study. The main information is the same as this ReadMe texts.
File: ReadMe_for_codes.txt
Description: ReadMe_for_codes.txt contains abstract information on each code used in this study. The main information is the same as this ReadMe texts.
File: audio_data.zip
Description: This file contains a subset of raw audio data, recorded on 2015/08/07 at the MP3 site during 14:00-21:45. If more raw data is required, please contact the authors.
File: kashima_csvs3_TL2.zip
Description: kashima_csvs3_TL2 contains the sound intensity between 500-1000Hz calculated in TL_detect3.m.
File: numerical_integration.zip
Description: This folder contains the files about the results of numerical integration. These data are created in Particlefilter_matrix_making3.ipynb, Particlefilter_matrix_making3-15.ipynb, and Particlefilter_matrix_making3-2.ipynb. These data are used for the Particle Filter MCMC. The details are described in Particlefilter_matrix_making3.ipynb.
File: MCMC_results.zip
Description: This file contains the results of MCMC. Four types of simulations were computed in "ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim1_1.ipynb", "ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim2_1.ipynb", "ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim3_1.ipynb", and "ParticleFilter_LMPxyrhoConst_MP1_log10_5_6_2_sim4_1.ipynb". In these simulations, to compute with the other parameter settings and seed values, the parameters and seed values should be changed in the script. The pair of parameters and seed values used in the manuscript can be checked in sim_results2.R.
The case study was computed in "ParticleFilter_LMPxyrhoConst_date_MP1_log10_5_6_2.ipynb". To compute other days of the case study, the sound volume file name (contained in the "q25_autoChorusDetect4" folder) and nday should be changed in the script.
File: q25_autoChorusDetect4.zip
Description: q25_autoChorusDetect4 contains the sound intensity of the chorus and the chorus-info file calculated in the chorustime_auto4.R. This sound volume data is used for the Particle Filter MCMC. The chorus info file is used in the later analysis in plot_for_paper3.R.
File: day_average_wtemp.csv
Description: This is the raw water temperature data from the Japan Meteorological Agency https://www.data.jma.go.jp/kaiyou/data/db/kaikyo/series/engan/txt/area138-past.txt
We copied the data from 4/8/2015 to 3/9/2015 from this source text file and pasted it into a CSV file named 'day_average_wtemp.csv'.
The raw data from data sources is difficult to analyse. A more convenient dataset is recreated in water_temp_dayaverage_ibaraki.R.
The specifications of this data are described in water_temp_dayaverage_ibaraki.R
This file is distributed as supplemental information via Zenodo.
Variables
- 2015: year
- 08: month
- 04: day
- 138: area number
- R: Flag. Flag is showing whether this data is a reanalysis value or not. If the flag is R, this water temperature is the reanalysis value.
- 26.09: water temperature
File: day_average_wtemp_analizable.csv
Description: This is the water temperature data that is recreated in water_temp_dayaverage_ibaraki.R for later analysis. This file is distributed as supplemental information via Zenodo.
Variables
- day: day
- areaNo: number of area
- flag: flag
- temp: water temperature
File: coastline_location.out
Description: This file shows the specifications for converting the location from the longitude-latitude scale to the xy-coordinate of the coastlines.
This conversion was conducted using the website of the Japanese "Geographical Survey Institute" here(https://vldb.gsi.go.jp/sokuchi/surveycalc/surveycalc/bl2xyf.html)
The specifications of each column of this file are described in coastline_location.out.
File: file_names2.csv
Description: file_names2 contains the names of the audio files. This file is used for "TL_detect3.m".
Variables
- MP1: site name of the recording files
- kashima150804_0906/AUSOMS/MP1/DS800151.MP3: filename of the recording files
- 2015/8/4 21:45: the end time of the recording
File: mic_location_xycoordinate.csv
Description: This file contains the locations of hydrophones as the xy-coordinate system. This file was created in mic_location_xycoordinate_info.R
Variables
- The first column: the number of each row
- x: Longitudal location of hydrophones[m]
- y: Latitudal location of hydrophones[m]
- z: Site name
File: mic_location_xycoordinate_info.out
Description: This file shows the specifications of the conversion of the location from the longitude-latitude scale to the xy-coordinate of the hydrophones.This conversion was conducted using the website of the Japanese "Geographical Survey Institute" here (https://vldb.gsi.go.jp/sokuchi/surveycalc/surveycalc/bl2xyf.html). The specifications of the columns in this file are described in mic_location_xycoordinate_info.out.
File: kashima_tide_August_2015.txt
Description: This is the raw data of the tide information from the Japan Meteorological Agency (https://www.data.jma.go.jp/kaiyou/data/db/tide/suisan/txt/2015/D2.txt). From this source txt file, we only used data from 1 August to 4 September as the renamed text file "kashima_tide_August_2015.txt". The descriptions of the text file structures are described in "tide_data_processing.R". To create the "kashima_tide_August_2015.txt" file, when the high tide and low tide times are linked to the part before them, the linkage is broken using half-width spaces as a pre-processing of the txt to convert easily. This file is distributed as supplemental information via Zenodo.
File: kashima_tide.csv
Description: This file is the tide table that contains hourly tide levels each time, and high tide levels(and low tide levels) with its time. This file is recreated in tide_data_processing.R from kashima_tide_August_2015.txt to make the analysis easier. This file is distributed as supplemental information via Zenodo.
Variables
- tide0-23: tide levels of 0:00h to 23:00h each day
- year: year
- date: date
- site_name: name of the observation site
- high_tide_time1: the time the first high tide observed in the day
- high_tide1: the height of the first high tide observed in the day
- high_tide_time2: the time the second high tide observed in the day
- high_tide2: the height of the second high tide observed in the day
- low_tide_time1: the time the first low tide observed in the day
- low_tide1: the height of the first low tide observed in the day
- low_tide_time2: the time the second low tide observed in the day
- low_tide2: the height of the second low tide observed in the day
File: mic_location.csv
Description: This file contains the locations of hydrophones on a longitude-latitude scale.
Variables
- id: number of rows
- microphone: site name of each hydrophone
- longitude: longitude of each hydrophone
- latitude: latitude of each hydrophone
File: mesh500_35_140.txt
Description: This file contains the depth at each longitude–latitude coordinate. This raw data is from the Japan Coast Guard (https://www.jodc.go.jp/vpage/depth500_file.html). From this data source, to create the depth data("mesh500_35_140.txt"), select the mesh for Latitude[deg.]: 35.00 - 36.00 and Longitude[deg.]: 140.00 - 141.00, and then download it. The downloaded file should match the one used in our analysis. The specifications of each column of this data are described in plot_for_paper_appendix3.R.
This data source does not permit the use of data without distribution or citation. Therefore, if you wish to use this data, you must download it yourself. Additionally, please review the data usage terms when utilizing the data.
File: pop_loc_CoastDist.csv
Description: This file contains the distance between the coastline and the group centre. This file is created in pop_location_log_5_6_2.R.
Please note that csv file cannot retain the date format, so please open this file via R.
Variables
- x: Distance from the coastline of each group centre.
- y: Distance along the shoreline direction.
- day_n: the number of days
- day: observation day (mm-dd format)
File: mito_suntime.csv
Description: This file contains the information of sunrise and sunset time at the study site. The data sources are https://eco.mtk.nao.ac.jp/koyomi/dni/2015/s0808.html.en and https://eco.mtk.nao.ac.jp/koyomi/dni/2015/s0809.html.en. To create "mito_suntime.csv", we copied the data from the matrix at this link for the period from August 1, 2015, to September 3, 2025, into a CSV file and named it mito_suntime.csv. The names of columns are changed from left to right to Day, sunrise, sunrise_degree, Median_altitude, Median_altitude_degree, sunset, and sunset_degree.
This data source does not permit the use of data without distribution or citation. Therefore, if you wish to use this data, you must download it yourself. Additionally, please review the data usage terms when utilizing the data.
Variables
- Day: day of the sunrise and sunset (August and September are abbreviated, and added in script "plot_for_paper_appendix3.R")
- sunrise: sunrise time
- sunrise_degree: degree of the sunrise
- Median_altitude: Time of meridian transit
- Median_altitude_degree: Southern Latitude
- sunset: sunset time
- sunset_degree: degree of the sunset
File: requirements.txt
Description: This file contains the package version information on the python environment used in this study. This information was created using pip freeze --exclude-editable | grep -v @ > requirements.txt under conda computational environment. To use this environmental setting, use pip install -r requirements.txt under the your computational environment.
File: R_sessionInfo.txt
Description: This is the information on the R environment. This file contains the information provided via sessionInfo() in the R environment.
File: environment.yml
Description: This file includes information on the Python computational environment. If you want to create a Python environment including conda environment, use conda env create -f environment.yml
Code/software
In these analyses, MATLAB, Python, and R are used.
Descriptions of the codes
File: TL_detect3.m
Description: In this script, calculate the sound intensity of each segment(2^18 points, approximately 6 seconds). The TL (500-1000Hz) calculated in this file is used in the following analysis.
File: chorustime_auto4.R
Description: In this script, calculate the start and end time of the chorus, and create the sound volume data during the chorus period
File: mic_location_xycoordinate_info.R
Description: Convert the locations of hydrophones from the longitude-latitude scale to xy-coordinates. In addition, it shows the geographical distance between the hydrophones and their relative positioning.
File: Particlefilter_matrix_making3.ipynb and other related files
Description: calculate the numerical integration of the sound attenuation with K=10 from the group in advance, and create the files that will be used in the Particle filter MCMC. Files created in this script are the results of the numerical integration. Other "Particlefilter_matrix_making3-15.ipynb" or "Particlefilter_matrix_making3-2.ipynb" files compute the numerical integration with K=15 or K=20, respectively. The files created in "Particlefilter_matrix_making3-**.ipynb" will be used in the simulation studies.
File: Particlefilter_LMPrhoConst_MP1_log10_5_6_2_sim1_1.ipynb
Description: This file is for calculating the results of simulation 1. To compute with the other parameter settings and seed values, the parameters and seed values should be changed in the script. The pair of parameters and seed values used in the manuscript can be checked in sim_results2.R.
File: Particlefilter_LMPrhoConst_MP1_log10_5_6_2_sim2_1.ipynb
Description: This file is for calculating the results of simulation 2. To compute with the other parameter settings and seed values, the parameters and seed values should be changed in the script. The pair of parameters and seed values used in the manuscript can be checked in sim_results2.R.
File: Particlefilter_LMPrhoConst_MP1_log10_5_6_2_sim3_1.ipynb
Description: This file is for calculating the results of simulation 3. To compute with the other parameter settings and seed values, the parameters and seed values should be changed in the script. The pair of parameters and seed values used in the manuscript can be checked in sim_results2.R.
File: Particlefilter_LMPrhoConst_MP1_log10_5_6_2_sim4_1.ipynb
Description: This file is for calculating the results of simulation 4 (not described in the manuscript. This simulation was for the response to the referee). To compute with the other parameter settings and seed values, the parameters and seed values should be changed in the script. The pair of parameters and seed values used in the manuscript can be checked in sim_results2.R.
File: Particlefilter_LMPrhoConst_date_MP1_log10_5_6_2.ipynb
Description: This file is for applying to the case study. In this script, estimate x, y, b, sigma_r, and rho (estimated variables) using ParticleFilter MCMC. To compute other days, the sound volume file name (contained in the "q25_autoChorusDetect4" folder) and nday should be changed in the script.
File: water_temp_dayaverage_ibaraki.R
Description: In this script, create a water temperature file(csv) for later analysis.
File: tide_data_processing.R
Description: In this script, create a tide-level file(csv) for later analysis.
File: coastline_plot.R
Description: In this script, describe the coastline data.
File: pop_location_log_5_6_2.R
Description: In this script, convert the location of the group centre from the longitude-latitude scale to the distance from the coastline.
File: plot_for_paper3.R
Description: In this script, conduct the statistical tests and analyse the results of the manuscript.
File: sim_result2.R
Description: In this script, analyze and show the results of the simulation studies.
File: plot_for_paper_appendix3.R
Description: In this script, the results of the manuscript are created (shown in the appendix).
File: bayesian_p_LMP_xyrhoConst_log_5_6_2.ipynb
Description: In this script, calculate the Bayesian p-value to check the model misfit described in the appendix.
File: prior_predictive_check_5_6_2.ipynb
Description: In this script, calculate the prior predictive check to check the validity of the prior distributions.
File: model_check_LMPxyrhoConst_log10_5_6_2.ipynb
Description: In this script, calculate and show the differences in the model prediction and observed values.
File: chorus_laplace_scale2.R
Description: In this script, confirm the scale of the observational noise in the observed chorus sound volume data.
File: sound_attenuation_matrix_test.ipynb
Description: In this script, check the performance of mathematical approximation, and the effect of adding 1 to the distance on the performance
File: sound_propagate2.ipynb
Description: In this script, confirm that the approximation of log(x+1) with displaying the results of Takahashi 2018.
Software information
To reproduce the computational environment, we recommend using:
- MATLAB R2023a with the toolboxes listed as follows
- Python 3.10.12 with the specified packages (can be installed via
pip install -r requirements.txtorconda env create -f environment.yml) - R 4.3.1; package versions can be confirmed via
sessionInfo()output provided inR_sessionInfo.txt
MATLAB ver.: 9.14.0.2337262 (R2023a) Update 5
Operating system: Microsoft Windows 11 Home Version 10.0 (Build 26100)
Java ver.: Java 1.8.0_202-b08 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
MATLAB ver. 9.14 (R2023a)
Simulink ver. 10.7 (R2023a)
Audio Toolbox ver. 3.4 (R2023a)
DSP System Toolbox ver. 9.16 (R2023a)
Deep Learning Toolbox ver. 14.6 (R2023a)
Image Processing Toolbox ver. 11.7 (R2023a)
Parallel Computing Toolbox ver. 7.8 (R2023a)
Signal Processing Toolbox ver. 9.2 (R2023a)
Wavelet Toolbox ver. 6.3 (R2023a)
ver. 3.10.12
numpy ver. 1.26.3
scipy ver. 1.12.0
pandas ver. 2.2.0
matplotlib ver. 3.8.2
tqdm ver. 4.66.1
numba ver. 0.58.1
ver. 4.3.1
tidyverse ver. 2.0.0
ggbreak ver. 0.1.2
metR ver. 0.15.0
cowplot ver. 1.1.3
patchwork ver. 1.2.0
car ver. 3.1.3
RColorBrewer ver. 1.1.3
Access information
Data was derived from the following sources:
- Water temperature data was from Japan Meteorological Agency (https://www.data.jma.go.jp/kaiyou/data/db/kaikyo/series/engan/txt/area138-past.txt)
We copied the data from 4/8/2015 to 3/9/2015 from this source text file and pasted it into a CSV file named 'day_average_wtemp.csv'. - Tidal data was from Japan Meteorological Agency (https://www.data.jma.go.jp/kaiyou/data/db/tide/suisan/txt/2015/D2.txt)
From this txt file, we only used data from 1 August to 4 September as the renamed text file "kashima_tide_August_2015.txt". The descriptions of the text file structures are described in "tide_data_processing.R". To create the "kashima_tide_August_2015.txt" file, when the high tide and low tide times are linked to the part before them, the linkage is broken using half-width spaces as a pre-processing of the txt to convert easily.
Japan Meteorological Agency permits users to use this content under the CC BY license.
- Depth data was from Japan Coast Guard (https://www.jodc.go.jp/vpage/depth500_file.html)
From this data source, to create the depth data("mesh500_35_140.txt"), select the mesh for Latitude[deg.]: 35.00 - 36.00 and Longitude[deg.]: 140.00 - 141.00, then download it. The downloaded file should match the one used in our analysis. - Sunset/ Sunrise data was from National Astronomical Observatory https://eco.mtk.nao.ac.jp/koyomi/dni/2015/s0808.html.en and https://eco.mtk.nao.ac.jp/koyomi/dni/2015/s0809.html.en. To create "mito_suntime.csv", we copied the data from the matrix at this link for the period from August 1, 2015, to September 3, 2025, into a CSV file and named it mito_suntime.csv. The names of columns are changed from left to right to Day, sunrise, sunrise_degree, Median_altitude, Median_altitude_degree, sunset, and sunset_degree.
These data sources (including the Japan Meteorological Agency) do not permit the use of data without distribution or citation. Therefore, if you wish to use this data, you must download it yourself. Additionally, please review the data usage terms when utilizing the data.
