Data and code from: Polarized schooling emerges in tetra species with cohesive social networks
Data files
Jun 02, 2026 version files 1.53 MB
-
compiled_data.csv
23.08 KB
-
experiment_log.csv
4.60 KB
-
kinematics_acquisition.zip
54.33 KB
-
README_compiled_data.md
10.42 KB
-
README_experiment_log.md
7.48 KB
-
README.md
10.66 KB
-
schooling_analysis.zip
1.42 MB
Abstract
Fish schooling depends on social interactions between animals. Studies on schooling overwhelmingly focus on a single species, which challenges our ability to resolve what features of this collective behavior are universal and how it has diversified over the course of evolution. Here, we studied interspecific variation in schooling behavior among five species of Neotropical tetras to examine how social networks relate to schooling kinematics among species. We quantified differences in speed, polarization, spacing, mutual information, and network properties within and between species. Our results demonstrate substantial interspecific variation in schooling behavior, with polarized species exhibiting higher speeds and more cohesive social networks. In contrast, shoaling species showed greater variability in their spatial arrangement and a less cohesive social structure. This comparison demonstrates how closely related species are capable of exhibiting distinct forms of schooling that reflect divergent traits in sensing, motivation, and locomotor control.
https://doi.org/10.5061/dryad.x0k6djj0f
kinematics_acquisition.zip
Python code for controlling devices during the running of experiments and the acquisition of kinematic data from the video recordings.
Running experiments
Running experiments entails video-recording the swimming of fish under controlled lighting conditions.
- schooling_experiments.ipynb - Jupyter notebook that steps through the running of experiments.
- def_runexperiments.py - Python functions called by schooling_experiments.ipynb to run experiments.
Acquiring video
The execution of code is controlled by the "experiment_log" spreadsheet (csv format), saved in the root directory for the project. See README_experiment_log.md for more details on that file.
Directory structure
The code assumes the following directory structure. This will be self-generated when running code in schooling_experiments.ipynb.
- "waketracking" [root_proj] - Directory holding all videos and data.
- "data"
- [project directory] - Named after the project name, specified in schooling_experiments.ipynb.
- "experiment_log.csv" - Experiment catalog, downloaded from google sheets.
- "data"
- "raw" - Directory holding the data generated from the videos by TRex.
- "fishdata" - Directory holding the data generated from the videos by TRex.
- "settings" - Directory holding the settings files used by TRex.
- "matlab" - mat files of the TRex data, analyzed in Matlab.
- "raw" - Directory holding the data generated from the videos by TRex.
- experiment_schedules - csv files generated to control the camera and lights on a schedule.
- masks - Image files generated by the code to generate a mask over the videos.
- calibration_images - Image files generated for measuring the calibration constant.
- [project directory] - Named after the project name, specified in schooling_experiments.ipynb.
- "video"
- [project directory] - Named after the project name, specified in schooling_experiments.ipynb.
- "raw" - Directory holding the recordings from experiments.
- (date) - Directories for each date of experiments (e.g. '2022-10-03')
- "compressed" - Code will generate compressed mp4 videos here.
- "calibration" - Generated from the raw videos for measuring calibration constant.
- "tmp" - Directory that generates temporary video files while compressing videos.
- "dv" - dv-formatted videos, generated by TGrabs.
- "raw" - Directory holding the recordings from experiments.
- [project directory] - Named after the project name, specified in schooling_experiments.ipynb.
- "data"
Acquisition files
- def_paths.py - Defines the data and video paths for the project. You need to add root paths for each new user or machine included in the project.
- def_acquisition.py - Functions for running data acquisition.
- videotools.py - Copied from kineKit, series of functions for manipulating and interacting with video. Requires installing ffmpeg and opencv.
Additional files
- gui_functions.py - GUI helper functions for the acquisition interface.
- run_acq_kinematics.py - Script for running kinematic data acquisition.
- run_acquisition.py - Script for running data acquisition.
- video_preprocess.py - Functions for preprocessing video files.
schooling_analysis.zip
MATLAB code for analyzing the results of schooling experiments, using data generated by TRex, as part of the wake_tracking project in the McHenry Lab at UC Irvine. Most of the code was developed by Ashley Peterson, PhD and Matt McHenry, PhD.
Core analysis pipeline
- wk_main.m - Master script to run all parts of the data analysis. Executes the following m-files:
- wk_raw.m - Compiles wanted fish data for an individual video trial into data structure for processing and analysis. Files:
- '_fishdata.mat' - Writes 'd' structure for individual fish data with respect to time.
- '_rawfish.mat' - Writes 'r' structure of the properties of each trial/fish.
- wk_displacement.m - Expresses individual kinematics with respect to displacement, rather than time.
- '_fishdata.mat' - Reads 'd'.
- '_rawfish.mat' - Reads 'r'.
- '_displacement.mat' - Writes 'l' structure, with displacement data with respect to distance.
- wk_schoolingvars.m - Finds the schooling variables from the raw positional data of the school.
- '_fishdata.mat' - Reads 'd'.
- '_rawfish.mat' - Reads 'r'.
- '_schooldata.mat' - Writes 's' structure for schooling variables with respect to time.
- wk_trial_sum.m - Timeseries plots for each trial.
- '_fishdata.mat' - Reads 'd'.
- wk_compile.m - Runs basic processing of raw data. Returns 'D' structure to workspace.
- '_fishdata.mat' - Reads 'd'.
- '_schooldata.mat' - Reads 's'.
- wk_comp_sum.m - Plots comparative analysis between schooling species. Accepts 'D' structures from multiple species.
- wk_pop_sum.m - Runs population-level analysis. Summarizes across experimental trials.
- wk_animate.m - Generates an animation of fish in a particular trial. Can render each fish in a unique color, or color-codes the background with a 2D colormap of R and P values.
- wk_raw.m - Compiles wanted fish data for an individual video trial into data structure for processing and analysis. Files:
- run_batch.m - Runs batch analysis across all sequences in the experiment log.
- setpath.m - Sets paths for the project and required toolboxes.
- jbfill.m - For confidence intervals in plots. Downloaded from MATLAB Central.
wk_main.m Structure
The wk_main.m script is organized into several sections that control the execution of the analysis pipeline:
Parameters
- local - Sets the path to data files. Options: 'work' (vortex), 'home' (google drive), 'travel' (local drive), or no input (defaults to vortex).
- db (debugging parameters):
- db.specific: Set to 1 to work with specific fish numbers (defined in db.focalFish)
- db.debug: Set to 1 for easier debugging
- db.view: Whether to visualize steps in analysis
- db.overwrite: Always run and create new data files
- db.parallel: Whether to run trials in parallel
- Animation parameters: num_frames, start_frame, and ani_type ('unique', 'colormapped', 'particles', 'info', or 'info 2')
- clist - List of projects to analyze. Can be a single project or multiple projects for comparative analysis. Projects ending in 'Basic' are used for comparative analyses.
- export_csv - Whether to save compiled data in CSV format.
- exclude5 - Whether to exclude schools of 5 fish in scaling data analysis.
- time_limit - Maximum time (in minutes) for timeseries data (set to inf if none).
Batch Analysis Functions
These functions process data for all sequences in the log:
- run_batch('wk_raw',...) - Basic processing of raw data, saves 'r' structure to '_rawfish.mat' files
- run_batch('wk_schoolingvars',...) - Calculates schooling variables (nearest-neighbor distance, polarization, etc.), saves 's' structure to '_schooldata.mat' files
- run_batch('wk_focal',...) - Transforms centroid coordinates with respect to focal fish, saves 'f' structure to 'focalfish.mat' files
- run_batch('wk_peaks',...) - Finds peaks in velocity and rate of heading change, saves 'L' and 'p' structures to '_peaks.mat' files
- run_batch('wk_ramp',...) - Analysis of ramp experiments
- run_batch('wk_info',...) - Analysis of mutual information
- run_batch('wk_body',...) - Analysis of body posture data
- run_batch('wk_bodylength',...) - Measures body length
Functions to Run After Batch Analysis
- wk_network_full(...) - Calculates network properties. Requires BCT Toolbox (path set in setpath.m). Uses threshold type specified by MI_thresh_type ('species mean', 'species median', or 'fixed').
- wk_compile(...) - Compiles summary information for all experiments, extracts mean and variance of schooling parameters. Returns 'D' structure to workspace.
- wk_repackage(...) - Repackages RN_Ramp data to extract only the first lights-on period for 'basic' use.
Plotting, Visualization, and Comparative Functions
- run_batch('wk_trial_sum',...) - Plots individual trial summaries
- wk_scaling_summ(D) - Summary analysis for scaling or proportion experiments
- wk_ramp_sum(D) - Summary analysis for ramp experiments
- wk_spd_analysis(...) - Analyzes speed timeseries data for ramp experiments
- wk_heatmap(...) - Generates heatmaps for various metrics: 'nearest neighbor', 'tank', 'corr coef', 'MI', 'time lag', 'school center', or 'in dark'
- wk_heatmap_PR(...) - Heatmap of polarization and rotation
- wk_comparative(D) - Runs comparative-level analysis
- wk_animate(...) - Makes animation of data
- wk_bodylength(...) - Measures body length
- wk_MI_summ(...) - Mutual information as a function of distance
- wk_network_summ(...) - Network properties summary
- wk_kinematics(...) - Kinematic analysis
- wk_network_comp(...) - Compares network properties of 5 species, creates scatterplots of R, P, NND
Data catalog
run_acquisition.py, from the wake_tracking project, exports the following TRex data:
- 'posture_fish' files. Saved in project_root/matlab/posture. There is one file per fish. The measurements include a variety of TRex-generated variables, including 'midline_angle', 'midline_centimeters', 'midlength_centimeters', 'midline_lengths', 'midline_offsets', 'midline_points', 'midline_points_raw', 'outline_lengths', 'outline_points', and 'posture_area'.
- 'fish' files. Saved in project_root/matlab/centroid. Includes TRex parameters, including ACCELERATION_pcentroid, ACCELERATION_wcentroid, ANGLE, ANGULAR_A_centroid, ANGULAR_V_centroid, AX, AY, BORDER_DISTANCE_pcentroid, MIDLINE_OFFSET, SPEED, SPEED_pcentroid, SPEED_wcentroid, VX, VY, X, X_wcentroid, Y, Y_wcentroid, frame, frame_segments, midline_length, midline_x, midline_y, missing, normalized_midline, num_pixels, segment_length, segment_vxys, time, timestamp.
Data files generated by our MATLAB code:
- 'rawfish' files. Saved in project_root/matlab/centroid and include the word 'rawfish'. The 'r' structure of centroid data for each fish in the school.
- 'schooldata' files. Saved in project_root/matlab/centroid and include the word 'schooldata'. Includes the 's' structure of schooling parameters.
- 'frame' files. Saved in project_root/matlab/posture. Include the word "posture_frame". Generated by wk_animate. Organizes the outline and midline coordinates from the 'posture_fish' data by frame of video.
Summary data
Summary data compiled in compiled_data.csv. This CSV file contains the compiled results across all five species analyzed in the study, including schooling parameters (polarization, rotation, nearest-neighbor distance), network metrics, and species-level summaries. See README_compiled_data.md for more details.
