Data from: Developmental arcs of plasticity in whole movement repertoires of a clonal fish
Data files
Mar 04, 2026 version files 8.71 GB
-
fe_dataset_csv.zip
5.41 GB
-
Projections_FE.zip
3.31 GB
-
README.md
5.45 KB
Abstract
Developmental plasticity at the behavioral repertoire level allows animals to incrementally adjust their behavioral phenotypes to match their environments through ontogeny, serving as a lynchpin between ecological factors that cue phenotypic adjustments and evolutionary forces that select upon emergent phenotypic variation. Quantifying the continuous arcs of plasticity throughout animals’ development, however, has often been prohibitively challenging. Here, we leverage recent advancements in high-resolution behavioral tracking and analysis to (i) track the behavior of 45 genetically identical fish clones (Poecilia formosa) reared in near-identical environments during their first four weeks of life at 0.2 s resolution and (ii) quantify the continuous arcs of plasticity across entire behavioral repertoires through development. Doing so, we empirically address one of the most fundamental theoretical predictions from Bayesian models of development that, in stable (but initially unknown) environments, behavioral plasticity should gradually decrease as individuals age. Using two approaches to measure plasticity across ontogeny, we first quantify plasticity in individual behavioral metrics before also developing a novel whole-repertoire approach that calculates plasticity as the degree of ‘behavioral entropy’ across a multi-dimensional behavioral phenotype space. We robustly find – despite experimentally matching as best as possible the assumptions of models that predict decreasing plasticity – a ~two-week initial increase in plasticity in movement behaviors before plasticity subsequently decreased. Our results help address one of the most widespread intuitions about the optimal developmental course of plasticity through early ontogeny, thereby also demonstrating the value of long-term behavioral tracking approaches for testing fundamental predictions on phenotypic development.
Dataset DOI: 10.5061/dryad.x69p8cztw
Description of the data and file structure
This data repository contains all data associated with the following manuscript:
The full dataset is a timeseries, measured at 5 Hz, of x-y coordinate points, representing fish movements in lab-based tank housing from the first day of life to day 28. Contained in the data are full 28-day timeseries for 45 genetically identical Amazon molly (Poecilia formosa) individuals.
For ease and future convenience, the data is represented in two different formats:
(1) 'fe_dataset_csv.zip' contains 45 .csv files, with each file corresponding to one individual's full 28-day timeseries (at 5 observations per second) of x-y coordinate points through time. The csv-data includes the following columns as listed below:
- individual: a unique identifier for each of 45 individuals
- day: the date of the observation (YYYYMMDD)
- df_time_index: the timestamp for the observation
- positions_x: the position (the centroid of) of a fish in the x-dimension
- positions_y: the position of (the centroid of) a fish in the y-dimension
- step_size: the Euclidean distance from any given point and the preceding point observed 0.2 seconds previously
- turning_angle: the turning angle in radians of a fish's heading relative to their previous heading
- dist_wall: shortest distance from a fish's position to the nearest tank wall in centimeters
- zVals_x: the position of a fish in the x-dimension of a 2-D UMAP embedding space representing the 'behavioral phenotype space' after wavelet transforms and UMAP projection.
- zVals_y: the position of a fish in the y-dimension of a 2-D UMAP embedding space representing the 'behavioral phenotype space' after wavelet transforms and UMAP projection.
(2) 'Projections_FE.zip' contains the same information as the above .csv files, but as MATLAB files (.mat) where each file is a unique individual*day identifier. Thus, each of 45 individuals has a unique .mat file for each of 28 days. These files are used in combination with the github code repository to conduct wavelet transforms, create a 75-dimensional feature embedding space, and calculate Shannon 'behavioral entropy' as outlined in the associated manuscript. Each file (e.g., 'Projections/**_pcaModes.mat'), includes the x-y coordinate positions, the three features of step length, turning angle, and distance to the nearest tank wall (in order) as the three values in 'projections', and relevant meta-data (fish_key, area definitions for the tank arenas).
- The naming convention for each file follows a structured format:
- Individual Identifier: Unique identifier for each fish in the format (block_tankid_compartment)
- Date: Date of tracking in the format YYYYMMDD.
- Tracking Starting Hour: The hour at which tracking for the day commenced, represented as 0600.
- The data files include also information detailing the experimental area setup, providing insights into the spatial context of the fish-tanks, the df_time_index and day as well as the fish_key (individual ID) for every datapoint in the following columns:
- area
- day
- df_time_index
- fish_key
- Information about the position and the features ('projections') are included in the following columns:
- position (x-y coordinate positions)
- projections (each entry is a vector of three values in the following order: [step length, turning angle, and distance to the nearest tank wall in centimeters]
Files and variables
File: Projections_FE.zip
Description: A .zip folder containing MABLAB (.mat) files for each unique individual*day combination, such that each of 45 fish individuals has one file for each of 28 days of observation during development. Each file is thus a timeseries of that day's x-y coordinate positions (as well as step length, turning angle, and distance to the nearest tank wall) for a given individual, recorded continuously and consecutively at 5 observations per second.
File: fe_dataset_csv.zip
Description: A .zip folder containing 45 .csv files, where each file is a timeseries of a given individual's position in x-y coordinate space (along with step length, turning angle, and distance to the nearest tank wall) for the duration of the 28-day observation.
Code/software
The dataset is associated with the following code repository, which can be used to reproduce the results presented in the associated manuscript:
https://www.github.com/smehlman/behavioral_entropy
Please note that the code repository is best used with the .mat file format provided in the 'Projections_FE.zip' date file.
Access information
Other publicly accessible locations of the data:
- None
Data was derived from the following sources:
- 45 individual Amazon molly (Poecilia formosa) clones tracked in lab-based tank environments at Humboldt Univeristy of Berlin in Fall 2021.
The dataset consists of x-y corrdinate points through time at 5 Hz, in which these points represent the spatial position of 45 genetically identical fish (Amazon mollies, Poecilia formosa) in tank space, tracked from their first day of life to day 28. From x-y coordinate points, we also derive and provide timeseries of three behavioral metrics ('step length', 'turning angle', and 'distance to nearest tank wall'), which we describe in more detail below. This data was obtained via the following methods:
Genetically identical gravid mollies were isolated from a single isogenic stock kept at Humboldt Universität zu Berlin (Berlin, Germany). The Amazon molly is a gynogenetic species of freshwater fish — the first described species of clonal vertebrate – with a diverse behavioral repertoire through development. Offspring from three of these isogenic mothers were used as experimental animals in behavioral observations. In addition to the original three mothers of experimental animals being genetically identical and arising from the same stock tank, we also accounted for individual mother ID in all statistical models (see associated manuscript, Supplements 2-4). In this way, we both minimized any maternal effects due to obvious differential experiences of mothers and ensured that all experimental animals were genetically identical. In total, three mothers provided 45 experimental fish, which were transferred on the day of their birth to large individual observation tanks (associated manuscript, Supplementary Fig 1.1). From the next day (their first full day of life) until an age of 28 days, fish were filmed from above using a Basler acA5472-5gm camera fitted with a 16mm lens (Basler Lens C11-1620-12M-P f16mm) at five frames per second continuously for eight hours per day. Given that mollies reach sexual maturity at approximately three months, behavioral observations covered a full third of development to maturity for these fish. Note that while overhead filming meant that the third (i.e., vertical) dimension of movement was not recorded, the water level in tanks was kept relatively shallow (at a depth of ~7cm) so that the dominant dimensions of possible movement were largely confined to two dimensions. Fish were kept on a 12:12h light:dark cycle with an air temperature of approximately 24 ± 1°C and fed daily during a two hour period following the 8-hour filming period with a stationary ‘food patch’ consisting of Sera vipan baby fish food fixed in agar. Observation tanks were illuminated from below with four LEDs per tank (each LED was 100cm in length, 12V, color temperature = 5500 K, light output = ~1570 lumen); tanks were manufactured from white polyethylene, which allowed for diffuse light from LEDs to illuminate the tank evenly. Illumination from below minimized glare on the water surface and allowed for ease of automated video tracking conducted using the software Biotracker. This generated data in the form of a timeseries of fish positions in x-y coordinate space with 0.2 second temporal resolution. From these coordinates, we also obtained three timeseries of the most basic behavioral metrics for which the original 0.2-second temporal resolution could be maintained: (1) an instantaneous measure of a fish’s activity, for which we used the Euclidean distance (‘step length’) between two consecutive x-y coordinate points, (2) a measure of a fish’s bearing relative to their movement vector in the preceding timepoint (i.e., their ‘turning angle’), and (3) a measure of a fish’s position in space relative to a salient aspect of their environment, for which we chose the distance to the nearest tank wall.
These data were then used to calculate both the coefficient of variation (CoV) in each behavioral metric and the degree of Shannon 'behavioral entropy', both used as measures of an individual's plasticity in movement behavior during the first four weeks of their development. These calculated measures of entropy can be computed directly from the data provided (using the Matlab projections files), as described in the associated manuscript. Relevant code and detailed instructions needed for these calculations are provided in the associated public code repository (https://github.com/smehlman/behavioral_entropy).
