Data from: Non-lethal imaging and modeling approaches for estimating dry mass in aquatic larvae
Abstract
Body mass is crucial for scaling and comparing physiological rates. For example, dry body mass is important in determining an organism’s metabolic rate since it excludes metabolically inactive water weight. Obtaining repeated measurements of body mass throughout an individual’s lifetime is trivial. In contrast, we are normally able to obtain only a single estimate of dry body mass per individual since classical methods require end-point euthanasia followed by drying. We present imaging and modeling techniques for estimating individual dry body mass in African clawed frog (Xenopus laevis) tadpoles, which allows repeated sampling of the same individuals. We applied allometric principles and tested whether external anatomy would yield reliable estimates of dry body mass. Specifically, we describe a procedure to embed tadpoles in agarose media for obtaining morphological data in 3-D, and then we evaluate dry mass predictions among nine cross-validated maximum likelihood and machine learning models. The best performing and flexible model is an allometric model that uses estimates of body volume to predict dry body mass (validation r2 = 0.75). However, other models based only on wet body mass or meant to reduce the number of necessary input variables may also be logistically tractable. We discuss the pros, cons, and future directions of all nine models and give practical advice for users on data collection and analysis. This research develops a strong foundation for continued research on the biological importance of dry body mass, particularly in the context of growth and physiological ecology. Future development of similar approaches is crucial for understanding the importance of body mass indices for the standardization and comparison of physiological rates in plants and animals.
Dataset DOI: 10.5061/dryad.prr4xgz1q
Description of the data and file structure
The purpose of the study was to develop models to predict dry body mass from morphology. The data collection involved imaging tadpoles, measuring their anatomy, drying them to obtain dry body mass, and predictive modelling.
Files and variables
The provided Dryad.zip main folder contains two folders, the Code and Data Folders. The Data folder contains 3 main data files.
Data Table.csv contains morphological measurements of N = 61 Xenopus laevis tadpoles. Below are the data columns and their definitions. Variables are described in further detail in the associated manuscript and protocols.io submission. Missing data is labeled as NA.
- species = species
- id = id
- date = date
- mslide = mass of the glass slide used in weighing (g)
- wmass = wet mass of tadpole (g)
- mass (24hr) = dry mass after 24 hours (g)
- mass(48hr) = dry mass after 48 hours (g)
- mass (72hr) = dry mass after 72 hours (g)
- DBL = dorsal body length (mm)
- DBW = dorsal body width (mm)
- DBA = dorsal body area (mm2)
- DTW = dorsal tail width (mm)
- DTL = dorsal tail length (mm)
- DTA = dorsal tail area (mm2)
- LBL = Lateral body length (mm)
- LBH = Lateral body height (mm)
- LBA = lateral body area (mm2)
- LTH = lateral tail height (mm)
- LTL = lateral tail length (mm)
- LTA = lateral tail area (mm2)
- LLBA = lateral limb bud area (mm2)
- LTMA = lateral tail muscle area (mm2)
- FBH = frontal body height (mm)
- FBW = frontal body width (mm)
- FBA = frontal body area (mmm2)
- Dorsal Stitching 1 = correlation coefficient of othe verlapping area of two combined images to form a dorsal composite image, when 2 or 3 images had to be combined
- Dorsal Stitching 2 = correlation coefficient of the overlapping area of the two combined images to form the dorsal composite image, when 3 original images had to be combined
- Lateral Stitching 1 = same as above, but for the lateral view
- Lateral Stitching 2 = same as above, but for the lateral view
- Notes = notes
Tadpole Images contains the raw photographs of each tadpole. The photographs contain the following information in the name: ID, orientation, photo number (some tadpoles required 2 or 3 images to capture their whole bodies), and the date in MM-DD-YY format. There are 377 photographs. The Tadpole Images folder also contains another folder called Scale, which contains 2 pictures of the 1mmx1mm grid paper used for converting pixels to distances, at different magnifications.
The NN folder contains outputs from the neural network analyses. Specifically, the outputted hyperparameter searches across 3 replicate runs (folders called First, Second, Third). All these data are summarized in a single file: combined_samples_all.
combined_samples_all contains the following data columns:
- rep = replicate
- layers = neural layers
- dense = neurons per layer
- mse = mean square error
- mae = mean absolute error
- r2 = coefficient of determination
The First, Secon,d and Third folders contain the hyperparameter sear,ch which varies the number of neural layers and their neural density. The combined_samples.csv file in each of the 3 folders combines the data in the other 6 files. The 6 files are named according to the number of neural layers being searched.
Each hyperparameter search file contains the following columns.
- dense = neurons per layer
- learning = the learning rate
- epochs = the epochs used in the analysis
- layers = the neural layers
- mse = mean squared error
- r2 = coefficient of determination
- mae = mean absolute error.
Code/software
The code can be opened with R. The data can be opened with any program that allows for .csv or .jpg for the measurement data or photographs.
The version of packages used to run the files is described in full in the manuscript.
There are 5 Code files. drymass_predictions is the predictive modelling and main analysis file. Together with caret_mae.R, it completes all analyses in the associated publication and produces all Data in the folder called NN (which stands for Neural Network). caret_mae.R is a wrapper of another function that calculates the mean absolute error through the caret library. Figures .R produces the figures in the manuscript. combine_nn_results takes the data in Hyperparameter Search (a Data folder) and combines all outputs in a single file (combine_samples_all.csv). The Model Fits folder contains all the fitted model objects that can be used by others to implement the new models described in the paper. Model Fits contains.RDS or .keras files that can be loaded directly into R.
There are 3 Data files. Data Table.csv contains all the morphological measurements and other associated data. NN contains the results of the Neural Network. Tadpole Images contains all the raw tadpole images and the scales (1mmX1mm grid paper at different magnifications).
