DLC networks from: Application of a novel deep learning based 3D videography workflow to bat flight data

Håkansson, Jonas 1 ; Quinn, Brooke L.2 ; Shultz, Abigail L.1 ; Swartz, Sharon M.2 ; Corcoran, Aaron J.1

Published Oct 02, 2023 on Dryad. https://doi.org/10.5061/dryad.0cfxpnw7x

Data files

Oct 02, 2023 version files 7.14 GB

Austin2020_accuracy_test-DLTdv-2023-04-07.zip

4.62 GB
Brown2021_accuracy_test-DLTdv-2023-01-28.zip

2.52 GB
README.md

202.32 KB

Abstract

Studying the detailed biomechanics of flying animals relies on producing accurate three-dimensional coordinates for key anatomical landmarks. Traditionally, this is achieved through manual digitization of animal videos, a labor-intensive task that grows more so with increasing frame rates and numbers of cameras. In this study, we present a workflow that combines deep learning-powered automatic digitization with intelligent filtering and correction of mislabeled points using 3D information. We tested our workflow using a particularly challenging scenario – bat flight. First, we documented bats flying steadily in a wind tunnel. We compared the results from manually digitizing bats with markers applied to anatomical landmarks against using our automatic workflow on the same bats without markers. In our second test case, we compared manual digitization against our automated workflow for bats exhibiting complex maneuvers in a large flight arena. We found that the variation between the 3D coordinates from our workflow and those from manual digitization was less than a millimeter larger than the variation between 3D coordinates resulting from two different human digitizers. The reduced reliance on manual digitization stemming from this work has the potential to significantly increase the scalability of studies into the detailed biomechanics of animal flight.

https://doi.org/10.5061/dryad.0cfxpnw7x

This dataset accompanies the article Application of a Novel Deep Learning Based 3D Videography Workflow to Bat Flight, in preparation to be submitted to the Annals of the New York Academy of Sciences. The dataset contains training data and trained DeepLabCut networks produced for that manuscript.

Description of the data and file structure

Overview

Data consists of two DeepLabCut (DLC) networks with associated training data. See DeepLabCut/DeepLabCut: Official implementation of DeepLabCut: Markerless pose estimation of user-defined features with deep learning for all animals incl. humans (github.com) for details on how to set up and run DeepLabCut.

Note that this repository will be of little use if you are not familiar with DeepLabCut or related packages.

Also note that the code used for using this repository is not provided here, but rather in the related GitHub repository (see Code/Software section).

File structure

The DeepLabCut projects take the shape of two project folders ("Austin2020_accuracy_test-DLTdv-2023-04-07" and "Brown2021_accuracy_test-DLTdv-2023-01-28") compressed as zip files. Once extracted, each project folder consists of four folders and a yaml file, described below. Both projects are based on, and used for analyzing, videos of bats in flight.

config.yaml. This file contains information about the project, such as the video sets the training data is based on, the labels for the tracked bodyparts, and so on. Before using, the project_path parameter needs to be changed to reflect where you are storing the network on your machine. Furthermore, the snapshotindex parameter should be changed if you wish to use another training state of the network than the one we used. Most importantly, note that this file is the one you need to refer to when using most DeepLabCut functions. For instance, if you want to analyze a video, you would use the command DeepLabCut.analyze_videos(config_path, videos=videoListOrFolderPath, shuffle=shuffleNumber, save_as_csv=TrueOrFalse). Where config_path would be the path to config.yaml.
dlc-models (folder). This folder contains the trained DeepLabCut models. For the "Brown2021" project, we only have one model trained. For the "Austin2020" one we have two, one that didn't use the augmentation for reducing propensity for mixing up left and right (shuffle3, and one that did shuffle4). This folder is important for two reasons. 1) It contains the trained models with which to analyze videos, so-called snapshots, meaning without it you can't use the project for analyzing videos. The snapshots are stored in dlc-models/iteration-0/NameOfProjectXXshuffleN, where XX is a number indicating the proportion of the training data to the total data in percentages (default 95%), and N refers to the shuffle number. Shuffle can be thought of as a way of storing multiple networks within a project, e.g. to try different train/test splits or different network settings. Each stored snapshot is composed of three files, a meta file, an index file, and files with custom file endings, e.g. "snapshot-750000.meta", "snapshot-750000.index", and "snapshot-750000.data-00000-of-00001" for the snapshot resulting from 750,000 iterations of training. This subfolder also contains the "pose_cfg.yaml" file which contains parameters used when training the network, such as base network, optimizer, and augmentation parameters, as well as the learning stats file "learning_stats.csv" that outlines the training loss and loss ratio during the training of the network. 2) The trained models within it can be used as base networks on which to train a new network with additional training data. If you, for instance, have a dataset of flying bats but the models we provide here aren't providing acceptable accuracy (they probably won't on their own), you might want to consider replacing a more common base-network, such as ResNet50, with one of our snapshots, and then training a new network with data from your own project. See DeepLabCut documentation for how to do this.

evaluation-results (folder). This folder contains the results from comparing the landmark coordinates arrived at by the DLC network against coordinates resulting from manual digitization, stored as CSV files. Results are available for each. We use it to decide which snapshot performs best. For example, for the "Brown2021" project, the evaluation CSV is located in "\Brown2021_accuracy_test-DLTdv-2023-01-28\evaluation-results\iteration-0\CombinedEvaluation-results.csv", and its first five rows looks like this:

	Training iterations:	%Training dataset	Shuffle number	Train error(px)	Test error(px)	p-cutoff used	Train error with p-cutoff	Test error with p-cutoff
0	100000	95	1	4.12	6.74	0.6	3.53	5.22
1	150000	95	1	3.3	7.3	0.6	3.01	5.79
2	200000	95	1	2.9	7.38	0.6	2.72	6.05
3	250000	95	1	2.69	6.73	0.6	2.58	5.52
4	300000	95	1	2.65	6.52	0.6	2.49	5.22
5	350000	95	1	2.38	6.47	0.6	2.29	5.75

"Training iterations" refer to iterations of training, so the snapshot with that many iterations. "%Training dataset" refers to the train/split proportion of that snapshot, i.e. how many percentages of the labeled data that were used as training data, here 95%. "Shuffle number" is explained above. "error(px)" refers to the average Euclidian pixel distance between the coordinates of the manual label and the corresponding automatically placed label, and the qualifier before ("Train" and "Test") refers to if the accuracy was tested on data designated as testing or training data, i.e. if the network was trained on the frames tested on or not. Train error is typically lower than Test error since the network has never "seen" the test data, it is only used for testing. "p-cutoff used" refers to the DLC-assigned confidence limit of the data considered in the two last columns. The confidence is a measure of how certain the network is of a prediction. If for instance, a prediction is made with a confidence of less than 0.1 (10%) it might, depending on the context, be wise to disregard that prediction, i.e. filter it out. So for the last two columns, predictions with a confidence of less than 0.6 (60%) are filtered out before calculating the mean error, which is why the columns "Train error with p-cutoff" and "Test error with p-cutoff" have lower values than the corresponding non-filtered error columns. Of the snapshots reported in the example above (100,000 to 350,000), I would likely choose to use the 300,000-iterations one, since it has a comparatively low error before and after filtering out low-confidence predictions.

labeled-data (folder). This folder contains frames from the videos we digitized along with the coordinates for the landmarks digitized for that particular frame. Each subfolder represents a digitized video and contains extracted frames as PNG files and the coordinates of the digitized labels are stored in a CSV file and a H5 file, where the H5 file is intended for internal use by DLC and the CSV file is more suited to human reading. To explain the stored data in more detail, here follows the first six rows and seven columns of an example CSV file with coordinates:

scorer			DLTdv	DLTdv	DLTdv	DLTdv
bodyparts			t3L	t3L	wstL	wstL
coords			x	y	x	y
labeled-data	AFAMcarollia_ ... _F1_croppedtraining	AFAMcarollia_ ... _F1_croppedtraining260.png	231.1788	38.5795	146.6479	100.0577
labeled-data	AFAMcarollia_ ... _F1_croppedtraining	AFAMcarollia_ ... _F1_croppedtraining270.png	257.6643	44.6277	154.557	74.335
labeled-data	AFAMcarollia_ ... _F1_croppedtraining	AFAMcarollia_ ... _F1_croppedtraining280.png	276.2536	51.5159	164.5954	62.2425

"scorer" refers to the name of the person performing the labeling, we used "DLTdv" as scorer name since we digitized the videos using DLTdv. "bodyparts" refers to the names of the anatomical landmarks, visible in this example is "t3L" - tip of the third digit on the left wing, and "wstL" the wrist on the left wing. "x" and "y" refers to the numbers in the rows below being either the x or the y coordinates. Starting on the forth row is the actual data, and the first three columns make up the path to the PNG files, so the first one would be "labeled-data\AFAMcarollia_ ... _F1_croppedtraining\AFAMcarollia_ ... _F1_croppedtraining260.png", and the following columns are the x and y coordinates of the digitized landmarks. Overall, the data in this foler ("labeled-data") is very useful if you want to create a new DeepLabCut project that uses your own data together with ours, that way you have a head start and won't need to provide as much digitizing of your own. Note that the digitizing scheme we use is as pictured below, meaning you would need to use the same in your project (although see SuperAnimal models pretrained for plug-and-play analysis of animal behavior (arxiv.org) for a way to combine training data of different digitizing schemes)

training-datasets (folder). This folder contains subfolders (e.g. "\training-datasets\iteration-0\UnaugmentedDataSet_Brown2021_accuracy_testJan28") that contain overviews of all the training data, a combination of all the CSVs and H5s in the "labeled-data" folder. It also contains a mat (Matlab) and pickle file for the training data of each shuffle. These are intended for internal use by DLC and don't lend themselves well to human reading, but the pickle files contained within can however be used if one wants to create a training dataset with identical train/test split. See code snippet below for an example of how to do that.

import pandas as pd
obj = pd.read_pickle(r'\\Path\\To\\Project\\Folder\\training-datasets\\iteration-0\\UnaugmentedDataSet_ProjectName\\Documentation_data-ProjectName_95shuffle1.pickle')
trainIndices = obj[1]
testIndices = obj[2]
deeplabcut.create_training_dataset(config_path, trainIndices=[trainIndices], testIndices=[testIndices])

Code/Software

The data contained in this repo is, to be clear, mostly created automatically by use of the Python package DeepLabCut. The labeled data, was created by using DLTdv to digitize videos of bats in flight and converting the DLTdv projects from the DLTdv format into one readable by DeepLabCut using in-house developed Matlab code. That code will be available on the related GitHub repo:
biol-jsh/DLC-DLTdv-workflow: Code used for manuscript Application of deep learning based 3D videography to bat flight (github.com)
Note that the Repo is in the process of being updated at the time of submission.

For more information on DeepLabCut and DLTdv, please see GitHub repo to DeepLabCut:
DeepLabCut/DeepLabCut: Official implementation of DeepLabCut: Markerless pose estimation of user-defined features with deep learning for all animals incl. humans (github.com)
and DLTdv GitHub repo: tlhedrick/dltdv: DLTdv MATLAB based video digitizing and annotation tool (github.com) and webpage: DLTdv digitizing tool | Hedrick Lab :: Comparative Biomechanics (unc.edu)

More information

A link to the related article will be added when the DOI becomes available.

DLC networks from: Application of a novel deep learning based 3D videography workflow to bat flight data

Data files

Abstract

Description of the data and file structure

Overview

File structure

Code/Software

More information

Overview

Filming

Digitizing

DLC networks from: Application of a novel deep learning based 3D videography workflow to bat flight data

Data files

Abstract

README: Application of a Novel Deep Learning Based 3D Videography Workflow to Bat Flight data

Description of the data and file structure

Overview

File structure

Code/Software

More information

Methods

Overview

Filming

Digitizing

Works referencing this dataset