Piecewise continuous sampling: a method for minimizing bias and sampling effort for estimated metrics of animal behavior
Data files
Apr 05, 2024 version files 2.56 MB
-
agreementData.csv
-
rawDataActivity.csv
-
rawDataTask.csv
-
README.md
Abstract
Capturing qualitative features of animal behavior requires recording occurrences of behavior over time. Continuous sampling is best for capturing brief behaviors, but can be very time consuming. Instantaneous sampling can reduce the amount of labor required, but can miss short-duration behaviors. We therefore synthesized these techniques by continuously sampling during randomly scattered time intervals; a technique we call piecewise continuous sampling. To optimize and test the efficacy of this technique, we collected a continuous behavioral dataset of harvester ant workers, and then we developed a protocol to estimate the amount of sampling time necessary to reconstruct the proportion of time animals spend in different behavioral states. This protocol finds the sample size needed for the variance of the sample to converge on the variation of the population. We then divided this estimated time into equal-duration intervals that were randomly distributed across the entire continuous dataset. Finally, we calculated both time-dependent and time-independent error from this sample. We found that 4 to 16 sampling intervals minimize both types of error simultaneously. This finding was robust to differences in underlying behavior and was validated with simulations, implying that this method could be used for many types of organisms.
README: Piecewise continuous sampling: a method for minimizing bias and sampling effort for estimated metrics of animal behavior
https://doi.org/10.5061/dryad.p8cz8w9z5
This archive contains all of the code and data used for the manuscript titled, "Piecewise continuous sampling: a method for minimizing bias and sampling effort for estimated metrics of animal behavior". The goal of this manuscript is to establish piecewise continuous sampling as a flexible method for capturing animal behavior and to give suggestions as to how ethologists can optimally sample their data. This work is based on the observation of ant behavior, but should have applications for other animals as well.
Description of the data and file structure
The file rawDataTask.csv contains second-by-second measures of the tasks 9 Pogonomyrmex californicus ants were performing over 11,041 seconds. This data is uses for nearly all analyses used in this manuscript, everything from the sample size estimate to evaluating error across various interval numbers. Each column represents a different ant (codes like BBB give color codes present on ants, in this case blue-blue-blue) and every row represents the behavior that the ant was performing for that second (one 'sample').
The file rawDataActivity.csv is extremely similar to that of the previous file, except the behaviors of the same 9 ants were categorized with a slightly different method. Instead of categorizing their behaviors into 9 tasks, we instead categorized their behavior into 3 activity levels. This data is used to validate the results drawn from the previous dataset. As with the previous dataset, each column represents a different ant and every row represents the behavior that the ant was performing for that second.
Finally, agreementData.csv contains the results of an experiment where we evaluated the effect of increasing the number of intervals on observational data. Here, two experimentalists studied the behavior of a single ant for 660 seconds, categorizing their behavior in the same manner as rawDataTask.csv. We then compared the results of these two experiments, and calculated the degree to which the two agreed (number of seconds tasks were identical for both observers / 660). This 660 second sample would be randomly drawn from a larger 3 hour video, and could be divided into I = 1, 2, 4, 8, 16, and 32 non-overlapping intervals. We did this for 4 colonies. The data included has the final agreement values (percentageAgreement), I, and the colony ID.
Code
All simulations and analyses were performed in R. For brevity, we compiled all functions into a single script (piecewiseContinuousSampling.R) which has been partitioned into different segments which correspond to different parts of the paper. Comments have been included in the script to explain logic. Necessary packages are in the first couple lines of the script, and the only line of code that needs to be changed is the working directory. Make sure that this script is in the same folder as the three data files.
Methods
In order to create a continuously-sampled dataset to compare sampling methods against, we manually coded the behavior of nine Pogonomyrmex californicus ants continuously over a three-hour timespan. Six were from a small colony (~30 workers, 2 queens) and three were from a larger one (~110 workers, 2 queens), though both colonies were still considered small as colonies in this species typically reach a size of 2,000–4,500 workers in the field (Johnson 2000). The nest was partitioned into a foraging arena and a brood chamber with a total surface area of 242 cm2. The workers that were followed were selected based on the task they were doing at the beginning of the video, so as to capture a range of repertoires; brood care (interacting with brood), food processing (interacting with seeds or artificial diet), or resting (immobile). Two from the small colony, and one from the larger, were selected for each task group. Switches between activities were manually coded using the program Cowlog (Version 3.0.2; Hänninen & Pastell 2009) and results were visualized with the ggplot2 package in R. We categorized behaviors into 9 discrete tasks and 3 activity levels (Table S1). Our lab-reared colonies were started from newly mated P. californicus foundresses that were collected in 2017 from Pine Valley, California (lat 32°49′20″N, long 116°31′43″W, 1136 m elevation).