Data from: Long-term, high-resolution field monitoring reveals increased temporal persistence of larger aggregations in fruit flies
Data files
Feb 20, 2026 version files 33.92 KB
-
analysis.R
2.43 KB
-
Data.txt
29.06 KB
-
README.md
2.42 KB
Abstract
Understanding how insect behaviors studied under controlled laboratory conditions unfold in natural environments remains a major challenge, which limits our understanding of the ecological relevance of the studied behaviors. The genetic and molecular mechanisms of social interactions are intensively studied in Drosophila melanogaster, often by exposing flies to group settings for periods ranging from minutes to several days. However, the duration of group formation in the wild remains poorly understood, limiting our ability to assess whether laboratory-identified social behaviors actually have the temporal opportunity to occur in nature. Using long-term, high-resolution field monitoring, we found that groups persist up to ten hours and that the duration of group formation increases with the number of flies but is not modulated by environmental factors such as temperature. These results provide rare empirical evidence for multi-hour group persistence in wild D. melanogaster, demonstrating that natural conditions can readily support the temporal windows required for many social and reproductive behaviors typically studied in the laboratory. More broadly, our findings highlight that aggregation size may be a key driver of group stability in natural insect populations, advancing efforts to link mechanistic behavioral research with ecological reality.
Dataset DOI: 10.5061/dryad.dfn2z35gd
Description of the data and file structure
Files and variables
File: analysis.R
Description: R script used for statistical analysis.
Workflow of the analysis:
1) Generation of additional variables from the dataset that include the total number of flies collected each day (variable: total_number_flies_per_day)
2) Generation of a vector that includes the longest stretch of flies trapped without a zero in between per day. These data were manually collected from the raw data.
3) Generation of variables that include environmental variables. Reduction to one value per day (before: one value per bin)
4) Generation of a table that includes the longest group formation, environmental variables, and the total number of flies of that day. Each row is one observation day. The table is then saved.
5) Rows from the table in which total_number_flies_per_day was = 1 or = 0 were removed
6) Descriptive statistics (mean + standard error of mean) for the longest stretch of flies trapped without a zero inbetween per day
7) A Generalized Linear Model is run that tests whether the longest stretch of flies trapped without a zero inbetween per day is influenced by environmental factors (temperature, humidity, daylight) and the total number of flies that day.
File: Data.txt
Description: Raw data
Variables
- Date_observation_started: Date on which the 24h monitoring interval started
- Month: Month of the respective monitoring day
- Day_of_the_year: Day of the year on which the 24h monitoring interval started
- Time_window: Time window of the open bin of the automated trap system
- D.melanogaster_Males: Number of D. melanogaster males found in the respective bin
- D.melanogaster_Females: Number of D. melanogaster females found in the respective bin
- Temperature: Mean temperature of observation day [°C]
- Humidity: Mean humidity of observation day [%]
- Minutes_of_Daylight: Minutes of daylight on observation day
Code/software
R v. 4.5.2 loaded with the packages car, lme4, lmerTest, plotrix
Access information
Other publicly accessible locations of the data:
- N/A
Data was derived from the following sources:
- N/A
Field monitoring took place from May 19th to November 10th, 2025, at our established field site (Figure 1A). Twice a week, an automated pet feeder with six compartments (Amazon Basics; Figure 1B) was deployed. Compartments of the feeder were filled with 0.4 g active dry yeast, 1.4 g sucrose, 100 ml water, 2 drops of dish soap, and a 5mm thick slice of banana. Feeders were programmed so that one compartment opened for two hours before closing automatically, providing six distinct 2-hour sampling intervals over 12 hours. After 12 hours, the feeder was replaced by a second feeder prepared in the same way to extend monitoring to 24 hours. Prior to being deployed, each feeder was kept for one hour at room temperature, allowing the yeast to become activated and to begin metabolizing the sucrose. All flies found in each compartment were counted, identified to the species level using Markow & O’Grady (2006, and sexed. D. melanogaster and D. simulans can morphologically only be distinguished in males, and both species were counted together and referred to as D. melanogaster throughout the paper. For each observation day, mean temperature and humidity values were retrieved from wunderground.com based on measurements from the station “KTNMEMPH309” positioned 685 m from the study location. The duration of daylight on each observational day was retrieved from www.timeanddate.com using Memphis, TN, as a location.
We quantified the longest continuous stretch of bins containing flies per observation day, representing the maximum duration of uninterrupted group activity. This variable was named Longest continuous group formation. For instance, if flies were found in bins one, two, and four, but not in three, this sequence was scored as ” 2“, i.e., two succeeding bins with captured flies. As days with no or only a single captured fly would have automatically resulted in a value of 0 or 1, respectively, these days were excluded from the analysis. The term “group” is used descriptively, as we did not assess whether group formation occurs, which has previously been shown (Dukas, 2020), but instead quantified the temporal persistence of aggregations. Then, a Generalized Linear Model (GLM) including the longest continuous group formation as a response variable and the total number of flies captured on that respective day, temperature, humidity, and length of daylight as explanatory factors was run.
