Cosmos: A data-driven probabilistic time series simulator for chemical plumes across spatial scales
Abstract
The development of robust odor navigation strategies for automated environmental monitoring applications requires realistic simulations of odor time series for agents moving across large spatial scales. Traditional approaches that rely on computational fluid dynamics (CFD) methods can capture the spatiotemporal dynamics of odor plumes, but are impractical for large-scale simulations due to their computational expense. On the other hand, puff-based simulations, although computationally tractable for large scales and capable of capturing the stochastic nature of plumes, fail to reproduce naturalistic odor statistics. Here, we present COSMOS (Configurable Odor Simulation Model over Scalable Spaces), a data-driven probabilistic framework that synthesizes realistic odor time series from spatial and temporal features of real datasets. COSMOS generates similar distributions of key statistical features such as whiff frequency, duration, and concentration as observed in real data, while dramatically reducing computational overhead. By reproducing critical statistical properties across a variety of flow regimes and scales, COSMOS enables the development and evaluation of agent-based navigation strategies with naturalistic odor experiences. To demonstrate its utility, we compare odor-tracking agents exposed to CFD-generated plumes versus COSMOS simulations, showing that both their odor experiences and resulting behaviors are quite similar.
The development of robust odor navigation strategies for automated environmental monitoring applications requires realistic simulations of odor time series for agents moving across large spatial scales. Traditional approaches that rely on computational fluid dynamics (CFD) methods can capture the spatiotemporal dynamics of odor plumes, but are impractical for large-scale simulations due to their computational expense. On the other hand, puff-based simulations, although computationally tractable for large scales and capable of capturing the stochastic nature of plumes, fail to reproduce naturalistic odor statistics. Here, we present COSMOS (Configurable Odor Simulation Model over Scalable Spaces), a data-driven probabilistic framework that synthesizes realistic odor time series from spatial and temporal features of real datasets. COSMOS generates similar distributions of key statistical features such as whiff frequency, duration, and concentration as observed in real data, while dramatically reducing computational overhead. By reproducing critical statistical properties across a variety of flow regimes and scales, COSMOS enables the development and evaluation of agent-based navigation strategies with naturalistic odor experiences. To demonstrate its utility, we compare odor-tracking agents exposed to CFD-generated plumes versus COSMOS simulations, showing that both their odor experiences and resulting behaviors are quite similar.
Data Files Walkthrough
Data Description:
All the file format are in pandas .h5
or .npz
format, are compatible with 1.0.0 < pandas <= 1.5.3
.
Download the data from Data dryad. The data
folder can be placed in the home folder under ~/COSMOS/
.
Folder Structure
├── data # Contains all data for COSMOS analysis
├── algorithm # datasets for algorithm visualization
├── forest # datasets and trained model for forest environment
├── hws # datasets and trained model for hws environment
├── lws # datasets and trained model for lws environment
├── rigolli # datasets and trained model for rigolli odor simulator
├── tracking # datasets for trajectory analysis in rigolli
├── svgs # contains svg files for reproducing the figure as seen in the paper
Below are the file descriptions under respective folders:
- algorithm
WindyMASigned.h5
: Contains interpolated sensor and odor data from Desert higher wind speeds.intermediates.h5
: Contains COSMOS’s intermediate signals.whiff.h5
: Contains empirical whiff statistics forWindyMASigned.h5
.
- forest
forest.h5
: Contains interpolated sensor and odor data from Whittel Forest.forest_hmap_with_edges.npz
: Trained spatial model for forest dataset.
- hws
hws.h5
: Contains empirical and cosmos simulated of COSMOS for Desert higher wind speeds.hmap.npz
: Trained spatial model for Desert HWS dataset.whiff.h5
: Contains whiff statistics for wind speed> 3m/s
for desert.nowhiff.h5
: Contains blank statistics for wind speed> 3m/s
for desert.
- lws
lws.h5
: Contains empirical and COSMOS simulated odor of COSMOS for Desert higher wind speeds.lws_hmap_with_edges.npz
: Trained spatial model for Desert LWS dataset.whiff.h5
: Contains whiff statistics for wind speed< 3m/s
for desert.nowhiff.h5
: Contains blank statistics for wind speed< 3m/s
for desert.
- rigolli
rigolli.h5
: Contains empirical and COSMOS simulated odor from rigolli’s cfd odor simulator.hmap.npz
: Trained spatial model for Rigolli odor simulator.whiff.h5
: Contains whiff statistics for rigolli simulator.nowhiff.h5
: Contains blank statistics for rigolli simulator.
- tracking
labels_150.npy
: Labels for clusters for UMAP projection for the trajectory features ran through CFD and COSMOS.X_umap_150.npy
: Cluster for UMAP projection for the trajectory features ran through CFD and COSMOS.plot_trajs
: Folder used to plot trajectories ran through CFD and COSMOS simulator using surge and cast algorithm for paper plot.trajectories
: All trajectories and there motion and odor experience that ran through CFD and COSMOS simulator using surge and cast algorithm for paper plot.timing
: CPU performance and analysis datasets for the trajectories that were run in the above simulators.
Figure:
Figure folder is available for download. The following are the files that can be found in the Figure folder:
algorithmv3.svg
: (Figure1) Flow diagram and overview of cosmos.results_hws.svg
: (Figure2) COSMOS results for HWS dataset.results_rigolli.svg
: (Figure3) COSMOS results for Rigolli odor simulator.results_trackingv1.svg
: (Figure4) Surge and Cast algorithm through CFD and COSMOS, showing experience similarities and time taken.results_lws.svg
: (Figure5) COSMOS results for LWS dataset.results_forest.svg
: (Figure6) COSMOS results for Forest dataset.S1.svg
: (Figure7) Supplmental figure showing the bins for empirical whiff statisics, and stepwise output of concentration modeling and flow diagram of intermittency modeling block.
Usage
This dataset is intended for use in developing and testing algorithms related to odor source localization in outdoor wind conditions in desert and forest terrains. Researchers are encouraged to utilize this simulator for developing odor tracing strategies for varying environmental conditions.
Dependencies
To visualize the below figures and see the results and calculations, you will need to install the following:
Follow the setup of FigureFirst into inkscape.
Install Virtualenv
-
Create the virtualenv:
virtualenv -p /usr/bin/python3.10 <env-name>
-
Install Packages:
pip install pandas pip install h5py pip install numpy pip install matplotlib pip install figurefirst pip install seaborn pip instal scikit-learn pip install h5py pip install tables pip install tsfresh python -m pip install statsmodels
Software/Scripts
The scripts can be found in the github in COSMOS repository.