Skip to main content
Dryad

Data from: Odor source distance is predictable from time-histories of odor statistics for large scale outdoor plumes

Cite this dataset

Nag, Arunava; van Breugel, Floris (2024). Data from: Odor source distance is predictable from time-histories of odor statistics for large scale outdoor plumes [Dataset]. Dryad. https://doi.org/10.5061/dryad.2547d7wvr

Abstract

Odor plumes in turbulent environments are intermittent and sparse. Lab-scaled experiments suggest that information about the source distance may be encoded in odor signal statistics, yet it is unclear whether useful and continuous distance estimates can be made under real-world flow conditions. Here we analyze odor signals from outdoor experiments with a sensor moving across large spatial scales in desert and forest environments to show that odor signal statistics can yield useful estimates of distance. We show that achieving accurate estimates of distance requires integrating statistics from 5-10 seconds, with a high temporal encoding of the olfactory signal of at least 20 Hz. By combining distance estimates from a linear model with wind-relative motion dynamics, we achieved source distance estimates in a 60x60 m2 search area with median errors of 3-8 meters, a distance at which point odor sources are often within visual range for animals such as mosquitoes.

README: Odor source distance is predictable from time-histories of odor statistics for large scale outdoor plumes

This repository consist of the data analysis done for Odor Tracking experiment.

Abstract

Odor plumes in turbulent environments are intermittent and sparse. Lab-scaled experiments suggest that information about the source distance may be encoded in odor signal statistics, yet it is unclear whether useful and continuous distance estimates can be made under real-world flow conditions. Here we analyze odor signals from outdoor experiments with a sensor moving across large spatial scales in desert and forest environments to show that odor signal statistics can yield useful estimates of distance. We show that achieving accurate estimates of distance requires integrating statistics from 5-10 seconds, with a high temporal encoding of the olfactory signal of at least 20 Hz. By combining distance estimates from a linear model with wind-relative motion dynamics, we achieved source distance estimates in a 60x60 meter square search area with median errors of 3-8 meters, a distance at which point odor sources are often within visual range for animals such as mosquitoes.

Preprint: "Odor source location can be predicted from a time-history of odor statistics for a large-scale outdoor plume"

Data Files Walkthrough

Data Description:

This contains all datasets involved in the analysis of the preprint as available here: "Odor source location can be predicted from a time-history of odor statistics for a large-scale outdoor plume" and the analysis repository can be found in here in github. All the file format are in pandas .h5 or .hdf format, are compatible with 1.0.0 < pandas <= 1.5.3

Download the data from Data dryad. The data folder along with figure folder, can be placed in the home folder under ~/odor_anaylsis/ .

The datasets can be divided into:

  1. *Interpolated Sensor and stationery wind sensor data data *: These datasets include data from mobile sensor stack that contains imu, gps data and odor sensor and data from stationery wind sensors that has ambient wind velocity, direction. The interpolation is done with respect to the odor sensor's sampling speed.

  2. Derived dataframes: For different analysis example whiff statistics calculation, or lookback time analysis dataframes have been derived for easy to use plug and play experience.

Folder Structure

├── data                       # Contains mobile sensor data from desert and forest
├── derived_data
   ├── KF                      # datasets for kalman filter analysis 
   ├── stationery_wind_data    # contains ambient wind sensor data from desert and forest
   ├── wind_lag_analysis       # datasets for wind lag analysis 
├── Figure                     # contains svg files for reproducing the figure as seen in the paper

Below are the file descriptions under respective folders:

  1. data
    • ForestMASigned.h5 : Contains interpolated sensor data and signed distance axis for general Whittell Forest data analysis.
    • NotWindyLR.h5 : Low resolution interpolated sensor data for wind speed < 3m/s for desert.
    • NotWindyMASigned.h5: Contains high resolution interpolated clean sensor data for wind speed < 3m/s for desert.
    • WindyMASigned.h5: Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  2. derived_data
    • NotWindyStatsTime_std.h5: Contains derived lookback time whiff statistics for wind speed < 3m/s for desert.
    • WindyStatsTime_std.h5: Contains derived lookback time whiff statistics for wind speed > 3m/s for desert.
    • ForestStatsTime_std.h5 : Contains derived lookback whiff statistics for forest.
    • 1hz.h5, 10hz.h5, 60hz.h5 : low pass filtered odor signal passed through 2nd order butterworth-filter and can be used with the script 8. Figure 8.
    • LpfForestFiltered.h5, LpfHWSFiltered.h5, LpfLWSFiltered.h5 : datasets containing effect on R-squared value when odor signal is low pass filtered. Use with Figure 8.
    • HWSLTall.h5, LWSLTall.h5, ForestLTall.h5, R2LtTime.h5 : datasets used in the script Figure 5 to analyse the effect of lookback window on r-squared when correlated with distance.
    • aic_filtered_model_params.h5 : Contains coefficients from the statsmodel of distance correlated aic filtered whiff statistics for all three wind scenarios. Run the script Figure S9.
    • lt_whiff_statistics.h5, All_AICDeltaTab.h5, all_Rsquared.h5, AllRsquaredAicCombinations.h5,: Contains all the 25 whiff statistics calculated across the different distance from source over a look back time of 10 seconds. Run the script Figure 5, in section Bootstrapped R2 and Boootstrapped R2 and AIC for filtered parameters to see the results.

Figure:

Figure folder is available for download. The following are the files that can be found in the Figure folder, which can be seen in the paper:

  • method1.svg : (Figure1) Contains interpolated sensor data and signed distance axis for general Whittell Forest data analysis.
  • method2.svg : (Figure2) Low resolution interpolated sensor data for wind speed < 3m/s for desert.
  • fig3_revised.svg : (Figure3) Contains high resolution interpolated clean sensor data for wind speed < 3m/s for desert.
  • fig4.svg : (Figure4) Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  • r2AicStatPlot.svg : (Figure5) Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  • clustering.svg : (Figure6) Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  • fig7.svg : (Figure7) Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  • lpf.svg : (Figure8) Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  • verticalMovement.svg : Figure S1: Contains high resolution interpolated sensor data for wind speed > 3m/s for desert.
  • motionanalysis.svg : Figure S2: Motion analysis showing agent's movements similar to casting motions.
  • normalityAnalysis.svg : Figure S3: Comprehensive residual analysis for model validation
  • windLagAnalysis.svg :Figure S4: Wind characteristics from our LWS, HWS and Forest scenarios span a representative range of near surface wind characteristics from a larger dataset.
  • time_spent.svg : Figure S5: Time spent - vs number of encounters
  • individualDatasetWhiffStat.svg : Figure S6: Time spent - vs number of encounters
  • mc_wsd.svg : Figure S7: Time spent - vs number of encounters
  • windAicParams.svg : Figure S9: AIC filtered coefficiecnts across various wind scenarios
  • klmsupplmental.svg : Figure S10: Kalman smoothed estimates of the distance to the odor source zoomed

Data Fields

Interpolated Sensor and stationery ambient sensor data

This section presents real-time and interpolated measurements from both mobile sensors on an agent and fixed ambient sensors, providing a comprehensive view of the environmental and agent-specific dynamics.

Field Name Description
master_time Unix Time stamp from epoch
time Time stamps starting from 0
lon GPS longitude
lat GPS latitude
alt GPS altitude
xsrc GPS coordinate converted to x axis in metres
ysrc GPS coordinate converted to y axis in metres
imu_angular_x Angular velocity about the x-axis
imu_angular_y Angular velocity about the y-axis
imu_angular_z Angular velocity about the z-axis
imu_linear_acc_x Linear acceleration along the x-axis
imu_linear_acc_y Linear acceleration along the y-axis
imu_linear_acc_z Linear acceleration along the z-axis
U East-west velocity of ambient wind (from stationary ground sensor)
V North-south velocity of ambient wind (from stationary ground sensor)
D X-Y ambient wind direction (from stationary ground sensor)
S X-Y ambient wind magnitude (from stationary ground sensor)
S2 Speed of ambient wind magnitude (from stationary ground sensor)
corrected_u Ambient Wind velocity in east-west direction
corrected_v Ambient Wind velocity in north-south direction
nearest_from_streakline Odor encounter in streakline coordinates in meters
distance_along_streakline Distance along streakline to odor encounter in meters
distance_from_source Distance from source to odor encounter in meters
relative_parallel_comp Relative parallel motion component of the agent with respect to wind
relative_perpendicular_comp Relative parallel perpendicular component of the agent with respect to wind

Derived Analysis Data

This section details analytics derived from the raw and interpolated data, aimed at understanding patterns, frequencies, and statistical measures related to odor encounters and agent navigation relative to the source.

Field Name Description
type Classification of encounter distance from source (0: 0-5m, 1: 5-30m, 2: >30m)
distance Average distance over a look back time
avg_dist_from_source Averaged Distance from source in meters within a whiff
avg_dist_from_streakline Averaged Distance along streakline in meters within a whiff
mean_whiff_time Average time duration of whiffs over a look back time
nwhiffs Number of whiff in a look back time
efreq Encounter frequency in hz
mean_ef Average encounter frequency in hz
std_whiff Mean standard deviation of odor within a whiff
whiff_ma Average moving average across a whiff
mc_min Minimum of whiff concentration over a look back time
mc_max Maximum of whiff concentration over a look back time
mc_mean Avg of whiff concentration over a look back time
mc_std_dev Standard Deviation of Whiff concentration over a look back time
mc_k Kurtosis of whiff concentration over a look back time
wf_min Minimum of whiff frequency over a look back time
wf_max Maximum of whiff frequency over a look back time
wf_mean Avg of whiff frequency over a look back time
wf_std_dev Standard deviation of whiff frequency over a look back time
wf_k Kurtosis of whiff frequency over a look back time
wd_min Minimum of whiff duration over a look back time
wd_max Maximum of whiff duration over a look back time
wd_mean Avg whiff duration over a look back time
wd_std_dev Standard deviation of Whiff Duration over a look back time
wd_k Kurtosis of Whiff Duration over a look back time
ma_min Minimum of moving average over a look back time
ma_max Maximum of moving average over a look back time
ma_mean Average of moving average over a look back time
ma_std_dev Standard deviation of moving average over a look back time
ma_k Kurtosis of moving average over a look back time
st_min Minimum of standard deviation over a look back time
st_max Maximum of standard deviation over a look back time
st_mean Average of standard deviation over a look back time
st_std_dev Standard deviation of standard deviation over a look back time
st_k Kurtosis of standard deviation over a look back time

Usage

This dataset is intended for use in developing and testing algorithms related to odor source localization in outdoor wind conditions in desert and forest terrains. Researchers are encouraged to utilize this data for exploring innovative approaches for odor source localization under varying environmental conditions.

Dependencies

To visualize the below figures and see the results and calculations, you will need to install the following:

  1. FlyPlotLib
  2. FigureFirst
  3. Inkscape 1.2

Follow the setup of FigureFirst into inkscape.

Setup Environment

Folder Structure

├── data_exploration             # Contains all the analysis scripts
   ├── bag2h5.py                 # script for changing rosbag file to pandas dataframe
   ├── dataInterpolation.ipynb   # interpolation to match odor sensor sampling rate
   ├── odor_statitics_lib.py
   ├── figure                    # all script for generating figure
      └── supplemental           # contains scripts to generate supplemental figures
   └── demo                      # some analysis demo scripts
├── dataFileDescription.md       # readme for datafiles submitted in Data Dryad
└── README.md

Install Virtualenv

  1. Create the virtualenv:

    virtualenv -p /usr/bin/python3.8 <env-name>  
    
  2. Install Packages:

    pip install pandas
    pip install h5py
    pip install numpy
    pip install matplotlib
    pip install figurefirst
    pip install seaborn
    pip instal scikit-learn
    pip install h5py
    pip install tables
    python -m pip install statsmodels
    

Software/Scripts

The odor_analysis.zip folder contains the software which can also be found in the github with detailed description on how to use them.

Methods

The setup can be divided into two components, the first being the mobile sensor stack that was carried by a human for collecting odor signals, and the second component included the placement of 8 stationery wind sensors for ambient wind measurements around the odor source. The odor source was a propylene gas tank that was mounted with a stationary GPS antenna which sent Real Time Kinematics (RTK) correction data to the antenna mounted on the mobile sensor stack for a high-resolution position with an accuracy close to 1cm.

The mobile sensor stack included an odor sensor, a GPS antenna to receive accurate location measurements, and an IMU that provided angular velocity measurements. The sensors stack was balanced on a gimbal for stability and ease of carrying. The odor sensor data was collected using a data acquisition (DAQ) unit, which was connected along with all the other sensors to a computer that was running ROS as middleware and recorded data in real-time. Due to the different sampling rates of sensors, the data was interpolated with respect to the rate of the odor sensor which sampled at 200Hz.

Around the odor source, 7 ambient wind sensors were placed in a square fashion at approximately 30 meters away from the source, and one another wind sensor was placed 1 meter away from the odor source for accurate measurement of wind speed and direction near the source.

Refer to the manuscript for more details.

Usage notes

These are data frames with .hdf and .h5 extension, processed with python pandas. They can be opened in using python pandas https://pandas.pydata.org/pandas-docs/version/1.5/getting_started/install.html

To open or produce the figures inskcape is needed.

Funding

United States Air Force Research Laboratory, Award: FA8651-20-1-0002

United States Air Force Office of Scientific Research, Award: FA9550-21-0122

Alfred P. Sloan Foundation, Award: FG-2020-13422

National Science Foundation, Award: 2112085