Data from: Odor source distance is predictable from time-histories of odor statistics for large scale outdoor plumes
Cite this dataset
Nag, Arunava; van Breugel, Floris (2024). Data from: Odor source distance is predictable from time-histories of odor statistics for large scale outdoor plumes [Dataset]. Dryad. https://doi.org/10.5061/dryad.2547d7wvr
Abstract
Odor plumes in turbulent environments are intermittent and sparse. Lab-scaled experiments suggest that information about the source distance may be encoded in odor signal statistics, yet it is unclear whether useful and continuous distance estimates can be made under real-world flow conditions. Here we analyze odor signals from outdoor experiments with a sensor moving across large spatial scales in desert and forest environments to show that odor signal statistics can yield useful estimates of distance. We show that achieving accurate estimates of distance requires integrating statistics from 5-10 seconds, with a high temporal encoding of the olfactory signal of at least 20 Hz. By combining distance estimates from a linear model with wind-relative motion dynamics, we achieved source distance estimates in a 60x60 m2 search area with median errors of 3-8 meters, a distance at which point odor sources are often within visual range for animals such as mosquitoes.
README: Odor source distance is predictable from time-histories of odor statistics for large scale outdoor plumes
This repository consist of the data analysis done for Odor Tracking experiment.
Abstract
Odor plumes in turbulent environments are intermittent and sparse. Lab-scaled experiments suggest that information about the source distance may be encoded in odor signal statistics, yet it is unclear whether useful and continuous distance estimates can be made under real-world flow conditions. Here we analyze odor signals from outdoor experiments with a sensor moving across large spatial scales in desert and forest environments to show that odor signal statistics can yield useful estimates of distance. We show that achieving accurate estimates of distance requires integrating statistics from 5-10 seconds, with a high temporal encoding of the olfactory signal of at least 20 Hz. By combining distance estimates from a linear model with wind-relative motion dynamics, we achieved source distance estimates in a 60x60 meter square search area with median errors of 3-8 meters, a distance at which point odor sources are often within visual range for animals such as mosquitoes.
Data Files Walkthrough
Data Description:
This contains all datasets involved in the analysis of the preprint as available here: "Odor source location can be predicted from a time-history of odor statistics for a large-scale outdoor plume" and the analysis repository can be found in here in github. All the file format are in pandas .h5
or .hdf
format, are compatible with 1.0.0 < pandas <= 1.5.3
Download the data from Data dryad. The data
folder along with figure
folder, can be placed in the home folder under ~/odor_anaylsis/
.
The datasets can be divided into:
*Interpolated Sensor and stationery wind sensor data data *: These datasets include data from mobile sensor stack that contains imu, gps data and odor sensor and data from stationery wind sensors that has ambient wind velocity, direction. The interpolation is done with respect to the odor sensor's sampling speed.
Derived dataframes: For different analysis example whiff statistics calculation, or lookback time analysis dataframes have been derived for easy to use plug and play experience.
Folder Structure
├── data # Contains mobile sensor data from desert and forest
├── derived_data
├── KF # datasets for kalman filter analysis
├── stationery_wind_data # contains ambient wind sensor data from desert and forest
├── wind_lag_analysis # datasets for wind lag analysis
├── Figure # contains svg files for reproducing the figure as seen in the paper
Below are the file descriptions under respective folders:
- data
-
ForestMASigned.h5
: Contains interpolated sensor data and signed distance axis for general Whittell Forest data analysis. -
NotWindyLR.h5
: Low resolution interpolated sensor data for wind speed< 3m/s
for desert. -
NotWindyMASigned.h5
: Contains high resolution interpolated clean sensor data for wind speed< 3m/s
for desert. -
WindyMASigned.h5
: Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert.
-
- derived_data
-
NotWindyStatsTime_std.h5
: Contains derived lookback time whiff statistics for wind speed< 3m/s
for desert. -
WindyStatsTime_std.h5
: Contains derived lookback time whiff statistics for wind speed> 3m/s
for desert. -
ForestStatsTime_std.h5
: Contains derived lookback whiff statistics for forest. -
1hz.h5, 10hz.h5, 60hz.h5
: low pass filtered odor signal passed through 2nd order butterworth-filter and can be used with the script 8. Figure 8. -
LpfForestFiltered.h5, LpfHWSFiltered.h5, LpfLWSFiltered.h5
: datasets containing effect on R-squared value when odor signal is low pass filtered. Use with Figure 8. -
HWSLTall.h5, LWSLTall.h5, ForestLTall.h5, R2LtTime.h5
: datasets used in the script Figure 5 to analyse the effect of lookback window on r-squared when correlated with distance. -
aic_filtered_model_params.h5
: Contains coefficients from the statsmodel of distance correlated aic filtered whiff statistics for all three wind scenarios. Run the script Figure S9. -
lt_whiff_statistics.h5, All_AICDeltaTab.h5, all_Rsquared.h5, AllRsquaredAicCombinations.h5,
: Contains all the 25 whiff statistics calculated across the different distance from source over a look back time of 10 seconds. Run the script Figure 5, in sectionBootstrapped R2
andBoootstrapped R2 and AIC for filtered parameters
to see the results.
-
Figure:
Figure folder is available for download. The following are the files that can be found in the Figure folder, which can be seen in the paper:
-
method1.svg
: (Figure1) Contains interpolated sensor data and signed distance axis for general Whittell Forest data analysis. -
method2.svg
: (Figure2) Low resolution interpolated sensor data for wind speed< 3m/s
for desert. -
fig3_revised.svg
: (Figure3) Contains high resolution interpolated clean sensor data for wind speed< 3m/s
for desert. -
fig4.svg
: (Figure4) Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert. -
r2AicStatPlot.svg
: (Figure5) Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert. -
clustering.svg
: (Figure6) Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert. -
fig7.svg
: (Figure7) Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert. -
lpf.svg
: (Figure8) Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert. -
verticalMovement.svg
: Figure S1: Contains high resolution interpolated sensor data for wind speed> 3m/s
for desert. -
motionanalysis.svg
: Figure S2: Motion analysis showing agent's movements similar to casting motions. -
normalityAnalysis.svg
: Figure S3: Comprehensive residual analysis for model validation -
windLagAnalysis.svg
:Figure S4: Wind characteristics from our LWS, HWS and Forest scenarios span a representative range of near surface wind characteristics from a larger dataset. -
time_spent.svg
: Figure S5: Time spent - vs number of encounters -
individualDatasetWhiffStat.svg
: Figure S6: Time spent - vs number of encounters -
mc_wsd.svg
: Figure S7: Time spent - vs number of encounters -
windAicParams.svg
: Figure S9: AIC filtered coefficiecnts across various wind scenarios -
klmsupplmental.svg
: Figure S10: Kalman smoothed estimates of the distance to the odor source zoomed
Data Fields
Interpolated Sensor and stationery ambient sensor data
This section presents real-time and interpolated measurements from both mobile sensors on an agent and fixed ambient sensors, providing a comprehensive view of the environmental and agent-specific dynamics.
Field Name | Description |
---|---|
master_time |
Unix Time stamp from epoch |
time |
Time stamps starting from 0 |
lon |
GPS longitude |
lat |
GPS latitude |
alt |
GPS altitude |
xsrc |
GPS coordinate converted to x axis in metres |
ysrc |
GPS coordinate converted to y axis in metres |
imu_angular_x |
Angular velocity about the x-axis |
imu_angular_y |
Angular velocity about the y-axis |
imu_angular_z |
Angular velocity about the z-axis |
imu_linear_acc_x |
Linear acceleration along the x-axis |
imu_linear_acc_y |
Linear acceleration along the y-axis |
imu_linear_acc_z |
Linear acceleration along the z-axis |
U |
East-west velocity of ambient wind (from stationary ground sensor) |
V |
North-south velocity of ambient wind (from stationary ground sensor) |
D |
X-Y ambient wind direction (from stationary ground sensor) |
S |
X-Y ambient wind magnitude (from stationary ground sensor) |
S2 |
Speed of ambient wind magnitude (from stationary ground sensor) |
corrected_u |
Ambient Wind velocity in east-west direction |
corrected_v |
Ambient Wind velocity in north-south direction |
nearest_from_streakline |
Odor encounter in streakline coordinates in meters |
distance_along_streakline |
Distance along streakline to odor encounter in meters |
distance_from_source |
Distance from source to odor encounter in meters |
relative_parallel_comp |
Relative parallel motion component of the agent with respect to wind |
relative_perpendicular_comp |
Relative parallel perpendicular component of the agent with respect to wind |
Derived Analysis Data
This section details analytics derived from the raw and interpolated data, aimed at understanding patterns, frequencies, and statistical measures related to odor encounters and agent navigation relative to the source.
Field Name | Description |
---|---|
type |
Classification of encounter distance from source (0: 0-5m, 1: 5-30m, 2: >30m) |
distance |
Average distance over a look back time |
avg_dist_from_source |
Averaged Distance from source in meters within a whiff |
avg_dist_from_streakline |
Averaged Distance along streakline in meters within a whiff |
mean_whiff_time |
Average time duration of whiffs over a look back time |
nwhiffs |
Number of whiff in a look back time |
efreq |
Encounter frequency in hz |
mean_ef |
Average encounter frequency in hz |
std_whiff |
Mean standard deviation of odor within a whiff |
whiff_ma |
Average moving average across a whiff |
mc_min |
Minimum of whiff concentration over a look back time |
mc_max |
Maximum of whiff concentration over a look back time |
mc_mean |
Avg of whiff concentration over a look back time |
mc_std_dev |
Standard Deviation of Whiff concentration over a look back time |
mc_k |
Kurtosis of whiff concentration over a look back time |
wf_min |
Minimum of whiff frequency over a look back time |
wf_max |
Maximum of whiff frequency over a look back time |
wf_mean |
Avg of whiff frequency over a look back time |
wf_std_dev |
Standard deviation of whiff frequency over a look back time |
wf_k |
Kurtosis of whiff frequency over a look back time |
wd_min |
Minimum of whiff duration over a look back time |
wd_max |
Maximum of whiff duration over a look back time |
wd_mean |
Avg whiff duration over a look back time |
wd_std_dev |
Standard deviation of Whiff Duration over a look back time |
wd_k |
Kurtosis of Whiff Duration over a look back time |
ma_min |
Minimum of moving average over a look back time |
ma_max |
Maximum of moving average over a look back time |
ma_mean |
Average of moving average over a look back time |
ma_std_dev |
Standard deviation of moving average over a look back time |
ma_k |
Kurtosis of moving average over a look back time |
st_min |
Minimum of standard deviation over a look back time |
st_max |
Maximum of standard deviation over a look back time |
st_mean |
Average of standard deviation over a look back time |
st_std_dev |
Standard deviation of standard deviation over a look back time |
st_k |
Kurtosis of standard deviation over a look back time |
Usage
This dataset is intended for use in developing and testing algorithms related to odor source localization in outdoor wind conditions in desert and forest terrains. Researchers are encouraged to utilize this data for exploring innovative approaches for odor source localization under varying environmental conditions.
Dependencies
To visualize the below figures and see the results and calculations, you will need to install the following:
Follow the setup of FigureFirst into inkscape.
Setup Environment
Folder Structure
├── data_exploration # Contains all the analysis scripts
├── bag2h5.py # script for changing rosbag file to pandas dataframe
├── dataInterpolation.ipynb # interpolation to match odor sensor sampling rate
├── odor_statitics_lib.py
├── figure # all script for generating figure
└── supplemental # contains scripts to generate supplemental figures
└── demo # some analysis demo scripts
├── dataFileDescription.md # readme for datafiles submitted in Data Dryad
└── README.md
Install Virtualenv
Create the virtualenv:
virtualenv -p /usr/bin/python3.8 <env-name>
Install Packages:
pip install pandas pip install h5py pip install numpy pip install matplotlib pip install figurefirst pip install seaborn pip instal scikit-learn pip install h5py pip install tables python -m pip install statsmodels
Software/Scripts
The odor_analysis.zip folder contains the software which can also be found in the github with detailed description on how to use them.
Methods
The setup can be divided into two components, the first being the mobile sensor stack that was carried by a human for collecting odor signals, and the second component included the placement of 8 stationery wind sensors for ambient wind measurements around the odor source. The odor source was a propylene gas tank that was mounted with a stationary GPS antenna which sent Real Time Kinematics (RTK) correction data to the antenna mounted on the mobile sensor stack for a high-resolution position with an accuracy close to 1cm.
The mobile sensor stack included an odor sensor, a GPS antenna to receive accurate location measurements, and an IMU that provided angular velocity measurements. The sensors stack was balanced on a gimbal for stability and ease of carrying. The odor sensor data was collected using a data acquisition (DAQ) unit, which was connected along with all the other sensors to a computer that was running ROS as middleware and recorded data in real-time. Due to the different sampling rates of sensors, the data was interpolated with respect to the rate of the odor sensor which sampled at 200Hz.
Around the odor source, 7 ambient wind sensors were placed in a square fashion at approximately 30 meters away from the source, and one another wind sensor was placed 1 meter away from the odor source for accurate measurement of wind speed and direction near the source.
Refer to the manuscript for more details.
Usage notes
These are data frames with .hdf and .h5 extension, processed with python pandas. They can be opened in using python pandas https://pandas.pydata.org/pandas-docs/version/1.5/getting_started/install.html
To open or produce the figures inskcape is needed.
Funding
United States Air Force Research Laboratory, Award: FA8651-20-1-0002
United States Air Force Office of Scientific Research, Award: FA9550-21-0122
Alfred P. Sloan Foundation, Award: FG-2020-13422
National Science Foundation, Award: 2112085