Data from: Insights into summertime surface ozone formation from diurnal variations in formaldehyde and nitrogen dioxide along a transect through New York City
Data files
Mar 31, 2025 version files 10.36 MB
-
bayonne_Hourly_B3GTS_S_allvars_JJA2018.csv
671.17 KB
-
bayonne_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
428.38 KB
-
bayonne_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
124.32 KB
-
bayonne_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
122.64 KB
-
bayonne_Hourly_WRF_PBL_JJA2018.csv
120.10 KB
-
bronx-pfizer_Hourly_B3GTS_S_allvars_JJA2018.csv
673.93 KB
-
bronx-pfizer_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
438.63 KB
-
bronx-pfizer_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
135.19 KB
-
bronx-pfizer_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
134.52 KB
-
bronx-pfizer_Hourly_WRF_PBL_JJA2018.csv
120.02 KB
-
flax-pond_Hourly_B3GTS_S_allvars_JJA2018.csv
657.72 KB
-
flax-pond_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
436.13 KB
-
flax-pond_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
127.63 KB
-
flax-pond_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
124.76 KB
-
flax-pond_Hourly_WRF_PBL_JJA2018.csv
120.11 KB
-
new-haven_Hourly_B3GTS_S_allvars_JJA2018.csv
671.32 KB
-
new-haven_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
433.57 KB
-
new-haven_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
127.63 KB
-
new-haven_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
124.97 KB
-
new-haven_Hourly_WRF_PBL_JJA2018.csv
120.14 KB
-
queens-college_Hourly_B3GTS_S_allvars_JJA2018.csv
681.28 KB
-
queens-college_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
444.10 KB
-
queens-college_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
139.62 KB
-
queens-college_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
138.49 KB
-
queens-college_Hourly_WRF_PBL_JJA2018.csv
120.04 KB
-
README.md
3.60 KB
-
rutgers_Hourly_B3GTS_S_allvars_JJA2018.csv
655.34 KB
-
rutgers_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
429.46 KB
-
rutgers_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
124.27 KB
-
rutgers_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
121.03 KB
-
rutgers_Hourly_WRF_PBL_JJA2018.csv
120.17 KB
-
westport_Hourly_B3GTS_S_allvars_JJA2018.csv
666.63 KB
-
westport_Hourly_CMAQSurfACONC_allvars_JJA2018.csv
433.12 KB
-
westport_Hourly_CMAQVCDTrop_HCHO_JJA2018.csv
125.66 KB
-
westport_Hourly_CMAQVCDTrop_NO2_JJA2018.csv
122.92 KB
-
westport_Hourly_WRF_PBL_JJA2018.csv
120.11 KB
Abstract
Estimating tropospheric ozone (O3) production from observations is challenging but possible given the close coupling of O3 with formaldehyde (HCHO) and nitrogen dioxide (NO2), two remotely sensed air pollutants. Previous reliance on once-daily satellite overpasses highlights the need to study diurnal changes and surface-column relationships. Using surface observations, Pandora spectrometer retrievals, and a high-resolution (1.33 km) air quality model (WRF-CMAQ), we characterize diurnal patterns of HCHO and NO2 at seven locations along an upwind-downwind pathway through NYC during June-August 2018. Diurnal patterns of the few available surface HCHO concentrations suggest biogenic emission influence, while a bimodal surface NO2 pattern implies local anthropogenic NOx emissions. Details of these patterns vary by site: an afternoon NO2 spike at New Haven (CT) indicates traffic emissions, while a delayed daily HCHO peak at Westport (CT) relative to other sites likely reflects sea breeze dynamics. Peak column concentrations generally lag surface peaks by about four hours, occurring at 9-10 AM for morning NO2 (from Pandora and WRF-CMAQ) and around 4 PM for midday HCHO (from WRF-CMAQ). TROPOMI overpass time at 1:30 PM misses peak column HCHO and NO2 concentrations. A box model (F0AM) constrained with site-level observations and WRF-CMAQ fields indicates 1-9 ppb hr-1 higher noontime local O3 production rates on three sets of paired high- versus mid-to-low-O3 days. F0AM sensitivity analyses on these six days suggest a predominantly transitional O3 formation regime at urban and downwind sites, differing at some sites from the NOx-saturated regime diagnosed for summertime average conditions via the weekday-weekend effect.
https://doi.org/10.5061/dryad.f7m0cfz3w
Description of the data and file structure
Source: Assembled and calculated by authors
This dataset comprises a series of CSV files generated from WRF-CMAQ simulations for June to August 2018 (JJA2018). The data capture variables across seven different sites, including planetary boundary layer heights, biogenic emissions, and both surface and column concentrations.
File Naming Convention
Files are named following the pattern: Sitei_Hourly_data source_varname_JJA2018.csv
- Sitei
represents the site name associated with the data.
- Hourly
indicates that the data was collected on an hourly basis.
- data source
and varname
refer to the specific data source and variable names as described above.
Data Sources and Variables
All data files include the following common columns:
site
: site name.datetime
,Date Local
,Time Local
: Columns representing the local date and time for each row of data. Timestamps follow a 24-hour clock format (from 00:00 to 23:00), for example:6/1/18 4:00
,6/1/18
,4:00
.
Below is a breakdown of each data source
_varname
file type and the variables they contain:
WRF_PBL
- Description: Planetary Boundary Layer height from the Weather Research and Forecasting (WRF) model.
- Units: Meters (m)
- Variable(s):
PBL_m
: Planetary boundary layer height in meters.
B3GTS_S
- Description: Biogenic emissions derived from the Biogenic Emission Inventory System (BEIS).
- Units: Kilograms per hour (kg/hr)
- Variable(s):
- Variables are formatted as:\
B3GTS_S_{species name}_kgPhr
\
Example:B3GTS_S_FORM_kgPhr
(for formaldehyde) - Includes biogenic emissions for 22 species.
- Variables are formatted as:\
CMAQSurfACONC
- Description: Hourly averaged surface-level concentrations from WRF-CMAQ simulations.
- Units: Parts per million by volume (ppmv)
- Variable(s):
- Variables are formatted as:\
CMAQSurfACONC_{species name}_ppmV
\
Example:CMAQSurfACONC_NO_ppmV
(for nitric oxide) - Includes data for 7 chemical species.
- Variables are formatted as:\
CMAQVCDTrop
- Description: Approximated tropospheric vertical column densities from WRF-CMAQ simulations.
- Units: Molecules per square centimeter (molec/cm²)
- Variable(s):
- Two separate files:
CMAQVCDTrop_HCHO_molecPcm2
: Tropospheric column density of formaldehyde (HCHO).CMAQVCDTrop_NO2_molecPcm2
: Tropospheric column density of nitrogen dioxide (NO₂).
- Two separate files:
Data Approximation Methodology
The geographic coordinates for each site are detailed in Supplement Table S1. The values in the CSV files represent the nearest land-based pixel to these coordinates selected from the original model simulations, which have a horizontal resolution of 1.33 km x 1.33 km. To approximate surface concentrations, we utilize data from the near-surface layer. For calculating the tropospheric vertical column densities (VCDTrop), we aggregate number densities up to the tropopause, which we define consistently across the dataset as a pressure of 200 hPa—corresponding to an altitude of roughly 13 km at a latitude near 40° N. We advise users to incorporate the nuances of our approximation methods and the specificity of the data’s geographic sourcing in any subsequent analyses to maintain the integrity of their research findings.
We use version 5.3.1 Community Multiscale Air Quality Modeling System coupled online with version 4.1.1 Weather Research and Forecasting model (WRF-CMAQ) at 1.33 km by 1.33 km horizontal resolution. This simulation provided hourly preceding-hour-average estimates of surface HCHO, NO2, NO, CO, and O3 at the near-surface layer and instantaneous at-the-hour estimates of HCHO and NO2 in 36 vertical sigma pressure levels. WRF-CMAQ simulations are based on model specifications, emissions, and meteorology for the 1.33 km × 1.33 km simulation described in Torres-Vazquez et al. (2022). Emissions are sourced from the 2016 modeling platform version 7.2, based on the 2014 U.S. National Emissions Inventory (NEI) and the 2017 NEI with some additional sector-specific updates. Namely, on-road and non-road emissions were processed down to the county level for 21 counties in and around NYC for 2018 using the Motor Vehicle Emissions Simulator (MOVES) version 2014b. The inventory also incorporates updates from 2018 Continuous Emission Monitoring data for EGU emissions, as well as inline biogenic emissions computed using the Biogenic Emission Inventory System (BEIS) version 3.61. We approximate the values at each site using the nearest land-based pixel based on the geographic coordinates of the site selected from the original model simulations.