Design principles for ESIPT-based fluorophores: Effects of heteroatoms, substituents, and solvent polarity in benzazole derivatives
Abstract
This dataset provides comprehensive computational and spectroscopic data supporting the analysis of enol–keto tautomerism, excited-state properties, and reaction kinetics. The collection is organized into three main components: (i) a Supporting Information (SI) file containing all essential data required to reproduce the reported results; (ii) a structured archive (draft-data.rar) including detailed Atoms in Molecules (AIM) analyses for enol and keto forms, raw and processed UV–Vis absorption spectra, calculated fluorescence properties (e.g., HOMO–LUMO energy gaps and S₁ electronic transitions), potential energy surface (PES) profiles from scan calculations, and derived kinetic parameters; and (iii) a data archive (data.rar) containing complete input files for all quantum chemical calculations. The dataset includes both raw and post-processed values, enabling transparency and reproducibility of electronic structure, spectroscopic, and kinetic analyses. It is suitable for reuse in benchmarking quantum chemical methods, studying excited-state intramolecular proton transfer (ESIPT), and modeling reaction mechanisms in related molecular systems. No ethical concerns are associated with this dataset, and all data are computationally generated. The dataset can be freely reused for academic and research purposes, provided appropriate citation of the original source.
I-GENERAL INFORMATION
1. Title of Dataset: Design Principles for ESIPT-Based Fluorophores: Effects of Heteroatoms, Substituents, and Solvent Polarity in Benzazole Derivatives
2. Authors Information:
Nguyen Linh Nam: The University of Danang-University of Technology and Education;
Mai Van Bay:The University of Danang;
Adam Mechler: La Trobe University;
Pham Cam Nam:The University of Danang-University of Technology and Sciences;
Nguyen Minh Thong*: The University of Danang-University of Sciences and Education;
Quan V. Vo*: The University of Danang-University of Technology and Education;
3. Date of data collection: 2024-01-01 to 2025-01-07
4. Geographic location of data collection:
The University of Danang-University of Sciences and Education, Vietnam and La Trobe University, Australia.
5. Information about funding sources that supported the collection of the data:
This research is funded by the Vietnam Ministry of Education and Training under grant No B2025.DNA.02
II-SHARING/ACCESS INFORMATION
1. Links to publications that cite or use the data: https://doi.org/10.1098/rsos.251822
2. Was data derived from another source? yes/no: no
3. Recommended citation for this dataset: https://doi.org/10.1098/rsos.251822
III-DATA & FILE OVERVIEW
This dataset contains computational chemistry data describing the structural, electronic, photophysical, and kinetic properties of benzazole-based fluorophores exhibiting excited-state intramolecular proton transfer (ESIPT).
The DATA.ZIP includes:
1-SI FILE
2-DRAFT-DATA
3-INPUT FILES
The data enable full reproducibility of all reported results and support reuse in method benchmarking, ESIPT mechanism studies, and photophysical modeling.
Further details of the computational methods are provided in the associated manuscript referenced above.
IV-METHODOLOGICAL INFORMATION
The dataset was collected during the Gaussian 16 suite of programs at The University of Danang - University of Technology and Education, Vietnam, and La Trobe University, Australia. Geometry optimizations for both the ground state (S₀) and first excited state (S₁) of the investigated compounds were carried out at the B3LYP/def2-TZVP level. Vibrational frequency analyses at the same level confirmed minima (no imaginary frequencies) and transition states (one imaginary frequency). Solvent effects were modeled using the SMD approach with three representative solvents of increasing polarity: heptane (nonpolar), chloroform (moderately polar), and water (polar protic). Topological analyses of the electron density were performed using the AIM software package, with wavefunctions generated at the B3LYP/def2-TZVP level.
V-DATA-SPECIFIC INFORMATION FOR SI FILE
Description: Supporting Information file associated with the manuscript. This file contains supplementary tables and figures, including photophysical benchmarks, structural parameters, AIM/BCP data, frontier molecular orbital data, absorption and fluorescence properties, HOMO/LUMO orbital plots, absorption spectra, and IRC profiles.
Contents of SI FILE:
Table S1. Comparison of calculated and experimental photophysical data for representative HBX analogs, page S3.
Table S2. Bond lengths (Å) and bond angles (°) related to the intramolecular hydrogen bond of studied compounds in enol and keto forms at S1 states in heptane solvent, pages S3-S4.
Table S3. Bond lengths (Å) and bond angles (°) related to the intramolecular hydrogen bond of studied compounds in enol and keto forms at S1 states in water solvent, pages S4-S5.
Table S4. BCP parameters related to the intramolecular hydrogen bond of studied compounds for the enol form at S1 state in heptane and water solvents, pages S5-S6.
Table S5. BCP parameters related to the intramolecular hydrogen bond of studied compounds for the keto form at S1 state in heptane and water solvents, pages S6-S8.
Table S6. Frontier molecular orbital energies and energy gaps of studied compounds in heptane solvent, pages S8-S9.
Table S7. Frontier molecular orbital energies and energy gaps of studied compounds in water solvent, pages S9-S11.
Table S8. Calculated absorption properties of studied compounds at S0 states in heptane solvent, page S11.
Table S9. Calculated absorption properties of studied compounds at S0 states in water solvent, page S12.
Table S10. Calculated fluorescence properties for the enol and keto forms of studied compounds at S1 states in heptane solvent, pages S12-S14.
Table S11. Calculated fluorescence properties for the enol and keto forms of studied compounds at S1 states in water solvent, pages S14-S15.
Figure S1. HOMO and LUMO orbitals of benzimidazole derivatives in chloroform solvent, page S16.
Figure S2. HOMO and LUMO orbitals of benzoxazole derivatives in chloroform solvent, page S17.
Figure S3. HOMO and LUMO orbitals of benzothiazole derivatives in chloroform solvent, page S18.
Figure S4. HOMO and LUMO orbitals of benzoselenazole derivatives in chloroform solvent, page S19.
Figure S5. Absorption spectra of studied compounds in heptane solvent, page S20.
Figure S6. Absorption spectra of studied compounds in chloroform solvent, page S20.
Figure S7. Absorption spectra of studied compounds in water solvent, page S21.
Figure S8. IRC profiles for the ESIPT process of representative HBX systems, page S21.
VI-DATA-SPECIFIC INFORMATION FOR DRAFT-DATA contains the following CSV files:
1. AIM data files
Files:
1. Data-AIM analysis-enol forms.csv
2. Data-AIM analysis-keto forms.csv
Description:
These files contain AIM bond critical point parameters for intramolecular hydrogen bonds in the enol and keto forms, respectively.
Variables:
Form
Definition: Tautomeric form analyzed by AIM/QTAIM.
Allowed values: Enol, Keto.
Unit: dimensionless.
Solvent
Definition: Solvent condition used in the calculation.
Examples: water, heptane.
Unit: dimensionless.
R
Definition: Substituent group attached to the parent benzazole derivative.
Examples: H, NO2, N(CH3)2.
Unit: dimensionless.
X
Definition: Heteroatom or azole-ring identity in the benzazole framework.
Examples: NH, O, S, Se.
Unit: dimensionless.
Compounds
Definition: Internal compound index used in the manuscript and Supporting Information.
Unit: dimensionless.
Contacts
Definition: Atom pair defining the intramolecular hydrogen-bond contact analyzed by AIM.
Examples: N12 - H24, O23 - H25.
Unit: dimensionless.
rho_au
Definition: Electron density at the bond critical point, ρ(r).
Unit: atomic units, a.u.
lambda1_au
Definition: First eigenvalue of the Hessian matrix of electron density at the bond critical point.
Unit: atomic units, a.u.
lambda2_au
Definition: Second eigenvalue of the Hessian matrix of electron density at the bond critical point.
Unit: atomic units, a.u.
lambda3_au
Definition: Third eigenvalue of the Hessian matrix of electron density at the bond critical point.
Unit: atomic units, a.u.
laplacian_rho_au
Definition: Laplacian of electron density at the bond critical point, ∇²ρ(r), calculated as lambda1_au + lambda2_au + lambda3_au.
Unit: atomic units, a.u.
G(r) (au)
Definition: Local kinetic energy density at the bond critical point, G(r).
Unit: atomic units, a.u.
V(r) (au)
Definition: Local potential energy density at the bond critical point, V(r).
Unit: atomic units, a.u.
G(r)/|V(r)|
Definition: Ratio between local kinetic energy density and the absolute value of local potential energy density.
Unit: dimensionless.
H(r) (au)
Definition: Total local energy density at the bond critical point, H(r) = G(r) + V(r).
Unit: atomic units, a.u.
E_HB (kj/mol)
Definition: Estimated hydrogen-bond energy calculated from the potential energy density.
Unit: kJ mol^-1.
E_HB (kcal/mol)
Definition: Estimated hydrogen-bond energy converted from kJ mol-1 to kcal mol-1.
Unit: kcal mol^-1.
2. Absorption spectra data files
Files:
3. DATA-Absorption Spectra-chloroform.csv
4. DATA-Absorption Spectra-heptane.csv
5. DATA-Absorption Spectra-water.csv
Description:
These files contain simulated UV-Vis absorption spectra of the studied benzazole derivatives in chloroform, heptane, and water.
Variables:
solvent
Definition: Solvent used in the absorption-spectrum calculation.
Allowed values: chloroform, heptane, water.
Unit: dimensionless.
compound
Definition: Compound label, including the parent benzazole core and substituent when applicable.
Examples: HBN, HBO, HBS, HBSe, HBN-NO2, HBN-N(CH3)2, HBSe-NO2, HBSe-N(CH3)2.
Unit: dimensionless.
wavelength_nm
Definition: Wavelength used to construct the simulated absorption spectrum.
Unit: nm.
absorption_intensity_au
Definition: Simulated absorption intensity after spectral broadening.
Unit: arbitrary units, a.u.
3. Fluorescence and frontier molecular orbital data files
Files:
6. DATA-Fluorescence properties-chloroform.csv
7. DATA-Fluorescence properties-heptane.csv
8. DATA-Fluorescence properties-water.csv
Description:
These files contain calculated fluorescence properties, electronic-state information, orbital contributions, and frontier molecular orbital energies for the studied compounds.
Variables:
solvent
Definition: Solvent used in the calculation.
Allowed values: chloroform, heptane, water.
Unit: dimensionless.
R
Definition: Substituent group attached to the parent benzazole derivative.
Examples: H, NO2, N(CH3)2.
Unit: dimensionless.
X
Definition: Heteroatom or azole-ring identity in the benzazole framework.
Examples: NH, O, S, Se.
Unit: dimensionless.
compound
Definition: Internal compound index used in the manuscript and Supporting Information. The heptane fluorescence file uses the header compound, whereas the chloroform and water fluorescence files use the header compound_id; these fields have the same meaning.
Unit: dimensionless.
form
Definition: Molecular form considered in the fluorescence calculation.
Allowed values: enol, keto.
Unit: dimensionless.
lambda_flu_nm
Definition: Calculated fluorescence emission wavelength.
Unit: nm.
oscillator_strength
Definition: Oscillator strength of the electronic transition.
Unit: dimensionless.
State
Definition: Electronic state involved in the transition.
Example: S1.
Unit: dimensionless.
transition
Definition: Dominant molecular orbital transition associated with the fluorescence process.
Example: LUMO -> HOMO.
Unit: dimensionless.
contribution_percent
Definition: Percentage contribution of the listed orbital transition to the electronic excitation or emission process.
Unit: %.
f1
Definition: First configuration-interaction coefficient associated with the dominant orbital contribution.
Unit: dimensionless.
f2
Definition: Second configuration-interaction coefficient associated with the dominant orbital contribution.
Unit: dimensionless.
f1_squared
Definition: Squared value of f1.
Unit: dimensionless.
f2_squared
Definition: Squared value of f2.
Unit: dimensionless.
percent_f1
Definition: Relative percentage contribution calculated from f1_squared and f2_squared.
Unit: %.
percent_f2
Definition: Relative percentage contribution calculated from f1_squared and f2_squared.
Unit: %.
E_HOMO_hartree
Definition: HOMO energy.
Unit: Hartree.
E_LUMO_hartree
Definition: LUMO energy.
Unit: Hartree.
E_HOMO_eV
Definition: HOMO energy converted from Hartree to electronvolt.
Unit: eV.
E_LUMO_eV
Definition: LUMO energy converted from Hartree to electronvolt.
Unit: eV.
Egap_eV
Definition: HOMO-LUMO energy gap, calculated as E_LUMO_eV - E_HOMO_eV.
Unit: eV.
4. Scanned potential energy curve data files
Files:
9. Data-scanned potential energy curves-HBN.csv
10. Data-scanned potential energy curves-HBO.csv
11. Data-scanned potential energy curves-HBS.csv
12. Data-scanned potential energy curves-HBSe.csv
13. Data-scanned potential energy curves-HBSe-N(CH3)2.csv
14. Data-scanned potential energy curves-HBSe-NO2.csv
Description:
These files contain scanned potential energy curve data for the ESIPT proton-transfer coordinate. Each file corresponds to one compound or substituted derivative.
Variables:
Distance of O-H (Å)
Definition: Constrained O-H bond distance used as the proton-transfer scan coordinate.
Unit: Å.
Relative energy (kcal/mol) for S0
Definition: Relative energy of the S0 state at the corresponding O-H scan point.
Unit: kcal mol^-1.
Relative energy (kcal/mol) for S1
Definition: Relative energy of the S1 state at the corresponding O-H scan point.
Unit: kcal mol^-1.
5. Kinetic data file
File:
15. Data-kinetic analysis.csv
Description:
This file contains transition-state-theory kinetic parameters for the ESIPT process in chloroform.
Variables:
Compounds
Definition: Compound label, tautomeric form or transition-state label, electronic state, and solvent.
Examples: HBN-enol-S1-Chloroform, HBN-S1-Chloroform-TS, HBN-keto-S1-Chloroform.
Unit: dimensionless.
ZPE
Definition: Zero-point vibrational energy correction.
Unit: Hartree.
TCE
Definition: Thermal correction to internal energy.
Unit: Hartree.
TCH
Definition: Thermal correction to enthalpy.
Unit: Hartree.
TCG
Definition: Thermal correction to Gibbs free energy.
Unit: Hartree.
E
Definition: Electronic energy of the optimized structure.
Unit: Hartree.
imaginary frequency (cm-1)
Definition: Imaginary vibrational frequency used to confirm a transition state. This value is reported only for transition states.
Unit: cm^-1.
ΔH(#,0K,f)
Definition: Forward activation enthalpy at 0 K.
Unit: kcal mol^-1.
ΔH(#,0K,r)
Definition: Reverse activation enthalpy at 0 K.
Unit: kcal mol^-1.
ΔG(#,298.15,1M)
Definition: Gibbs free energy of activation at 298.15 K and 1 M standard state.
Unit: kcal mol^-1.
rate constant(s^-1)
Definition: Rate constant calculated from transition state theory.
Unit: s^-1.
VII-INPUT FILES contain input files for all compounds in three solvents:
Description:
All quantum chemical input files used in calculations.
Organization:
By solvent:
1-heptane (nonpolar)
2-chloroform (moderately polar)
3-water (polar)
File Contents:
Atomic coordinates (Cartesian, Å)
Charge and multiplicity
Method: B3LYP/def2-TZVP
Solvent model: SMD
Naming conventions: HBX-enol/enol-S0/s1-solvents-opt (X = N/O/S; solvents = heptane/chloroform/water)
Software guidance: All of the files should be open by GaussView or as .txt and run by Gaussian software.
The dataset was collected during the Gaussian 16 suite of programs at The University of Danang - University of Technology and Education, Vietnam, and La Trobe University, Australia. Geometry optimizations for both the ground state (S₀) and first excited state (S₁) of the investigated compounds were carried out at the B3LYP/def2-TZVP level. Vibrational frequency analyses at the same level confirmed minima (no imaginary frequencies) and transition states (one imaginary frequency). Solvent effects were modeled using the SMD approach with three representative solvents of increasing polarity: heptane (nonpolar), chloroform (moderately polar), and water (polar protic). Topological analyses of the electron density were performed using the AIM software package, with wavefunctions generated at the B3LYP/def2-TZVP level.
