Data for: Thermodynamics of the gas-phase dimerization of formic acid: fully anharmonic finite temperature calculations at the CCSD(T) and many DFT levels
Data files
Apr 02, 2024 version files 540.61 MB
-
AIMD_D_PBE_D2.dat
4.50 MB
-
AIMD_D_PBE.dat
4.50 MB
-
AIMD_D.xyz
91.31 MB
-
AIMD_M_PBE_D2.dat
4.50 MB
-
AIMD_M_PBE.dat
4.50 MB
-
AIMD_M.xyz
45.57 MB
-
MC_CCSDT.dat
2.36 MB
-
MC_CCSDT.xyz
52.72 MB
-
MC_HSE06.dat
3.46 MB
-
MC_HSE06.xyz
79.38 MB
-
MC_vdW-DF.dat
3.18 MB
-
MC_vdW-DF.xyz
73.47 MB
-
predicted_D_CCSDT.dat
4.50 MB
-
predicted_D_HSE06_D3.dat
4.50 MB
-
predicted_D_HSE06_D3BJ.dat
4.50 MB
-
predicted_D_HSE06_MBD.dat
4.50 MB
-
predicted_D_HSE06.dat
4.50 MB
-
predicted_D_optB86b-vdW.dat
4.50 MB
-
predicted_D_PBE_D3.dat
4.50 MB
-
predicted_D_PBE_D3BJ.dat
4.50 MB
-
predicted_D_PBE_HI.dat
4.50 MB
-
predicted_D_PBE_MBD.dat
4.50 MB
-
predicted_D_PBE.dat
4.50 MB
-
predicted_D_SCAN_D3.dat
4.50 MB
-
predicted_D_SCAN_D3BJ.dat
4.50 MB
-
predicted_D_SCAN_MBD.dat
4.50 MB
-
predicted_D_SCAN_RVV10.dat
4.50 MB
-
predicted_D_SCAN.dat
4.50 MB
-
predicted_D_vdW-DF-cx.dat
4.50 MB
-
predicted_D_vdW-DF.dat
4.50 MB
-
predicted_D_vdW-DF2-B86R.dat
4.50 MB
-
predicted_M_CCSDT.dat
4.50 MB
-
predicted_M_HSE06_D3.dat
4.50 MB
-
predicted_M_HSE06_D3BJ.dat
4.50 MB
-
predicted_M_HSE06_MBD.dat
4.50 MB
-
predicted_M_HSE06.dat
4.50 MB
-
predicted_M_optB86b-vdW.dat
4.50 MB
-
predicted_M_PBE_D3.dat
4.50 MB
-
predicted_M_PBE_D3BJ.dat
4.50 MB
-
predicted_M_PBE_HI.dat
4.50 MB
-
predicted_M_PBE_MBD.dat
4.50 MB
-
predicted_M_PBE.dat
4.50 MB
-
predicted_M_SCAN_D3.dat
4.50 MB
-
predicted_M_SCAN_D3BJ.dat
4.50 MB
-
predicted_M_SCAN_MBD.dat
4.50 MB
-
predicted_M_SCAN_RVV10.dat
4.50 MB
-
predicted_M_SCAN.dat
4.50 MB
-
predicted_M_vdW-DF-cx.dat
4.50 MB
-
predicted_M_vdW-DF.dat
4.50 MB
-
predicted_M_vdW-DF2-B86R.dat
4.50 MB
-
README.md
8.04 KB
-
training_D_CCSDT.dat
2.25 KB
-
training_D_HSE06_D3.dat
2.25 KB
-
training_D_HSE06_D3BJ.dat
2.25 KB
-
training_D_HSE06_MBD.dat
2.25 KB
-
training_D_HSE06.dat
2.25 KB
-
training_D_optB86b-vdW.dat
2.25 KB
-
training_D_PBE_D3.dat
2.25 KB
-
training_D_PBE_D3BJ.dat
2.25 KB
-
training_D_PBE_HI.dat
2.25 KB
-
training_D_PBE_MBD.dat
2.25 KB
-
training_D_PBE.dat
2.25 KB
-
training_D_SCAN_D3.dat
2.25 KB
-
training_D_SCAN_D3BJ.dat
2.25 KB
-
training_D_SCAN_MBD.dat
2.25 KB
-
training_D_SCAN_RVV10.dat
2.25 KB
-
training_D_SCAN.dat
2.25 KB
-
training_D_vdW-DF-cx.dat
2.25 KB
-
training_D_vdW-DF.dat
2.25 KB
-
training_D_vdW-DF2-B86R.dat
2.25 KB
-
training_D.xyz
45.19 KB
-
training_M_CCSDT.dat
2.25 KB
-
training_M_HSE06_D3.dat
2.25 KB
-
training_M_HSE06_D3BJ.dat
2.25 KB
-
training_M_HSE06_MBD.dat
2.25 KB
-
training_M_HSE06.dat
2.25 KB
-
training_M_optB86b-vdW.dat
2.25 KB
-
training_M_PBE_D3.dat
2.25 KB
-
training_M_PBE_D3BJ.dat
2.25 KB
-
training_M_PBE_HI.dat
2.25 KB
-
training_M_PBE_MBD.dat
2.25 KB
-
training_M_PBE.dat
2.25 KB
-
training_M_SCAN_D3.dat
2.25 KB
-
training_M_SCAN_D3BJ.dat
2.25 KB
-
training_M_SCAN_MBD.dat
2.25 KB
-
training_M_SCAN_RVV10.dat
2.25 KB
-
training_M_SCAN.dat
2.25 KB
-
training_M_vdW-DF-cx.dat
2.25 KB
-
training_M_vdW-DF.dat
2.25 KB
-
training_M_vdW-DF2-B86R.dat
2.25 KB
-
training_M.xyz
22.73 KB
Abstract
We provide data files needed to reproduce the MLPT calculations presented in work of Dávid Vrška, Michal Pitoňák and Tomáš Bučko: "Thermodynamics of the gas-phase dimerization of formic acid: Fully anharmonic finite temperature calculations at the CCSD(T) and many DFT levels" or to perform such calculation for any other electronic structure method not considered in mentioned work. In particular, the structural data are available in the standard xyz file format containing the atomic labels and atomic coordinates in Angstroms (Å), and the energies are provided as data files in a two-column format, where the items in the first column represent the identification numbers of each configuration and those in the second column are the corresponding energies in electronvolts (eV).
README: Data for: Thermodynamics of the gas-phase dimerization of formic acid: Fully anharmonic finite temperature calculations at the CCSD(T) and many DFT levels
file formats
- .xyz files contain the atomic labels and atomic coordinates in Å, in the standard xyz format.
- .dat files consist of two columns: the items in the first column represent the identification numbers of each configuration and in the second column are the corresponding energies in eV.
structures
- AIMD_M.xyz cartesian coordinates from PBE+D2 AIMD for monomer
- AIMD_D.xyz cartesian coordinates from PBE+D2 AIMD dynamics for dimer
- training_M.xyz cartesian coordinates used for MLPT training for monomer
- training_D.xyz cartesian coordinates used for MLPT training for dimer
- MC_HSE06.xyz cartesian coordinates from PBE+D2/HSE06 MLMC
- MC_vdW-DF.xyz cartesian coordinates from PBE+D2/vdW-DF MLMC
- MC_CCSDT.xyz cartesian coordinates from PBE+D2/CCSD(T) MLMC
energies
AIMD
- AIMD_M_PBE.dat PBE energies from PBE+D2 AIMD for monomer
- AIMD_M_PBE+D2.dat PBE+D2 energies from PBE+D2 AIMD for monomer
- AIMD_D_PBE.dat PBE energies from PBE+D2 AIMD for dimer
- AIMD_D_PBE+D2.dat PBE+D2 energies from PBE+D2 AIMD for dimer
MLPT training
- training_D_PBE.dat PBE energies used for training of MLPT for dimer
- training_D_PBE+D3.dat PBE+D3(0) energies used for training of MLPT for dimer
- training_D_PBE+D3BJ.dat PBE+D3BJ energies used for training of MLPT for dimer
- training_D_PBE+MBD.dat PBE+MBD energies used for training of MLPT for dimer
- training_D_PBE+HI.dat PBE+TS/HI energies used for training of MLPT for dimer
- training_D_SCAN.dat SCAN energies used for training of MLPT for dimer
- training_D_SCAN+D3.dat SCAN+D3(0) energies used for training of MLPT for dimer
- training_D_SCAN+D3BJ.dat SCAN+D3BJ energies used for training of MLPT for dimer
- training_D_SCAN+MBD.dat SCAN+MBD energies used for training of MLPT for dimer
- training_D_SCAN+RVV10.dat SCAN+RVV10 energies used for training of MLPT for dimer
- training_D_HSE06.dat HSE06 energies used for training of MLPT for dimer
- training_D_HSE06+D3.dat HSE06+D3(0) energies used for training of MLPT for dimer
- training_D_HSE06+D3BJ.dat HSE06+D3BJ energies used for training of MLPT for dimer
- training_D_HSE06+MBD.dat HSE06+MBD energies used for training of MLPT for dimer
- training_D_vdW-DF.dat vdW-DF energies used for training of MLPT for dimer
- training_D_vdW-DF-cx.dat vdW-DF-cx energies used for training of MLPT for dimer
- training_D_vdW-DF2-B86R.dat vdW-DF2-B86R energies used for training of MLPT for dimer
- training_D_optB86b-vdW.dat optB86b-vdW energies used for training of MLPT for dimer
- training_D_CCSDT.dat CCSD(T) energies used for training of MLPT for dimer
- training_M_PBE.dat PBE energies used for training of MLPT for monomer
- training_M_PBE+D3.dat PBE+D3(0) energies used for training of MLPT for monomer
- training_M_PBE+D3BJ.dat PBE+D3BJ energies used for training of MLPT for monomer
- training_M_PBE+MBD.dat PBE+MBD energies used for training of MLPT for monomer
- training_M_PBE+HI.dat PBE+TS/HI energies used for training of MLPT for monomer
- training_M_SCAN.dat SCAN energies used for training of MLPT for monomer
- training_M_SCAN+D3.dat SCAN+D3(0) energies used for training of MLPT for monomer
- training_M_SCAN+D3BJ.dat SCAN+D3BJ energies used for training of MLPT for monomer
- training_M_SCAN+MBD.dat SCAN+MBD energies used for training of MLPT for monomer
- training_M_SCAN+RVV10.dat SCAN+RVV10 energies used for training of MLPT for monomer
- training_M_HSE06.dat HSE06 energies used for training of MLPT for monomer
- training_M_HSE06+D3.dat HSE06+D3(0) energies used for training of MLPT for monomer
- training_M_HSE06+D3BJ.dat HSE06+D3BJ energies used for training of MLPT for monomer
- training_M_HSE06+MBD.dat HSE06+MBD energies used for training of MLPT for monomer
- training_M_vdW-DF.dat vdW-DF energies used for training of MLPT for monomer
- training_M_vdW-DF-cx.dat vdW-DF-cx energies used for training of MLPT for monomer
- training_M_vdW-DF2-B86R.dat vdW-DF2-B86R energies used for training of MLPT for monomer
- training_M_optB86b-vdW.dat optB86b-vdW energies used for training of MLPT for monomer
- training_M_CCSDT.dat CCSD(T) energies used for training of MLPT for monomer
MLPT predictions
- predicted_D_PBE.dat PBE energies from MLPT prediction for dimer
- predicted_D_PBE+D3.dat PBE+D3(0) energies from MLPT prediction for dimer
- predicted_D_PBE+D3BJ.dat PBE+D3BJ energies from MLPT prediction for dimer
- predicted_D_PBE+MBD.dat PBE+MBD energies from MLPT prediction for dimer
- predicted_D_PBE+HI.dat PBE+TS/HI energies from MLPT prediction for dimer
- predicted_D_SCAN.dat SCAN energies from MLPT prediction for dimer
- predicted_D_SCAN+D3.dat SCAN+D3(0) energies from MLPT prediction for dimer
- predicted_D_SCAN+D3BJ.dat SCAN+D3BJ energies from MLPT prediction for dimer
- predicted_D_SCAN+MBD.dat SCAN+MBD energies from MLPT prediction for dimer
- predicted_D_SCAN+RVV10.dat SCAN+RVV10 energies from MLPT prediction for dimer
- predicted_D_HSE06.dat HSE06 energies from MLPT prediction for dimer
- predicted_D_HSE06+D3.dat HSE06+D3(0) energies from MLPT prediction for dimer
- predicted_D_HSE06+D3BJ.dat HSE06+D3BJ energies from MLPT prediction for dimer
- predicted_D_HSE06+MBD.dat HSE06+MBD energies from MLPT prediction for dimer
- predicted_D_vdW-DF.dat vdW-DF energies from MLPT prediction for dimer
- predicted_D_vdW-DF-cx.dat vdW-DF-cx energies from MLPT prediction for dimer
- predicted_D_vdW-DF2-B86R.dat vdW-DF2-B86R energies from MLPT prediction for dimer
- predicted_D_optB86b-vdW.dat optB86b-vdW energies from MLPT prediction for dimer
- predicted_D_CCSDT.dat CCSD(T) energies from MLPT prediction for dimer
- predicted_M_PBE.dat PBE energies from MLPT prediction for monomer
- predicted_M_PBE+D3.dat PBE+D3(0) energies from MLPT prediction for monomer
- predicted_M_PBE+D3BJ.dat PBE+D3BJ energies from MLPT prediction for monomer
- predicted_M_PBE+MBD.dat PBE+MBD energies from MLPT prediction for monomer
- predicted_M_PBE+HI.dat PBE+TS/HI energies from MLPT prediction for monomer
- predicted_M_SCAN.dat SCAN energies from MLPT prediction for monomer
- predicted_M_SCAN+D3.dat SCAN+D3(0) energies from MLPT prediction for monomer
- predicted_M_SCAN+D3BJ.dat SCAN+D3BJ energies from MLPT prediction for monomer
- predicted_M_SCAN+MBD.dat SCAN+MBD energies from MLPT prediction for monomer
- predicted_M_SCAN+RVV10.dat SCAN+RVV10 energies from MLPT prediction for monomer
- predicted_M_HSE06.dat HSE06 energies from MLPT prediction for monomer
- predicted_M_HSE06+D3.dat HSE06+D3(0) energies from MLPT prediction for monomer
- predicted_M_HSE06+D3BJ.dat HSE06+D3BJ energies from MLPT prediction for monomer
- predicted_M_HSE06+MBD.dat HSE06+MBD energies from MLPT prediction for monomer
- predicted_M_vdW-DF.dat vdW-DF energies from MLPT prediction for monomer
- predicted_M_vdW-DF-cx.dat vdW-DF-cx energies from MLPT prediction for monomer
- predicted_M_vdW-DF2-B86R.dat vdW-DF2-B86R energies from MLPT prediction for monomer
- predicted_M_optB86b-vdW.dat optB86b-vdW energies from MLPT prediction for monomer
- predicted_M_CCSDT.dat CCSD(T) energies from MLPT prediction for monomer
MLMC
- MC_HSE06.dat HSE06 energies from HSE06 MLMC
- MC_vdW-DF.dat vdW-DF energies from vdW-DF MLMC
- MC_CCSDT.dat CCSD(T) energies from CCSD(T) MLMC
Methods
Ab initio density functional theory (DFT) simulations were employed using the periodic DFT software package VASP [1-3]. Coupled Clusters (CCSD(T)) [4] calculations were performed using the program ORCA (version 5.0.4) [5,6].
Ab initio molecular dynamics (AIMD) calculations with a time step of 0.5 fs were performed at the PBE+D2 level. The simulation temperature of 300 K was maintained by employing Andersen thermostat [7], whereby the mass of hydrogen atoms was increased to 3 amu. The total length of all AIMD trajectories was 100 ps out of which the equilibration period identified via Mann Kendall test [8] (up to 15 ps) was discarded.
For determining energy at the target level of theory (CCSD(T) and various DFAs) we employed Machine Learning Perturbation Theory (MLPT) aproach using the data obtained at computationally less expensive method (PBE+D2). We used the training set with 100 training points obtained as 100 single point calculations at the target level. Within MLPT, the kernel ridge regression [9] as implemented in scikit-learn library [10,11] with the REMatch kernel [12] and the SOAP (smooth overlap of atomic positions) descriptors [13] as implemented in DScribe library [14] are used.
In the cases of a poor overlap, corresponding to cases with index Iw < 0.05, see Herzog et al. [15] proposed to perform machine learning Monte Carlo (MLMC) resampling.
References:
[1] G. Kresse and J. Hafner, Phys. Rev. B 48, 13115 (1993)
[2] G. Kresse and J. Hafner, J. Phys. Condens. Matter 6, 8245 (1994)
[3] G. Kresse and J. Hafner, Phys. Rev. B 47, 558 (1993)
[4] K. Raghavachari, G. W. Trucks, J. A. Pople, and M. Head-Gordon, Chem. Phys. Lett. 157, 479 (1989)
[5] F. Neese, Wiley Interdiscip. Rev. Comput. Mol. Sci. 2, 73 (2011)
[6] F. Neese, Wiley Interdiscip. Rev. Comput. Mol. Sci. 12 (2022), 10.1002/wcms.1606
[7] H. C. Andersen, J. Chem. Phys. 72, 2384 (1980)
[8] S. K. Schiferl and D. C. Wallace, J. Chem. Phys. 83, 5203 (1985)
[9] M. Rupp, Int. J. Quantum Chem. 115, 1058 (2015)
[10] L. Himanen, M. O. Jäger, E. V. Morooka, F. F. Canova, Y. S. Ranawat, D. Z. Gao, P. Rinke, and A. S. Foster, Comput. Phys. Commun. 247, 106949 (2020)
[11] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, A. Müller, J. Nothman, G. Louppe, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, J. Mach. Learn. Res. (2012), 10.48550/ARXIV.1201.0490
[12] S. De, A. P. Bartók, G. Csányi, and M. Ceriotti, Phys. Chem. Chem. Phys. 18, 13754 (2016)
[13] A. P. Bartók, R. Kondor, and G. Csányi, Phys. Rev. B 87, 184115 (2013)
[14] L. Himanen, M. O. Jäger, E. V. Morooka, F. Federici Canova, Y. S. Ranawat, D. Z. Gao, P. Rinke, and A. S. Foster, Comput. Phys. Commun. 247, 106949 (2020)
[15] . Herzog, M. C. da Silva, B. Casier, M. Badawi, F. Pascale, T. Bučko, S. Lebègue, and D. Rocca, J. Chem. Theory Comput. 18, 1382 (2022)