Data from: High-throughput computational screening of bioinspired dual atom alloys for CO2 activation
Data files
Oct 02, 2024 version files 2.05 GB
-
catML-dryad.tar
2.05 GB
-
README.md
2.81 KB
Abstract
This is a large DFT dataset on screening materials (dual atom alloys in Cu 111 surface) for aptitude for binding CO and CO2, as well as their relative formation energy and relevant machine learning features.
README: Data for "High-throughput computational screening of bioinspired dual atom alloys for CO2 activation"
https://doi.org/10.5061/dryad.866t1g20b
Description of the data and file structure
A high throughput screening study for catalytic binding energies and thermodynamics was performed by altering atom cites in the surface of Cu 111 to be different transition metals.
File: catMLfinal.tar
Subfolders contain their own more detailed READMEs
/plots
All of the primary metadata is contained in the ./plots directory, which contains all the information used for plots in the original manuscript.
In all CSV files;
- Dopants is the atom types inserted into the DAA
- xxx BE(eV) is the calculated binding energy for the xxx species onto the surface
- Elemental FE(eV) is the calculated formation energy for the dual atom alloy species
/BE-FEscatter contains raw data for the initial screening of all 400 SAAs and DAAs
- sorted by early/mid/late transition metals and row of periodic table
/ML contains data used in Fig.3 which is the machine learning feature selection and plot of specific values
- featureselection.py does the ML feature selection and plots feature importance
- can tune what features and methods are used
- correlationmatrix.py does Pearson Correlation plot
- crossvalidate.py does n-fold cross validation plots for each ML method
- plotallscatter and plotscatter and plot do basic scatter plots using input csv files
/BE-COscatter is the Fig 4 data with CO binding and CO2 binding
- Lines and colors in the final plot were added with post processing
/accurate includes the comparison between gamma point and denser k-point grid plot in the SI
Each directory contains it's own README (except pseudos, tests, and manuscripts)
/COLE
./COLE contains adsorbate DFT relaxation calculations and code from the collaborator Cole Clark
/database
./database contains all of the possible DAA combinations of two different dopants
each sub directory in ./database*/dopants/ (must unzip) contains all raw DFT relaxation calculations for that DAA
./databasehomo
contains DAA of homodoping two atoms
./databasesingle
contains SAA for each element and pure Cu
./databaseCO
contains all DAA and SAA from the second step of screening
./scripts
contains generation and workup scripts
./pseudos
contains the pseudopotentials that were used for this project
- norm conserving, srl, generated by opium
Code/software
Python version 3 and Quantum Espresso Version 7 are required to reproduce results. Only python is required for viewing plots.
Access information
Data was derived from the following sources:
Methods
This dataset was collected with atomic relaxations using density functional theory and processed with machine learning algorithms from SKLearn.