Data from: High-throughput computational screening of bioinspired dual atom alloys for CO2 activation

Behrendt, Drew 1 ; Banerjee, Sayan1; Clark, Cole1; Rappe, Andrew M.1

Research facility: University of Pennsylvania

Published Oct 02, 2024 on Dryad. https://doi.org/10.5061/dryad.866t1g20b

Data files

Oct 02, 2024 version files 2.05 GB

catML-dryad.tar

2.05 GB
README.md

2.81 KB

Abstract

This is a large DFT dataset on screening materials (dual atom alloys in Cu 111 surface) for aptitude for binding CO and CO2, as well as their relative formation energy and relevant machine learning features.

https://doi.org/10.5061/dryad.866t1g20b

Description of the data and file structure

A high throughput screening study for catalytic binding energies and thermodynamics was performed by altering atom cites in the surface of Cu 111 to be different transition metals.

File: catMLfinal.tar

Subfolders contain their own more detailed READMEs

/plots

All of the primary metadata is contained in the ./plots directory, which contains all the information used for plots in the original manuscript.

In all CSV files;

Dopants is the atom types inserted into the DAA
xxx BE(eV) is the calculated binding energy for the xxx species onto the surface
Elemental FE(eV) is the calculated formation energy for the dual atom alloy species

/BE-FEscatter contains raw data for the initial screening of all 400 SAAs and DAAs

sorted by early/mid/late transition metals and row of periodic table

/ML contains data used in Fig.3 which is the machine learning feature selection and plot of specific values

featureselection.py does the ML feature selection and plots feature importance
- can tune what features and methods are used
correlationmatrix.py does Pearson Correlation plot
crossvalidate.py does n-fold cross validation plots for each ML method
plotallscatter and plotscatter and plot do basic scatter plots using input csv files

/BE-COscatter is the Fig 4 data with CO binding and CO2 binding

Lines and colors in the final plot were added with post processing

/accurate includes the comparison between gamma point and denser k-point grid plot in the SI

Each directory contains it's own README (except pseudos, tests, and manuscripts)

/COLE

./COLE contains adsorbate DFT relaxation calculations and code from the collaborator Cole Clark

/database

./database contains all of the possible DAA combinations of two different dopants

each sub directory in ./database*/dopants/ (must unzip) contains all raw DFT relaxation calculations for that DAA

./databasehomo

contains DAA of homodoping two atoms

./databasesingle

contains SAA for each element and pure Cu

./databaseCO

contains all DAA and SAA from the second step of screening

./scripts

contains generation and workup scripts

./pseudos

contains the pseudopotentials that were used for this project
- norm conserving, srl, generated by opium

Code/software

Python version 3 and Quantum Espresso Version 7 are required to reproduce results. Only python is required for viewing plots.

Access information

Data was derived from the following sources:

https://doi.org/10.1021/jacs.2c13253